<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

  <title>Re: RFH: aliasing problems</title>

</head>

<body bgcolor="#ffffff" text="#000000">

Sergei Organov wrote:

<blockquote cite="midelj4bq$ogg$1@sea.gmane.org" type="cite">

  <pre wrap="">Steven Johnson

<a class="moz-txt-link-rfc2396E" href="mailto:sjohnson@sakuraindustries.com"><sjohnson@sakuraindustries.com></a> writes:

  </pre>

  <blockquote type="cite">

    <pre wrap="">Hi,

Can anyone explain how -fstrict-aliasing is compatible with C99 Section

6.3.2.3(7).

because it says:

"A pointer to an object ... may be converted to a pointer to a different

object ... when converted back again the results shall compare equal to

the original pointer."

If this is not the definition of type punning with pointers, I do not

know what is.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

This tells nothing about semantics of accessing corresponding object

by dereferencing the pointer, while aliasing is all about accessing actual

objects using pointers of different types.

  </pre>

</blockquote>

The full quotation is:<br>

<br>

<b>6.3.2.3(7):<br>

"A pointer to an object or incomplete type may be converted to a

pointer to a different object or incomplete type."<br>

</b><br>

<i>so the following is valid, according to this statement:<br>

void f(void)<br>

{<br>

  int *ip;<br>

  float *fp;<br>

  int i = 12345;<br>

<br>

  ip = &i;<br>

  fp = (float*)ip;<br>

</i>}<br>

<br>

<b>"If the resulting pointer is not correctly aligned () for the

pointed-to type, the behavior is undeﬁned. (In general, the concept

‘‘correctly aligned’’ is transitive: if a pointer to type A is

correctly aligned for a pointer to type B, which in turn is correctly

aligned for a pointer to type C, then a pointer to type A is correctly

aligned for a pointer to type C.)"<br>

</b><br>

<i>So, this obviously contemplates accessing the data, otherwise

alignment is not an issue.<br>

</i><br>

<b>"Otherwise, when converted back again, the result shall compare

equal to the original pointer." <br>

</b><i><br>

Whenever the pointer is converted back to it's original type, the

address of the pointer must be the same as before conversion.  This can

not be performed if the pointer is changed, because no information

travels with the pointer telling what it's original value was or

original type.  So the data of the object pointed to MUST be equal for

the original pointer and the converted pointer.  if accessing the

converted pointer is restricted then the whole paragraph is a nonsense

and redundant.  What does it specify if you are forbidden from

accessing the contents of the converted pointer?  What is its purpose?<br>

</i><br>

<b>"When a pointer to an object is converted to a pointer to a

character type, the result points to the lowest addressed byte of the

object. Successive increments of the result, up to the size of the

object, yield pointers to the remaining bytes of the object."<br>

<br>

</b>Again certainly contemplates accessing the type cast pointer and

describes what happens with incrementing it, for the purpose of

accessing (using char pointers).<br>

<br>

The C99 rationale on page 49 says:<br>

<b>   Consequences of the treatment of pointer types in the Standard

include:<br>

      • A pointer to void may be converted to a pointer to an object of

any type.<br>

      • A pointer to any object of any type may be converted to a

pointer to void.<br>

      • If a pointer to an object is converted to a pointer to void and

back again to the original<br>

       pointer type, the result compares equal to original pointer.<br>

      • It is invalid to convert a pointer to an object of any type to

a pointer to an object of a<br>

         different type without an explicit cast.<br>

      • Even with an explicit cast, it is invalid to convert a function

pointer to an object pointer<br>

         or a pointer to void, or vice versa.<br>

    • It is invalid to convert a pointer to a function of one type to a

pointer to a function of a<br>

         different type without a cast.<br>

      • Pointers to functions that have different parameter-type

information (including the “old-<br>

         style” absence of parameter-type information) are different

types.<br>

</b><br>

Note:<br>

<b>      • It is invalid to convert a pointer to an object of any type

to a pointer to an object of a<br>

         different type without an explicit cast.<br>

</b>The corollary of this is "It is valid to convert a pointer to an

object of any type to a pointer to an object of a different type with

an explicit cast".  Otherwise this statement would be, "it is invalid

to convert a pointer to an object of any type to a pointer of a

different type FULLSTOP".<br>

<br>

Again, if you can't use the converted pointer there is no point in

converting it, so I say this says everything about type punning using

pointers, and it is allowed by C99.<br>

<b></b><br>

The Rationale says the spirit of the specification of the C language is:<br>

<br>

<b>   Keep the spirit of C. The C89 Committee kept as a major goal to

preserve the traditional spirit<br>

   of C. There are many facets of the spirit of C, but the essence is a

community sentiment of the<br>

 underlying principles upon which the C language is based. Some of the

facets of the spirit of C<br>

   can be summarized in phrases like:<br>

       • Trust the programmer.<br>

       • Don’t prevent the programmer from doing what needs to be done.<br>

       • Keep the language small and simple.<br>

     • Provide only one way to do an operation.<br>

       • Make it fast, even if it is not guaranteed to be portable.<br>

</b><br>

Strict aliasing is against this spirit.  No where does the Rationale

provide a rationale for such a radical and spirit changing alteration

to C, as strict aliasing performs.<br>

<br>

The paragraph everyone is hung up on is:<b><br>

An object shall have its stored value accessed only by an lvalue

expression that has one of<br>

the following types: (The intent of this list is to specify those

circumstances in which an object may or may not be aliased.)<br>

— a type compatible with the effective type of the object,<br>

— a qualiﬁed version of a type compatible with the effective type of

the object,<br>

— a type that is the signed or unsigned type corresponding to the

effective type of the<br>

    object,<br>

— a type that is the signed or unsigned type corresponding to a

qualiﬁed version of the<br>

    effective type of the object,<br>

— an aggregate or union type that includes one of the aforementioned

types among its<br>

    members (including, recursively, a member of a subaggregate or

contained union), or<br>

— a character type.<br>

</b><br>

Effective Type is the "Effective type of an object which can be changed

by the lvalue expression" this is supported by the C99 rationale, page

11:<br>

<br>

<b>  The definition of object does not employ the notion of type. Thus

an object has no type in and of<br>

  itself. However, since an object may only be designated by an lvalue

(see §6.3.2.1), the phrase<br>

  “the type of an object” is taken to mean, here and in the Standard,

“the type of the lvalue<br>

  designating this object,” and “the value of an object” means “the

contents of the object<br>

  interpreted as a value of the type of the lvalue designating the

object.”<br>

</b><br>

an lvalue is (6.3.2.1):<br>

<br>

<b>  An lvalue is an expression with an object type or an incomplete

type other than void; ...<br>

  When an object is said to have a particular type, the type is

speciﬁed by the lvalue used to<br>

  designate the object.<br>

</b><br>

and the rationale in page 48:<br>

<br>

<b>   A difference of opinion within the C community centered around

the meaning of lvalue, one<br>

   group considering an lvalue to be any kind of object locator,

another group holding that an lvalue<br>

 is meaningful on the left side of an assigning operator. The C89

Committee adopted the<br>

   definition of lvalue as an object locator. The term modifiable

lvalue is used for the second of the<br>

   above concepts.<br>

</b><br>

and casting is an expression according to the formal language

specification.<br>

<br>

<b>lvalues</b> are <b>expressions, <br>

expressions </b>encompass <b>casting<br>

</b>the language specification clearly says any object can be cast to

any other object.<br>

<br>

Breaking down the problematic statement from the standard one gets

(after properly evaluating all of the word in it):<br>

<b>An object shall have its stored value accessed only by an lvalue

expression that has one of<br>

the following types: (The intent of this list is to specify those

circumstances in which an object may or may not be aliased.)<br>

</b>An access can be read or write, <br>

<br>

<b>— a type compatible with the effective type of the object,</b><br>

on read, the value being read into must be compatible with the lvalue

expression (including any casting) of the pointer being referenced.<br>

on write, the value being written into the object must be compatible

with the lvalue expression (including any casting) of the pointer being

referenced.<br>

<br>

and the others don't really apply to this discussion after all.  I can

not find any definitive reference that says if you want to type pun you

must use unions.  it just isn't there.  the spec doesn't say what

everyone is saying it says, at all, and in fact it says a whole lot

about the exact opposite.  <br>

<br>

Page 59/60 of the C99 Rationale says:<br>

<br>

<b>In practice, aliasing arises with the use of pointers. A contrived

example to illustrate the issues is<br>

           int a;<br>

           void f(int * b)<br>

           {<br>

                a = 1;<br>

              *b = 2;<br>

                g(a);<br>

           }<br>

   It is tempting to generate the call to g as if the source expression

were g(1), but b might point<br>

   to a, so this optimization is not safe. <br>

</b><br>

<i>So the compiler can not sub expression eliminate the reference to a,

assuming it is a constant to 1, it must get its value for the call to

g, as it might be 2.</i><br>

<br>

<b>On the other hand, consider<br>

         int a;<br>

           void f( double * b )<br>

           {<br>

                a = 1;<br>

                *b = 2.0;<br>

              g(a);<br>

           }<br>

   Again the optimization is incorrect only if b points to a. However,

this would only have come<br>

   about if the address of a were somewhere cast to double*. The C89

Committee has decided<br>

   that such dubious possibilities need not be allowed for.<br>

<br>

</b><i>So, as an optimisation the compiler can not determine if double

*b points to a, so it can assume it doesn't and call g(a) with the

constant 1, rather than look up the value.</i><br>

<br>

<b> In principle, then, aliasing only need be allowed for when the

lvalues all have the same type.<br>

</b><br>

<i>But, following the rule, the following should be OK, and should be

analogous to the first example:<br>

         int a;<br>

           void f( double * b )<br>

           {<br>

                int *c;<br>

                a = 1;<br>

                *c = (int*)b;<br>

                *c = 2;<br>

              g(a);<br>

           }<br>

Because the "effective type" of c is an int, for the assignment of 2,

so the compiler has the same problem as the first example, it can not

assume that c and hence b does not point to an int, because the lvalue

of the expression in the assignment was an int, and the spec says if

you convert a pointer back to its original type it will have the

original address, so the optimiser can not assume that it isn't

pointing to a, in this case.<br>

<br>

</i>If this is not the case, then again, type casting pointers has no

utility, so why is it still in the standard.<br>

<br>

Further, if Type Aliasing as implemented by GCC is part of C99, then

GCC is broken (in C99 mode) without optimisations on, because it does

not enforce the rule. Take the attached code, it behaves differently

with the following command line to build:<br>

<br>

built with:<br>

gcc -std=c99 -pedantic -O -fstrict-aliasing -Wstrict-aliasing=2 test.c<br>

<br>

when run gives:<br>

5<br>

5<br>

13<br>

15<br>

5<br>

<br>

built with:<br>

gcc -std=c99 -pedantic -fstrict-aliasing -Wstrict-aliasing=2 test.c<br>

<br>

when run gives:<br>

5<br>

5<br>

5<br>

5<br>

5<br>

<br>

Neither of which generate ANY warnings or Errors in build.<br>

<br>

The fact that you get different behaviour with optimisations ON and

OFF, and no warning that there is a problem, to me indicates a broken

optimiser.  No one is going to convince me that when a program has

radically different behaviour with the optimiser ON or OFF that the

Optimiser isn't broken (or the unoptimised compiler isn't broken, take

your pick).  Also whether the code is inlined or not changing the

behaviour is very concerning, especially as at higher optimisation

levels GCC automatically inlines code. Further, my tests prove to me

that the presence or absence of the warning means precisely nothing.  

The warning can be present, yet the code behave precisely as intended,

the warning not be present, and the code be broken as in this case.<br>

<br>

I don't read the C99 specification to give the strict interpretation

that GCC spins on it, and in any event, I don't see how the C99

specification makes it permissible for inline code to behave

differently than non-inline code (with the same operation), or for a

compiler to have completely different behaviour with optimisation ON or

OFF. <br>

<br>

Further, given the HUGE amount of non C99 code for lots of other

reasons (such as the K&R parameter declarations all through the

network stack).  I don't see why there is such a hangup on disabling

this highly problematic and suspicious optimisation.  Why file a PR on

this, and ignore the other obvious non C99 conformances in the same

immediate vicinity of the file?<br>

<br>

Steven J

</body>

</html>