Why are these constructs using pre and post-increment undefined behavior?












741















#include <stdio.h>

int main(void)
{
int i = 0;
i = i++ + ++i;
printf("%dn", i); // 3

i = 1;
i = (i++);
printf("%dn", i); // 2 Should be 1, no ?

volatile int u = 0;
u = u++ + ++u;
printf("%dn", u); // 1

u = 1;
u = (u++);
printf("%dn", u); // 2 Should also be one, no ?

register int v = 0;
v = v++ + ++v;
printf("%dn", v); // 3 (Should be the same as u ?)

int w = 0;
printf("%d %d %dn", w++, ++w, w); // shouldn't this print 0 2 2

int x[2] = { 5, 8 }, y = 0;
x[y] = y ++;
printf("%d %dn", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}









share|improve this question




















  • 42





    Homework? Not trying to be a pain, but you should never write code with expressions like these. They are usually given as academic examples, sometimes showing that different compilers yield different output.

    – Jarrett Meyer
    Jun 4 '09 at 10:30






  • 10





    @Jarett, nope, just needed some pointers to "sequence points". While working I found a piece of code with i = i++, I thougth "This isn't modifying the value of i". I tested and I wondered why. Since, i've removed this statment and replaced it by i++;

    – PiX
    Jun 4 '09 at 18:24






  • 22





    Explain these undefined behaviors? Explain what about them? How they behave is undefined.

    – Jesse Millikan
    Jul 10 '09 at 15:44






  • 187





    I think it's interesting that everyone ALWAYS assumes that questions like this are asked because the asker wants to USE the construct in question. My first assumption was that PiX knows that these are bad, but is curious why the behave they way the do on whataver compiler s/he was using... And yeah, what unWind said... it's undefined, it could do anything... including JCF (Jump and Catch Fire)

    – Brian Postow
    May 24 '10 at 13:41






  • 31





    I'm curious: Why don't compilers seem to warn on constructs such as "u = u++ + ++u;" if the result is undefined?

    – Learn OpenGL ES
    Sep 20 '12 at 16:23
















741















#include <stdio.h>

int main(void)
{
int i = 0;
i = i++ + ++i;
printf("%dn", i); // 3

i = 1;
i = (i++);
printf("%dn", i); // 2 Should be 1, no ?

volatile int u = 0;
u = u++ + ++u;
printf("%dn", u); // 1

u = 1;
u = (u++);
printf("%dn", u); // 2 Should also be one, no ?

register int v = 0;
v = v++ + ++v;
printf("%dn", v); // 3 (Should be the same as u ?)

int w = 0;
printf("%d %d %dn", w++, ++w, w); // shouldn't this print 0 2 2

int x[2] = { 5, 8 }, y = 0;
x[y] = y ++;
printf("%d %dn", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}









share|improve this question




















  • 42





    Homework? Not trying to be a pain, but you should never write code with expressions like these. They are usually given as academic examples, sometimes showing that different compilers yield different output.

    – Jarrett Meyer
    Jun 4 '09 at 10:30






  • 10





    @Jarett, nope, just needed some pointers to "sequence points". While working I found a piece of code with i = i++, I thougth "This isn't modifying the value of i". I tested and I wondered why. Since, i've removed this statment and replaced it by i++;

    – PiX
    Jun 4 '09 at 18:24






  • 22





    Explain these undefined behaviors? Explain what about them? How they behave is undefined.

    – Jesse Millikan
    Jul 10 '09 at 15:44






  • 187





    I think it's interesting that everyone ALWAYS assumes that questions like this are asked because the asker wants to USE the construct in question. My first assumption was that PiX knows that these are bad, but is curious why the behave they way the do on whataver compiler s/he was using... And yeah, what unWind said... it's undefined, it could do anything... including JCF (Jump and Catch Fire)

    – Brian Postow
    May 24 '10 at 13:41






  • 31





    I'm curious: Why don't compilers seem to warn on constructs such as "u = u++ + ++u;" if the result is undefined?

    – Learn OpenGL ES
    Sep 20 '12 at 16:23














741












741








741


251






#include <stdio.h>

int main(void)
{
int i = 0;
i = i++ + ++i;
printf("%dn", i); // 3

i = 1;
i = (i++);
printf("%dn", i); // 2 Should be 1, no ?

volatile int u = 0;
u = u++ + ++u;
printf("%dn", u); // 1

u = 1;
u = (u++);
printf("%dn", u); // 2 Should also be one, no ?

register int v = 0;
v = v++ + ++v;
printf("%dn", v); // 3 (Should be the same as u ?)

int w = 0;
printf("%d %d %dn", w++, ++w, w); // shouldn't this print 0 2 2

int x[2] = { 5, 8 }, y = 0;
x[y] = y ++;
printf("%d %dn", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}









share|improve this question
















#include <stdio.h>

int main(void)
{
int i = 0;
i = i++ + ++i;
printf("%dn", i); // 3

i = 1;
i = (i++);
printf("%dn", i); // 2 Should be 1, no ?

volatile int u = 0;
u = u++ + ++u;
printf("%dn", u); // 1

u = 1;
u = (u++);
printf("%dn", u); // 2 Should also be one, no ?

register int v = 0;
v = v++ + ++v;
printf("%dn", v); // 3 (Should be the same as u ?)

int w = 0;
printf("%d %d %dn", w++, ++w, w); // shouldn't this print 0 2 2

int x[2] = { 5, 8 }, y = 0;
x[y] = y ++;
printf("%d %dn", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}






c increment undefined-behavior order-of-evaluation sequence-points






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 21 '18 at 11:42









Sourav Ghosh

109k14130188




109k14130188










asked Jun 4 '09 at 9:17









PiXPiX

4,05741511




4,05741511








  • 42





    Homework? Not trying to be a pain, but you should never write code with expressions like these. They are usually given as academic examples, sometimes showing that different compilers yield different output.

    – Jarrett Meyer
    Jun 4 '09 at 10:30






  • 10





    @Jarett, nope, just needed some pointers to "sequence points". While working I found a piece of code with i = i++, I thougth "This isn't modifying the value of i". I tested and I wondered why. Since, i've removed this statment and replaced it by i++;

    – PiX
    Jun 4 '09 at 18:24






  • 22





    Explain these undefined behaviors? Explain what about them? How they behave is undefined.

    – Jesse Millikan
    Jul 10 '09 at 15:44






  • 187





    I think it's interesting that everyone ALWAYS assumes that questions like this are asked because the asker wants to USE the construct in question. My first assumption was that PiX knows that these are bad, but is curious why the behave they way the do on whataver compiler s/he was using... And yeah, what unWind said... it's undefined, it could do anything... including JCF (Jump and Catch Fire)

    – Brian Postow
    May 24 '10 at 13:41






  • 31





    I'm curious: Why don't compilers seem to warn on constructs such as "u = u++ + ++u;" if the result is undefined?

    – Learn OpenGL ES
    Sep 20 '12 at 16:23














  • 42





    Homework? Not trying to be a pain, but you should never write code with expressions like these. They are usually given as academic examples, sometimes showing that different compilers yield different output.

    – Jarrett Meyer
    Jun 4 '09 at 10:30






  • 10





    @Jarett, nope, just needed some pointers to "sequence points". While working I found a piece of code with i = i++, I thougth "This isn't modifying the value of i". I tested and I wondered why. Since, i've removed this statment and replaced it by i++;

    – PiX
    Jun 4 '09 at 18:24






  • 22





    Explain these undefined behaviors? Explain what about them? How they behave is undefined.

    – Jesse Millikan
    Jul 10 '09 at 15:44






  • 187





    I think it's interesting that everyone ALWAYS assumes that questions like this are asked because the asker wants to USE the construct in question. My first assumption was that PiX knows that these are bad, but is curious why the behave they way the do on whataver compiler s/he was using... And yeah, what unWind said... it's undefined, it could do anything... including JCF (Jump and Catch Fire)

    – Brian Postow
    May 24 '10 at 13:41






  • 31





    I'm curious: Why don't compilers seem to warn on constructs such as "u = u++ + ++u;" if the result is undefined?

    – Learn OpenGL ES
    Sep 20 '12 at 16:23








42




42





Homework? Not trying to be a pain, but you should never write code with expressions like these. They are usually given as academic examples, sometimes showing that different compilers yield different output.

– Jarrett Meyer
Jun 4 '09 at 10:30





Homework? Not trying to be a pain, but you should never write code with expressions like these. They are usually given as academic examples, sometimes showing that different compilers yield different output.

– Jarrett Meyer
Jun 4 '09 at 10:30




10




10





@Jarett, nope, just needed some pointers to "sequence points". While working I found a piece of code with i = i++, I thougth "This isn't modifying the value of i". I tested and I wondered why. Since, i've removed this statment and replaced it by i++;

– PiX
Jun 4 '09 at 18:24





@Jarett, nope, just needed some pointers to "sequence points". While working I found a piece of code with i = i++, I thougth "This isn't modifying the value of i". I tested and I wondered why. Since, i've removed this statment and replaced it by i++;

– PiX
Jun 4 '09 at 18:24




22




22





Explain these undefined behaviors? Explain what about them? How they behave is undefined.

– Jesse Millikan
Jul 10 '09 at 15:44





Explain these undefined behaviors? Explain what about them? How they behave is undefined.

– Jesse Millikan
Jul 10 '09 at 15:44




187




187





I think it's interesting that everyone ALWAYS assumes that questions like this are asked because the asker wants to USE the construct in question. My first assumption was that PiX knows that these are bad, but is curious why the behave they way the do on whataver compiler s/he was using... And yeah, what unWind said... it's undefined, it could do anything... including JCF (Jump and Catch Fire)

– Brian Postow
May 24 '10 at 13:41





I think it's interesting that everyone ALWAYS assumes that questions like this are asked because the asker wants to USE the construct in question. My first assumption was that PiX knows that these are bad, but is curious why the behave they way the do on whataver compiler s/he was using... And yeah, what unWind said... it's undefined, it could do anything... including JCF (Jump and Catch Fire)

– Brian Postow
May 24 '10 at 13:41




31




31





I'm curious: Why don't compilers seem to warn on constructs such as "u = u++ + ++u;" if the result is undefined?

– Learn OpenGL ES
Sep 20 '12 at 16:23





I'm curious: Why don't compilers seem to warn on constructs such as "u = u++ + ++u;" if the result is undefined?

– Learn OpenGL ES
Sep 20 '12 at 16:23












14 Answers
14






active

oldest

votes


















533





+500









C has the concept of undefined behavior, i.e. some language constructs are syntactically valid but you can't predict the behavior when the code is run.



As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of i.e. requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.



So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile, that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.



Your most interesting-looking example, the one with



u = (u++);


is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).






share|improve this answer





















  • 37





    I knew it was undefined, (The idea of seing this code in production frighten me :)) but I tried to understand what was the reason for these results. Especially why u = u++ incremented u. In java for example: u = u++ returns 0 as (my brain) expected :) Thanks for the sequence points links BTW.

    – PiX
    Jun 4 '09 at 9:42






  • 8





    @PiX: Things are undefined for a number of possible reasons. These include: there is no clear "right result", different machine architectures would strongly favour different results, existing practice is not consistent, or beyond the scope of the standard (e.g. what filenames are valid).

    – Richard
    Jun 4 '09 at 10:57






  • 35





    The spirit of C: Trust the programmer... no matter how insane he is.

    – Fiddling Bits
    Nov 26 '13 at 2:48






  • 4





    @rusty Not sure what you mean. The term "undefined behavior" is used in the C standard. It means that even though some constructs are syntactically valid and will typically compile, they lead to undefine behavior i.e. they do not make sense and should be avoided since your program is broken if it has undefined behavior.

    – unwind
    Mar 22 '14 at 20:01






  • 6





    @MattMcNabb that is only well defined in C++11 not in C11.

    – Shafik Yaghmour
    Jul 14 '14 at 1:18



















76














Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.



This is what I get on my machine, together with what I think is going on:



$ cat evil.c
void evil(){
int i = 0;
i+= i++ + ++i;
}
$ gcc evil.c -c -o evil.bin
$ gdb evil.bin
(gdb) disassemble evil
Dump of assembler code for function evil:
0x00000000 <+0>: push %ebp
0x00000001 <+1>: mov %esp,%ebp
0x00000003 <+3>: sub $0x10,%esp
0x00000006 <+6>: movl $0x0,-0x4(%ebp) // i = 0 i = 0
0x0000000d <+13>: addl $0x1,-0x4(%ebp) // i++ i = 1
0x00000011 <+17>: mov -0x4(%ebp),%eax // j = i i = 1 j = 1
0x00000014 <+20>: add %eax,%eax // j += j i = 1 j = 2
0x00000016 <+22>: add %eax,-0x4(%ebp) // i += j i = 3
0x00000019 <+25>: addl $0x1,-0x4(%ebp) // i++ i = 4
0x0000001d <+29>: leave
0x0000001e <+30>: ret
End of assembler dump.


(I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)






share|improve this answer


























  • how do i get the machine code? I use Dev C++, and i played around with 'Code Generation' option in compiler settings, but go no extra file output or any console output

    – bad_keypoints
    Sep 24 '12 at 14:11






  • 4





    @ronnieaka gcc evil.c -c -o evil.bin and gdb evil.bindisassemble evil, or whatever the Windows equivalents of those are :)

    – badp
    Sep 24 '12 at 18:20








  • 17





    This answer does not really address the question of Why are these constructs undefined behavior?.

    – Shafik Yaghmour
    Jul 1 '14 at 14:00






  • 8





    As an aside, it'll be easier to compile to assembly (with gcc -S evil.c), which is all that's needed here. Assembling then disassembling it is just a roundabout way of doing it.

    – Kat
    Jul 27 '15 at 20:32








  • 39





    For the record, if for whatever reason you're wondering what a given construct does -- and especially if there's any suspicion that it might be undefined behavior -- the age-old advice of "just try it with your compiler and see" is potentially quite perilous. You will learn, at best, what it does under this version of your compiler, under these circumstances, today. You will not learn much if anything about what it's guaranteed to do. In general, "just try it with your compiler" leads to nonportable programs that work only with your compiler.

    – Steve Summit
    Feb 16 '16 at 21:26





















56














I think the relevant parts of the C99 standard are 6.5 Expressions, §2




Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.




and 6.5.16 Assignment operators, §4:




The order of evaluation of the operands is unspecified. If an attempt is made to modify
the result of an assignment operator or to access it after the next sequence point, the
behavior is undefined.







share|improve this answer





















  • 2





    Would the above imply that 'i=i=5;" would be Undefined Behavior?

    – supercat
    Nov 20 '11 at 21:41






  • 1





    @supercat as far as I know i=i=5 is also undefined behavior

    – dhein
    Sep 23 '13 at 15:39






  • 2





    @Zaibis: The rationale I like to use for most places rule applies that in theory a mutli-processor platform could implement something like A=B=5; as "Write-lock A; Write-Lock B; Store 5 to A; store 5 to B; Unlock B; Unock A;", and a statement like C=A+B; as "Read-lock A; Read-lock B; Compute A+B; Unlock A and B; Write-lock C; Store result; Unlock C;". That would ensure that if one thread did A=B=5; while another did C=A+B; the latter thread would either see both writes as having taken place or neither. Potentially a useful guarantee. If one thread did I=I=5;, however, ...

    – supercat
    Sep 23 '13 at 16:18






  • 1





    ... and the compiler didn't notice that both writes were to the same location (if one or both lvalues involve pointers, that may be hard to determine), the generated code could deadlock. I don't think any real-world implementations implement such locking as part of their normal behavior, but it would be permissible under the standard, and if hardware could implement such behaviors cheaply it might be useful. On today's hardware such behavior would be way too expensive to implement as a default, but that doesn't mean it would always be thus.

    – supercat
    Sep 23 '13 at 16:19








  • 1





    @supercat but wouldn't the sequence point access rule of c99 alone be enough to declare it as undefined behavior? So it doesn't matter what technically the hardware could implement?

    – dhein
    Sep 23 '13 at 16:40



















48














The behavior can't really be explained because it invokes both unspecified behavior and undefined behavior, so we can not make any general predictions about this code, although if you read Olve Maudal's work such as Deep C and Unspecified and Undefined sometimes you can make good guesses in very specific cases with a specific compiler and environment but please don't do that anywhere near production.



So moving on to unspecified behavior, in draft c99 standard section6.5 paragraph 3 says(emphasis mine):




The grouping of operators and operands is indicated by the syntax.74) Except as specified
later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.




So when we have a line like this:



i = i++ + ++i;


we do not know whether i++ or ++i will be evaluated first. This is mainly to give the compiler better options for optimization.



We also have undefined behavior here as well since the program is modifying variables(i, u, etc..) more than once between sequence points. From draft standard section 6.5 paragraph 2(emphasis mine):




Between the previous and next sequence point an object shall have its stored value
modified at most once
by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored
.




it cites the following code examples as being undefined:



i = ++i + 1;
a[i++] = i;


In all these examples the code is attempting to modify an object more than once in the same sequence point, which will end with the ; in each one of these cases:



i = i++ + ++i;
^ ^ ^

i = (i++);
^ ^

u = u++ + ++u;
^ ^ ^

u = (u++);
^ ^

v = v++ + ++v;
^ ^ ^


Unspecified behavior is defined in the draft c99 standard in section 3.4.4 as:




use of an unspecified value, or other behavior where this International Standard provides
two or more possibilities and imposes no further requirements on which is chosen in any
instance




and undefined behavior is defined in section 3.4.3 as:




behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements




and notes that:




Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).







share|improve this answer

































    47














    Most of the answers here quoted from C standard emphasizing that the behavior of these constructs are undefined. To understand why the behavior of these constructs are undefined, let's understand these terms first in the light of C11 standard:



    Sequenced: (5.1.2.3)




    Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B.




    Unsequenced:




    If A is not sequenced before or after B, then A and B are unsequenced.




    Evaluations can be one of two things:





    • value computations, which work out the result of an expression; and


    • side effects, which are modifications of objects.


    Sequence Point:




    The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.




    Now coming to the question, for the expressions like



    int i = 1;
    i = i++;


    standard says that:



    6.5 Expressions:




    If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. [...]




    Therefore, the above expression invokes UB because two side effects on the same object i is unsequenced relative to each other. That means it is not sequenced whether the side effect by assignment to i will be done before or after the side effect by ++.

    Depending on whether assignment occurs before or after the increment, different results will be produced and that's the one of the case of undefined behavior.



    Lets rename the i at left of assignment be il and at the right of assignment (in the expression i++) be ir, then the expression be like



    il = ir++     // Note that suffix l and r are used for the sake of clarity.
    // Both il and ir represents the same object.


    An important point regarding Postfix ++ operator is that:




    just because the ++ comes after the variable does not mean that the increment happens late. The increment can happen as early as the compiler likes as long as the compiler ensures that the original value is used.




    It means the expression il = ir++ could be evaluated either as



    temp = ir;      // i = 1
    ir = ir + 1; // i = 2 side effect by ++ before assignment
    il = temp; // i = 1 result is 1


    or



    temp = ir;      // i = 1
    il = temp; // i = 1 side effect by assignment before ++
    ir = ir + 1; // i = 2 result is 2


    resulting in two different results 1 and 2 which depends on the sequence of side effects by assignment and ++ and hence invokes UB.






    share|improve this answer

































      30














      Another way of answering this, rather than getting bogged down in arcane details of sequence points and undefined behavior, is simply to ask, what are they supposed to mean? What was the programmer trying to do?



      The first fragment asked about, i = i++ + ++i, is pretty clearly insane in my book. No one would ever write it in a real program, it's not obvious what it does, there's no conceivable algorithm someone could have been trying to code that would have resulted in this particular contrived sequence of operations. And since it's not obvious to you and me what it's supposed to do, it's fine in my book if the compiler can't figure out what it's supposed to do, either.



      The second fragment, i = i++, is a little easier to understand. Someone is clearly trying to increment i, and assign the result back to i. But there are a couple ways of doing this in C. The most basic way to add 1 to i, and assign the result back to i, is the same in almost any programming language:



      i = i + 1


      C, of course, has a handy shortcut:



      i++


      This means, "add 1 to i, and assign the result back to i". So if we construct a hodgepodge of the two, by writing



      i = i++


      what we're really saying is "add 1 to i, and assign the result back to i, and assign the result back to i". We're confused, so it doesn't bother me too much if the compiler gets confused, too.



      Realistically, the only time these crazy expressions get written is when people are using them as artificial examples of how ++ is supposed to work. And of course it is important to understand how ++ works. But one practical rule for using ++ is, "If it's not obvious what an expression using ++ means, don't write it."



      We used to spend countless hours on comp.lang.c discussing expressions like these and why they're undefined. Two of my longer answers, that try to really explain why, are archived on the web:




      • Why doesn't the Standard define what these do?

      • Doesn't operator precedence determine the order of evaluation?






      share|improve this answer





















      • 1





        A rather nasty gotcha with regard to Undefined Behavior is that while it used to be safe on 99.9% of compilers to use *p=(*q)++; to mean if (p!=q) *p=(*q)++; else *p= __ARBITRARY_VALUE; that is no longer the case. Hyper-modern C would require writing something like the latter formulation (though there's no standard way of indicating code doesn't care what's in *p) to achieve the level of efficiency compilers used to provide with the former (the else clause is necessary in order to let the compiler optimize out the if which some newer compilers would require).

        – supercat
        Jun 30 '15 at 16:14








      • 1





        I've seen at least 5 similar questions about these ++ and -- madness last week or so. These seem to be some professors' favorite topic to puzzle their students..

        – artm
        Feb 8 '16 at 7:49



















      22














      While it is unlikely that any compilers and processors would actually do so, it would be legal, under the C standard, for the compiler to implement "i++" with the sequence:



      In a single operation, read `i` and lock it to prevent access until further notice
      Compute (1+read_value)
      In a single operation, unlock `i` and store the computed value


      While I don't think any processors support the hardware to allow such a thing to be done efficiently, one can easily imagine situations where such behavior would make multi-threaded code easier (e.g. it would guarantee that if two threads try to perform the above sequence simultaneously, i would get incremented by two) and it's not totally inconceivable that some future processor might provide a feature something like that.



      If the compiler were to write i++ as indicated above (legal under the standard) and were to intersperse the above instructions throughout the evaluation of the overall expression (also legal), and if it didn't happen to notice that one of the other instructions happened to access i, it would be possible (and legal) for the compiler to generate a sequence of instructions that would deadlock. To be sure, a compiler would almost certainly detect the problem in the case where the same variable i is used in both places, but if a routine accepts references to two pointers p and q, and uses (*p) and (*q) in the above expression (rather than using i twice) the compiler would not be required to recognize or avoid the deadlock that would occur if the same object's address were passed for both p and q.






      share|improve this answer

































        21














        Often this question is linked as a duplicate of questions related to code like



        printf("%d %dn", i, i++);


        or



        printf("%d %dn", ++i, i++);


        or similar variants.



        While this is also undefined behaviour as stated already, there are subtle differences when printf() is involved when comparing to a statement such as:



           x = i++ + i++;




        In the following statement:



        printf("%d %dn", ++i, i++);


        the order of evaluation of arguments in printf() is unspecified. That means, expressions i++ and ++i could be evaluated in any order. C11 standard has some relevant descriptions on this:



        Annex J, unspecified behaviours




        The order in which the function designator, arguments, and
        subexpressions within the arguments are evaluated in a function call
        (6.5.2.2).




        3.4.4, unspecified behavior




        Use of an unspecified value, or other behavior where this
        International Standard provides two or more possibilities and imposes
        no further requirements on which is chosen in any instance.



        EXAMPLE An example of unspecified behavior is the order in which the
        arguments to a function are evaluated.




        The unspecified behaviour itself is NOT an issue. Consider this example:



        printf("%d %dn", ++x, y++);


        This too has unspecified behaviour because the order of evaluation of ++x and y++ is unspecified. But it's perfectly legal and valid statement. There's no undefined behaviour in this statement. Because the modifications (++x and y++) are done to distinct objects.



        What renders the following statement



        printf("%d %dn", ++i, i++);


        as undefined behaviour is the fact that these two expressions modify the same object i without an intervening sequence point.





        Another detail is that the comma involved in the printf() call is a separator, not the comma operator.



        This is an important distinction because the comma operator does introduce a sequence point between the evaluation of their operands, which makes the following legal:



        int i = 5;
        int j;

        j = (++i, i++); // No undefined behaviour here because the comma operator
        // introduces a sequence point between '++i' and 'i++'

        printf("i=%d j=%dn",i, j); // prints: i=7 j=6


        The comma operator evaluates its operands left-to-right and yields only the value of the last operand. So in j = (++i, i++);, ++i increments i to 6 and i++ yields old value of i (6) which is assigned to j. Then i becomes 7 due to post-increment.



        So if the comma in the function call were to be a comma operator then



        printf("%d %dn", ++i, i++);


        will not be a problem. But it invokes undefined behaviour because the comma here is a separator.





        For those who are new to undefined behaviour would benefit from reading What Every C Programmer Should Know About Undefined Behavior to understand the concept and many other variants of undefined behaviour in C.



        This post: Undefined, unspecified and implementation-defined behavior is also relevant.






        share|improve this answer


























        • This sequence int a = 10, b = 20, c = 30; printf("a=%d b=%d c=%dn", (a = a + b + c), (b = b + b), (c = c + c)); appears to give stable behavior (right-to-left argument evaluation in gcc v7.3.0; result "a=110 b=40 c=60"). Is it because the assignments are considered as 'full-statements' and thus introduce a sequence point? Shouldn't that result in left-to-right argument/statement evaluation? Or, is it just manifestation of undefined behavior?

          – kavadias
          Oct 17 '18 at 20:20











        • @kavadias That printf statement involves undefined behaviour, for the same reason explained above. You are writing b and c in 3rd & 4th arguments respectively and reading in 2nd argument. But there's no sequence between these expressions (2nd, 3rd, & 4th args). gcc/clang has an option -Wsequence-point which can help find these, too.

          – P.P.
          Oct 18 '18 at 8:40



















        13














        The C standard says that a variable should only be assigned at most once between two sequence points. A semi-colon for instance is a sequence point.

        So every statement of the form:



        i = i++;
        i = i++ + ++i;


        and so on violate that rule. The standard also says that behavior is undefined and not unspecified. Some compilers do detect these and produce some result but this is not per standard.



        However, two different variables can be incremented between two sequence points.



        while(*src++ = *dst++);


        The above is a common coding practice while copying/analysing strings.






        share|improve this answer
























        • Of course it doesn't apply to different variables within one expression. It would be a total design failure if it did! All you need in the 2nd example is for both to be incremented between the statement ending and the next one beginning, and that's guaranteed, precisely because of the concept of sequence points at the centre of all this.

          – underscore_d
          Jul 19 '16 at 18:55





















        11














        While the syntax of the expressions like a = a++ or a++ + a++ is legal, the behaviour of these constructs is undefined because a shall in C standard is not obeyed. C99 6.5p2:





        1. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. [72] Furthermore, the prior value shall be read only to determine the value to be stored [73]




        With footnote 73 further clarifying that






        1. This paragraph renders undefined statement expressions such as



          i = ++i + 1;
          a[i++] = i;


          while allowing



          i = i + 1;
          a[i] = i;





        The various sequence points are listed in Annex C of C11 (and C99):






        1. The following are the sequence points described in 5.1.2.3:




          • Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).

          • Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).

          • Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).

          • The end of a full declarator: declarators (6.7.6);

          • Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).

          • Immediately before a library function returns (7.1.4).

          • After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.29.2).

          • Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).






        The wording of the same paragraph in C11 is:





        1. If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)






        You can detect such errors in a program by for example using a recent version of GCC with -Wall and -Werror, and then GCC will outright refuse to compile your program. The following is the output of gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005:



        % gcc plusplus.c -Wall -Werror -pedantic
        plusplus.c: In function ‘main’:
        plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
        i = i++ + ++i;
        ~~^~~~~~~~~~~
        plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
        plusplus.c:10:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
        i = (i++);
        ~~^~~~~~~
        plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
        u = u++ + ++u;
        ~~^~~~~~~~~~~
        plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
        plusplus.c:18:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
        u = (u++);
        ~~^~~~~~~
        plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
        v = v++ + ++v;
        ~~^~~~~~~~~~~
        plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
        cc1: all warnings being treated as errors


        The important part is to know what a sequence point is -- and what is a sequence point and what isn't. For example the comma operator is a sequence point, so



        j = (i ++, ++ i);


        is well-defined, and will increment i by one, yielding the old value, discard that value; then at comma operator, settle the side effects; and then increment i by one, and the resulting value becomes the value of the expression - i.e. this is just a contrived way to write j = (i += 2) which is yet again a "clever" way to write



        i += 2;
        j = i;


        However, the , in function argument lists is not a comma operator, and there is no sequence point between evaluations of distinct arguments; instead their evaluations are unsequenced with regard to each other; so the function call



        int i = 0;
        printf("%d %dn", i++, ++i, i);


        has undefined behaviour because there is no sequence point between the evaluations of i++ and ++i in function arguments, and the value of i is therefore modified twice, by both i++ and ++i, between the previous and the next sequence point.






        share|improve this answer

































          9














          In https://stackoverflow.com/questions/29505280/incrementing-array-index-in-c someone asked about a statement like:



          int k = {0,1,2,3,4,5,6,7,8,9,10};
          int i = 0;
          int num;
          num = k[++i+k[++i]] + k[++i];
          printf("%d", num);


          which prints 7... the OP expected it to print 6.



          The ++i increments aren't guaranteed to all complete before the rest of the calculations. In fact, different compilers will get different results here. In the example you provided, the first 2 ++i executed, then the values of k were read, then the last ++i then k.



          num = k[i+1]+k[i+2] + k[i+3];
          i += 3


          Modern compilers will optimize this very well. In fact, possibly better than the code you originally wrote (assuming it had worked the way you had hoped).






          share|improve this answer

































            5














            A good explanation about what happens in this kind of computation is provided in the document n1188 from the ISO W14 site.



            I explain the ideas.



            The main rule from the standard ISO 9899 that applies in this situation is 6.5p2.




            Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.




            The sequence points in an expression like i=i++ are before i= and after i++.



            In the paper that I quoted above it is explained that you can figure out the program as being formed by small boxes, each box containing the instructions between 2 consecutive sequence points. The sequence points are defined in annex C of the standard, in the case of i=i++ there are 2 sequence points that delimit a full-expression. Such an expression is syntactically equivalent with an entry of expression-statement in the Backus-Naur form of the grammar (a grammar is provided in annex A of the Standard).



            So the order of instructions inside a box has no clear order.



            i=i++


            can be interpreted as



            tmp = i
            i=i+1
            i = tmp


            or as



            tmp = i
            i = tmp
            i=i+1


            because both all these forms to interpret the code i=i++ are valid and because both generate different answers, the behavior is undefined.



            So a sequence point can be seen by the beginning and the end of each box that composes the program [the boxes are atomic units in C] and inside a box the order of instructions is not defined in all cases. Changing that order one can change the result sometimes.



            EDIT:



            Other good source for explaining such ambiguities are the entries from c-faq site (also published as a book) , namely here and here and here .






            share|improve this answer


























            • How this answer added new to the existing answers? Also the explanations for i=i++ is very similar to this answer.

              – haccks
              Nov 24 '17 at 7:00













            • @haccks I did not read the other answers. I wanted to explain in my own language what I learned from the mentioned document from the official site of ISO 9899 open-std.org/jtc1/sc22/wg14/www/docs/n1188.pdf

              – alinsoar
              Nov 24 '17 at 12:14





















            3














            The reason is that the program is running undefined behavior. The problem lies in the evaluation order, because there is no sequence points required according to C++98 standard ( no operations is sequenced before or after another according to C++11 terminology).



            However if you stick to one compiler, you will find the behavior persistent, as long as you don't add function calls or pointers, which would make the behavior more messy.





            • So first the GCC:
              Using Nuwen MinGW 15 GCC 7.1 you will get:



              #include<stdio.h>
              int main(int argc, char ** argv)
              {
              int i = 0;
              i = i++ + ++i;
              printf("%dn", i); // 2

              i = 1;
              i = (i++);
              printf("%dn", i); //1

              volatile int u = 0;
              u = u++ + ++u;
              printf("%dn", u); // 2

              u = 1;
              u = (u++);
              printf("%dn", u); //1

              register int v = 0;
              v = v++ + ++v;
              printf("%dn", v); //2


              }




            How does GCC work? it evaluates sub expressions at a left to right order for the right hand side (RHS) , then assigns the value to the left hand side (LHS) . This is exactly how Java and C# behave and define their standards. (Yes, the equivalent software in Java and C# has defined behaviors). It evaluate each sub expression one by one in the RHS Statement in a left to right order; for each sub expression: the ++c (pre-increment) is evaluated first then the value c is used for the operation, then the post increment c++).



            according to GCC C++: Operators




            In GCC C++, the precedence of the operators controls the order in
            which the individual operators are evaluated




            the equivalent code in defined behavior C++ as GCC understands:



            #include<stdio.h>
            int main(int argc, char ** argv)
            {
            int i = 0;
            //i = i++ + ++i;
            int r;
            r=i;
            i++;
            ++i;
            r+=i;
            i=r;
            printf("%dn", i); // 2

            i = 1;
            //i = (i++);
            r=i;
            i++;
            i=r;
            printf("%dn", i); // 1

            volatile int u = 0;
            //u = u++ + ++u;
            r=u;
            u++;
            ++u;
            r+=u;
            u=r;
            printf("%dn", u); // 2

            u = 1;
            //u = (u++);
            r=u;
            u++;
            u=r;
            printf("%dn", u); // 1

            register int v = 0;
            //v = v++ + ++v;
            r=v;
            v++;
            ++v;
            r+=v;
            v=r;
            printf("%dn", v); //2
            }


            Then we go to Visual Studio. Visual Studio 2015, you get:



            #include<stdio.h>
            int main(int argc, char ** argv)
            {
            int i = 0;
            i = i++ + ++i;
            printf("%dn", i); // 3

            i = 1;
            i = (i++);
            printf("%dn", i); // 2

            volatile int u = 0;
            u = u++ + ++u;
            printf("%dn", u); // 3

            u = 1;
            u = (u++);
            printf("%dn", u); // 2

            register int v = 0;
            v = v++ + ++v;
            printf("%dn", v); // 3
            }


            How does visual studio work, it takes another approach, it evaluates all pre-increments expressions in first pass, then uses variables values in the operations in second pass, assign from RHS to LHS in third pass, then at last pass it evaluates all the post-increment expressions in one pass.



            So the equivalent in defined behavior C++ as Visual C++ understands:



            #include<stdio.h>
            int main(int argc, char ** argv)
            {
            int r;
            int i = 0;
            //i = i++ + ++i;
            ++i;
            r = i + i;
            i = r;
            i++;
            printf("%dn", i); // 3

            i = 1;
            //i = (i++);
            r = i;
            i = r;
            i++;
            printf("%dn", i); // 2

            volatile int u = 0;
            //u = u++ + ++u;
            ++u;
            r = u + u;
            u = r;
            u++;
            printf("%dn", u); // 3

            u = 1;
            //u = (u++);
            r = u;
            u = r;
            u++;
            printf("%dn", u); // 2

            register int v = 0;
            //v = v++ + ++v;
            ++v;
            r = v + v;
            v = r;
            v++;
            printf("%dn", v); // 3
            }


            as Visual Studio documentation states at Precedence and Order of Evaluation:




            Where several operators appear together, they have equal precedence and are evaluated according to their associativity. The operators in the table are described in the sections beginning with Postfix Operators.







            share|improve this answer





















            • 1





              I've edited the question to add the UB in evaluation of function arguments, as this question is often used as a duplicate for that. (The last example)

              – Antti Haapala
              Oct 21 '17 at 10:46






            • 1





              Also the question is about c now, not C++

              – Antti Haapala
              Oct 21 '17 at 10:47



















            3














            Your question was probably not, "Why are these constructs undefined behavior in C?". Your question was probably, "Why did this code (using ++) not give me the value I expected?", and someone marked your question as a duplicate, and sent you here.



            This answer tries to answer that question: why did your code not give you the answer you expected, and how can you learn to recognize (and avoid) expressions that will not work as expected.



            I assume you've heard the basic definition of C's ++ and -- operators by now, and how the prefix form ++x differs from the postfix form x++. But these operators are hard to think about, so to make sure you understood, perhaps you wrote a tiny little test program involving something like



            int x = 5;
            printf("%d %d %dn", x, ++x, x++);


            But, to your surprise, this program did not help you understand -- it printed some strange, unexpected, inexplicable output, suggesting that maybe ++ does something completely different, not at all what you thought it did.



            Or, perhaps you're looking at a hard-to-understand expression like



            int x = 5;
            x = x++ + ++x;
            printf("%dn", x);


            Perhaps someone gave you that code as a puzzle. This code also makes no sense, especially if you run it -- and if you compile and run it under two different compilers, you're likely to get two different answers! What's up with that? Which answer is correct? (And the answer is that both of them are, or neither of them are.)



            As you've heard by now, all of these expressions are undefined, which means that the C language makes no guarantee about what they'll do. This is a strange and surprising result, because you probably thought that any program you could write, as long as it compiled and ran, would generate a unique, well-defined output. But in the case of undefined behavior, that's not so.



            What makes an expression undefined? Are expressions involving ++ and -- always undefined? Of course not: these are useful operators, and if you use them properly, they're perfectly well-defined.



            For the expressions we're talking about what makes them undefined is when there's too much going on at once, when we're not sure what order things will happen in, but when the order matters to the result we get.



            Let's go back to the two examples I've used in this answer. When I wrote



            printf("%d %d %dn", x, ++x, x++);


            the question is, before calling printf, does the compiler compute the value of x first, or x++, or maybe ++x? But it turns out we don't know. There's no rule in C which says that the arguments to a function get evaluated left-to-right, or right-to-left, or in some other order. So we can't say whether the compiler will do x first, then ++x, then x++, or x++ then ++x then x, or some other order. But the order clearly matters, because depending on which order the compiler uses, we'll clearly get different results printed by printf.



            What about this crazy expression?



            x = x++ + ++x;


            The problem with this expression is that it contains three different attempts to modify the value of x: (1) the x++ part tries to add 1 to x, store the new value in x, and return the old value of x; (2) the ++x part tries to add 1 to x, store the new value in x, and return the new value of x; and (3) the x = part tries to assign the sum of the other two back to x. Which of those three attempted assignments will "win"? Which of the three values will actually get assigned to x? Again, and perhaps surprisingly, there's no rule in C to tell us.



            You might imagine that precedence or associativity or left-to-right evaluation tells you what order things happen in, but they do not. You may not believe me, but please take my word for it, and I'll say it again: precedence and associativity do not determine every aspect of the evaluation order of an expression in C. In particular, if within one expression there are multiple different spots where we try to assign a new value to something like x, precedence and associativity do not tell us which of those attempts happens first, or last, or anything.





            So with all that background and introduction out of the way, if you want to make sure that all your programs are well-defined, which expressions can you write, and which ones can you not write?



            These expressions are all fine:



            y = x++;
            z = x++ + y++;
            x = x + 1;
            x = a[i++];
            x = a[i++] + b[j++];
            x[i++] = a[j++] + b[k++];
            x = *p++;
            x = *p++ + *q++;


            These expressions are all undefined:



            x = x++;
            x = x++ + ++x;
            y = x + x++;
            a[i] = i++;
            a[i++] = i;
            printf("%d %d %dn", x, ++x, x++);


            And the last question is, how can you tell which expressions are well-defined, and which expressions are undefined?



            As I said earlier, the undefined expressions are the ones where there's too much going at once, where you can't be sure what order things happen in, and where the order matters:




            1. If there's one variable that's getting modified (assigned to) in two or more different places, how do you know which modification happens first?

            2. If there's a variable that's getting modified in one place, and having its value used in another place, how do you know whether it uses the old value or the new value?


            As an example of #1, in the expression



            x = x++ + ++x;


            there are three attempts to modify `x.



            As an example of #2, in the expression



            y = x + x++;


            we both use the value of x, and modify it.



            So that's the answer: make sure that in any expression you write, each variable is modified at most once, and if a variable is modified, you don't also attempt to use the value of that variable somewhere else.






            share|improve this answer































              14 Answers
              14






              active

              oldest

              votes








              14 Answers
              14






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              533





              +500









              C has the concept of undefined behavior, i.e. some language constructs are syntactically valid but you can't predict the behavior when the code is run.



              As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of i.e. requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.



              So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile, that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.



              Your most interesting-looking example, the one with



              u = (u++);


              is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).






              share|improve this answer





















              • 37





                I knew it was undefined, (The idea of seing this code in production frighten me :)) but I tried to understand what was the reason for these results. Especially why u = u++ incremented u. In java for example: u = u++ returns 0 as (my brain) expected :) Thanks for the sequence points links BTW.

                – PiX
                Jun 4 '09 at 9:42






              • 8





                @PiX: Things are undefined for a number of possible reasons. These include: there is no clear "right result", different machine architectures would strongly favour different results, existing practice is not consistent, or beyond the scope of the standard (e.g. what filenames are valid).

                – Richard
                Jun 4 '09 at 10:57






              • 35





                The spirit of C: Trust the programmer... no matter how insane he is.

                – Fiddling Bits
                Nov 26 '13 at 2:48






              • 4





                @rusty Not sure what you mean. The term "undefined behavior" is used in the C standard. It means that even though some constructs are syntactically valid and will typically compile, they lead to undefine behavior i.e. they do not make sense and should be avoided since your program is broken if it has undefined behavior.

                – unwind
                Mar 22 '14 at 20:01






              • 6





                @MattMcNabb that is only well defined in C++11 not in C11.

                – Shafik Yaghmour
                Jul 14 '14 at 1:18
















              533





              +500









              C has the concept of undefined behavior, i.e. some language constructs are syntactically valid but you can't predict the behavior when the code is run.



              As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of i.e. requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.



              So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile, that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.



              Your most interesting-looking example, the one with



              u = (u++);


              is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).






              share|improve this answer





















              • 37





                I knew it was undefined, (The idea of seing this code in production frighten me :)) but I tried to understand what was the reason for these results. Especially why u = u++ incremented u. In java for example: u = u++ returns 0 as (my brain) expected :) Thanks for the sequence points links BTW.

                – PiX
                Jun 4 '09 at 9:42






              • 8





                @PiX: Things are undefined for a number of possible reasons. These include: there is no clear "right result", different machine architectures would strongly favour different results, existing practice is not consistent, or beyond the scope of the standard (e.g. what filenames are valid).

                – Richard
                Jun 4 '09 at 10:57






              • 35





                The spirit of C: Trust the programmer... no matter how insane he is.

                – Fiddling Bits
                Nov 26 '13 at 2:48






              • 4





                @rusty Not sure what you mean. The term "undefined behavior" is used in the C standard. It means that even though some constructs are syntactically valid and will typically compile, they lead to undefine behavior i.e. they do not make sense and should be avoided since your program is broken if it has undefined behavior.

                – unwind
                Mar 22 '14 at 20:01






              • 6





                @MattMcNabb that is only well defined in C++11 not in C11.

                – Shafik Yaghmour
                Jul 14 '14 at 1:18














              533





              +500







              533





              +500



              533




              +500





              C has the concept of undefined behavior, i.e. some language constructs are syntactically valid but you can't predict the behavior when the code is run.



              As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of i.e. requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.



              So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile, that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.



              Your most interesting-looking example, the one with



              u = (u++);


              is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).






              share|improve this answer















              C has the concept of undefined behavior, i.e. some language constructs are syntactically valid but you can't predict the behavior when the code is run.



              As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of i.e. requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.



              So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile, that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.



              Your most interesting-looking example, the one with



              u = (u++);


              is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Dec 16 '15 at 9:38









              DaveRandom

              67.4k10125160




              67.4k10125160










              answered Jun 4 '09 at 9:20









              unwindunwind

              319k52394526




              319k52394526








              • 37





                I knew it was undefined, (The idea of seing this code in production frighten me :)) but I tried to understand what was the reason for these results. Especially why u = u++ incremented u. In java for example: u = u++ returns 0 as (my brain) expected :) Thanks for the sequence points links BTW.

                – PiX
                Jun 4 '09 at 9:42






              • 8





                @PiX: Things are undefined for a number of possible reasons. These include: there is no clear "right result", different machine architectures would strongly favour different results, existing practice is not consistent, or beyond the scope of the standard (e.g. what filenames are valid).

                – Richard
                Jun 4 '09 at 10:57






              • 35





                The spirit of C: Trust the programmer... no matter how insane he is.

                – Fiddling Bits
                Nov 26 '13 at 2:48






              • 4





                @rusty Not sure what you mean. The term "undefined behavior" is used in the C standard. It means that even though some constructs are syntactically valid and will typically compile, they lead to undefine behavior i.e. they do not make sense and should be avoided since your program is broken if it has undefined behavior.

                – unwind
                Mar 22 '14 at 20:01






              • 6





                @MattMcNabb that is only well defined in C++11 not in C11.

                – Shafik Yaghmour
                Jul 14 '14 at 1:18














              • 37





                I knew it was undefined, (The idea of seing this code in production frighten me :)) but I tried to understand what was the reason for these results. Especially why u = u++ incremented u. In java for example: u = u++ returns 0 as (my brain) expected :) Thanks for the sequence points links BTW.

                – PiX
                Jun 4 '09 at 9:42






              • 8





                @PiX: Things are undefined for a number of possible reasons. These include: there is no clear "right result", different machine architectures would strongly favour different results, existing practice is not consistent, or beyond the scope of the standard (e.g. what filenames are valid).

                – Richard
                Jun 4 '09 at 10:57






              • 35





                The spirit of C: Trust the programmer... no matter how insane he is.

                – Fiddling Bits
                Nov 26 '13 at 2:48






              • 4





                @rusty Not sure what you mean. The term "undefined behavior" is used in the C standard. It means that even though some constructs are syntactically valid and will typically compile, they lead to undefine behavior i.e. they do not make sense and should be avoided since your program is broken if it has undefined behavior.

                – unwind
                Mar 22 '14 at 20:01






              • 6





                @MattMcNabb that is only well defined in C++11 not in C11.

                – Shafik Yaghmour
                Jul 14 '14 at 1:18








              37




              37





              I knew it was undefined, (The idea of seing this code in production frighten me :)) but I tried to understand what was the reason for these results. Especially why u = u++ incremented u. In java for example: u = u++ returns 0 as (my brain) expected :) Thanks for the sequence points links BTW.

              – PiX
              Jun 4 '09 at 9:42





              I knew it was undefined, (The idea of seing this code in production frighten me :)) but I tried to understand what was the reason for these results. Especially why u = u++ incremented u. In java for example: u = u++ returns 0 as (my brain) expected :) Thanks for the sequence points links BTW.

              – PiX
              Jun 4 '09 at 9:42




              8




              8





              @PiX: Things are undefined for a number of possible reasons. These include: there is no clear "right result", different machine architectures would strongly favour different results, existing practice is not consistent, or beyond the scope of the standard (e.g. what filenames are valid).

              – Richard
              Jun 4 '09 at 10:57





              @PiX: Things are undefined for a number of possible reasons. These include: there is no clear "right result", different machine architectures would strongly favour different results, existing practice is not consistent, or beyond the scope of the standard (e.g. what filenames are valid).

              – Richard
              Jun 4 '09 at 10:57




              35




              35





              The spirit of C: Trust the programmer... no matter how insane he is.

              – Fiddling Bits
              Nov 26 '13 at 2:48





              The spirit of C: Trust the programmer... no matter how insane he is.

              – Fiddling Bits
              Nov 26 '13 at 2:48




              4




              4





              @rusty Not sure what you mean. The term "undefined behavior" is used in the C standard. It means that even though some constructs are syntactically valid and will typically compile, they lead to undefine behavior i.e. they do not make sense and should be avoided since your program is broken if it has undefined behavior.

              – unwind
              Mar 22 '14 at 20:01





              @rusty Not sure what you mean. The term "undefined behavior" is used in the C standard. It means that even though some constructs are syntactically valid and will typically compile, they lead to undefine behavior i.e. they do not make sense and should be avoided since your program is broken if it has undefined behavior.

              – unwind
              Mar 22 '14 at 20:01




              6




              6





              @MattMcNabb that is only well defined in C++11 not in C11.

              – Shafik Yaghmour
              Jul 14 '14 at 1:18





              @MattMcNabb that is only well defined in C++11 not in C11.

              – Shafik Yaghmour
              Jul 14 '14 at 1:18













              76














              Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.



              This is what I get on my machine, together with what I think is going on:



              $ cat evil.c
              void evil(){
              int i = 0;
              i+= i++ + ++i;
              }
              $ gcc evil.c -c -o evil.bin
              $ gdb evil.bin
              (gdb) disassemble evil
              Dump of assembler code for function evil:
              0x00000000 <+0>: push %ebp
              0x00000001 <+1>: mov %esp,%ebp
              0x00000003 <+3>: sub $0x10,%esp
              0x00000006 <+6>: movl $0x0,-0x4(%ebp) // i = 0 i = 0
              0x0000000d <+13>: addl $0x1,-0x4(%ebp) // i++ i = 1
              0x00000011 <+17>: mov -0x4(%ebp),%eax // j = i i = 1 j = 1
              0x00000014 <+20>: add %eax,%eax // j += j i = 1 j = 2
              0x00000016 <+22>: add %eax,-0x4(%ebp) // i += j i = 3
              0x00000019 <+25>: addl $0x1,-0x4(%ebp) // i++ i = 4
              0x0000001d <+29>: leave
              0x0000001e <+30>: ret
              End of assembler dump.


              (I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)






              share|improve this answer


























              • how do i get the machine code? I use Dev C++, and i played around with 'Code Generation' option in compiler settings, but go no extra file output or any console output

                – bad_keypoints
                Sep 24 '12 at 14:11






              • 4





                @ronnieaka gcc evil.c -c -o evil.bin and gdb evil.bindisassemble evil, or whatever the Windows equivalents of those are :)

                – badp
                Sep 24 '12 at 18:20








              • 17





                This answer does not really address the question of Why are these constructs undefined behavior?.

                – Shafik Yaghmour
                Jul 1 '14 at 14:00






              • 8





                As an aside, it'll be easier to compile to assembly (with gcc -S evil.c), which is all that's needed here. Assembling then disassembling it is just a roundabout way of doing it.

                – Kat
                Jul 27 '15 at 20:32








              • 39





                For the record, if for whatever reason you're wondering what a given construct does -- and especially if there's any suspicion that it might be undefined behavior -- the age-old advice of "just try it with your compiler and see" is potentially quite perilous. You will learn, at best, what it does under this version of your compiler, under these circumstances, today. You will not learn much if anything about what it's guaranteed to do. In general, "just try it with your compiler" leads to nonportable programs that work only with your compiler.

                – Steve Summit
                Feb 16 '16 at 21:26


















              76














              Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.



              This is what I get on my machine, together with what I think is going on:



              $ cat evil.c
              void evil(){
              int i = 0;
              i+= i++ + ++i;
              }
              $ gcc evil.c -c -o evil.bin
              $ gdb evil.bin
              (gdb) disassemble evil
              Dump of assembler code for function evil:
              0x00000000 <+0>: push %ebp
              0x00000001 <+1>: mov %esp,%ebp
              0x00000003 <+3>: sub $0x10,%esp
              0x00000006 <+6>: movl $0x0,-0x4(%ebp) // i = 0 i = 0
              0x0000000d <+13>: addl $0x1,-0x4(%ebp) // i++ i = 1
              0x00000011 <+17>: mov -0x4(%ebp),%eax // j = i i = 1 j = 1
              0x00000014 <+20>: add %eax,%eax // j += j i = 1 j = 2
              0x00000016 <+22>: add %eax,-0x4(%ebp) // i += j i = 3
              0x00000019 <+25>: addl $0x1,-0x4(%ebp) // i++ i = 4
              0x0000001d <+29>: leave
              0x0000001e <+30>: ret
              End of assembler dump.


              (I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)






              share|improve this answer


























              • how do i get the machine code? I use Dev C++, and i played around with 'Code Generation' option in compiler settings, but go no extra file output or any console output

                – bad_keypoints
                Sep 24 '12 at 14:11






              • 4





                @ronnieaka gcc evil.c -c -o evil.bin and gdb evil.bindisassemble evil, or whatever the Windows equivalents of those are :)

                – badp
                Sep 24 '12 at 18:20








              • 17





                This answer does not really address the question of Why are these constructs undefined behavior?.

                – Shafik Yaghmour
                Jul 1 '14 at 14:00






              • 8





                As an aside, it'll be easier to compile to assembly (with gcc -S evil.c), which is all that's needed here. Assembling then disassembling it is just a roundabout way of doing it.

                – Kat
                Jul 27 '15 at 20:32








              • 39





                For the record, if for whatever reason you're wondering what a given construct does -- and especially if there's any suspicion that it might be undefined behavior -- the age-old advice of "just try it with your compiler and see" is potentially quite perilous. You will learn, at best, what it does under this version of your compiler, under these circumstances, today. You will not learn much if anything about what it's guaranteed to do. In general, "just try it with your compiler" leads to nonportable programs that work only with your compiler.

                – Steve Summit
                Feb 16 '16 at 21:26
















              76












              76








              76







              Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.



              This is what I get on my machine, together with what I think is going on:



              $ cat evil.c
              void evil(){
              int i = 0;
              i+= i++ + ++i;
              }
              $ gcc evil.c -c -o evil.bin
              $ gdb evil.bin
              (gdb) disassemble evil
              Dump of assembler code for function evil:
              0x00000000 <+0>: push %ebp
              0x00000001 <+1>: mov %esp,%ebp
              0x00000003 <+3>: sub $0x10,%esp
              0x00000006 <+6>: movl $0x0,-0x4(%ebp) // i = 0 i = 0
              0x0000000d <+13>: addl $0x1,-0x4(%ebp) // i++ i = 1
              0x00000011 <+17>: mov -0x4(%ebp),%eax // j = i i = 1 j = 1
              0x00000014 <+20>: add %eax,%eax // j += j i = 1 j = 2
              0x00000016 <+22>: add %eax,-0x4(%ebp) // i += j i = 3
              0x00000019 <+25>: addl $0x1,-0x4(%ebp) // i++ i = 4
              0x0000001d <+29>: leave
              0x0000001e <+30>: ret
              End of assembler dump.


              (I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)






              share|improve this answer















              Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.



              This is what I get on my machine, together with what I think is going on:



              $ cat evil.c
              void evil(){
              int i = 0;
              i+= i++ + ++i;
              }
              $ gcc evil.c -c -o evil.bin
              $ gdb evil.bin
              (gdb) disassemble evil
              Dump of assembler code for function evil:
              0x00000000 <+0>: push %ebp
              0x00000001 <+1>: mov %esp,%ebp
              0x00000003 <+3>: sub $0x10,%esp
              0x00000006 <+6>: movl $0x0,-0x4(%ebp) // i = 0 i = 0
              0x0000000d <+13>: addl $0x1,-0x4(%ebp) // i++ i = 1
              0x00000011 <+17>: mov -0x4(%ebp),%eax // j = i i = 1 j = 1
              0x00000014 <+20>: add %eax,%eax // j += j i = 1 j = 2
              0x00000016 <+22>: add %eax,-0x4(%ebp) // i += j i = 3
              0x00000019 <+25>: addl $0x1,-0x4(%ebp) // i++ i = 4
              0x0000001d <+29>: leave
              0x0000001e <+30>: ret
              End of assembler dump.


              (I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Jul 1 '14 at 23:21

























              answered May 24 '10 at 13:26









              badpbadp

              8,85334978




              8,85334978













              • how do i get the machine code? I use Dev C++, and i played around with 'Code Generation' option in compiler settings, but go no extra file output or any console output

                – bad_keypoints
                Sep 24 '12 at 14:11






              • 4





                @ronnieaka gcc evil.c -c -o evil.bin and gdb evil.bindisassemble evil, or whatever the Windows equivalents of those are :)

                – badp
                Sep 24 '12 at 18:20








              • 17





                This answer does not really address the question of Why are these constructs undefined behavior?.

                – Shafik Yaghmour
                Jul 1 '14 at 14:00






              • 8





                As an aside, it'll be easier to compile to assembly (with gcc -S evil.c), which is all that's needed here. Assembling then disassembling it is just a roundabout way of doing it.

                – Kat
                Jul 27 '15 at 20:32








              • 39





                For the record, if for whatever reason you're wondering what a given construct does -- and especially if there's any suspicion that it might be undefined behavior -- the age-old advice of "just try it with your compiler and see" is potentially quite perilous. You will learn, at best, what it does under this version of your compiler, under these circumstances, today. You will not learn much if anything about what it's guaranteed to do. In general, "just try it with your compiler" leads to nonportable programs that work only with your compiler.

                – Steve Summit
                Feb 16 '16 at 21:26





















              • how do i get the machine code? I use Dev C++, and i played around with 'Code Generation' option in compiler settings, but go no extra file output or any console output

                – bad_keypoints
                Sep 24 '12 at 14:11






              • 4





                @ronnieaka gcc evil.c -c -o evil.bin and gdb evil.bindisassemble evil, or whatever the Windows equivalents of those are :)

                – badp
                Sep 24 '12 at 18:20








              • 17





                This answer does not really address the question of Why are these constructs undefined behavior?.

                – Shafik Yaghmour
                Jul 1 '14 at 14:00






              • 8





                As an aside, it'll be easier to compile to assembly (with gcc -S evil.c), which is all that's needed here. Assembling then disassembling it is just a roundabout way of doing it.

                – Kat
                Jul 27 '15 at 20:32








              • 39





                For the record, if for whatever reason you're wondering what a given construct does -- and especially if there's any suspicion that it might be undefined behavior -- the age-old advice of "just try it with your compiler and see" is potentially quite perilous. You will learn, at best, what it does under this version of your compiler, under these circumstances, today. You will not learn much if anything about what it's guaranteed to do. In general, "just try it with your compiler" leads to nonportable programs that work only with your compiler.

                – Steve Summit
                Feb 16 '16 at 21:26



















              how do i get the machine code? I use Dev C++, and i played around with 'Code Generation' option in compiler settings, but go no extra file output or any console output

              – bad_keypoints
              Sep 24 '12 at 14:11





              how do i get the machine code? I use Dev C++, and i played around with 'Code Generation' option in compiler settings, but go no extra file output or any console output

              – bad_keypoints
              Sep 24 '12 at 14:11




              4




              4





              @ronnieaka gcc evil.c -c -o evil.bin and gdb evil.bindisassemble evil, or whatever the Windows equivalents of those are :)

              – badp
              Sep 24 '12 at 18:20







              @ronnieaka gcc evil.c -c -o evil.bin and gdb evil.bindisassemble evil, or whatever the Windows equivalents of those are :)

              – badp
              Sep 24 '12 at 18:20






              17




              17





              This answer does not really address the question of Why are these constructs undefined behavior?.

              – Shafik Yaghmour
              Jul 1 '14 at 14:00





              This answer does not really address the question of Why are these constructs undefined behavior?.

              – Shafik Yaghmour
              Jul 1 '14 at 14:00




              8




              8





              As an aside, it'll be easier to compile to assembly (with gcc -S evil.c), which is all that's needed here. Assembling then disassembling it is just a roundabout way of doing it.

              – Kat
              Jul 27 '15 at 20:32







              As an aside, it'll be easier to compile to assembly (with gcc -S evil.c), which is all that's needed here. Assembling then disassembling it is just a roundabout way of doing it.

              – Kat
              Jul 27 '15 at 20:32






              39




              39





              For the record, if for whatever reason you're wondering what a given construct does -- and especially if there's any suspicion that it might be undefined behavior -- the age-old advice of "just try it with your compiler and see" is potentially quite perilous. You will learn, at best, what it does under this version of your compiler, under these circumstances, today. You will not learn much if anything about what it's guaranteed to do. In general, "just try it with your compiler" leads to nonportable programs that work only with your compiler.

              – Steve Summit
              Feb 16 '16 at 21:26







              For the record, if for whatever reason you're wondering what a given construct does -- and especially if there's any suspicion that it might be undefined behavior -- the age-old advice of "just try it with your compiler and see" is potentially quite perilous. You will learn, at best, what it does under this version of your compiler, under these circumstances, today. You will not learn much if anything about what it's guaranteed to do. In general, "just try it with your compiler" leads to nonportable programs that work only with your compiler.

              – Steve Summit
              Feb 16 '16 at 21:26













              56














              I think the relevant parts of the C99 standard are 6.5 Expressions, §2




              Between the previous and next sequence point an object shall have its stored value
              modified at most once by the evaluation of an expression. Furthermore, the prior value
              shall be read only to determine the value to be stored.




              and 6.5.16 Assignment operators, §4:




              The order of evaluation of the operands is unspecified. If an attempt is made to modify
              the result of an assignment operator or to access it after the next sequence point, the
              behavior is undefined.







              share|improve this answer





















              • 2





                Would the above imply that 'i=i=5;" would be Undefined Behavior?

                – supercat
                Nov 20 '11 at 21:41






              • 1





                @supercat as far as I know i=i=5 is also undefined behavior

                – dhein
                Sep 23 '13 at 15:39






              • 2





                @Zaibis: The rationale I like to use for most places rule applies that in theory a mutli-processor platform could implement something like A=B=5; as "Write-lock A; Write-Lock B; Store 5 to A; store 5 to B; Unlock B; Unock A;", and a statement like C=A+B; as "Read-lock A; Read-lock B; Compute A+B; Unlock A and B; Write-lock C; Store result; Unlock C;". That would ensure that if one thread did A=B=5; while another did C=A+B; the latter thread would either see both writes as having taken place or neither. Potentially a useful guarantee. If one thread did I=I=5;, however, ...

                – supercat
                Sep 23 '13 at 16:18






              • 1





                ... and the compiler didn't notice that both writes were to the same location (if one or both lvalues involve pointers, that may be hard to determine), the generated code could deadlock. I don't think any real-world implementations implement such locking as part of their normal behavior, but it would be permissible under the standard, and if hardware could implement such behaviors cheaply it might be useful. On today's hardware such behavior would be way too expensive to implement as a default, but that doesn't mean it would always be thus.

                – supercat
                Sep 23 '13 at 16:19








              • 1





                @supercat but wouldn't the sequence point access rule of c99 alone be enough to declare it as undefined behavior? So it doesn't matter what technically the hardware could implement?

                – dhein
                Sep 23 '13 at 16:40
















              56














              I think the relevant parts of the C99 standard are 6.5 Expressions, §2




              Between the previous and next sequence point an object shall have its stored value
              modified at most once by the evaluation of an expression. Furthermore, the prior value
              shall be read only to determine the value to be stored.




              and 6.5.16 Assignment operators, §4:




              The order of evaluation of the operands is unspecified. If an attempt is made to modify
              the result of an assignment operator or to access it after the next sequence point, the
              behavior is undefined.







              share|improve this answer





















              • 2





                Would the above imply that 'i=i=5;" would be Undefined Behavior?

                – supercat
                Nov 20 '11 at 21:41






              • 1





                @supercat as far as I know i=i=5 is also undefined behavior

                – dhein
                Sep 23 '13 at 15:39






              • 2





                @Zaibis: The rationale I like to use for most places rule applies that in theory a mutli-processor platform could implement something like A=B=5; as "Write-lock A; Write-Lock B; Store 5 to A; store 5 to B; Unlock B; Unock A;", and a statement like C=A+B; as "Read-lock A; Read-lock B; Compute A+B; Unlock A and B; Write-lock C; Store result; Unlock C;". That would ensure that if one thread did A=B=5; while another did C=A+B; the latter thread would either see both writes as having taken place or neither. Potentially a useful guarantee. If one thread did I=I=5;, however, ...

                – supercat
                Sep 23 '13 at 16:18






              • 1





                ... and the compiler didn't notice that both writes were to the same location (if one or both lvalues involve pointers, that may be hard to determine), the generated code could deadlock. I don't think any real-world implementations implement such locking as part of their normal behavior, but it would be permissible under the standard, and if hardware could implement such behaviors cheaply it might be useful. On today's hardware such behavior would be way too expensive to implement as a default, but that doesn't mean it would always be thus.

                – supercat
                Sep 23 '13 at 16:19








              • 1





                @supercat but wouldn't the sequence point access rule of c99 alone be enough to declare it as undefined behavior? So it doesn't matter what technically the hardware could implement?

                – dhein
                Sep 23 '13 at 16:40














              56












              56








              56







              I think the relevant parts of the C99 standard are 6.5 Expressions, §2




              Between the previous and next sequence point an object shall have its stored value
              modified at most once by the evaluation of an expression. Furthermore, the prior value
              shall be read only to determine the value to be stored.




              and 6.5.16 Assignment operators, §4:




              The order of evaluation of the operands is unspecified. If an attempt is made to modify
              the result of an assignment operator or to access it after the next sequence point, the
              behavior is undefined.







              share|improve this answer















              I think the relevant parts of the C99 standard are 6.5 Expressions, §2




              Between the previous and next sequence point an object shall have its stored value
              modified at most once by the evaluation of an expression. Furthermore, the prior value
              shall be read only to determine the value to be stored.




              and 6.5.16 Assignment operators, §4:




              The order of evaluation of the operands is unspecified. If an attempt is made to modify
              the result of an assignment operator or to access it after the next sequence point, the
              behavior is undefined.








              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Jun 4 '09 at 9:42

























              answered Jun 4 '09 at 9:35









              ChristophChristoph

              129k31155213




              129k31155213








              • 2





                Would the above imply that 'i=i=5;" would be Undefined Behavior?

                – supercat
                Nov 20 '11 at 21:41






              • 1





                @supercat as far as I know i=i=5 is also undefined behavior

                – dhein
                Sep 23 '13 at 15:39






              • 2





                @Zaibis: The rationale I like to use for most places rule applies that in theory a mutli-processor platform could implement something like A=B=5; as "Write-lock A; Write-Lock B; Store 5 to A; store 5 to B; Unlock B; Unock A;", and a statement like C=A+B; as "Read-lock A; Read-lock B; Compute A+B; Unlock A and B; Write-lock C; Store result; Unlock C;". That would ensure that if one thread did A=B=5; while another did C=A+B; the latter thread would either see both writes as having taken place or neither. Potentially a useful guarantee. If one thread did I=I=5;, however, ...

                – supercat
                Sep 23 '13 at 16:18






              • 1





                ... and the compiler didn't notice that both writes were to the same location (if one or both lvalues involve pointers, that may be hard to determine), the generated code could deadlock. I don't think any real-world implementations implement such locking as part of their normal behavior, but it would be permissible under the standard, and if hardware could implement such behaviors cheaply it might be useful. On today's hardware such behavior would be way too expensive to implement as a default, but that doesn't mean it would always be thus.

                – supercat
                Sep 23 '13 at 16:19








              • 1





                @supercat but wouldn't the sequence point access rule of c99 alone be enough to declare it as undefined behavior? So it doesn't matter what technically the hardware could implement?

                – dhein
                Sep 23 '13 at 16:40














              • 2





                Would the above imply that 'i=i=5;" would be Undefined Behavior?

                – supercat
                Nov 20 '11 at 21:41






              • 1





                @supercat as far as I know i=i=5 is also undefined behavior

                – dhein
                Sep 23 '13 at 15:39






              • 2





                @Zaibis: The rationale I like to use for most places rule applies that in theory a mutli-processor platform could implement something like A=B=5; as "Write-lock A; Write-Lock B; Store 5 to A; store 5 to B; Unlock B; Unock A;", and a statement like C=A+B; as "Read-lock A; Read-lock B; Compute A+B; Unlock A and B; Write-lock C; Store result; Unlock C;". That would ensure that if one thread did A=B=5; while another did C=A+B; the latter thread would either see both writes as having taken place or neither. Potentially a useful guarantee. If one thread did I=I=5;, however, ...

                – supercat
                Sep 23 '13 at 16:18






              • 1





                ... and the compiler didn't notice that both writes were to the same location (if one or both lvalues involve pointers, that may be hard to determine), the generated code could deadlock. I don't think any real-world implementations implement such locking as part of their normal behavior, but it would be permissible under the standard, and if hardware could implement such behaviors cheaply it might be useful. On today's hardware such behavior would be way too expensive to implement as a default, but that doesn't mean it would always be thus.

                – supercat
                Sep 23 '13 at 16:19








              • 1





                @supercat but wouldn't the sequence point access rule of c99 alone be enough to declare it as undefined behavior? So it doesn't matter what technically the hardware could implement?

                – dhein
                Sep 23 '13 at 16:40








              2




              2





              Would the above imply that 'i=i=5;" would be Undefined Behavior?

              – supercat
              Nov 20 '11 at 21:41





              Would the above imply that 'i=i=5;" would be Undefined Behavior?

              – supercat
              Nov 20 '11 at 21:41




              1




              1





              @supercat as far as I know i=i=5 is also undefined behavior

              – dhein
              Sep 23 '13 at 15:39





              @supercat as far as I know i=i=5 is also undefined behavior

              – dhein
              Sep 23 '13 at 15:39




              2




              2





              @Zaibis: The rationale I like to use for most places rule applies that in theory a mutli-processor platform could implement something like A=B=5; as "Write-lock A; Write-Lock B; Store 5 to A; store 5 to B; Unlock B; Unock A;", and a statement like C=A+B; as "Read-lock A; Read-lock B; Compute A+B; Unlock A and B; Write-lock C; Store result; Unlock C;". That would ensure that if one thread did A=B=5; while another did C=A+B; the latter thread would either see both writes as having taken place or neither. Potentially a useful guarantee. If one thread did I=I=5;, however, ...

              – supercat
              Sep 23 '13 at 16:18





              @Zaibis: The rationale I like to use for most places rule applies that in theory a mutli-processor platform could implement something like A=B=5; as "Write-lock A; Write-Lock B; Store 5 to A; store 5 to B; Unlock B; Unock A;", and a statement like C=A+B; as "Read-lock A; Read-lock B; Compute A+B; Unlock A and B; Write-lock C; Store result; Unlock C;". That would ensure that if one thread did A=B=5; while another did C=A+B; the latter thread would either see both writes as having taken place or neither. Potentially a useful guarantee. If one thread did I=I=5;, however, ...

              – supercat
              Sep 23 '13 at 16:18




              1




              1





              ... and the compiler didn't notice that both writes were to the same location (if one or both lvalues involve pointers, that may be hard to determine), the generated code could deadlock. I don't think any real-world implementations implement such locking as part of their normal behavior, but it would be permissible under the standard, and if hardware could implement such behaviors cheaply it might be useful. On today's hardware such behavior would be way too expensive to implement as a default, but that doesn't mean it would always be thus.

              – supercat
              Sep 23 '13 at 16:19







              ... and the compiler didn't notice that both writes were to the same location (if one or both lvalues involve pointers, that may be hard to determine), the generated code could deadlock. I don't think any real-world implementations implement such locking as part of their normal behavior, but it would be permissible under the standard, and if hardware could implement such behaviors cheaply it might be useful. On today's hardware such behavior would be way too expensive to implement as a default, but that doesn't mean it would always be thus.

              – supercat
              Sep 23 '13 at 16:19






              1




              1





              @supercat but wouldn't the sequence point access rule of c99 alone be enough to declare it as undefined behavior? So it doesn't matter what technically the hardware could implement?

              – dhein
              Sep 23 '13 at 16:40





              @supercat but wouldn't the sequence point access rule of c99 alone be enough to declare it as undefined behavior? So it doesn't matter what technically the hardware could implement?

              – dhein
              Sep 23 '13 at 16:40











              48














              The behavior can't really be explained because it invokes both unspecified behavior and undefined behavior, so we can not make any general predictions about this code, although if you read Olve Maudal's work such as Deep C and Unspecified and Undefined sometimes you can make good guesses in very specific cases with a specific compiler and environment but please don't do that anywhere near production.



              So moving on to unspecified behavior, in draft c99 standard section6.5 paragraph 3 says(emphasis mine):




              The grouping of operators and operands is indicated by the syntax.74) Except as specified
              later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.




              So when we have a line like this:



              i = i++ + ++i;


              we do not know whether i++ or ++i will be evaluated first. This is mainly to give the compiler better options for optimization.



              We also have undefined behavior here as well since the program is modifying variables(i, u, etc..) more than once between sequence points. From draft standard section 6.5 paragraph 2(emphasis mine):




              Between the previous and next sequence point an object shall have its stored value
              modified at most once
              by the evaluation of an expression. Furthermore, the prior value
              shall be read only to determine the value to be stored
              .




              it cites the following code examples as being undefined:



              i = ++i + 1;
              a[i++] = i;


              In all these examples the code is attempting to modify an object more than once in the same sequence point, which will end with the ; in each one of these cases:



              i = i++ + ++i;
              ^ ^ ^

              i = (i++);
              ^ ^

              u = u++ + ++u;
              ^ ^ ^

              u = (u++);
              ^ ^

              v = v++ + ++v;
              ^ ^ ^


              Unspecified behavior is defined in the draft c99 standard in section 3.4.4 as:




              use of an unspecified value, or other behavior where this International Standard provides
              two or more possibilities and imposes no further requirements on which is chosen in any
              instance




              and undefined behavior is defined in section 3.4.3 as:




              behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
              for which this International Standard imposes no requirements




              and notes that:




              Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).







              share|improve this answer






























                48














                The behavior can't really be explained because it invokes both unspecified behavior and undefined behavior, so we can not make any general predictions about this code, although if you read Olve Maudal's work such as Deep C and Unspecified and Undefined sometimes you can make good guesses in very specific cases with a specific compiler and environment but please don't do that anywhere near production.



                So moving on to unspecified behavior, in draft c99 standard section6.5 paragraph 3 says(emphasis mine):




                The grouping of operators and operands is indicated by the syntax.74) Except as specified
                later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.




                So when we have a line like this:



                i = i++ + ++i;


                we do not know whether i++ or ++i will be evaluated first. This is mainly to give the compiler better options for optimization.



                We also have undefined behavior here as well since the program is modifying variables(i, u, etc..) more than once between sequence points. From draft standard section 6.5 paragraph 2(emphasis mine):




                Between the previous and next sequence point an object shall have its stored value
                modified at most once
                by the evaluation of an expression. Furthermore, the prior value
                shall be read only to determine the value to be stored
                .




                it cites the following code examples as being undefined:



                i = ++i + 1;
                a[i++] = i;


                In all these examples the code is attempting to modify an object more than once in the same sequence point, which will end with the ; in each one of these cases:



                i = i++ + ++i;
                ^ ^ ^

                i = (i++);
                ^ ^

                u = u++ + ++u;
                ^ ^ ^

                u = (u++);
                ^ ^

                v = v++ + ++v;
                ^ ^ ^


                Unspecified behavior is defined in the draft c99 standard in section 3.4.4 as:




                use of an unspecified value, or other behavior where this International Standard provides
                two or more possibilities and imposes no further requirements on which is chosen in any
                instance




                and undefined behavior is defined in section 3.4.3 as:




                behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
                for which this International Standard imposes no requirements




                and notes that:




                Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).







                share|improve this answer




























                  48












                  48








                  48







                  The behavior can't really be explained because it invokes both unspecified behavior and undefined behavior, so we can not make any general predictions about this code, although if you read Olve Maudal's work such as Deep C and Unspecified and Undefined sometimes you can make good guesses in very specific cases with a specific compiler and environment but please don't do that anywhere near production.



                  So moving on to unspecified behavior, in draft c99 standard section6.5 paragraph 3 says(emphasis mine):




                  The grouping of operators and operands is indicated by the syntax.74) Except as specified
                  later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.




                  So when we have a line like this:



                  i = i++ + ++i;


                  we do not know whether i++ or ++i will be evaluated first. This is mainly to give the compiler better options for optimization.



                  We also have undefined behavior here as well since the program is modifying variables(i, u, etc..) more than once between sequence points. From draft standard section 6.5 paragraph 2(emphasis mine):




                  Between the previous and next sequence point an object shall have its stored value
                  modified at most once
                  by the evaluation of an expression. Furthermore, the prior value
                  shall be read only to determine the value to be stored
                  .




                  it cites the following code examples as being undefined:



                  i = ++i + 1;
                  a[i++] = i;


                  In all these examples the code is attempting to modify an object more than once in the same sequence point, which will end with the ; in each one of these cases:



                  i = i++ + ++i;
                  ^ ^ ^

                  i = (i++);
                  ^ ^

                  u = u++ + ++u;
                  ^ ^ ^

                  u = (u++);
                  ^ ^

                  v = v++ + ++v;
                  ^ ^ ^


                  Unspecified behavior is defined in the draft c99 standard in section 3.4.4 as:




                  use of an unspecified value, or other behavior where this International Standard provides
                  two or more possibilities and imposes no further requirements on which is chosen in any
                  instance




                  and undefined behavior is defined in section 3.4.3 as:




                  behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
                  for which this International Standard imposes no requirements




                  and notes that:




                  Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).







                  share|improve this answer















                  The behavior can't really be explained because it invokes both unspecified behavior and undefined behavior, so we can not make any general predictions about this code, although if you read Olve Maudal's work such as Deep C and Unspecified and Undefined sometimes you can make good guesses in very specific cases with a specific compiler and environment but please don't do that anywhere near production.



                  So moving on to unspecified behavior, in draft c99 standard section6.5 paragraph 3 says(emphasis mine):




                  The grouping of operators and operands is indicated by the syntax.74) Except as specified
                  later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.




                  So when we have a line like this:



                  i = i++ + ++i;


                  we do not know whether i++ or ++i will be evaluated first. This is mainly to give the compiler better options for optimization.



                  We also have undefined behavior here as well since the program is modifying variables(i, u, etc..) more than once between sequence points. From draft standard section 6.5 paragraph 2(emphasis mine):




                  Between the previous and next sequence point an object shall have its stored value
                  modified at most once
                  by the evaluation of an expression. Furthermore, the prior value
                  shall be read only to determine the value to be stored
                  .




                  it cites the following code examples as being undefined:



                  i = ++i + 1;
                  a[i++] = i;


                  In all these examples the code is attempting to modify an object more than once in the same sequence point, which will end with the ; in each one of these cases:



                  i = i++ + ++i;
                  ^ ^ ^

                  i = (i++);
                  ^ ^

                  u = u++ + ++u;
                  ^ ^ ^

                  u = (u++);
                  ^ ^

                  v = v++ + ++v;
                  ^ ^ ^


                  Unspecified behavior is defined in the draft c99 standard in section 3.4.4 as:




                  use of an unspecified value, or other behavior where this International Standard provides
                  two or more possibilities and imposes no further requirements on which is chosen in any
                  instance




                  and undefined behavior is defined in section 3.4.3 as:




                  behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
                  for which this International Standard imposes no requirements




                  and notes that:




                  Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).








                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited May 23 '17 at 12:34









                  Community

                  11




                  11










                  answered Aug 15 '13 at 19:25









                  Shafik YaghmourShafik Yaghmour

                  126k23324535




                  126k23324535























                      47














                      Most of the answers here quoted from C standard emphasizing that the behavior of these constructs are undefined. To understand why the behavior of these constructs are undefined, let's understand these terms first in the light of C11 standard:



                      Sequenced: (5.1.2.3)




                      Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B.




                      Unsequenced:




                      If A is not sequenced before or after B, then A and B are unsequenced.




                      Evaluations can be one of two things:





                      • value computations, which work out the result of an expression; and


                      • side effects, which are modifications of objects.


                      Sequence Point:




                      The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.




                      Now coming to the question, for the expressions like



                      int i = 1;
                      i = i++;


                      standard says that:



                      6.5 Expressions:




                      If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. [...]




                      Therefore, the above expression invokes UB because two side effects on the same object i is unsequenced relative to each other. That means it is not sequenced whether the side effect by assignment to i will be done before or after the side effect by ++.

                      Depending on whether assignment occurs before or after the increment, different results will be produced and that's the one of the case of undefined behavior.



                      Lets rename the i at left of assignment be il and at the right of assignment (in the expression i++) be ir, then the expression be like



                      il = ir++     // Note that suffix l and r are used for the sake of clarity.
                      // Both il and ir represents the same object.


                      An important point regarding Postfix ++ operator is that:




                      just because the ++ comes after the variable does not mean that the increment happens late. The increment can happen as early as the compiler likes as long as the compiler ensures that the original value is used.




                      It means the expression il = ir++ could be evaluated either as



                      temp = ir;      // i = 1
                      ir = ir + 1; // i = 2 side effect by ++ before assignment
                      il = temp; // i = 1 result is 1


                      or



                      temp = ir;      // i = 1
                      il = temp; // i = 1 side effect by assignment before ++
                      ir = ir + 1; // i = 2 result is 2


                      resulting in two different results 1 and 2 which depends on the sequence of side effects by assignment and ++ and hence invokes UB.






                      share|improve this answer






























                        47














                        Most of the answers here quoted from C standard emphasizing that the behavior of these constructs are undefined. To understand why the behavior of these constructs are undefined, let's understand these terms first in the light of C11 standard:



                        Sequenced: (5.1.2.3)




                        Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B.




                        Unsequenced:




                        If A is not sequenced before or after B, then A and B are unsequenced.




                        Evaluations can be one of two things:





                        • value computations, which work out the result of an expression; and


                        • side effects, which are modifications of objects.


                        Sequence Point:




                        The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.




                        Now coming to the question, for the expressions like



                        int i = 1;
                        i = i++;


                        standard says that:



                        6.5 Expressions:




                        If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. [...]




                        Therefore, the above expression invokes UB because two side effects on the same object i is unsequenced relative to each other. That means it is not sequenced whether the side effect by assignment to i will be done before or after the side effect by ++.

                        Depending on whether assignment occurs before or after the increment, different results will be produced and that's the one of the case of undefined behavior.



                        Lets rename the i at left of assignment be il and at the right of assignment (in the expression i++) be ir, then the expression be like



                        il = ir++     // Note that suffix l and r are used for the sake of clarity.
                        // Both il and ir represents the same object.


                        An important point regarding Postfix ++ operator is that:




                        just because the ++ comes after the variable does not mean that the increment happens late. The increment can happen as early as the compiler likes as long as the compiler ensures that the original value is used.




                        It means the expression il = ir++ could be evaluated either as



                        temp = ir;      // i = 1
                        ir = ir + 1; // i = 2 side effect by ++ before assignment
                        il = temp; // i = 1 result is 1


                        or



                        temp = ir;      // i = 1
                        il = temp; // i = 1 side effect by assignment before ++
                        ir = ir + 1; // i = 2 result is 2


                        resulting in two different results 1 and 2 which depends on the sequence of side effects by assignment and ++ and hence invokes UB.






                        share|improve this answer




























                          47












                          47








                          47







                          Most of the answers here quoted from C standard emphasizing that the behavior of these constructs are undefined. To understand why the behavior of these constructs are undefined, let's understand these terms first in the light of C11 standard:



                          Sequenced: (5.1.2.3)




                          Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B.




                          Unsequenced:




                          If A is not sequenced before or after B, then A and B are unsequenced.




                          Evaluations can be one of two things:





                          • value computations, which work out the result of an expression; and


                          • side effects, which are modifications of objects.


                          Sequence Point:




                          The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.




                          Now coming to the question, for the expressions like



                          int i = 1;
                          i = i++;


                          standard says that:



                          6.5 Expressions:




                          If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. [...]




                          Therefore, the above expression invokes UB because two side effects on the same object i is unsequenced relative to each other. That means it is not sequenced whether the side effect by assignment to i will be done before or after the side effect by ++.

                          Depending on whether assignment occurs before or after the increment, different results will be produced and that's the one of the case of undefined behavior.



                          Lets rename the i at left of assignment be il and at the right of assignment (in the expression i++) be ir, then the expression be like



                          il = ir++     // Note that suffix l and r are used for the sake of clarity.
                          // Both il and ir represents the same object.


                          An important point regarding Postfix ++ operator is that:




                          just because the ++ comes after the variable does not mean that the increment happens late. The increment can happen as early as the compiler likes as long as the compiler ensures that the original value is used.




                          It means the expression il = ir++ could be evaluated either as



                          temp = ir;      // i = 1
                          ir = ir + 1; // i = 2 side effect by ++ before assignment
                          il = temp; // i = 1 result is 1


                          or



                          temp = ir;      // i = 1
                          il = temp; // i = 1 side effect by assignment before ++
                          ir = ir + 1; // i = 2 result is 2


                          resulting in two different results 1 and 2 which depends on the sequence of side effects by assignment and ++ and hence invokes UB.






                          share|improve this answer















                          Most of the answers here quoted from C standard emphasizing that the behavior of these constructs are undefined. To understand why the behavior of these constructs are undefined, let's understand these terms first in the light of C11 standard:



                          Sequenced: (5.1.2.3)




                          Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B.




                          Unsequenced:




                          If A is not sequenced before or after B, then A and B are unsequenced.




                          Evaluations can be one of two things:





                          • value computations, which work out the result of an expression; and


                          • side effects, which are modifications of objects.


                          Sequence Point:




                          The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.




                          Now coming to the question, for the expressions like



                          int i = 1;
                          i = i++;


                          standard says that:



                          6.5 Expressions:




                          If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. [...]




                          Therefore, the above expression invokes UB because two side effects on the same object i is unsequenced relative to each other. That means it is not sequenced whether the side effect by assignment to i will be done before or after the side effect by ++.

                          Depending on whether assignment occurs before or after the increment, different results will be produced and that's the one of the case of undefined behavior.



                          Lets rename the i at left of assignment be il and at the right of assignment (in the expression i++) be ir, then the expression be like



                          il = ir++     // Note that suffix l and r are used for the sake of clarity.
                          // Both il and ir represents the same object.


                          An important point regarding Postfix ++ operator is that:




                          just because the ++ comes after the variable does not mean that the increment happens late. The increment can happen as early as the compiler likes as long as the compiler ensures that the original value is used.




                          It means the expression il = ir++ could be evaluated either as



                          temp = ir;      // i = 1
                          ir = ir + 1; // i = 2 side effect by ++ before assignment
                          il = temp; // i = 1 result is 1


                          or



                          temp = ir;      // i = 1
                          il = temp; // i = 1 side effect by assignment before ++
                          ir = ir + 1; // i = 2 result is 2


                          resulting in two different results 1 and 2 which depends on the sequence of side effects by assignment and ++ and hence invokes UB.







                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited May 23 '17 at 12:18









                          Community

                          11




                          11










                          answered Jun 27 '15 at 0:27









                          hacckshaccks

                          85.9k20126218




                          85.9k20126218























                              30














                              Another way of answering this, rather than getting bogged down in arcane details of sequence points and undefined behavior, is simply to ask, what are they supposed to mean? What was the programmer trying to do?



                              The first fragment asked about, i = i++ + ++i, is pretty clearly insane in my book. No one would ever write it in a real program, it's not obvious what it does, there's no conceivable algorithm someone could have been trying to code that would have resulted in this particular contrived sequence of operations. And since it's not obvious to you and me what it's supposed to do, it's fine in my book if the compiler can't figure out what it's supposed to do, either.



                              The second fragment, i = i++, is a little easier to understand. Someone is clearly trying to increment i, and assign the result back to i. But there are a couple ways of doing this in C. The most basic way to add 1 to i, and assign the result back to i, is the same in almost any programming language:



                              i = i + 1


                              C, of course, has a handy shortcut:



                              i++


                              This means, "add 1 to i, and assign the result back to i". So if we construct a hodgepodge of the two, by writing



                              i = i++


                              what we're really saying is "add 1 to i, and assign the result back to i, and assign the result back to i". We're confused, so it doesn't bother me too much if the compiler gets confused, too.



                              Realistically, the only time these crazy expressions get written is when people are using them as artificial examples of how ++ is supposed to work. And of course it is important to understand how ++ works. But one practical rule for using ++ is, "If it's not obvious what an expression using ++ means, don't write it."



                              We used to spend countless hours on comp.lang.c discussing expressions like these and why they're undefined. Two of my longer answers, that try to really explain why, are archived on the web:




                              • Why doesn't the Standard define what these do?

                              • Doesn't operator precedence determine the order of evaluation?






                              share|improve this answer





















                              • 1





                                A rather nasty gotcha with regard to Undefined Behavior is that while it used to be safe on 99.9% of compilers to use *p=(*q)++; to mean if (p!=q) *p=(*q)++; else *p= __ARBITRARY_VALUE; that is no longer the case. Hyper-modern C would require writing something like the latter formulation (though there's no standard way of indicating code doesn't care what's in *p) to achieve the level of efficiency compilers used to provide with the former (the else clause is necessary in order to let the compiler optimize out the if which some newer compilers would require).

                                – supercat
                                Jun 30 '15 at 16:14








                              • 1





                                I've seen at least 5 similar questions about these ++ and -- madness last week or so. These seem to be some professors' favorite topic to puzzle their students..

                                – artm
                                Feb 8 '16 at 7:49
















                              30














                              Another way of answering this, rather than getting bogged down in arcane details of sequence points and undefined behavior, is simply to ask, what are they supposed to mean? What was the programmer trying to do?



                              The first fragment asked about, i = i++ + ++i, is pretty clearly insane in my book. No one would ever write it in a real program, it's not obvious what it does, there's no conceivable algorithm someone could have been trying to code that would have resulted in this particular contrived sequence of operations. And since it's not obvious to you and me what it's supposed to do, it's fine in my book if the compiler can't figure out what it's supposed to do, either.



                              The second fragment, i = i++, is a little easier to understand. Someone is clearly trying to increment i, and assign the result back to i. But there are a couple ways of doing this in C. The most basic way to add 1 to i, and assign the result back to i, is the same in almost any programming language:



                              i = i + 1


                              C, of course, has a handy shortcut:



                              i++


                              This means, "add 1 to i, and assign the result back to i". So if we construct a hodgepodge of the two, by writing



                              i = i++


                              what we're really saying is "add 1 to i, and assign the result back to i, and assign the result back to i". We're confused, so it doesn't bother me too much if the compiler gets confused, too.



                              Realistically, the only time these crazy expressions get written is when people are using them as artificial examples of how ++ is supposed to work. And of course it is important to understand how ++ works. But one practical rule for using ++ is, "If it's not obvious what an expression using ++ means, don't write it."



                              We used to spend countless hours on comp.lang.c discussing expressions like these and why they're undefined. Two of my longer answers, that try to really explain why, are archived on the web:




                              • Why doesn't the Standard define what these do?

                              • Doesn't operator precedence determine the order of evaluation?






                              share|improve this answer





















                              • 1





                                A rather nasty gotcha with regard to Undefined Behavior is that while it used to be safe on 99.9% of compilers to use *p=(*q)++; to mean if (p!=q) *p=(*q)++; else *p= __ARBITRARY_VALUE; that is no longer the case. Hyper-modern C would require writing something like the latter formulation (though there's no standard way of indicating code doesn't care what's in *p) to achieve the level of efficiency compilers used to provide with the former (the else clause is necessary in order to let the compiler optimize out the if which some newer compilers would require).

                                – supercat
                                Jun 30 '15 at 16:14








                              • 1





                                I've seen at least 5 similar questions about these ++ and -- madness last week or so. These seem to be some professors' favorite topic to puzzle their students..

                                – artm
                                Feb 8 '16 at 7:49














                              30












                              30








                              30







                              Another way of answering this, rather than getting bogged down in arcane details of sequence points and undefined behavior, is simply to ask, what are they supposed to mean? What was the programmer trying to do?



                              The first fragment asked about, i = i++ + ++i, is pretty clearly insane in my book. No one would ever write it in a real program, it's not obvious what it does, there's no conceivable algorithm someone could have been trying to code that would have resulted in this particular contrived sequence of operations. And since it's not obvious to you and me what it's supposed to do, it's fine in my book if the compiler can't figure out what it's supposed to do, either.



                              The second fragment, i = i++, is a little easier to understand. Someone is clearly trying to increment i, and assign the result back to i. But there are a couple ways of doing this in C. The most basic way to add 1 to i, and assign the result back to i, is the same in almost any programming language:



                              i = i + 1


                              C, of course, has a handy shortcut:



                              i++


                              This means, "add 1 to i, and assign the result back to i". So if we construct a hodgepodge of the two, by writing



                              i = i++


                              what we're really saying is "add 1 to i, and assign the result back to i, and assign the result back to i". We're confused, so it doesn't bother me too much if the compiler gets confused, too.



                              Realistically, the only time these crazy expressions get written is when people are using them as artificial examples of how ++ is supposed to work. And of course it is important to understand how ++ works. But one practical rule for using ++ is, "If it's not obvious what an expression using ++ means, don't write it."



                              We used to spend countless hours on comp.lang.c discussing expressions like these and why they're undefined. Two of my longer answers, that try to really explain why, are archived on the web:




                              • Why doesn't the Standard define what these do?

                              • Doesn't operator precedence determine the order of evaluation?






                              share|improve this answer















                              Another way of answering this, rather than getting bogged down in arcane details of sequence points and undefined behavior, is simply to ask, what are they supposed to mean? What was the programmer trying to do?



                              The first fragment asked about, i = i++ + ++i, is pretty clearly insane in my book. No one would ever write it in a real program, it's not obvious what it does, there's no conceivable algorithm someone could have been trying to code that would have resulted in this particular contrived sequence of operations. And since it's not obvious to you and me what it's supposed to do, it's fine in my book if the compiler can't figure out what it's supposed to do, either.



                              The second fragment, i = i++, is a little easier to understand. Someone is clearly trying to increment i, and assign the result back to i. But there are a couple ways of doing this in C. The most basic way to add 1 to i, and assign the result back to i, is the same in almost any programming language:



                              i = i + 1


                              C, of course, has a handy shortcut:



                              i++


                              This means, "add 1 to i, and assign the result back to i". So if we construct a hodgepodge of the two, by writing



                              i = i++


                              what we're really saying is "add 1 to i, and assign the result back to i, and assign the result back to i". We're confused, so it doesn't bother me too much if the compiler gets confused, too.



                              Realistically, the only time these crazy expressions get written is when people are using them as artificial examples of how ++ is supposed to work. And of course it is important to understand how ++ works. But one practical rule for using ++ is, "If it's not obvious what an expression using ++ means, don't write it."



                              We used to spend countless hours on comp.lang.c discussing expressions like these and why they're undefined. Two of my longer answers, that try to really explain why, are archived on the web:




                              • Why doesn't the Standard define what these do?

                              • Doesn't operator precedence determine the order of evaluation?







                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited May 26 '18 at 21:27

























                              answered Jun 18 '15 at 11:55









                              Steve SummitSteve Summit

                              17.6k22450




                              17.6k22450








                              • 1





                                A rather nasty gotcha with regard to Undefined Behavior is that while it used to be safe on 99.9% of compilers to use *p=(*q)++; to mean if (p!=q) *p=(*q)++; else *p= __ARBITRARY_VALUE; that is no longer the case. Hyper-modern C would require writing something like the latter formulation (though there's no standard way of indicating code doesn't care what's in *p) to achieve the level of efficiency compilers used to provide with the former (the else clause is necessary in order to let the compiler optimize out the if which some newer compilers would require).

                                – supercat
                                Jun 30 '15 at 16:14








                              • 1





                                I've seen at least 5 similar questions about these ++ and -- madness last week or so. These seem to be some professors' favorite topic to puzzle their students..

                                – artm
                                Feb 8 '16 at 7:49














                              • 1





                                A rather nasty gotcha with regard to Undefined Behavior is that while it used to be safe on 99.9% of compilers to use *p=(*q)++; to mean if (p!=q) *p=(*q)++; else *p= __ARBITRARY_VALUE; that is no longer the case. Hyper-modern C would require writing something like the latter formulation (though there's no standard way of indicating code doesn't care what's in *p) to achieve the level of efficiency compilers used to provide with the former (the else clause is necessary in order to let the compiler optimize out the if which some newer compilers would require).

                                – supercat
                                Jun 30 '15 at 16:14








                              • 1





                                I've seen at least 5 similar questions about these ++ and -- madness last week or so. These seem to be some professors' favorite topic to puzzle their students..

                                – artm
                                Feb 8 '16 at 7:49








                              1




                              1





                              A rather nasty gotcha with regard to Undefined Behavior is that while it used to be safe on 99.9% of compilers to use *p=(*q)++; to mean if (p!=q) *p=(*q)++; else *p= __ARBITRARY_VALUE; that is no longer the case. Hyper-modern C would require writing something like the latter formulation (though there's no standard way of indicating code doesn't care what's in *p) to achieve the level of efficiency compilers used to provide with the former (the else clause is necessary in order to let the compiler optimize out the if which some newer compilers would require).

                              – supercat
                              Jun 30 '15 at 16:14







                              A rather nasty gotcha with regard to Undefined Behavior is that while it used to be safe on 99.9% of compilers to use *p=(*q)++; to mean if (p!=q) *p=(*q)++; else *p= __ARBITRARY_VALUE; that is no longer the case. Hyper-modern C would require writing something like the latter formulation (though there's no standard way of indicating code doesn't care what's in *p) to achieve the level of efficiency compilers used to provide with the former (the else clause is necessary in order to let the compiler optimize out the if which some newer compilers would require).

                              – supercat
                              Jun 30 '15 at 16:14






                              1




                              1





                              I've seen at least 5 similar questions about these ++ and -- madness last week or so. These seem to be some professors' favorite topic to puzzle their students..

                              – artm
                              Feb 8 '16 at 7:49





                              I've seen at least 5 similar questions about these ++ and -- madness last week or so. These seem to be some professors' favorite topic to puzzle their students..

                              – artm
                              Feb 8 '16 at 7:49











                              22














                              While it is unlikely that any compilers and processors would actually do so, it would be legal, under the C standard, for the compiler to implement "i++" with the sequence:



                              In a single operation, read `i` and lock it to prevent access until further notice
                              Compute (1+read_value)
                              In a single operation, unlock `i` and store the computed value


                              While I don't think any processors support the hardware to allow such a thing to be done efficiently, one can easily imagine situations where such behavior would make multi-threaded code easier (e.g. it would guarantee that if two threads try to perform the above sequence simultaneously, i would get incremented by two) and it's not totally inconceivable that some future processor might provide a feature something like that.



                              If the compiler were to write i++ as indicated above (legal under the standard) and were to intersperse the above instructions throughout the evaluation of the overall expression (also legal), and if it didn't happen to notice that one of the other instructions happened to access i, it would be possible (and legal) for the compiler to generate a sequence of instructions that would deadlock. To be sure, a compiler would almost certainly detect the problem in the case where the same variable i is used in both places, but if a routine accepts references to two pointers p and q, and uses (*p) and (*q) in the above expression (rather than using i twice) the compiler would not be required to recognize or avoid the deadlock that would occur if the same object's address were passed for both p and q.






                              share|improve this answer






























                                22














                                While it is unlikely that any compilers and processors would actually do so, it would be legal, under the C standard, for the compiler to implement "i++" with the sequence:



                                In a single operation, read `i` and lock it to prevent access until further notice
                                Compute (1+read_value)
                                In a single operation, unlock `i` and store the computed value


                                While I don't think any processors support the hardware to allow such a thing to be done efficiently, one can easily imagine situations where such behavior would make multi-threaded code easier (e.g. it would guarantee that if two threads try to perform the above sequence simultaneously, i would get incremented by two) and it's not totally inconceivable that some future processor might provide a feature something like that.



                                If the compiler were to write i++ as indicated above (legal under the standard) and were to intersperse the above instructions throughout the evaluation of the overall expression (also legal), and if it didn't happen to notice that one of the other instructions happened to access i, it would be possible (and legal) for the compiler to generate a sequence of instructions that would deadlock. To be sure, a compiler would almost certainly detect the problem in the case where the same variable i is used in both places, but if a routine accepts references to two pointers p and q, and uses (*p) and (*q) in the above expression (rather than using i twice) the compiler would not be required to recognize or avoid the deadlock that would occur if the same object's address were passed for both p and q.






                                share|improve this answer




























                                  22












                                  22








                                  22







                                  While it is unlikely that any compilers and processors would actually do so, it would be legal, under the C standard, for the compiler to implement "i++" with the sequence:



                                  In a single operation, read `i` and lock it to prevent access until further notice
                                  Compute (1+read_value)
                                  In a single operation, unlock `i` and store the computed value


                                  While I don't think any processors support the hardware to allow such a thing to be done efficiently, one can easily imagine situations where such behavior would make multi-threaded code easier (e.g. it would guarantee that if two threads try to perform the above sequence simultaneously, i would get incremented by two) and it's not totally inconceivable that some future processor might provide a feature something like that.



                                  If the compiler were to write i++ as indicated above (legal under the standard) and were to intersperse the above instructions throughout the evaluation of the overall expression (also legal), and if it didn't happen to notice that one of the other instructions happened to access i, it would be possible (and legal) for the compiler to generate a sequence of instructions that would deadlock. To be sure, a compiler would almost certainly detect the problem in the case where the same variable i is used in both places, but if a routine accepts references to two pointers p and q, and uses (*p) and (*q) in the above expression (rather than using i twice) the compiler would not be required to recognize or avoid the deadlock that would occur if the same object's address were passed for both p and q.






                                  share|improve this answer















                                  While it is unlikely that any compilers and processors would actually do so, it would be legal, under the C standard, for the compiler to implement "i++" with the sequence:



                                  In a single operation, read `i` and lock it to prevent access until further notice
                                  Compute (1+read_value)
                                  In a single operation, unlock `i` and store the computed value


                                  While I don't think any processors support the hardware to allow such a thing to be done efficiently, one can easily imagine situations where such behavior would make multi-threaded code easier (e.g. it would guarantee that if two threads try to perform the above sequence simultaneously, i would get incremented by two) and it's not totally inconceivable that some future processor might provide a feature something like that.



                                  If the compiler were to write i++ as indicated above (legal under the standard) and were to intersperse the above instructions throughout the evaluation of the overall expression (also legal), and if it didn't happen to notice that one of the other instructions happened to access i, it would be possible (and legal) for the compiler to generate a sequence of instructions that would deadlock. To be sure, a compiler would almost certainly detect the problem in the case where the same variable i is used in both places, but if a routine accepts references to two pointers p and q, and uses (*p) and (*q) in the above expression (rather than using i twice) the compiler would not be required to recognize or avoid the deadlock that would occur if the same object's address were passed for both p and q.







                                  share|improve this answer














                                  share|improve this answer



                                  share|improve this answer








                                  edited Oct 23 '17 at 20:06

























                                  answered Dec 5 '12 at 18:30









                                  supercatsupercat

                                  56.6k2117151




                                  56.6k2117151























                                      21














                                      Often this question is linked as a duplicate of questions related to code like



                                      printf("%d %dn", i, i++);


                                      or



                                      printf("%d %dn", ++i, i++);


                                      or similar variants.



                                      While this is also undefined behaviour as stated already, there are subtle differences when printf() is involved when comparing to a statement such as:



                                         x = i++ + i++;




                                      In the following statement:



                                      printf("%d %dn", ++i, i++);


                                      the order of evaluation of arguments in printf() is unspecified. That means, expressions i++ and ++i could be evaluated in any order. C11 standard has some relevant descriptions on this:



                                      Annex J, unspecified behaviours




                                      The order in which the function designator, arguments, and
                                      subexpressions within the arguments are evaluated in a function call
                                      (6.5.2.2).




                                      3.4.4, unspecified behavior




                                      Use of an unspecified value, or other behavior where this
                                      International Standard provides two or more possibilities and imposes
                                      no further requirements on which is chosen in any instance.



                                      EXAMPLE An example of unspecified behavior is the order in which the
                                      arguments to a function are evaluated.




                                      The unspecified behaviour itself is NOT an issue. Consider this example:



                                      printf("%d %dn", ++x, y++);


                                      This too has unspecified behaviour because the order of evaluation of ++x and y++ is unspecified. But it's perfectly legal and valid statement. There's no undefined behaviour in this statement. Because the modifications (++x and y++) are done to distinct objects.



                                      What renders the following statement



                                      printf("%d %dn", ++i, i++);


                                      as undefined behaviour is the fact that these two expressions modify the same object i without an intervening sequence point.





                                      Another detail is that the comma involved in the printf() call is a separator, not the comma operator.



                                      This is an important distinction because the comma operator does introduce a sequence point between the evaluation of their operands, which makes the following legal:



                                      int i = 5;
                                      int j;

                                      j = (++i, i++); // No undefined behaviour here because the comma operator
                                      // introduces a sequence point between '++i' and 'i++'

                                      printf("i=%d j=%dn",i, j); // prints: i=7 j=6


                                      The comma operator evaluates its operands left-to-right and yields only the value of the last operand. So in j = (++i, i++);, ++i increments i to 6 and i++ yields old value of i (6) which is assigned to j. Then i becomes 7 due to post-increment.



                                      So if the comma in the function call were to be a comma operator then



                                      printf("%d %dn", ++i, i++);


                                      will not be a problem. But it invokes undefined behaviour because the comma here is a separator.





                                      For those who are new to undefined behaviour would benefit from reading What Every C Programmer Should Know About Undefined Behavior to understand the concept and many other variants of undefined behaviour in C.



                                      This post: Undefined, unspecified and implementation-defined behavior is also relevant.






                                      share|improve this answer


























                                      • This sequence int a = 10, b = 20, c = 30; printf("a=%d b=%d c=%dn", (a = a + b + c), (b = b + b), (c = c + c)); appears to give stable behavior (right-to-left argument evaluation in gcc v7.3.0; result "a=110 b=40 c=60"). Is it because the assignments are considered as 'full-statements' and thus introduce a sequence point? Shouldn't that result in left-to-right argument/statement evaluation? Or, is it just manifestation of undefined behavior?

                                        – kavadias
                                        Oct 17 '18 at 20:20











                                      • @kavadias That printf statement involves undefined behaviour, for the same reason explained above. You are writing b and c in 3rd & 4th arguments respectively and reading in 2nd argument. But there's no sequence between these expressions (2nd, 3rd, & 4th args). gcc/clang has an option -Wsequence-point which can help find these, too.

                                        – P.P.
                                        Oct 18 '18 at 8:40
















                                      21














                                      Often this question is linked as a duplicate of questions related to code like



                                      printf("%d %dn", i, i++);


                                      or



                                      printf("%d %dn", ++i, i++);


                                      or similar variants.



                                      While this is also undefined behaviour as stated already, there are subtle differences when printf() is involved when comparing to a statement such as:



                                         x = i++ + i++;




                                      In the following statement:



                                      printf("%d %dn", ++i, i++);


                                      the order of evaluation of arguments in printf() is unspecified. That means, expressions i++ and ++i could be evaluated in any order. C11 standard has some relevant descriptions on this:



                                      Annex J, unspecified behaviours




                                      The order in which the function designator, arguments, and
                                      subexpressions within the arguments are evaluated in a function call
                                      (6.5.2.2).




                                      3.4.4, unspecified behavior




                                      Use of an unspecified value, or other behavior where this
                                      International Standard provides two or more possibilities and imposes
                                      no further requirements on which is chosen in any instance.



                                      EXAMPLE An example of unspecified behavior is the order in which the
                                      arguments to a function are evaluated.




                                      The unspecified behaviour itself is NOT an issue. Consider this example:



                                      printf("%d %dn", ++x, y++);


                                      This too has unspecified behaviour because the order of evaluation of ++x and y++ is unspecified. But it's perfectly legal and valid statement. There's no undefined behaviour in this statement. Because the modifications (++x and y++) are done to distinct objects.



                                      What renders the following statement



                                      printf("%d %dn", ++i, i++);


                                      as undefined behaviour is the fact that these two expressions modify the same object i without an intervening sequence point.





                                      Another detail is that the comma involved in the printf() call is a separator, not the comma operator.



                                      This is an important distinction because the comma operator does introduce a sequence point between the evaluation of their operands, which makes the following legal:



                                      int i = 5;
                                      int j;

                                      j = (++i, i++); // No undefined behaviour here because the comma operator
                                      // introduces a sequence point between '++i' and 'i++'

                                      printf("i=%d j=%dn",i, j); // prints: i=7 j=6


                                      The comma operator evaluates its operands left-to-right and yields only the value of the last operand. So in j = (++i, i++);, ++i increments i to 6 and i++ yields old value of i (6) which is assigned to j. Then i becomes 7 due to post-increment.



                                      So if the comma in the function call were to be a comma operator then



                                      printf("%d %dn", ++i, i++);


                                      will not be a problem. But it invokes undefined behaviour because the comma here is a separator.





                                      For those who are new to undefined behaviour would benefit from reading What Every C Programmer Should Know About Undefined Behavior to understand the concept and many other variants of undefined behaviour in C.



                                      This post: Undefined, unspecified and implementation-defined behavior is also relevant.






                                      share|improve this answer


























                                      • This sequence int a = 10, b = 20, c = 30; printf("a=%d b=%d c=%dn", (a = a + b + c), (b = b + b), (c = c + c)); appears to give stable behavior (right-to-left argument evaluation in gcc v7.3.0; result "a=110 b=40 c=60"). Is it because the assignments are considered as 'full-statements' and thus introduce a sequence point? Shouldn't that result in left-to-right argument/statement evaluation? Or, is it just manifestation of undefined behavior?

                                        – kavadias
                                        Oct 17 '18 at 20:20











                                      • @kavadias That printf statement involves undefined behaviour, for the same reason explained above. You are writing b and c in 3rd & 4th arguments respectively and reading in 2nd argument. But there's no sequence between these expressions (2nd, 3rd, & 4th args). gcc/clang has an option -Wsequence-point which can help find these, too.

                                        – P.P.
                                        Oct 18 '18 at 8:40














                                      21












                                      21








                                      21







                                      Often this question is linked as a duplicate of questions related to code like



                                      printf("%d %dn", i, i++);


                                      or



                                      printf("%d %dn", ++i, i++);


                                      or similar variants.



                                      While this is also undefined behaviour as stated already, there are subtle differences when printf() is involved when comparing to a statement such as:



                                         x = i++ + i++;




                                      In the following statement:



                                      printf("%d %dn", ++i, i++);


                                      the order of evaluation of arguments in printf() is unspecified. That means, expressions i++ and ++i could be evaluated in any order. C11 standard has some relevant descriptions on this:



                                      Annex J, unspecified behaviours




                                      The order in which the function designator, arguments, and
                                      subexpressions within the arguments are evaluated in a function call
                                      (6.5.2.2).




                                      3.4.4, unspecified behavior




                                      Use of an unspecified value, or other behavior where this
                                      International Standard provides two or more possibilities and imposes
                                      no further requirements on which is chosen in any instance.



                                      EXAMPLE An example of unspecified behavior is the order in which the
                                      arguments to a function are evaluated.




                                      The unspecified behaviour itself is NOT an issue. Consider this example:



                                      printf("%d %dn", ++x, y++);


                                      This too has unspecified behaviour because the order of evaluation of ++x and y++ is unspecified. But it's perfectly legal and valid statement. There's no undefined behaviour in this statement. Because the modifications (++x and y++) are done to distinct objects.



                                      What renders the following statement



                                      printf("%d %dn", ++i, i++);


                                      as undefined behaviour is the fact that these two expressions modify the same object i without an intervening sequence point.





                                      Another detail is that the comma involved in the printf() call is a separator, not the comma operator.



                                      This is an important distinction because the comma operator does introduce a sequence point between the evaluation of their operands, which makes the following legal:



                                      int i = 5;
                                      int j;

                                      j = (++i, i++); // No undefined behaviour here because the comma operator
                                      // introduces a sequence point between '++i' and 'i++'

                                      printf("i=%d j=%dn",i, j); // prints: i=7 j=6


                                      The comma operator evaluates its operands left-to-right and yields only the value of the last operand. So in j = (++i, i++);, ++i increments i to 6 and i++ yields old value of i (6) which is assigned to j. Then i becomes 7 due to post-increment.



                                      So if the comma in the function call were to be a comma operator then



                                      printf("%d %dn", ++i, i++);


                                      will not be a problem. But it invokes undefined behaviour because the comma here is a separator.





                                      For those who are new to undefined behaviour would benefit from reading What Every C Programmer Should Know About Undefined Behavior to understand the concept and many other variants of undefined behaviour in C.



                                      This post: Undefined, unspecified and implementation-defined behavior is also relevant.






                                      share|improve this answer















                                      Often this question is linked as a duplicate of questions related to code like



                                      printf("%d %dn", i, i++);


                                      or



                                      printf("%d %dn", ++i, i++);


                                      or similar variants.



                                      While this is also undefined behaviour as stated already, there are subtle differences when printf() is involved when comparing to a statement such as:



                                         x = i++ + i++;




                                      In the following statement:



                                      printf("%d %dn", ++i, i++);


                                      the order of evaluation of arguments in printf() is unspecified. That means, expressions i++ and ++i could be evaluated in any order. C11 standard has some relevant descriptions on this:



                                      Annex J, unspecified behaviours




                                      The order in which the function designator, arguments, and
                                      subexpressions within the arguments are evaluated in a function call
                                      (6.5.2.2).




                                      3.4.4, unspecified behavior




                                      Use of an unspecified value, or other behavior where this
                                      International Standard provides two or more possibilities and imposes
                                      no further requirements on which is chosen in any instance.



                                      EXAMPLE An example of unspecified behavior is the order in which the
                                      arguments to a function are evaluated.




                                      The unspecified behaviour itself is NOT an issue. Consider this example:



                                      printf("%d %dn", ++x, y++);


                                      This too has unspecified behaviour because the order of evaluation of ++x and y++ is unspecified. But it's perfectly legal and valid statement. There's no undefined behaviour in this statement. Because the modifications (++x and y++) are done to distinct objects.



                                      What renders the following statement



                                      printf("%d %dn", ++i, i++);


                                      as undefined behaviour is the fact that these two expressions modify the same object i without an intervening sequence point.





                                      Another detail is that the comma involved in the printf() call is a separator, not the comma operator.



                                      This is an important distinction because the comma operator does introduce a sequence point between the evaluation of their operands, which makes the following legal:



                                      int i = 5;
                                      int j;

                                      j = (++i, i++); // No undefined behaviour here because the comma operator
                                      // introduces a sequence point between '++i' and 'i++'

                                      printf("i=%d j=%dn",i, j); // prints: i=7 j=6


                                      The comma operator evaluates its operands left-to-right and yields only the value of the last operand. So in j = (++i, i++);, ++i increments i to 6 and i++ yields old value of i (6) which is assigned to j. Then i becomes 7 due to post-increment.



                                      So if the comma in the function call were to be a comma operator then



                                      printf("%d %dn", ++i, i++);


                                      will not be a problem. But it invokes undefined behaviour because the comma here is a separator.





                                      For those who are new to undefined behaviour would benefit from reading What Every C Programmer Should Know About Undefined Behavior to understand the concept and many other variants of undefined behaviour in C.



                                      This post: Undefined, unspecified and implementation-defined behavior is also relevant.







                                      share|improve this answer














                                      share|improve this answer



                                      share|improve this answer








                                      edited May 23 '17 at 11:55









                                      Community

                                      11




                                      11










                                      answered Dec 30 '15 at 20:26









                                      P.P.P.P.

                                      75k11105155




                                      75k11105155













                                      • This sequence int a = 10, b = 20, c = 30; printf("a=%d b=%d c=%dn", (a = a + b + c), (b = b + b), (c = c + c)); appears to give stable behavior (right-to-left argument evaluation in gcc v7.3.0; result "a=110 b=40 c=60"). Is it because the assignments are considered as 'full-statements' and thus introduce a sequence point? Shouldn't that result in left-to-right argument/statement evaluation? Or, is it just manifestation of undefined behavior?

                                        – kavadias
                                        Oct 17 '18 at 20:20











                                      • @kavadias That printf statement involves undefined behaviour, for the same reason explained above. You are writing b and c in 3rd & 4th arguments respectively and reading in 2nd argument. But there's no sequence between these expressions (2nd, 3rd, & 4th args). gcc/clang has an option -Wsequence-point which can help find these, too.

                                        – P.P.
                                        Oct 18 '18 at 8:40



















                                      • This sequence int a = 10, b = 20, c = 30; printf("a=%d b=%d c=%dn", (a = a + b + c), (b = b + b), (c = c + c)); appears to give stable behavior (right-to-left argument evaluation in gcc v7.3.0; result "a=110 b=40 c=60"). Is it because the assignments are considered as 'full-statements' and thus introduce a sequence point? Shouldn't that result in left-to-right argument/statement evaluation? Or, is it just manifestation of undefined behavior?

                                        – kavadias
                                        Oct 17 '18 at 20:20











                                      • @kavadias That printf statement involves undefined behaviour, for the same reason explained above. You are writing b and c in 3rd & 4th arguments respectively and reading in 2nd argument. But there's no sequence between these expressions (2nd, 3rd, & 4th args). gcc/clang has an option -Wsequence-point which can help find these, too.

                                        – P.P.
                                        Oct 18 '18 at 8:40

















                                      This sequence int a = 10, b = 20, c = 30; printf("a=%d b=%d c=%dn", (a = a + b + c), (b = b + b), (c = c + c)); appears to give stable behavior (right-to-left argument evaluation in gcc v7.3.0; result "a=110 b=40 c=60"). Is it because the assignments are considered as 'full-statements' and thus introduce a sequence point? Shouldn't that result in left-to-right argument/statement evaluation? Or, is it just manifestation of undefined behavior?

                                      – kavadias
                                      Oct 17 '18 at 20:20





                                      This sequence int a = 10, b = 20, c = 30; printf("a=%d b=%d c=%dn", (a = a + b + c), (b = b + b), (c = c + c)); appears to give stable behavior (right-to-left argument evaluation in gcc v7.3.0; result "a=110 b=40 c=60"). Is it because the assignments are considered as 'full-statements' and thus introduce a sequence point? Shouldn't that result in left-to-right argument/statement evaluation? Or, is it just manifestation of undefined behavior?

                                      – kavadias
                                      Oct 17 '18 at 20:20













                                      @kavadias That printf statement involves undefined behaviour, for the same reason explained above. You are writing b and c in 3rd & 4th arguments respectively and reading in 2nd argument. But there's no sequence between these expressions (2nd, 3rd, & 4th args). gcc/clang has an option -Wsequence-point which can help find these, too.

                                      – P.P.
                                      Oct 18 '18 at 8:40





                                      @kavadias That printf statement involves undefined behaviour, for the same reason explained above. You are writing b and c in 3rd & 4th arguments respectively and reading in 2nd argument. But there's no sequence between these expressions (2nd, 3rd, & 4th args). gcc/clang has an option -Wsequence-point which can help find these, too.

                                      – P.P.
                                      Oct 18 '18 at 8:40











                                      13














                                      The C standard says that a variable should only be assigned at most once between two sequence points. A semi-colon for instance is a sequence point.

                                      So every statement of the form:



                                      i = i++;
                                      i = i++ + ++i;


                                      and so on violate that rule. The standard also says that behavior is undefined and not unspecified. Some compilers do detect these and produce some result but this is not per standard.



                                      However, two different variables can be incremented between two sequence points.



                                      while(*src++ = *dst++);


                                      The above is a common coding practice while copying/analysing strings.






                                      share|improve this answer
























                                      • Of course it doesn't apply to different variables within one expression. It would be a total design failure if it did! All you need in the 2nd example is for both to be incremented between the statement ending and the next one beginning, and that's guaranteed, precisely because of the concept of sequence points at the centre of all this.

                                        – underscore_d
                                        Jul 19 '16 at 18:55


















                                      13














                                      The C standard says that a variable should only be assigned at most once between two sequence points. A semi-colon for instance is a sequence point.

                                      So every statement of the form:



                                      i = i++;
                                      i = i++ + ++i;


                                      and so on violate that rule. The standard also says that behavior is undefined and not unspecified. Some compilers do detect these and produce some result but this is not per standard.



                                      However, two different variables can be incremented between two sequence points.



                                      while(*src++ = *dst++);


                                      The above is a common coding practice while copying/analysing strings.






                                      share|improve this answer
























                                      • Of course it doesn't apply to different variables within one expression. It would be a total design failure if it did! All you need in the 2nd example is for both to be incremented between the statement ending and the next one beginning, and that's guaranteed, precisely because of the concept of sequence points at the centre of all this.

                                        – underscore_d
                                        Jul 19 '16 at 18:55
















                                      13












                                      13








                                      13







                                      The C standard says that a variable should only be assigned at most once between two sequence points. A semi-colon for instance is a sequence point.

                                      So every statement of the form:



                                      i = i++;
                                      i = i++ + ++i;


                                      and so on violate that rule. The standard also says that behavior is undefined and not unspecified. Some compilers do detect these and produce some result but this is not per standard.



                                      However, two different variables can be incremented between two sequence points.



                                      while(*src++ = *dst++);


                                      The above is a common coding practice while copying/analysing strings.






                                      share|improve this answer













                                      The C standard says that a variable should only be assigned at most once between two sequence points. A semi-colon for instance is a sequence point.

                                      So every statement of the form:



                                      i = i++;
                                      i = i++ + ++i;


                                      and so on violate that rule. The standard also says that behavior is undefined and not unspecified. Some compilers do detect these and produce some result but this is not per standard.



                                      However, two different variables can be incremented between two sequence points.



                                      while(*src++ = *dst++);


                                      The above is a common coding practice while copying/analysing strings.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Sep 11 '14 at 12:36









                                      Nikhil VidhaniNikhil Vidhani

                                      45339




                                      45339













                                      • Of course it doesn't apply to different variables within one expression. It would be a total design failure if it did! All you need in the 2nd example is for both to be incremented between the statement ending and the next one beginning, and that's guaranteed, precisely because of the concept of sequence points at the centre of all this.

                                        – underscore_d
                                        Jul 19 '16 at 18:55





















                                      • Of course it doesn't apply to different variables within one expression. It would be a total design failure if it did! All you need in the 2nd example is for both to be incremented between the statement ending and the next one beginning, and that's guaranteed, precisely because of the concept of sequence points at the centre of all this.

                                        – underscore_d
                                        Jul 19 '16 at 18:55



















                                      Of course it doesn't apply to different variables within one expression. It would be a total design failure if it did! All you need in the 2nd example is for both to be incremented between the statement ending and the next one beginning, and that's guaranteed, precisely because of the concept of sequence points at the centre of all this.

                                      – underscore_d
                                      Jul 19 '16 at 18:55







                                      Of course it doesn't apply to different variables within one expression. It would be a total design failure if it did! All you need in the 2nd example is for both to be incremented between the statement ending and the next one beginning, and that's guaranteed, precisely because of the concept of sequence points at the centre of all this.

                                      – underscore_d
                                      Jul 19 '16 at 18:55













                                      11














                                      While the syntax of the expressions like a = a++ or a++ + a++ is legal, the behaviour of these constructs is undefined because a shall in C standard is not obeyed. C99 6.5p2:





                                      1. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. [72] Furthermore, the prior value shall be read only to determine the value to be stored [73]




                                      With footnote 73 further clarifying that






                                      1. This paragraph renders undefined statement expressions such as



                                        i = ++i + 1;
                                        a[i++] = i;


                                        while allowing



                                        i = i + 1;
                                        a[i] = i;





                                      The various sequence points are listed in Annex C of C11 (and C99):






                                      1. The following are the sequence points described in 5.1.2.3:




                                        • Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).

                                        • Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).

                                        • Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).

                                        • The end of a full declarator: declarators (6.7.6);

                                        • Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).

                                        • Immediately before a library function returns (7.1.4).

                                        • After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.29.2).

                                        • Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).






                                      The wording of the same paragraph in C11 is:





                                      1. If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)






                                      You can detect such errors in a program by for example using a recent version of GCC with -Wall and -Werror, and then GCC will outright refuse to compile your program. The following is the output of gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005:



                                      % gcc plusplus.c -Wall -Werror -pedantic
                                      plusplus.c: In function ‘main’:
                                      plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                      i = i++ + ++i;
                                      ~~^~~~~~~~~~~
                                      plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                      plusplus.c:10:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                      i = (i++);
                                      ~~^~~~~~~
                                      plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                      u = u++ + ++u;
                                      ~~^~~~~~~~~~~
                                      plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                      plusplus.c:18:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                      u = (u++);
                                      ~~^~~~~~~
                                      plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
                                      v = v++ + ++v;
                                      ~~^~~~~~~~~~~
                                      plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
                                      cc1: all warnings being treated as errors


                                      The important part is to know what a sequence point is -- and what is a sequence point and what isn't. For example the comma operator is a sequence point, so



                                      j = (i ++, ++ i);


                                      is well-defined, and will increment i by one, yielding the old value, discard that value; then at comma operator, settle the side effects; and then increment i by one, and the resulting value becomes the value of the expression - i.e. this is just a contrived way to write j = (i += 2) which is yet again a "clever" way to write



                                      i += 2;
                                      j = i;


                                      However, the , in function argument lists is not a comma operator, and there is no sequence point between evaluations of distinct arguments; instead their evaluations are unsequenced with regard to each other; so the function call



                                      int i = 0;
                                      printf("%d %dn", i++, ++i, i);


                                      has undefined behaviour because there is no sequence point between the evaluations of i++ and ++i in function arguments, and the value of i is therefore modified twice, by both i++ and ++i, between the previous and the next sequence point.






                                      share|improve this answer






























                                        11














                                        While the syntax of the expressions like a = a++ or a++ + a++ is legal, the behaviour of these constructs is undefined because a shall in C standard is not obeyed. C99 6.5p2:





                                        1. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. [72] Furthermore, the prior value shall be read only to determine the value to be stored [73]




                                        With footnote 73 further clarifying that






                                        1. This paragraph renders undefined statement expressions such as



                                          i = ++i + 1;
                                          a[i++] = i;


                                          while allowing



                                          i = i + 1;
                                          a[i] = i;





                                        The various sequence points are listed in Annex C of C11 (and C99):






                                        1. The following are the sequence points described in 5.1.2.3:




                                          • Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).

                                          • Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).

                                          • Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).

                                          • The end of a full declarator: declarators (6.7.6);

                                          • Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).

                                          • Immediately before a library function returns (7.1.4).

                                          • After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.29.2).

                                          • Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).






                                        The wording of the same paragraph in C11 is:





                                        1. If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)






                                        You can detect such errors in a program by for example using a recent version of GCC with -Wall and -Werror, and then GCC will outright refuse to compile your program. The following is the output of gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005:



                                        % gcc plusplus.c -Wall -Werror -pedantic
                                        plusplus.c: In function ‘main’:
                                        plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                        i = i++ + ++i;
                                        ~~^~~~~~~~~~~
                                        plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                        plusplus.c:10:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                        i = (i++);
                                        ~~^~~~~~~
                                        plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                        u = u++ + ++u;
                                        ~~^~~~~~~~~~~
                                        plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                        plusplus.c:18:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                        u = (u++);
                                        ~~^~~~~~~
                                        plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
                                        v = v++ + ++v;
                                        ~~^~~~~~~~~~~
                                        plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
                                        cc1: all warnings being treated as errors


                                        The important part is to know what a sequence point is -- and what is a sequence point and what isn't. For example the comma operator is a sequence point, so



                                        j = (i ++, ++ i);


                                        is well-defined, and will increment i by one, yielding the old value, discard that value; then at comma operator, settle the side effects; and then increment i by one, and the resulting value becomes the value of the expression - i.e. this is just a contrived way to write j = (i += 2) which is yet again a "clever" way to write



                                        i += 2;
                                        j = i;


                                        However, the , in function argument lists is not a comma operator, and there is no sequence point between evaluations of distinct arguments; instead their evaluations are unsequenced with regard to each other; so the function call



                                        int i = 0;
                                        printf("%d %dn", i++, ++i, i);


                                        has undefined behaviour because there is no sequence point between the evaluations of i++ and ++i in function arguments, and the value of i is therefore modified twice, by both i++ and ++i, between the previous and the next sequence point.






                                        share|improve this answer




























                                          11












                                          11








                                          11







                                          While the syntax of the expressions like a = a++ or a++ + a++ is legal, the behaviour of these constructs is undefined because a shall in C standard is not obeyed. C99 6.5p2:





                                          1. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. [72] Furthermore, the prior value shall be read only to determine the value to be stored [73]




                                          With footnote 73 further clarifying that






                                          1. This paragraph renders undefined statement expressions such as



                                            i = ++i + 1;
                                            a[i++] = i;


                                            while allowing



                                            i = i + 1;
                                            a[i] = i;





                                          The various sequence points are listed in Annex C of C11 (and C99):






                                          1. The following are the sequence points described in 5.1.2.3:




                                            • Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).

                                            • Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).

                                            • Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).

                                            • The end of a full declarator: declarators (6.7.6);

                                            • Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).

                                            • Immediately before a library function returns (7.1.4).

                                            • After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.29.2).

                                            • Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).






                                          The wording of the same paragraph in C11 is:





                                          1. If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)






                                          You can detect such errors in a program by for example using a recent version of GCC with -Wall and -Werror, and then GCC will outright refuse to compile your program. The following is the output of gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005:



                                          % gcc plusplus.c -Wall -Werror -pedantic
                                          plusplus.c: In function ‘main’:
                                          plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                          i = i++ + ++i;
                                          ~~^~~~~~~~~~~
                                          plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                          plusplus.c:10:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                          i = (i++);
                                          ~~^~~~~~~
                                          plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                          u = u++ + ++u;
                                          ~~^~~~~~~~~~~
                                          plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                          plusplus.c:18:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                          u = (u++);
                                          ~~^~~~~~~
                                          plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
                                          v = v++ + ++v;
                                          ~~^~~~~~~~~~~
                                          plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
                                          cc1: all warnings being treated as errors


                                          The important part is to know what a sequence point is -- and what is a sequence point and what isn't. For example the comma operator is a sequence point, so



                                          j = (i ++, ++ i);


                                          is well-defined, and will increment i by one, yielding the old value, discard that value; then at comma operator, settle the side effects; and then increment i by one, and the resulting value becomes the value of the expression - i.e. this is just a contrived way to write j = (i += 2) which is yet again a "clever" way to write



                                          i += 2;
                                          j = i;


                                          However, the , in function argument lists is not a comma operator, and there is no sequence point between evaluations of distinct arguments; instead their evaluations are unsequenced with regard to each other; so the function call



                                          int i = 0;
                                          printf("%d %dn", i++, ++i, i);


                                          has undefined behaviour because there is no sequence point between the evaluations of i++ and ++i in function arguments, and the value of i is therefore modified twice, by both i++ and ++i, between the previous and the next sequence point.






                                          share|improve this answer















                                          While the syntax of the expressions like a = a++ or a++ + a++ is legal, the behaviour of these constructs is undefined because a shall in C standard is not obeyed. C99 6.5p2:





                                          1. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. [72] Furthermore, the prior value shall be read only to determine the value to be stored [73]




                                          With footnote 73 further clarifying that






                                          1. This paragraph renders undefined statement expressions such as



                                            i = ++i + 1;
                                            a[i++] = i;


                                            while allowing



                                            i = i + 1;
                                            a[i] = i;





                                          The various sequence points are listed in Annex C of C11 (and C99):






                                          1. The following are the sequence points described in 5.1.2.3:




                                            • Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).

                                            • Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).

                                            • Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).

                                            • The end of a full declarator: declarators (6.7.6);

                                            • Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).

                                            • Immediately before a library function returns (7.1.4).

                                            • After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.29.2).

                                            • Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).






                                          The wording of the same paragraph in C11 is:





                                          1. If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)






                                          You can detect such errors in a program by for example using a recent version of GCC with -Wall and -Werror, and then GCC will outright refuse to compile your program. The following is the output of gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005:



                                          % gcc plusplus.c -Wall -Werror -pedantic
                                          plusplus.c: In function ‘main’:
                                          plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                          i = i++ + ++i;
                                          ~~^~~~~~~~~~~
                                          plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                          plusplus.c:10:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
                                          i = (i++);
                                          ~~^~~~~~~
                                          plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                          u = u++ + ++u;
                                          ~~^~~~~~~~~~~
                                          plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                          plusplus.c:18:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
                                          u = (u++);
                                          ~~^~~~~~~
                                          plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
                                          v = v++ + ++v;
                                          ~~^~~~~~~~~~~
                                          plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
                                          cc1: all warnings being treated as errors


                                          The important part is to know what a sequence point is -- and what is a sequence point and what isn't. For example the comma operator is a sequence point, so



                                          j = (i ++, ++ i);


                                          is well-defined, and will increment i by one, yielding the old value, discard that value; then at comma operator, settle the side effects; and then increment i by one, and the resulting value becomes the value of the expression - i.e. this is just a contrived way to write j = (i += 2) which is yet again a "clever" way to write



                                          i += 2;
                                          j = i;


                                          However, the , in function argument lists is not a comma operator, and there is no sequence point between evaluations of distinct arguments; instead their evaluations are unsequenced with regard to each other; so the function call



                                          int i = 0;
                                          printf("%d %dn", i++, ++i, i);


                                          has undefined behaviour because there is no sequence point between the evaluations of i++ and ++i in function arguments, and the value of i is therefore modified twice, by both i++ and ++i, between the previous and the next sequence point.







                                          share|improve this answer














                                          share|improve this answer



                                          share|improve this answer








                                          edited Sep 7 '18 at 12:49

























                                          answered Mar 26 '17 at 14:58









                                          Antti HaapalaAntti Haapala

                                          82.2k16154195




                                          82.2k16154195























                                              9














                                              In https://stackoverflow.com/questions/29505280/incrementing-array-index-in-c someone asked about a statement like:



                                              int k = {0,1,2,3,4,5,6,7,8,9,10};
                                              int i = 0;
                                              int num;
                                              num = k[++i+k[++i]] + k[++i];
                                              printf("%d", num);


                                              which prints 7... the OP expected it to print 6.



                                              The ++i increments aren't guaranteed to all complete before the rest of the calculations. In fact, different compilers will get different results here. In the example you provided, the first 2 ++i executed, then the values of k were read, then the last ++i then k.



                                              num = k[i+1]+k[i+2] + k[i+3];
                                              i += 3


                                              Modern compilers will optimize this very well. In fact, possibly better than the code you originally wrote (assuming it had worked the way you had hoped).






                                              share|improve this answer






























                                                9














                                                In https://stackoverflow.com/questions/29505280/incrementing-array-index-in-c someone asked about a statement like:



                                                int k = {0,1,2,3,4,5,6,7,8,9,10};
                                                int i = 0;
                                                int num;
                                                num = k[++i+k[++i]] + k[++i];
                                                printf("%d", num);


                                                which prints 7... the OP expected it to print 6.



                                                The ++i increments aren't guaranteed to all complete before the rest of the calculations. In fact, different compilers will get different results here. In the example you provided, the first 2 ++i executed, then the values of k were read, then the last ++i then k.



                                                num = k[i+1]+k[i+2] + k[i+3];
                                                i += 3


                                                Modern compilers will optimize this very well. In fact, possibly better than the code you originally wrote (assuming it had worked the way you had hoped).






                                                share|improve this answer




























                                                  9












                                                  9








                                                  9







                                                  In https://stackoverflow.com/questions/29505280/incrementing-array-index-in-c someone asked about a statement like:



                                                  int k = {0,1,2,3,4,5,6,7,8,9,10};
                                                  int i = 0;
                                                  int num;
                                                  num = k[++i+k[++i]] + k[++i];
                                                  printf("%d", num);


                                                  which prints 7... the OP expected it to print 6.



                                                  The ++i increments aren't guaranteed to all complete before the rest of the calculations. In fact, different compilers will get different results here. In the example you provided, the first 2 ++i executed, then the values of k were read, then the last ++i then k.



                                                  num = k[i+1]+k[i+2] + k[i+3];
                                                  i += 3


                                                  Modern compilers will optimize this very well. In fact, possibly better than the code you originally wrote (assuming it had worked the way you had hoped).






                                                  share|improve this answer















                                                  In https://stackoverflow.com/questions/29505280/incrementing-array-index-in-c someone asked about a statement like:



                                                  int k = {0,1,2,3,4,5,6,7,8,9,10};
                                                  int i = 0;
                                                  int num;
                                                  num = k[++i+k[++i]] + k[++i];
                                                  printf("%d", num);


                                                  which prints 7... the OP expected it to print 6.



                                                  The ++i increments aren't guaranteed to all complete before the rest of the calculations. In fact, different compilers will get different results here. In the example you provided, the first 2 ++i executed, then the values of k were read, then the last ++i then k.



                                                  num = k[i+1]+k[i+2] + k[i+3];
                                                  i += 3


                                                  Modern compilers will optimize this very well. In fact, possibly better than the code you originally wrote (assuming it had worked the way you had hoped).







                                                  share|improve this answer














                                                  share|improve this answer



                                                  share|improve this answer








                                                  edited May 23 '17 at 11:47









                                                  Community

                                                  11




                                                  11










                                                  answered Apr 8 '15 at 3:20









                                                  TomOnTimeTomOnTime

                                                  2,0802129




                                                  2,0802129























                                                      5














                                                      A good explanation about what happens in this kind of computation is provided in the document n1188 from the ISO W14 site.



                                                      I explain the ideas.



                                                      The main rule from the standard ISO 9899 that applies in this situation is 6.5p2.




                                                      Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.




                                                      The sequence points in an expression like i=i++ are before i= and after i++.



                                                      In the paper that I quoted above it is explained that you can figure out the program as being formed by small boxes, each box containing the instructions between 2 consecutive sequence points. The sequence points are defined in annex C of the standard, in the case of i=i++ there are 2 sequence points that delimit a full-expression. Such an expression is syntactically equivalent with an entry of expression-statement in the Backus-Naur form of the grammar (a grammar is provided in annex A of the Standard).



                                                      So the order of instructions inside a box has no clear order.



                                                      i=i++


                                                      can be interpreted as



                                                      tmp = i
                                                      i=i+1
                                                      i = tmp


                                                      or as



                                                      tmp = i
                                                      i = tmp
                                                      i=i+1


                                                      because both all these forms to interpret the code i=i++ are valid and because both generate different answers, the behavior is undefined.



                                                      So a sequence point can be seen by the beginning and the end of each box that composes the program [the boxes are atomic units in C] and inside a box the order of instructions is not defined in all cases. Changing that order one can change the result sometimes.



                                                      EDIT:



                                                      Other good source for explaining such ambiguities are the entries from c-faq site (also published as a book) , namely here and here and here .






                                                      share|improve this answer


























                                                      • How this answer added new to the existing answers? Also the explanations for i=i++ is very similar to this answer.

                                                        – haccks
                                                        Nov 24 '17 at 7:00













                                                      • @haccks I did not read the other answers. I wanted to explain in my own language what I learned from the mentioned document from the official site of ISO 9899 open-std.org/jtc1/sc22/wg14/www/docs/n1188.pdf

                                                        – alinsoar
                                                        Nov 24 '17 at 12:14


















                                                      5














                                                      A good explanation about what happens in this kind of computation is provided in the document n1188 from the ISO W14 site.



                                                      I explain the ideas.



                                                      The main rule from the standard ISO 9899 that applies in this situation is 6.5p2.




                                                      Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.




                                                      The sequence points in an expression like i=i++ are before i= and after i++.



                                                      In the paper that I quoted above it is explained that you can figure out the program as being formed by small boxes, each box containing the instructions between 2 consecutive sequence points. The sequence points are defined in annex C of the standard, in the case of i=i++ there are 2 sequence points that delimit a full-expression. Such an expression is syntactically equivalent with an entry of expression-statement in the Backus-Naur form of the grammar (a grammar is provided in annex A of the Standard).



                                                      So the order of instructions inside a box has no clear order.



                                                      i=i++


                                                      can be interpreted as



                                                      tmp = i
                                                      i=i+1
                                                      i = tmp


                                                      or as



                                                      tmp = i
                                                      i = tmp
                                                      i=i+1


                                                      because both all these forms to interpret the code i=i++ are valid and because both generate different answers, the behavior is undefined.



                                                      So a sequence point can be seen by the beginning and the end of each box that composes the program [the boxes are atomic units in C] and inside a box the order of instructions is not defined in all cases. Changing that order one can change the result sometimes.



                                                      EDIT:



                                                      Other good source for explaining such ambiguities are the entries from c-faq site (also published as a book) , namely here and here and here .






                                                      share|improve this answer


























                                                      • How this answer added new to the existing answers? Also the explanations for i=i++ is very similar to this answer.

                                                        – haccks
                                                        Nov 24 '17 at 7:00













                                                      • @haccks I did not read the other answers. I wanted to explain in my own language what I learned from the mentioned document from the official site of ISO 9899 open-std.org/jtc1/sc22/wg14/www/docs/n1188.pdf

                                                        – alinsoar
                                                        Nov 24 '17 at 12:14
















                                                      5












                                                      5








                                                      5







                                                      A good explanation about what happens in this kind of computation is provided in the document n1188 from the ISO W14 site.



                                                      I explain the ideas.



                                                      The main rule from the standard ISO 9899 that applies in this situation is 6.5p2.




                                                      Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.




                                                      The sequence points in an expression like i=i++ are before i= and after i++.



                                                      In the paper that I quoted above it is explained that you can figure out the program as being formed by small boxes, each box containing the instructions between 2 consecutive sequence points. The sequence points are defined in annex C of the standard, in the case of i=i++ there are 2 sequence points that delimit a full-expression. Such an expression is syntactically equivalent with an entry of expression-statement in the Backus-Naur form of the grammar (a grammar is provided in annex A of the Standard).



                                                      So the order of instructions inside a box has no clear order.



                                                      i=i++


                                                      can be interpreted as



                                                      tmp = i
                                                      i=i+1
                                                      i = tmp


                                                      or as



                                                      tmp = i
                                                      i = tmp
                                                      i=i+1


                                                      because both all these forms to interpret the code i=i++ are valid and because both generate different answers, the behavior is undefined.



                                                      So a sequence point can be seen by the beginning and the end of each box that composes the program [the boxes are atomic units in C] and inside a box the order of instructions is not defined in all cases. Changing that order one can change the result sometimes.



                                                      EDIT:



                                                      Other good source for explaining such ambiguities are the entries from c-faq site (also published as a book) , namely here and here and here .






                                                      share|improve this answer















                                                      A good explanation about what happens in this kind of computation is provided in the document n1188 from the ISO W14 site.



                                                      I explain the ideas.



                                                      The main rule from the standard ISO 9899 that applies in this situation is 6.5p2.




                                                      Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.




                                                      The sequence points in an expression like i=i++ are before i= and after i++.



                                                      In the paper that I quoted above it is explained that you can figure out the program as being formed by small boxes, each box containing the instructions between 2 consecutive sequence points. The sequence points are defined in annex C of the standard, in the case of i=i++ there are 2 sequence points that delimit a full-expression. Such an expression is syntactically equivalent with an entry of expression-statement in the Backus-Naur form of the grammar (a grammar is provided in annex A of the Standard).



                                                      So the order of instructions inside a box has no clear order.



                                                      i=i++


                                                      can be interpreted as



                                                      tmp = i
                                                      i=i+1
                                                      i = tmp


                                                      or as



                                                      tmp = i
                                                      i = tmp
                                                      i=i+1


                                                      because both all these forms to interpret the code i=i++ are valid and because both generate different answers, the behavior is undefined.



                                                      So a sequence point can be seen by the beginning and the end of each box that composes the program [the boxes are atomic units in C] and inside a box the order of instructions is not defined in all cases. Changing that order one can change the result sometimes.



                                                      EDIT:



                                                      Other good source for explaining such ambiguities are the entries from c-faq site (also published as a book) , namely here and here and here .







                                                      share|improve this answer














                                                      share|improve this answer



                                                      share|improve this answer








                                                      edited Nov 24 '17 at 12:15

























                                                      answered Oct 13 '17 at 13:58









                                                      alinsoaralinsoar

                                                      8,14413047




                                                      8,14413047













                                                      • How this answer added new to the existing answers? Also the explanations for i=i++ is very similar to this answer.

                                                        – haccks
                                                        Nov 24 '17 at 7:00













                                                      • @haccks I did not read the other answers. I wanted to explain in my own language what I learned from the mentioned document from the official site of ISO 9899 open-std.org/jtc1/sc22/wg14/www/docs/n1188.pdf

                                                        – alinsoar
                                                        Nov 24 '17 at 12:14





















                                                      • How this answer added new to the existing answers? Also the explanations for i=i++ is very similar to this answer.

                                                        – haccks
                                                        Nov 24 '17 at 7:00













                                                      • @haccks I did not read the other answers. I wanted to explain in my own language what I learned from the mentioned document from the official site of ISO 9899 open-std.org/jtc1/sc22/wg14/www/docs/n1188.pdf

                                                        – alinsoar
                                                        Nov 24 '17 at 12:14



















                                                      How this answer added new to the existing answers? Also the explanations for i=i++ is very similar to this answer.

                                                      – haccks
                                                      Nov 24 '17 at 7:00







                                                      How this answer added new to the existing answers? Also the explanations for i=i++ is very similar to this answer.

                                                      – haccks
                                                      Nov 24 '17 at 7:00















                                                      @haccks I did not read the other answers. I wanted to explain in my own language what I learned from the mentioned document from the official site of ISO 9899 open-std.org/jtc1/sc22/wg14/www/docs/n1188.pdf

                                                      – alinsoar
                                                      Nov 24 '17 at 12:14







                                                      @haccks I did not read the other answers. I wanted to explain in my own language what I learned from the mentioned document from the official site of ISO 9899 open-std.org/jtc1/sc22/wg14/www/docs/n1188.pdf

                                                      – alinsoar
                                                      Nov 24 '17 at 12:14













                                                      3














                                                      The reason is that the program is running undefined behavior. The problem lies in the evaluation order, because there is no sequence points required according to C++98 standard ( no operations is sequenced before or after another according to C++11 terminology).



                                                      However if you stick to one compiler, you will find the behavior persistent, as long as you don't add function calls or pointers, which would make the behavior more messy.





                                                      • So first the GCC:
                                                        Using Nuwen MinGW 15 GCC 7.1 you will get:



                                                        #include<stdio.h>
                                                        int main(int argc, char ** argv)
                                                        {
                                                        int i = 0;
                                                        i = i++ + ++i;
                                                        printf("%dn", i); // 2

                                                        i = 1;
                                                        i = (i++);
                                                        printf("%dn", i); //1

                                                        volatile int u = 0;
                                                        u = u++ + ++u;
                                                        printf("%dn", u); // 2

                                                        u = 1;
                                                        u = (u++);
                                                        printf("%dn", u); //1

                                                        register int v = 0;
                                                        v = v++ + ++v;
                                                        printf("%dn", v); //2


                                                        }




                                                      How does GCC work? it evaluates sub expressions at a left to right order for the right hand side (RHS) , then assigns the value to the left hand side (LHS) . This is exactly how Java and C# behave and define their standards. (Yes, the equivalent software in Java and C# has defined behaviors). It evaluate each sub expression one by one in the RHS Statement in a left to right order; for each sub expression: the ++c (pre-increment) is evaluated first then the value c is used for the operation, then the post increment c++).



                                                      according to GCC C++: Operators




                                                      In GCC C++, the precedence of the operators controls the order in
                                                      which the individual operators are evaluated




                                                      the equivalent code in defined behavior C++ as GCC understands:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int i = 0;
                                                      //i = i++ + ++i;
                                                      int r;
                                                      r=i;
                                                      i++;
                                                      ++i;
                                                      r+=i;
                                                      i=r;
                                                      printf("%dn", i); // 2

                                                      i = 1;
                                                      //i = (i++);
                                                      r=i;
                                                      i++;
                                                      i=r;
                                                      printf("%dn", i); // 1

                                                      volatile int u = 0;
                                                      //u = u++ + ++u;
                                                      r=u;
                                                      u++;
                                                      ++u;
                                                      r+=u;
                                                      u=r;
                                                      printf("%dn", u); // 2

                                                      u = 1;
                                                      //u = (u++);
                                                      r=u;
                                                      u++;
                                                      u=r;
                                                      printf("%dn", u); // 1

                                                      register int v = 0;
                                                      //v = v++ + ++v;
                                                      r=v;
                                                      v++;
                                                      ++v;
                                                      r+=v;
                                                      v=r;
                                                      printf("%dn", v); //2
                                                      }


                                                      Then we go to Visual Studio. Visual Studio 2015, you get:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int i = 0;
                                                      i = i++ + ++i;
                                                      printf("%dn", i); // 3

                                                      i = 1;
                                                      i = (i++);
                                                      printf("%dn", i); // 2

                                                      volatile int u = 0;
                                                      u = u++ + ++u;
                                                      printf("%dn", u); // 3

                                                      u = 1;
                                                      u = (u++);
                                                      printf("%dn", u); // 2

                                                      register int v = 0;
                                                      v = v++ + ++v;
                                                      printf("%dn", v); // 3
                                                      }


                                                      How does visual studio work, it takes another approach, it evaluates all pre-increments expressions in first pass, then uses variables values in the operations in second pass, assign from RHS to LHS in third pass, then at last pass it evaluates all the post-increment expressions in one pass.



                                                      So the equivalent in defined behavior C++ as Visual C++ understands:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int r;
                                                      int i = 0;
                                                      //i = i++ + ++i;
                                                      ++i;
                                                      r = i + i;
                                                      i = r;
                                                      i++;
                                                      printf("%dn", i); // 3

                                                      i = 1;
                                                      //i = (i++);
                                                      r = i;
                                                      i = r;
                                                      i++;
                                                      printf("%dn", i); // 2

                                                      volatile int u = 0;
                                                      //u = u++ + ++u;
                                                      ++u;
                                                      r = u + u;
                                                      u = r;
                                                      u++;
                                                      printf("%dn", u); // 3

                                                      u = 1;
                                                      //u = (u++);
                                                      r = u;
                                                      u = r;
                                                      u++;
                                                      printf("%dn", u); // 2

                                                      register int v = 0;
                                                      //v = v++ + ++v;
                                                      ++v;
                                                      r = v + v;
                                                      v = r;
                                                      v++;
                                                      printf("%dn", v); // 3
                                                      }


                                                      as Visual Studio documentation states at Precedence and Order of Evaluation:




                                                      Where several operators appear together, they have equal precedence and are evaluated according to their associativity. The operators in the table are described in the sections beginning with Postfix Operators.







                                                      share|improve this answer





















                                                      • 1





                                                        I've edited the question to add the UB in evaluation of function arguments, as this question is often used as a duplicate for that. (The last example)

                                                        – Antti Haapala
                                                        Oct 21 '17 at 10:46






                                                      • 1





                                                        Also the question is about c now, not C++

                                                        – Antti Haapala
                                                        Oct 21 '17 at 10:47
















                                                      3














                                                      The reason is that the program is running undefined behavior. The problem lies in the evaluation order, because there is no sequence points required according to C++98 standard ( no operations is sequenced before or after another according to C++11 terminology).



                                                      However if you stick to one compiler, you will find the behavior persistent, as long as you don't add function calls or pointers, which would make the behavior more messy.





                                                      • So first the GCC:
                                                        Using Nuwen MinGW 15 GCC 7.1 you will get:



                                                        #include<stdio.h>
                                                        int main(int argc, char ** argv)
                                                        {
                                                        int i = 0;
                                                        i = i++ + ++i;
                                                        printf("%dn", i); // 2

                                                        i = 1;
                                                        i = (i++);
                                                        printf("%dn", i); //1

                                                        volatile int u = 0;
                                                        u = u++ + ++u;
                                                        printf("%dn", u); // 2

                                                        u = 1;
                                                        u = (u++);
                                                        printf("%dn", u); //1

                                                        register int v = 0;
                                                        v = v++ + ++v;
                                                        printf("%dn", v); //2


                                                        }




                                                      How does GCC work? it evaluates sub expressions at a left to right order for the right hand side (RHS) , then assigns the value to the left hand side (LHS) . This is exactly how Java and C# behave and define their standards. (Yes, the equivalent software in Java and C# has defined behaviors). It evaluate each sub expression one by one in the RHS Statement in a left to right order; for each sub expression: the ++c (pre-increment) is evaluated first then the value c is used for the operation, then the post increment c++).



                                                      according to GCC C++: Operators




                                                      In GCC C++, the precedence of the operators controls the order in
                                                      which the individual operators are evaluated




                                                      the equivalent code in defined behavior C++ as GCC understands:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int i = 0;
                                                      //i = i++ + ++i;
                                                      int r;
                                                      r=i;
                                                      i++;
                                                      ++i;
                                                      r+=i;
                                                      i=r;
                                                      printf("%dn", i); // 2

                                                      i = 1;
                                                      //i = (i++);
                                                      r=i;
                                                      i++;
                                                      i=r;
                                                      printf("%dn", i); // 1

                                                      volatile int u = 0;
                                                      //u = u++ + ++u;
                                                      r=u;
                                                      u++;
                                                      ++u;
                                                      r+=u;
                                                      u=r;
                                                      printf("%dn", u); // 2

                                                      u = 1;
                                                      //u = (u++);
                                                      r=u;
                                                      u++;
                                                      u=r;
                                                      printf("%dn", u); // 1

                                                      register int v = 0;
                                                      //v = v++ + ++v;
                                                      r=v;
                                                      v++;
                                                      ++v;
                                                      r+=v;
                                                      v=r;
                                                      printf("%dn", v); //2
                                                      }


                                                      Then we go to Visual Studio. Visual Studio 2015, you get:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int i = 0;
                                                      i = i++ + ++i;
                                                      printf("%dn", i); // 3

                                                      i = 1;
                                                      i = (i++);
                                                      printf("%dn", i); // 2

                                                      volatile int u = 0;
                                                      u = u++ + ++u;
                                                      printf("%dn", u); // 3

                                                      u = 1;
                                                      u = (u++);
                                                      printf("%dn", u); // 2

                                                      register int v = 0;
                                                      v = v++ + ++v;
                                                      printf("%dn", v); // 3
                                                      }


                                                      How does visual studio work, it takes another approach, it evaluates all pre-increments expressions in first pass, then uses variables values in the operations in second pass, assign from RHS to LHS in third pass, then at last pass it evaluates all the post-increment expressions in one pass.



                                                      So the equivalent in defined behavior C++ as Visual C++ understands:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int r;
                                                      int i = 0;
                                                      //i = i++ + ++i;
                                                      ++i;
                                                      r = i + i;
                                                      i = r;
                                                      i++;
                                                      printf("%dn", i); // 3

                                                      i = 1;
                                                      //i = (i++);
                                                      r = i;
                                                      i = r;
                                                      i++;
                                                      printf("%dn", i); // 2

                                                      volatile int u = 0;
                                                      //u = u++ + ++u;
                                                      ++u;
                                                      r = u + u;
                                                      u = r;
                                                      u++;
                                                      printf("%dn", u); // 3

                                                      u = 1;
                                                      //u = (u++);
                                                      r = u;
                                                      u = r;
                                                      u++;
                                                      printf("%dn", u); // 2

                                                      register int v = 0;
                                                      //v = v++ + ++v;
                                                      ++v;
                                                      r = v + v;
                                                      v = r;
                                                      v++;
                                                      printf("%dn", v); // 3
                                                      }


                                                      as Visual Studio documentation states at Precedence and Order of Evaluation:




                                                      Where several operators appear together, they have equal precedence and are evaluated according to their associativity. The operators in the table are described in the sections beginning with Postfix Operators.







                                                      share|improve this answer





















                                                      • 1





                                                        I've edited the question to add the UB in evaluation of function arguments, as this question is often used as a duplicate for that. (The last example)

                                                        – Antti Haapala
                                                        Oct 21 '17 at 10:46






                                                      • 1





                                                        Also the question is about c now, not C++

                                                        – Antti Haapala
                                                        Oct 21 '17 at 10:47














                                                      3












                                                      3








                                                      3







                                                      The reason is that the program is running undefined behavior. The problem lies in the evaluation order, because there is no sequence points required according to C++98 standard ( no operations is sequenced before or after another according to C++11 terminology).



                                                      However if you stick to one compiler, you will find the behavior persistent, as long as you don't add function calls or pointers, which would make the behavior more messy.





                                                      • So first the GCC:
                                                        Using Nuwen MinGW 15 GCC 7.1 you will get:



                                                        #include<stdio.h>
                                                        int main(int argc, char ** argv)
                                                        {
                                                        int i = 0;
                                                        i = i++ + ++i;
                                                        printf("%dn", i); // 2

                                                        i = 1;
                                                        i = (i++);
                                                        printf("%dn", i); //1

                                                        volatile int u = 0;
                                                        u = u++ + ++u;
                                                        printf("%dn", u); // 2

                                                        u = 1;
                                                        u = (u++);
                                                        printf("%dn", u); //1

                                                        register int v = 0;
                                                        v = v++ + ++v;
                                                        printf("%dn", v); //2


                                                        }




                                                      How does GCC work? it evaluates sub expressions at a left to right order for the right hand side (RHS) , then assigns the value to the left hand side (LHS) . This is exactly how Java and C# behave and define their standards. (Yes, the equivalent software in Java and C# has defined behaviors). It evaluate each sub expression one by one in the RHS Statement in a left to right order; for each sub expression: the ++c (pre-increment) is evaluated first then the value c is used for the operation, then the post increment c++).



                                                      according to GCC C++: Operators




                                                      In GCC C++, the precedence of the operators controls the order in
                                                      which the individual operators are evaluated




                                                      the equivalent code in defined behavior C++ as GCC understands:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int i = 0;
                                                      //i = i++ + ++i;
                                                      int r;
                                                      r=i;
                                                      i++;
                                                      ++i;
                                                      r+=i;
                                                      i=r;
                                                      printf("%dn", i); // 2

                                                      i = 1;
                                                      //i = (i++);
                                                      r=i;
                                                      i++;
                                                      i=r;
                                                      printf("%dn", i); // 1

                                                      volatile int u = 0;
                                                      //u = u++ + ++u;
                                                      r=u;
                                                      u++;
                                                      ++u;
                                                      r+=u;
                                                      u=r;
                                                      printf("%dn", u); // 2

                                                      u = 1;
                                                      //u = (u++);
                                                      r=u;
                                                      u++;
                                                      u=r;
                                                      printf("%dn", u); // 1

                                                      register int v = 0;
                                                      //v = v++ + ++v;
                                                      r=v;
                                                      v++;
                                                      ++v;
                                                      r+=v;
                                                      v=r;
                                                      printf("%dn", v); //2
                                                      }


                                                      Then we go to Visual Studio. Visual Studio 2015, you get:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int i = 0;
                                                      i = i++ + ++i;
                                                      printf("%dn", i); // 3

                                                      i = 1;
                                                      i = (i++);
                                                      printf("%dn", i); // 2

                                                      volatile int u = 0;
                                                      u = u++ + ++u;
                                                      printf("%dn", u); // 3

                                                      u = 1;
                                                      u = (u++);
                                                      printf("%dn", u); // 2

                                                      register int v = 0;
                                                      v = v++ + ++v;
                                                      printf("%dn", v); // 3
                                                      }


                                                      How does visual studio work, it takes another approach, it evaluates all pre-increments expressions in first pass, then uses variables values in the operations in second pass, assign from RHS to LHS in third pass, then at last pass it evaluates all the post-increment expressions in one pass.



                                                      So the equivalent in defined behavior C++ as Visual C++ understands:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int r;
                                                      int i = 0;
                                                      //i = i++ + ++i;
                                                      ++i;
                                                      r = i + i;
                                                      i = r;
                                                      i++;
                                                      printf("%dn", i); // 3

                                                      i = 1;
                                                      //i = (i++);
                                                      r = i;
                                                      i = r;
                                                      i++;
                                                      printf("%dn", i); // 2

                                                      volatile int u = 0;
                                                      //u = u++ + ++u;
                                                      ++u;
                                                      r = u + u;
                                                      u = r;
                                                      u++;
                                                      printf("%dn", u); // 3

                                                      u = 1;
                                                      //u = (u++);
                                                      r = u;
                                                      u = r;
                                                      u++;
                                                      printf("%dn", u); // 2

                                                      register int v = 0;
                                                      //v = v++ + ++v;
                                                      ++v;
                                                      r = v + v;
                                                      v = r;
                                                      v++;
                                                      printf("%dn", v); // 3
                                                      }


                                                      as Visual Studio documentation states at Precedence and Order of Evaluation:




                                                      Where several operators appear together, they have equal precedence and are evaluated according to their associativity. The operators in the table are described in the sections beginning with Postfix Operators.







                                                      share|improve this answer















                                                      The reason is that the program is running undefined behavior. The problem lies in the evaluation order, because there is no sequence points required according to C++98 standard ( no operations is sequenced before or after another according to C++11 terminology).



                                                      However if you stick to one compiler, you will find the behavior persistent, as long as you don't add function calls or pointers, which would make the behavior more messy.





                                                      • So first the GCC:
                                                        Using Nuwen MinGW 15 GCC 7.1 you will get:



                                                        #include<stdio.h>
                                                        int main(int argc, char ** argv)
                                                        {
                                                        int i = 0;
                                                        i = i++ + ++i;
                                                        printf("%dn", i); // 2

                                                        i = 1;
                                                        i = (i++);
                                                        printf("%dn", i); //1

                                                        volatile int u = 0;
                                                        u = u++ + ++u;
                                                        printf("%dn", u); // 2

                                                        u = 1;
                                                        u = (u++);
                                                        printf("%dn", u); //1

                                                        register int v = 0;
                                                        v = v++ + ++v;
                                                        printf("%dn", v); //2


                                                        }




                                                      How does GCC work? it evaluates sub expressions at a left to right order for the right hand side (RHS) , then assigns the value to the left hand side (LHS) . This is exactly how Java and C# behave and define their standards. (Yes, the equivalent software in Java and C# has defined behaviors). It evaluate each sub expression one by one in the RHS Statement in a left to right order; for each sub expression: the ++c (pre-increment) is evaluated first then the value c is used for the operation, then the post increment c++).



                                                      according to GCC C++: Operators




                                                      In GCC C++, the precedence of the operators controls the order in
                                                      which the individual operators are evaluated




                                                      the equivalent code in defined behavior C++ as GCC understands:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int i = 0;
                                                      //i = i++ + ++i;
                                                      int r;
                                                      r=i;
                                                      i++;
                                                      ++i;
                                                      r+=i;
                                                      i=r;
                                                      printf("%dn", i); // 2

                                                      i = 1;
                                                      //i = (i++);
                                                      r=i;
                                                      i++;
                                                      i=r;
                                                      printf("%dn", i); // 1

                                                      volatile int u = 0;
                                                      //u = u++ + ++u;
                                                      r=u;
                                                      u++;
                                                      ++u;
                                                      r+=u;
                                                      u=r;
                                                      printf("%dn", u); // 2

                                                      u = 1;
                                                      //u = (u++);
                                                      r=u;
                                                      u++;
                                                      u=r;
                                                      printf("%dn", u); // 1

                                                      register int v = 0;
                                                      //v = v++ + ++v;
                                                      r=v;
                                                      v++;
                                                      ++v;
                                                      r+=v;
                                                      v=r;
                                                      printf("%dn", v); //2
                                                      }


                                                      Then we go to Visual Studio. Visual Studio 2015, you get:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int i = 0;
                                                      i = i++ + ++i;
                                                      printf("%dn", i); // 3

                                                      i = 1;
                                                      i = (i++);
                                                      printf("%dn", i); // 2

                                                      volatile int u = 0;
                                                      u = u++ + ++u;
                                                      printf("%dn", u); // 3

                                                      u = 1;
                                                      u = (u++);
                                                      printf("%dn", u); // 2

                                                      register int v = 0;
                                                      v = v++ + ++v;
                                                      printf("%dn", v); // 3
                                                      }


                                                      How does visual studio work, it takes another approach, it evaluates all pre-increments expressions in first pass, then uses variables values in the operations in second pass, assign from RHS to LHS in third pass, then at last pass it evaluates all the post-increment expressions in one pass.



                                                      So the equivalent in defined behavior C++ as Visual C++ understands:



                                                      #include<stdio.h>
                                                      int main(int argc, char ** argv)
                                                      {
                                                      int r;
                                                      int i = 0;
                                                      //i = i++ + ++i;
                                                      ++i;
                                                      r = i + i;
                                                      i = r;
                                                      i++;
                                                      printf("%dn", i); // 3

                                                      i = 1;
                                                      //i = (i++);
                                                      r = i;
                                                      i = r;
                                                      i++;
                                                      printf("%dn", i); // 2

                                                      volatile int u = 0;
                                                      //u = u++ + ++u;
                                                      ++u;
                                                      r = u + u;
                                                      u = r;
                                                      u++;
                                                      printf("%dn", u); // 3

                                                      u = 1;
                                                      //u = (u++);
                                                      r = u;
                                                      u = r;
                                                      u++;
                                                      printf("%dn", u); // 2

                                                      register int v = 0;
                                                      //v = v++ + ++v;
                                                      ++v;
                                                      r = v + v;
                                                      v = r;
                                                      v++;
                                                      printf("%dn", v); // 3
                                                      }


                                                      as Visual Studio documentation states at Precedence and Order of Evaluation:




                                                      Where several operators appear together, they have equal precedence and are evaluated according to their associativity. The operators in the table are described in the sections beginning with Postfix Operators.








                                                      share|improve this answer














                                                      share|improve this answer



                                                      share|improve this answer








                                                      edited Jun 11 '17 at 1:17

























                                                      answered Jun 10 '17 at 22:56









                                                      Muhammad AnnaqeebMuhammad Annaqeeb

                                                      4,84512637




                                                      4,84512637








                                                      • 1





                                                        I've edited the question to add the UB in evaluation of function arguments, as this question is often used as a duplicate for that. (The last example)

                                                        – Antti Haapala
                                                        Oct 21 '17 at 10:46






                                                      • 1





                                                        Also the question is about c now, not C++

                                                        – Antti Haapala
                                                        Oct 21 '17 at 10:47














                                                      • 1





                                                        I've edited the question to add the UB in evaluation of function arguments, as this question is often used as a duplicate for that. (The last example)

                                                        – Antti Haapala
                                                        Oct 21 '17 at 10:46






                                                      • 1





                                                        Also the question is about c now, not C++

                                                        – Antti Haapala
                                                        Oct 21 '17 at 10:47








                                                      1




                                                      1





                                                      I've edited the question to add the UB in evaluation of function arguments, as this question is often used as a duplicate for that. (The last example)

                                                      – Antti Haapala
                                                      Oct 21 '17 at 10:46





                                                      I've edited the question to add the UB in evaluation of function arguments, as this question is often used as a duplicate for that. (The last example)

                                                      – Antti Haapala
                                                      Oct 21 '17 at 10:46




                                                      1




                                                      1





                                                      Also the question is about c now, not C++

                                                      – Antti Haapala
                                                      Oct 21 '17 at 10:47





                                                      Also the question is about c now, not C++

                                                      – Antti Haapala
                                                      Oct 21 '17 at 10:47











                                                      3














                                                      Your question was probably not, "Why are these constructs undefined behavior in C?". Your question was probably, "Why did this code (using ++) not give me the value I expected?", and someone marked your question as a duplicate, and sent you here.



                                                      This answer tries to answer that question: why did your code not give you the answer you expected, and how can you learn to recognize (and avoid) expressions that will not work as expected.



                                                      I assume you've heard the basic definition of C's ++ and -- operators by now, and how the prefix form ++x differs from the postfix form x++. But these operators are hard to think about, so to make sure you understood, perhaps you wrote a tiny little test program involving something like



                                                      int x = 5;
                                                      printf("%d %d %dn", x, ++x, x++);


                                                      But, to your surprise, this program did not help you understand -- it printed some strange, unexpected, inexplicable output, suggesting that maybe ++ does something completely different, not at all what you thought it did.



                                                      Or, perhaps you're looking at a hard-to-understand expression like



                                                      int x = 5;
                                                      x = x++ + ++x;
                                                      printf("%dn", x);


                                                      Perhaps someone gave you that code as a puzzle. This code also makes no sense, especially if you run it -- and if you compile and run it under two different compilers, you're likely to get two different answers! What's up with that? Which answer is correct? (And the answer is that both of them are, or neither of them are.)



                                                      As you've heard by now, all of these expressions are undefined, which means that the C language makes no guarantee about what they'll do. This is a strange and surprising result, because you probably thought that any program you could write, as long as it compiled and ran, would generate a unique, well-defined output. But in the case of undefined behavior, that's not so.



                                                      What makes an expression undefined? Are expressions involving ++ and -- always undefined? Of course not: these are useful operators, and if you use them properly, they're perfectly well-defined.



                                                      For the expressions we're talking about what makes them undefined is when there's too much going on at once, when we're not sure what order things will happen in, but when the order matters to the result we get.



                                                      Let's go back to the two examples I've used in this answer. When I wrote



                                                      printf("%d %d %dn", x, ++x, x++);


                                                      the question is, before calling printf, does the compiler compute the value of x first, or x++, or maybe ++x? But it turns out we don't know. There's no rule in C which says that the arguments to a function get evaluated left-to-right, or right-to-left, or in some other order. So we can't say whether the compiler will do x first, then ++x, then x++, or x++ then ++x then x, or some other order. But the order clearly matters, because depending on which order the compiler uses, we'll clearly get different results printed by printf.



                                                      What about this crazy expression?



                                                      x = x++ + ++x;


                                                      The problem with this expression is that it contains three different attempts to modify the value of x: (1) the x++ part tries to add 1 to x, store the new value in x, and return the old value of x; (2) the ++x part tries to add 1 to x, store the new value in x, and return the new value of x; and (3) the x = part tries to assign the sum of the other two back to x. Which of those three attempted assignments will "win"? Which of the three values will actually get assigned to x? Again, and perhaps surprisingly, there's no rule in C to tell us.



                                                      You might imagine that precedence or associativity or left-to-right evaluation tells you what order things happen in, but they do not. You may not believe me, but please take my word for it, and I'll say it again: precedence and associativity do not determine every aspect of the evaluation order of an expression in C. In particular, if within one expression there are multiple different spots where we try to assign a new value to something like x, precedence and associativity do not tell us which of those attempts happens first, or last, or anything.





                                                      So with all that background and introduction out of the way, if you want to make sure that all your programs are well-defined, which expressions can you write, and which ones can you not write?



                                                      These expressions are all fine:



                                                      y = x++;
                                                      z = x++ + y++;
                                                      x = x + 1;
                                                      x = a[i++];
                                                      x = a[i++] + b[j++];
                                                      x[i++] = a[j++] + b[k++];
                                                      x = *p++;
                                                      x = *p++ + *q++;


                                                      These expressions are all undefined:



                                                      x = x++;
                                                      x = x++ + ++x;
                                                      y = x + x++;
                                                      a[i] = i++;
                                                      a[i++] = i;
                                                      printf("%d %d %dn", x, ++x, x++);


                                                      And the last question is, how can you tell which expressions are well-defined, and which expressions are undefined?



                                                      As I said earlier, the undefined expressions are the ones where there's too much going at once, where you can't be sure what order things happen in, and where the order matters:




                                                      1. If there's one variable that's getting modified (assigned to) in two or more different places, how do you know which modification happens first?

                                                      2. If there's a variable that's getting modified in one place, and having its value used in another place, how do you know whether it uses the old value or the new value?


                                                      As an example of #1, in the expression



                                                      x = x++ + ++x;


                                                      there are three attempts to modify `x.



                                                      As an example of #2, in the expression



                                                      y = x + x++;


                                                      we both use the value of x, and modify it.



                                                      So that's the answer: make sure that in any expression you write, each variable is modified at most once, and if a variable is modified, you don't also attempt to use the value of that variable somewhere else.






                                                      share|improve this answer




























                                                        3














                                                        Your question was probably not, "Why are these constructs undefined behavior in C?". Your question was probably, "Why did this code (using ++) not give me the value I expected?", and someone marked your question as a duplicate, and sent you here.



                                                        This answer tries to answer that question: why did your code not give you the answer you expected, and how can you learn to recognize (and avoid) expressions that will not work as expected.



                                                        I assume you've heard the basic definition of C's ++ and -- operators by now, and how the prefix form ++x differs from the postfix form x++. But these operators are hard to think about, so to make sure you understood, perhaps you wrote a tiny little test program involving something like



                                                        int x = 5;
                                                        printf("%d %d %dn", x, ++x, x++);


                                                        But, to your surprise, this program did not help you understand -- it printed some strange, unexpected, inexplicable output, suggesting that maybe ++ does something completely different, not at all what you thought it did.



                                                        Or, perhaps you're looking at a hard-to-understand expression like



                                                        int x = 5;
                                                        x = x++ + ++x;
                                                        printf("%dn", x);


                                                        Perhaps someone gave you that code as a puzzle. This code also makes no sense, especially if you run it -- and if you compile and run it under two different compilers, you're likely to get two different answers! What's up with that? Which answer is correct? (And the answer is that both of them are, or neither of them are.)



                                                        As you've heard by now, all of these expressions are undefined, which means that the C language makes no guarantee about what they'll do. This is a strange and surprising result, because you probably thought that any program you could write, as long as it compiled and ran, would generate a unique, well-defined output. But in the case of undefined behavior, that's not so.



                                                        What makes an expression undefined? Are expressions involving ++ and -- always undefined? Of course not: these are useful operators, and if you use them properly, they're perfectly well-defined.



                                                        For the expressions we're talking about what makes them undefined is when there's too much going on at once, when we're not sure what order things will happen in, but when the order matters to the result we get.



                                                        Let's go back to the two examples I've used in this answer. When I wrote



                                                        printf("%d %d %dn", x, ++x, x++);


                                                        the question is, before calling printf, does the compiler compute the value of x first, or x++, or maybe ++x? But it turns out we don't know. There's no rule in C which says that the arguments to a function get evaluated left-to-right, or right-to-left, or in some other order. So we can't say whether the compiler will do x first, then ++x, then x++, or x++ then ++x then x, or some other order. But the order clearly matters, because depending on which order the compiler uses, we'll clearly get different results printed by printf.



                                                        What about this crazy expression?



                                                        x = x++ + ++x;


                                                        The problem with this expression is that it contains three different attempts to modify the value of x: (1) the x++ part tries to add 1 to x, store the new value in x, and return the old value of x; (2) the ++x part tries to add 1 to x, store the new value in x, and return the new value of x; and (3) the x = part tries to assign the sum of the other two back to x. Which of those three attempted assignments will "win"? Which of the three values will actually get assigned to x? Again, and perhaps surprisingly, there's no rule in C to tell us.



                                                        You might imagine that precedence or associativity or left-to-right evaluation tells you what order things happen in, but they do not. You may not believe me, but please take my word for it, and I'll say it again: precedence and associativity do not determine every aspect of the evaluation order of an expression in C. In particular, if within one expression there are multiple different spots where we try to assign a new value to something like x, precedence and associativity do not tell us which of those attempts happens first, or last, or anything.





                                                        So with all that background and introduction out of the way, if you want to make sure that all your programs are well-defined, which expressions can you write, and which ones can you not write?



                                                        These expressions are all fine:



                                                        y = x++;
                                                        z = x++ + y++;
                                                        x = x + 1;
                                                        x = a[i++];
                                                        x = a[i++] + b[j++];
                                                        x[i++] = a[j++] + b[k++];
                                                        x = *p++;
                                                        x = *p++ + *q++;


                                                        These expressions are all undefined:



                                                        x = x++;
                                                        x = x++ + ++x;
                                                        y = x + x++;
                                                        a[i] = i++;
                                                        a[i++] = i;
                                                        printf("%d %d %dn", x, ++x, x++);


                                                        And the last question is, how can you tell which expressions are well-defined, and which expressions are undefined?



                                                        As I said earlier, the undefined expressions are the ones where there's too much going at once, where you can't be sure what order things happen in, and where the order matters:




                                                        1. If there's one variable that's getting modified (assigned to) in two or more different places, how do you know which modification happens first?

                                                        2. If there's a variable that's getting modified in one place, and having its value used in another place, how do you know whether it uses the old value or the new value?


                                                        As an example of #1, in the expression



                                                        x = x++ + ++x;


                                                        there are three attempts to modify `x.



                                                        As an example of #2, in the expression



                                                        y = x + x++;


                                                        we both use the value of x, and modify it.



                                                        So that's the answer: make sure that in any expression you write, each variable is modified at most once, and if a variable is modified, you don't also attempt to use the value of that variable somewhere else.






                                                        share|improve this answer


























                                                          3












                                                          3








                                                          3







                                                          Your question was probably not, "Why are these constructs undefined behavior in C?". Your question was probably, "Why did this code (using ++) not give me the value I expected?", and someone marked your question as a duplicate, and sent you here.



                                                          This answer tries to answer that question: why did your code not give you the answer you expected, and how can you learn to recognize (and avoid) expressions that will not work as expected.



                                                          I assume you've heard the basic definition of C's ++ and -- operators by now, and how the prefix form ++x differs from the postfix form x++. But these operators are hard to think about, so to make sure you understood, perhaps you wrote a tiny little test program involving something like



                                                          int x = 5;
                                                          printf("%d %d %dn", x, ++x, x++);


                                                          But, to your surprise, this program did not help you understand -- it printed some strange, unexpected, inexplicable output, suggesting that maybe ++ does something completely different, not at all what you thought it did.



                                                          Or, perhaps you're looking at a hard-to-understand expression like



                                                          int x = 5;
                                                          x = x++ + ++x;
                                                          printf("%dn", x);


                                                          Perhaps someone gave you that code as a puzzle. This code also makes no sense, especially if you run it -- and if you compile and run it under two different compilers, you're likely to get two different answers! What's up with that? Which answer is correct? (And the answer is that both of them are, or neither of them are.)



                                                          As you've heard by now, all of these expressions are undefined, which means that the C language makes no guarantee about what they'll do. This is a strange and surprising result, because you probably thought that any program you could write, as long as it compiled and ran, would generate a unique, well-defined output. But in the case of undefined behavior, that's not so.



                                                          What makes an expression undefined? Are expressions involving ++ and -- always undefined? Of course not: these are useful operators, and if you use them properly, they're perfectly well-defined.



                                                          For the expressions we're talking about what makes them undefined is when there's too much going on at once, when we're not sure what order things will happen in, but when the order matters to the result we get.



                                                          Let's go back to the two examples I've used in this answer. When I wrote



                                                          printf("%d %d %dn", x, ++x, x++);


                                                          the question is, before calling printf, does the compiler compute the value of x first, or x++, or maybe ++x? But it turns out we don't know. There's no rule in C which says that the arguments to a function get evaluated left-to-right, or right-to-left, or in some other order. So we can't say whether the compiler will do x first, then ++x, then x++, or x++ then ++x then x, or some other order. But the order clearly matters, because depending on which order the compiler uses, we'll clearly get different results printed by printf.



                                                          What about this crazy expression?



                                                          x = x++ + ++x;


                                                          The problem with this expression is that it contains three different attempts to modify the value of x: (1) the x++ part tries to add 1 to x, store the new value in x, and return the old value of x; (2) the ++x part tries to add 1 to x, store the new value in x, and return the new value of x; and (3) the x = part tries to assign the sum of the other two back to x. Which of those three attempted assignments will "win"? Which of the three values will actually get assigned to x? Again, and perhaps surprisingly, there's no rule in C to tell us.



                                                          You might imagine that precedence or associativity or left-to-right evaluation tells you what order things happen in, but they do not. You may not believe me, but please take my word for it, and I'll say it again: precedence and associativity do not determine every aspect of the evaluation order of an expression in C. In particular, if within one expression there are multiple different spots where we try to assign a new value to something like x, precedence and associativity do not tell us which of those attempts happens first, or last, or anything.





                                                          So with all that background and introduction out of the way, if you want to make sure that all your programs are well-defined, which expressions can you write, and which ones can you not write?



                                                          These expressions are all fine:



                                                          y = x++;
                                                          z = x++ + y++;
                                                          x = x + 1;
                                                          x = a[i++];
                                                          x = a[i++] + b[j++];
                                                          x[i++] = a[j++] + b[k++];
                                                          x = *p++;
                                                          x = *p++ + *q++;


                                                          These expressions are all undefined:



                                                          x = x++;
                                                          x = x++ + ++x;
                                                          y = x + x++;
                                                          a[i] = i++;
                                                          a[i++] = i;
                                                          printf("%d %d %dn", x, ++x, x++);


                                                          And the last question is, how can you tell which expressions are well-defined, and which expressions are undefined?



                                                          As I said earlier, the undefined expressions are the ones where there's too much going at once, where you can't be sure what order things happen in, and where the order matters:




                                                          1. If there's one variable that's getting modified (assigned to) in two or more different places, how do you know which modification happens first?

                                                          2. If there's a variable that's getting modified in one place, and having its value used in another place, how do you know whether it uses the old value or the new value?


                                                          As an example of #1, in the expression



                                                          x = x++ + ++x;


                                                          there are three attempts to modify `x.



                                                          As an example of #2, in the expression



                                                          y = x + x++;


                                                          we both use the value of x, and modify it.



                                                          So that's the answer: make sure that in any expression you write, each variable is modified at most once, and if a variable is modified, you don't also attempt to use the value of that variable somewhere else.






                                                          share|improve this answer













                                                          Your question was probably not, "Why are these constructs undefined behavior in C?". Your question was probably, "Why did this code (using ++) not give me the value I expected?", and someone marked your question as a duplicate, and sent you here.



                                                          This answer tries to answer that question: why did your code not give you the answer you expected, and how can you learn to recognize (and avoid) expressions that will not work as expected.



                                                          I assume you've heard the basic definition of C's ++ and -- operators by now, and how the prefix form ++x differs from the postfix form x++. But these operators are hard to think about, so to make sure you understood, perhaps you wrote a tiny little test program involving something like



                                                          int x = 5;
                                                          printf("%d %d %dn", x, ++x, x++);


                                                          But, to your surprise, this program did not help you understand -- it printed some strange, unexpected, inexplicable output, suggesting that maybe ++ does something completely different, not at all what you thought it did.



                                                          Or, perhaps you're looking at a hard-to-understand expression like



                                                          int x = 5;
                                                          x = x++ + ++x;
                                                          printf("%dn", x);


                                                          Perhaps someone gave you that code as a puzzle. This code also makes no sense, especially if you run it -- and if you compile and run it under two different compilers, you're likely to get two different answers! What's up with that? Which answer is correct? (And the answer is that both of them are, or neither of them are.)



                                                          As you've heard by now, all of these expressions are undefined, which means that the C language makes no guarantee about what they'll do. This is a strange and surprising result, because you probably thought that any program you could write, as long as it compiled and ran, would generate a unique, well-defined output. But in the case of undefined behavior, that's not so.



                                                          What makes an expression undefined? Are expressions involving ++ and -- always undefined? Of course not: these are useful operators, and if you use them properly, they're perfectly well-defined.



                                                          For the expressions we're talking about what makes them undefined is when there's too much going on at once, when we're not sure what order things will happen in, but when the order matters to the result we get.



                                                          Let's go back to the two examples I've used in this answer. When I wrote



                                                          printf("%d %d %dn", x, ++x, x++);


                                                          the question is, before calling printf, does the compiler compute the value of x first, or x++, or maybe ++x? But it turns out we don't know. There's no rule in C which says that the arguments to a function get evaluated left-to-right, or right-to-left, or in some other order. So we can't say whether the compiler will do x first, then ++x, then x++, or x++ then ++x then x, or some other order. But the order clearly matters, because depending on which order the compiler uses, we'll clearly get different results printed by printf.



                                                          What about this crazy expression?



                                                          x = x++ + ++x;


                                                          The problem with this expression is that it contains three different attempts to modify the value of x: (1) the x++ part tries to add 1 to x, store the new value in x, and return the old value of x; (2) the ++x part tries to add 1 to x, store the new value in x, and return the new value of x; and (3) the x = part tries to assign the sum of the other two back to x. Which of those three attempted assignments will "win"? Which of the three values will actually get assigned to x? Again, and perhaps surprisingly, there's no rule in C to tell us.



                                                          You might imagine that precedence or associativity or left-to-right evaluation tells you what order things happen in, but they do not. You may not believe me, but please take my word for it, and I'll say it again: precedence and associativity do not determine every aspect of the evaluation order of an expression in C. In particular, if within one expression there are multiple different spots where we try to assign a new value to something like x, precedence and associativity do not tell us which of those attempts happens first, or last, or anything.





                                                          So with all that background and introduction out of the way, if you want to make sure that all your programs are well-defined, which expressions can you write, and which ones can you not write?



                                                          These expressions are all fine:



                                                          y = x++;
                                                          z = x++ + y++;
                                                          x = x + 1;
                                                          x = a[i++];
                                                          x = a[i++] + b[j++];
                                                          x[i++] = a[j++] + b[k++];
                                                          x = *p++;
                                                          x = *p++ + *q++;


                                                          These expressions are all undefined:



                                                          x = x++;
                                                          x = x++ + ++x;
                                                          y = x + x++;
                                                          a[i] = i++;
                                                          a[i++] = i;
                                                          printf("%d %d %dn", x, ++x, x++);


                                                          And the last question is, how can you tell which expressions are well-defined, and which expressions are undefined?



                                                          As I said earlier, the undefined expressions are the ones where there's too much going at once, where you can't be sure what order things happen in, and where the order matters:




                                                          1. If there's one variable that's getting modified (assigned to) in two or more different places, how do you know which modification happens first?

                                                          2. If there's a variable that's getting modified in one place, and having its value used in another place, how do you know whether it uses the old value or the new value?


                                                          As an example of #1, in the expression



                                                          x = x++ + ++x;


                                                          there are three attempts to modify `x.



                                                          As an example of #2, in the expression



                                                          y = x + x++;


                                                          we both use the value of x, and modify it.



                                                          So that's the answer: make sure that in any expression you write, each variable is modified at most once, and if a variable is modified, you don't also attempt to use the value of that variable somewhere else.







                                                          share|improve this answer












                                                          share|improve this answer



                                                          share|improve this answer










                                                          answered Aug 16 '18 at 11:54









                                                          Steve SummitSteve Summit

                                                          17.6k22450




                                                          17.6k22450















                                                              Popular posts from this blog

                                                              How to pass form data using jquery Ajax to insert data in database?

                                                              National Museum of Racing and Hall of Fame