AFAIK GCC -O2 optimization level has "always" included this
-f(no-)defer-pop optimization option. See "man gcc".
Wow, thank you for such interesting information. I always considered popping arguments back from stack as holy grail which can never be omitted and voila, it's standard optimization technique :) Question is how much gain it really has, obviously you can't use it in generic loops and even if you could, calling functions in the middle of time-critical loop is silly.