[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MiNT] GCC Fast Math



Hi,

On keskiviikko 09 helmikuu 2011, Vincent Rivière wrote:
> I have discovered a "new" GCC compilation option: -ffast-math
> The documentation is here:
> http://gcc.gnu.org/onlinedocs/gcc-4.5.2/gcc/Optimize-Options.html#index-f
> fast_002dmath-811
> 
> Normally, the math functions like sqrt(), sin()... are implemented inside
> libm.a. They do their job and handle correctly the error cases by setting
> errno, etc.
> 
> The -ffast-math option inlines for example sqrt() to a single FPU
> instruction, without any library call. The drawback is that errno is not
> set, so it is a violation of the C standard.

Other typical differences are:
2. Values close to infinity or zero are truncated to those
3. In NAN (not-a-number) handling


> However most programs are OK with that, including POV-Ray.

Things that break are typically buggy programs.  They may e.g. get
divide by zero due to 2) which they by luck haven't gotten with IEEE
compliant number handling (by luck I mean that few more rounds of
same algorithm would with IEEE numbers gotten the same issue).


> The job is math-68881.h was similar. But instead of relying on GCC's
> internals (like -ffast-math), magic is done through inline assembler.
> GCC is not optimal on register usage when transferring data from C to
> inline assembler. As a result, the code produced by -ffast-math is a bit
> better.
> 
> So I have totally disabled the usage of math-68881.h in math.h. Using
> explicitly -ffast-math should do a better job.
> 
> I have built several versions of POV-Ray using -m68020-60, and thanks to
> Jean-François Lemaire and Guillaume Tello, I got some bench results from
> a CT60/100 MHz. Surprisingly, the performance is almost the same on all
> versions. Maybe the overhead of a function call is insignificant
> compared to the execution time of an FPU instruction ?
> Well, now gprof is fixed, we could do some profiling to understand things
> better :-)

Gprof is unlikely to help with that.  With small often called functions
and things like function call overhead it's not providing any kind of
accurate results because it itself affects function call overhead
(calls an extra function on each function call).

I would suggest using Hatari debugger's profiler functionality, but
note that Hatari's 030 emulation isn't cycle accurate (or provide profiler
accurate cycle information), so you can get it only to count instructions
between breakpoints you've set to the code.


> Anyway, those math options don't seem to affect the speed significantly,
> so don't bother with them.

If you would have been profiling something else than Pov-Ray, newer GCC's
ability to pre-calculate results of math functions with arguments it finds
out to be fixed could have meant that there actually isn't any difference
in generated code. :-)


> NB: I'm going to send my math-68881.h patch for C99 to GCC people, and I
> will also ask them if that file can really be useful for someone
> nowadays.


	- Eero