[Cin] Slackware 14.2 64-bit works too
Andrew Randrianasulu
randrianasulu at gmail.com
Tue Feb 12 11:37:12 CET 2019
update: actually, i was pointed at this article:
https://blogs.msdn.microsoft.com/vcblog/2015/10/19/do-you-prefer-fast-or-precise/
it all about Microsoft compiler, but underlaying problem should be the same for
all compilers. It even says _sometimes_ auto-vectorization can produce more
correct results!
--------------------quote---------
Counter Example
The explanations so far would lead you to expect that /fp:fast will sometimes
(maybe always?) produce a result that is less accurate than /fp:precise. As a
simple example, let’s consider the sum of the first million reciprocals, or
Sum(1/n) for n = 1..1000000. I calculated the approximate result using floats,
and the correct result using Boost’s cpp_dec_float (to a precision of 100
decimal digits). With /O2 level of optimization, the results are:
float /fp:precise 14.3574
float /fp:fast 14.3929
cpp_dec_float<100> 14.39272672286
So the /fp:fast result is nearer the correct answer than the /fp:precise!
How can this be? With /fp:fast the auto-vectorizer emits the SIMD RCPPS machine
instruction, which is both faster and more accurate than the DIVSS emitted
for /fp:precise.
This is just one specific case. But the point is that even a complete error
analysis won’t tell you whether /fp:fastis acceptable in your App – there’s
more going on. The only way to be sure is to test your App under each regime
and compare answers.
----------------------quote end--------
---------
В сообщении от Tuesday 12 February 2019 12:17:13 Andrea paz написал(а):
> Thank you, Adrian. You have provided a lot of news and all
> interesting. I am not competent for compilations and coding, but it
> seems that we have almost reached the intrinsic limits of which the
> group of Lumiera spoke (IMO):
> https://www.lumiera.org/project/background/history/CinelerraWoes.html
> CFLAGS: In the Arch wiki they advise against the -03 option because it
> brings instability and there are times when it is slower than -02. But
> your results aren't bad. Can I apply these CFLAGS options to my Arch?
> Are they general or do they only apply to Slackware? Can I try
> vectorization with -ffast-math or are there any other ways that would
> advise against it to an incompetent like me?
More information about the Cin
mailing list