我今天才注意到Java9中存在Math.fma(a,b,c),它计算a*b+c(对于double和float值)。Returnsthefusedmultiplyaddofthethreearguments;thatis,returnstheexactproductofthefirsttwoargumentssummedwiththethirdargumentandthenroundedoncetothenearestfloat.Theroundingisdoneusingtheroundtonearestevenroundingmode.Incontrast,ifa*b+ciseval
我今天才注意到Java9中存在Math.fma(a,b,c),它计算a*b+c(对于double和float值)。Returnsthefusedmultiplyaddofthethreearguments;thatis,returnstheexactproductofthefirsttwoargumentssummedwiththethirdargumentandthenroundedoncetothenearestfloat.Theroundingisdoneusingtheroundtonearestevenroundingmode.Incontrast,ifa*b+ciseval
MSVC多年来一直支持AVX/AVX2指令,并且根据thismsdnblogpost,可以自动生成fused-multiply-add(FMA)说明。然而,以下函数都无法编译为FMA指令:floatfunc1(floatx,floaty,floatz){returnx*y+z;}floatfunc2(floatx,floaty,floatz){returnstd::fma(x,y,z);}更糟糕的是,std::fma不是作为单个FMA指令实现的,它执行得非常糟糕,比普通的x*y+z慢得多(std::fma的糟糕性能是如果实现不依赖于FMA指令,这是预期的)。我用/arch:AVX2/O