Comments on: Optimizing Matrix Multiplication
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/
Just another WordPress.com weblogSun, 19 Feb 2017 16:39:11 +0000
hourly
1 http://wordpress.com/
By: Siddhesh Urkude
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/#comment-3079
Sun, 19 Feb 2017 16:39:11 +0000http://attractivechaos.wordpress.com/?p=1385#comment-3079Optimizing matrix multiplication explained very easily. Its easy to understand through your article. I was searching for one such article. Thanx for sharing.

]]>
By: Paweł
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/#comment-2916
Fri, 14 Oct 2016 13:45:43 +0000http://attractivechaos.wordpress.com/?p=1385#comment-2916I do recomend including also implementation with instruction reordering(eliminates cache-misisng) and openmp directives for parallization
]]>
By: BLAS Pascal
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/#comment-2872
Mon, 29 Aug 2016 14:19:18 +0000http://attractivechaos.wordpress.com/?p=1385#comment-2872For GotoBLAS, from which OpenBLAS was forked, you might want to read the paper https://www.cs.utexas.edu/users/pingali/CS378/2008sp/papers/gotoPaper.pdf
]]>
By: 2#
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/#comment-2871
Mon, 29 Aug 2016 08:56:59 +0000http://attractivechaos.wordpress.com/?p=1385#comment-2871This is one very interesting ecample where you could use some multy threading in order to acchieve faster multiplication.
]]>
By: rurban
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/#comment-2869
Sun, 28 Aug 2016 22:44:40 +0000http://attractivechaos.wordpress.com/?p=1385#comment-2869What about the new gather-scatter vector intrinsics, esp. the 512 bit ones? Is this included in SSE? I guess not, as it needs AVX or Knights Landing/Phi.
]]>
By: zhanxwzhanxw
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/#comment-2868
Sun, 28 Aug 2016 21:49:09 +0000http://attractivechaos.wordpress.com/?p=1385#comment-2868I like Eigen a lot. It can use Intel MKL backend, and that has good performances besides matrix multiplication.
]]>
By: attractivechaos
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/#comment-2867
Sun, 28 Aug 2016 20:17:28 +0000http://attractivechaos.wordpress.com/?p=1385#comment-2867Have a look at the “matrix multiplication algorithm” wiki page and you will get some hints. I guess they are faster mostly because they are better at minimizing cache misses by splitting and reordering the computation block by block. As to other possible explanations – I have intentionally disabled multithreading; although the linux server supports AVX, gcc doesn’t and I explicitly tells OpenBLAS not to use AVX.
]]>
By: Øystein Schønning-Johansen
https://attractivechaos.wordpress.com/2016/08/28/optimizing-matrix-multiplication/#comment-2866
Sun, 28 Aug 2016 19:56:45 +0000http://attractivechaos.wordpress.com/?p=1385#comment-2866A great post as usual! Thanks. Do you have any theories why OpenBLAS and Eigen is so much better? Are they using threading or maybe AVX?
]]>