The Arduino Library provides functions like shiftOut() and digitalWrite(). These functions are simple and effective, but they are slow. Of course, they’re doing a lot more than just toggling bits. Faster isn’t always necessary and can sometimes lead to more difficult debugging. And as Donald Knuth said,
…premature optimization is the root of all evil.
So what happens, when you do need to optimize? For example, if shiftOut() is too slow for your project, what do you do? In Ralph’s post, Fastest AVR software SPI in the West, he breaks down different SPI code implementations into their assembly code.
To make the best optimization, you need to change compiler flags. So this is, in my opinion, an interesting case study in what kind of performance benefit you can get when you do some serious optimization.
Of course, you really shouldn’t, unless you need it…
Check out his post: Fastest AVR software SPI in the West
Knuth quote from his paper “StructuredProgrammingWithGoToStatements.”

