Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Last week I put together an alternative to the digitalWriteFast library for digitalRead/Write optimisations, and it got me thinking about the digitalRead/Write implementations in Arduino and opportunities for optimisation.
I read Issue 140 on the Google project and see that it was unmerged because "the complication of providing two separate versions of the pin mapping seem like a bigger burden than the potential benefit"
This version doesn't require two copies of each pin mapping. The pin mapping macros are changed, but there's no duplication. In fact there's only 130 additional lines of code.
pinMode, digitalRead & digitalWrite will all automatically inline whenever the pin number is known at compile time.
Statistics
Before
In the current Arduino library I measured a digitalWrite() at approx 107 clock ticks (6.74us @ 16Mhz.) The digitalWrite function takes up 8 bytes in flash for each time it is called, plus 108 bytes for the implementation.
After
With this change, many pins can digitalWrite() a constant value in 2 clock ticks (125ns.) The single sbi instruction takes up 2 bytes in flash.
If the pin supports PWM, the operation takes 7 clock ticks (437ns) due to setting the timer register to turn off possible PWM. Each digitalWrite() takes up 8 bytes in flash.
For some pins on the Mega, that require a read/modify/write cycle (ie no sbi support), the operation takes 17 clock ticks (~1us) and takes up 26 bytes of flash, in part due to the need to disable interrupts.
For instances where the pin is not known at compile time, the digitalWrite() function takes approx 112 clock ticks (~7us.) The digitalWrite function takes up 8 bytes in flash for each time it is called, plus 100 bytes for the implementation (ie nearly identical to the current implementation in these cases.)
Advantages
Disadvantages
Has been tested with gcc 4.7 (Linux) and gcc 4.3.2 (Windows.)
What do you think? Please let me know if there's anything else these changes need or could use. :)