Skip to content

Optimize AES-CTR encryption/decryption #20

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

kixelated
Copy link
Contributor

Much of the time spent is performing cipher.NewCTR for each packet.
This function allocates ~600 bytes on the heap each time which is a
little excessive to encrypt 1000 bytes.

I added a Reset function that allows cipher reuse. In addition, I sped
up the CTR cipher taking advantage of the fixed aes.BlockSize.

Unfortunately, forking the Go CTR implementation requires xorBytes,
which is not exported. There's a Go issue open that will hopefully
export the function because it's pretty useful. Until then, there needs
to be a lot of copy-pasted assembly code related to fast XOR bytes.

golang/go#30553

name                 old time/op   new time/op   delta
EncryptRTP-8          3.66µs ± 6%   3.36µs ± 6%   -8.04%  (p=0.001 n=10+10)
EncryptRTPInPlace-8   3.38µs ± 8%   3.13µs ± 5%   -7.33%  (p=0.000 n=10+10)
DecryptRTP-8          3.69µs ± 7%   3.37µs ± 8%   -8.80%  (p=0.000 n=10+10)
Write-8               3.80µs ± 9%   3.45µs ± 5%   -9.33%  (p=0.000 n=10+10)
WriteRTP-8            3.72µs ± 7%   3.46µs ± 8%   -6.96%  (p=0.005 n=10+10)

name                 old speed     new speed     delta
EncryptRTP-8         277MB/s ± 6%  301MB/s ± 6%   +8.76%  (p=0.001 n=10+10)
EncryptRTPInPlace-8  300MB/s ± 7%  323MB/s ± 5%   +7.85%  (p=0.000 n=10+10)
DecryptRTP-8         277MB/s ± 7%  304MB/s ± 7%   +9.68%  (p=0.000 n=10+10)
Write-8              266MB/s ± 8%  294MB/s ± 5%  +10.24%  (p=0.000 n=10+10)
WriteRTP-8           272MB/s ± 7%  293MB/s ± 8%   +7.61%  (p=0.005 n=10+10)

Much of the time spent is performing `cipher.NewCTR` for each packet.
This function allocates ~600 bytes on the heap each time which is a
little excessive to encrypt 1000 bytes.

I added a `Reset` function that allows cipher reuse. In addition, I sped
up the CTR cipher taking advantage of the fixed aes.BlockSize.

Unfortunately, forking the Go CTR implementation requires `xorBytes`,
which is not exported. There's a Go issue open that will hopefully
export the function because it's pretty useful. Until then, there needs
to be a lot of copy-pasted assembly code related to fast XOR bytes.

golang/go#30553

```
name                 old time/op   new time/op   delta
EncryptRTP-8          3.66µs ± 6%   3.36µs ± 6%   -8.04%  (p=0.001 n=10+10)
EncryptRTPInPlace-8   3.38µs ± 8%   3.13µs ± 5%   -7.33%  (p=0.000 n=10+10)
DecryptRTP-8          3.69µs ± 7%   3.37µs ± 8%   -8.80%  (p=0.000 n=10+10)
Write-8               3.80µs ± 9%   3.45µs ± 5%   -9.33%  (p=0.000 n=10+10)
WriteRTP-8            3.72µs ± 7%   3.46µs ± 8%   -6.96%  (p=0.005 n=10+10)

name                 old speed     new speed     delta
EncryptRTP-8         277MB/s ± 6%  301MB/s ± 6%   +8.76%  (p=0.001 n=10+10)
EncryptRTPInPlace-8  300MB/s ± 7%  323MB/s ± 5%   +7.85%  (p=0.000 n=10+10)
DecryptRTP-8         277MB/s ± 7%  304MB/s ± 7%   +9.68%  (p=0.000 n=10+10)
Write-8              266MB/s ± 8%  294MB/s ± 5%  +10.24%  (p=0.000 n=10+10)
WriteRTP-8           272MB/s ± 7%  293MB/s ± 8%   +7.61%  (p=0.005 n=10+10)
```
@kixelated kixelated requested a review from Sean-Der March 23, 2019 00:34
@backkem backkem added the review label Mar 23, 2019
@kixelated
Copy link
Contributor Author

Let me know if you would prefer that I just fork pions/srtp instead this approach.

@kixelated
Copy link
Contributor Author

Kind of an overkill approach.

@kixelated kixelated closed this Mar 29, 2019
@kixelated kixelated deleted the optimize-new-ctr branch March 29, 2019 17:51
@backkem backkem removed the review label Mar 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants