Skip to content

Add Base64 safe #13672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 20, 2020
Merged

Add Base64 safe #13672

merged 4 commits into from
Mar 20, 2020

Conversation

juancarlospaco
Copy link
Collaborator

  • base64 adds URL-Safe Base64, implements RFC-4648 Section-7.
  • Allows to use Base64 on URL and file paths.
  • Documentation with links and examples. Changelog. Test.

@Clyybber
Copy link
Contributor

Clyybber commented Mar 17, 2020

No advantage over just replacing those chars yourself IMO.
EDIT: To clarify: Just do .replace('+', '-').replace('/', '_')

@Araq
Copy link
Member

Araq commented Mar 17, 2020

Can we get some benchmark numbers? It's some overhead.

@juancarlospaco
Copy link
Collaborator Author

@Clyybber
You are iterating/processing/scanning 93240504396054 chars to change the last one at run-time.
Instead of just generating it properly as the RFC-4648 says, even more on slow target like JavaScript.
🙂

## **See also:**
## * `encode proc<#encode,string>`_ for encoding a string
## * `decode proc<#decode,string>`_ for decoding a string
runnableExamples:
assert encode(['n', 'i', 'm']) == "bmlt"
assert encode(@['n', 'i', 'm']) == "bmlt"
assert encode([1, 2, 3, 4, 5]) == "AQIDBAU="
encodeInternal(s)
encodeInternal(s, if safe: cb64safe else: cb64)
Copy link
Member

@timotheecour timotheecour Mar 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is if safe: cb64safe else: cb64 guaranteed to evaluate only once (here + in other overload) ?
the template will expand as:

while ...:
  result[outputIndex] = (if safe: cb64safe else: cb64)[x and 63]

instead, I'd simply not make that assumption and pass a constant to it

EDIT: my suspicion was correct, see comment below

@timotheecour
Copy link
Member

timotheecour commented Mar 19, 2020

Can we get some benchmark numbers? It's some overhead.

  • current PR has exact same speed as with .replace('+', '-').replace('/', '_')
  • with modification I hinted at [1], it gives a 4X speedup

nim c -r -d:case2 -d:case_workaround -d:danger $timn_D/tests/nim/all/t10369.nim
2.581079
nim c -r -d:case2 -d:case_safe -d:danger $timn_D/tests/nim/all/t10369.nim
0.6306889999999999

when defined case2:
  import base64
  import random
  import times
  import strutils
  proc main()=
    let n = 10000000
    let m = 100
    var s = newString(n)
    for i in 0..<n:
      s[i]=cast[char](rand(uint8.high.int))
    let t = cpuTime()
    for j in 0..<m:
      when defined case_safe:
        let s2 = encode(s, safe = true)
      elif defined case_workaround:
        let s2 = encode(s, safe = false).replace('+', '-').replace('/', '_')
        doAssert s2[^1] != '\0' # prevent optimize out
      else:
        static: doAssert false
    echo cpuTime()-t
    # echo s2
  main()

[1]

modified: lib/pure/base64.nim
@ lib/pure/base64.nim:160 @ proc encode*(s: string, safe = false): string =
  ## * `decode proc<#decode,string>`_ for decoding a string
  runnableExamples:
    assert encode("Hello World") == "SGVsbG8gV29ybGQ="
- encodeInternal(s, if safe: cb64safe else: cb64)
+ if safe: encodeInternal(s, cb64safe)
+ else: encodeInternal(s, cb64)

but I'd argue that, even with that 4X speedup, this PR would still be worth it because it implements a standard and we shouldn't let users re-invent the wheel

note:

the patch needs to be applied to the other encode overload as well

## Encodes `s` into base64 representation.
##
## This procedure encodes an openarray (array or sequence) of either integers
## or characters.
##
## If ``safe`` is ``true`` then it will encode using the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (doesn't have to be done in this PR): single ticks now works (simpler == better)

@Araq Araq merged commit 70d9363 into nim-lang:devel Mar 20, 2020
@juancarlospaco juancarlospaco deleted the base64safe branch March 20, 2020 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants