Skip to content

Optimize _key to prevent string allocation when formatting Symbols #593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

moberegger
Copy link

A lot of memory is allocated in jbuilder's _key method. Doing something as simple as json.set! :key, 'whatever' allocates a string for the key. Since this runs for every json property, the memory allocation adds up. This can result in more frequent and longer garbage collection cycles, which adds latency.

For some context, waaaay back in 20212 there was a PR filed titled "Substantial performance improvements". One of the big pieces here were optimizations to key formatting. The solution introduced was a caching mechanism that would only format a key (often a symbol) to a string a single time. The default formatter that would get applied would simply be a to_s, so a given key like :foo would become 'foo' and that would be cached.

I suppose this was deemed overkill for simple key value pairs, and thus another PR was filed to make the key formatter optional, and instead just return the key as a string if no key formatter was explicitly provided. On the surface this makes sense, but this is where the problem was introduced: this causes a new string allocation for each key. That PR provides some benchmarks that look promising, but the benchmark didn't account for the additional memory allocations (which can offset any performance gains due to GC cycles) and didn't benchmark against something that would actually yield cache hits - the code under bench ran a single time, and even if it didn't, sample would randomly select 1 of 100 keys which are unlikely to be cached.

Regardless, there is an easy way to keep the key formatter optional without introducing extra memory overhead by using Symbol#name instead of Symbol#to_s. The former results in an interned string so that same string object is used, preventing extra memory allocation.

[dev-local] irb(main)> :foo.name.object_id
26704
[dev-local] irb(main)> :foo.name.object_id
26704
[dev-local] irb(main)> :foo.to_s.object_id
32712
[dev-local] irb(main)> :foo.to_s.object_id
33224

While keys are very likely to be Symbols since that is what method_missing provides when doing something like json.name :foo, to maintain backwards compatibility the call to .name will only be attempted if the key is in fact a String; it's possible there are things in the wild that don't use symbols as keys (ex: json.set! "foo", :bar).

Comparing the _key method directly, you not only see the extra allocation saved but also a performance improvement.

ruby 3.4.4 (2025-05-14 revision a38531fd3f) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
              before   352.461k i/100ms
               after   399.830k i/100ms
Calculating -------------------------------------
              before      3.753M (± 6.1%) i/s  (266.43 ns/i) -     18.680M in   5.003777s
               after      4.316M (± 4.1%) i/s  (231.70 ns/i) -     21.591M in   5.014181s

Comparison:
               after:  4315862.1 i/s
              before:  3753327.2 i/s - 1.15x  slower
Calculating -------------------------------------
              before   120.000  memsize (     0.000  retained)
                         3.000  objects (     0.000  retained)
                         1.000  strings (     0.000  retained)
               after    80.000  memsize (     0.000  retained)
                         2.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)

Comparison:
               after:         80 allocated
              before:        120 allocated - 1.50x more

This performance gain is also noticed when using the jbuilder DSL as well. For example

json = Jbuilder.new
name = 'John Doe'

Benchmark.ips do |x|
  x.report('before') { json.set! :name, name }
  x.report('after') { json.set! :name, name }

  x.hold! 'ips_key'
  x.compare!
end
ruby 3.4.4 (2025-05-14 revision a38531fd3f) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
              before   475.059k i/100ms
               after   575.917k i/100ms
Calculating -------------------------------------
              before      5.297M (± 1.1%) i/s  (188.80 ns/i) -     26.603M in   5.023404s
               after      6.592M (± 0.9%) i/s  (151.69 ns/i) -     33.403M in   5.067424s

Comparison:
               after:  6592290.6 i/s
              before:  5296510.0 i/s - 1.24x  slower

@moberegger moberegger changed the title Prevent memory allocation on symbolized keys Optimize _key to prevent string allocation when formatting Symbols May 28, 2025
@moberegger
Copy link
Author

@rafaelfranca We've observed massive savings in memory allocation from this change alone. We're talking upwards of hundreds of thousands of strings allocations per minute!

Figured I'd let you know in case you needed some convincing... and also to show off a bit 😆.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant