Use non-string symbols internally? #883

dcodeIO · 2019-10-02T01:19:25Z

For a while now I've been thinking about switching the string names we have internally with something that is faster to look up in a hash map. Pretty sure that other compilers do this for various reasons, but after looking through it I have my doubts that it will speed-up compilation significantly, justifying such an extensive change. For instance, currently we are looking up names in code by doing a Map#has (each time hashing the string), but if we'd use non-string symbols, we'd have to do that hash lookup when parsing to make a non-string symbol of a string anyway, leading to about the same overhead. Also, my first reflex was to utilize JS symbols for this, which has the drawback that the compiler would have to build a map of all the internal symbols on load, potentially degrading start-up time, while strings are just static data.

Pinning this here for future reference, or if someone has an idea or experience with actual benefits.

The text was updated successfully, but these errors were encountered:

willemneal · 2019-10-02T14:37:27Z

I think that it's fine to create a symbols abstraction that has a parsing input and lookup table. This way we can start with the current hashing. But since we have the same idea of building the map first, we could swap out later for the more efficient solution. In the future, I see the compiler as more of a server that will cache all of the things it needs to. So in the case of lookup tables, their initial overhead becomes worth it. But personally, I too doubt that the work at the current stage is worth it. Features are more important than performance. (not to say that the performance aspects of the features shouldn't be part of considered.

MaxGraey · 2019-10-02T15:53:39Z

Instead simple hashtable may be better use LRU cache which allow O(1) lookup and O(1) insert/delete

MaxGraey · 2019-12-14T23:48:11Z

Also will be interesting try to adopt qp-tries (faster than crit-bit tries) or double-array-trie for symbolic table. @jedisct1 I know you have Rust's implementation of qp-trie what do you think?

jedisct1 · 2019-12-16T11:33:18Z

qp-tries are GREAT!

But them being fast requires the compiled code to be friendly to cache prefetch. It works pretty well in C and Rust, but I don't know how an AS version would perform. There's one way to know :)

dcodeIO · 2020-05-28T18:15:05Z

Closing this issue as part of 2020 vacuum because it is not absolutely crucial at this point in time. My expectation is that there will be new opportunities once compiling the compiler to WebAssembly becomes the default, like internalizing strings.

stale bot added the stale label Nov 1, 2019

MaxGraey added the enhancement label Nov 1, 2019

stale bot removed the stale label Nov 1, 2019

AssemblyScript deleted a comment from stale bot Nov 1, 2019

MaxGraey added the performance label Mar 4, 2020

dcodeIO closed this as completed May 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use non-string symbols internally? #883

Use non-string symbols internally? #883

dcodeIO commented Oct 2, 2019

willemneal commented Oct 2, 2019

MaxGraey commented Oct 2, 2019

MaxGraey commented Dec 14, 2019

jedisct1 commented Dec 16, 2019

dcodeIO commented May 28, 2020

Use non-string symbols internally? #883

Use non-string symbols internally? #883

Comments

dcodeIO commented Oct 2, 2019

willemneal commented Oct 2, 2019

MaxGraey commented Oct 2, 2019

MaxGraey commented Dec 14, 2019

jedisct1 commented Dec 16, 2019

dcodeIO commented May 28, 2020