Skip to content

Use non-string symbols internally? #883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dcodeIO opened this issue Oct 2, 2019 · 5 comments
Closed

Use non-string symbols internally? #883

dcodeIO opened this issue Oct 2, 2019 · 5 comments

Comments

@dcodeIO
Copy link
Member

dcodeIO commented Oct 2, 2019

For a while now I've been thinking about switching the string names we have internally with something that is faster to look up in a hash map. Pretty sure that other compilers do this for various reasons, but after looking through it I have my doubts that it will speed-up compilation significantly, justifying such an extensive change. For instance, currently we are looking up names in code by doing a Map#has (each time hashing the string), but if we'd use non-string symbols, we'd have to do that hash lookup when parsing to make a non-string symbol of a string anyway, leading to about the same overhead. Also, my first reflex was to utilize JS symbols for this, which has the drawback that the compiler would have to build a map of all the internal symbols on load, potentially degrading start-up time, while strings are just static data.

Pinning this here for future reference, or if someone has an idea or experience with actual benefits.

@willemneal
Copy link
Contributor

I think that it's fine to create a symbols abstraction that has a parsing input and lookup table. This way we can start with the current hashing. But since we have the same idea of building the map first, we could swap out later for the more efficient solution. In the future, I see the compiler as more of a server that will cache all of the things it needs to. So in the case of lookup tables, their initial overhead becomes worth it. But personally, I too doubt that the work at the current stage is worth it. Features are more important than performance. (not to say that the performance aspects of the features shouldn't be part of considered.

@MaxGraey
Copy link
Member

MaxGraey commented Oct 2, 2019

Instead simple hashtable may be better use LRU cache which allow O(1) lookup and O(1) insert/delete

@stale stale bot added the stale label Nov 1, 2019
@stale stale bot removed the stale label Nov 1, 2019
@AssemblyScript AssemblyScript deleted a comment from stale bot Nov 1, 2019
@MaxGraey
Copy link
Member

Also will be interesting try to adopt qp-tries (faster than crit-bit tries) or double-array-trie for symbolic table. @jedisct1 I know you have Rust's implementation of qp-trie what do you think?

@jedisct1
Copy link

qp-tries are GREAT!

But them being fast requires the compiled code to be friendly to cache prefetch. It works pretty well in C and Rust, but I don't know how an AS version would perform. There's one way to know :)

@dcodeIO
Copy link
Member Author

dcodeIO commented May 28, 2020

Closing this issue as part of 2020 vacuum because it is not absolutely crucial at this point in time. My expectation is that there will be new opportunities once compiling the compiler to WebAssembly becomes the default, like internalizing strings.

@dcodeIO dcodeIO closed this as completed May 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants