Skip to content

Commit ad37c0b

Browse files
committed
auto merge of #12740 : nical/rust/json-streaming, r=erickt
Hi rust enthusiasts, With this patch I propose to add a "streaming" API to the existing json parser in libserialize. By "streaming" I mean a parser that let you act on JsonEvents that are generated as while parsing happens, as opposed to parsing the entire source, generating a big data structure and working with this data structure. I think both approaches have their pros and cons so this pull request adds the streaming API, preserving the existing one. The streaming API is simple: It consist into an Iterator<JsonEvent> that consumes an Iterator<char>. JsonEvent is an enum with values such as NumberValue(f64), BeginList, EndList, BeginObject, etc. The user would ideally use the API as follows: ``` for evt in StreamingParser::new(src) { match evt { BeginList => { // ... } // ... } } ``` The iterator provides a stack() method returning a slice of StackNodes which represent "where we currently are" in the logical structure of the json stream (for instance at "foo.bar[3].x" you get [ Key("foo"), Key("bar"), Index(3), Key("x") ].) I wrote "ideally" above because the current way rust expands for loops, you can't call the stack() method because the iterator is already borrowed. So for know you need to manually advance the iterator in the loop. I hope this is something we can cope with, until for loops are better integrated with the compiler. Streaming parsers are useful when you want to read from a json stream, generate a custom data structure and you know how the json is going to be structured. For example, imagine you have to parse a 3D mesh file represented in the json format. In this case you probably expect to have large arrays of vertices and using the generic parser will be very inefficient because it will create a big list of all these vertices, which you will copy into a contiguous array afterwards (so you end up doing a lot of small allocations, parsing the json once and parsing the data structure afterwards). With a streaming parser, you can add the vertices to a contiguous array as they come in without paying the cost of creating the intermediate Json data structure. You have much fewer allocations since you write directly in the final data structure and you can be smart in how you will pre-allocate it. I added added this directly into serialize::json rather than in its own library because it turns out I can reuse most of the existing code whereas maintaining a separate library (which I did originally) forces me to duplicate this code. I wrote this trying to minimize the size of the patch so there may be places where the code could be nicer at the expenses of more changes (let me know what you prefer). This is my first (potential) contribution to rust, so please let me know if I am doing something wrong (maybe I should have first introduced this proposition in the mailing list, or opened a github issue, etc.?). I work a few meters away from @pknfelix so I am not too hard to find :)
2 parents f77784b + 02c45de commit ad37c0b

File tree

3 files changed

+1054
-266
lines changed

3 files changed

+1054
-266
lines changed

0 commit comments

Comments
 (0)