Add loggit with example app, instrument redux for profiling, and add initial profiling data #1

kevinrobinson · 2015-07-06T00:08:30Z

Overview

Inspiration struck flying home from Paris after React Europe. :) This is starting from @gaearon's awesome Redux as a base, and building an API for using a log as the primary data structure. There are some comments in the code and some rambling unedited explanation here, but this is super rough and not really polished as a proper presentation or blog post. I'll write something proper about this, but wanted to share since some folks were interested and I would love some help reviewing or helping shake this down. :)

This set of work had two goals:

build out some of the pieces I presented at React Europe in a way that is more presentable and can be shared with other folks
get some measurements for memory space and compute performance to get a rough sense for how ths design compares to standard flux Stores that are computed up front

What is it:

As data flows through, it's appended to an immutable log. React components describe what computations they want run on that log in a computations method. They can then ask the library to perform those computations directly in the render method. The library provides a loggit object to components as the API, which includes:

recordFact: takes a fact about something that happened, an Action since this is using Redux as a starting point
computeFor: the way components ask for computation to be performed. They need to pass a reference to themselves (the React component), and need to implement a computations method that returns a map of Computation objects.

Reducers are the only computations used at this point. Computation objects are like plain reducer functions, but they explicitly have the shape {initial,reducer}, rather than having the initial value hidden inside the reducer. This is so that even if there are no facts, the return value of the Computation still has the same shape, and is not undefined.

You can see a slideshow with notes walking through whiteboarding the overall idea here: https://t.co/CJSlYMaFWs

Starting points to read code

Using loggit from a component:

redux/examples/loggit-todomvc/components/MainSection.js

Line 45 in 41a7e65

computations() {
Where the outer pieces of the system are stitched together, and where the loggit API is created for components to use:

redux/examples/loggit-todomvc/loggit/shell.js

Line 19 in 41a7e65

export default class Shell {
An optimizer:
https://github.com/kevinrobinson/redux/blob/41a7e65cdccfc850583a2bb502a66a880d5e9a5f/examples/loggit-todomvc/loggit/optimizers/memoizing_snapshot_optimizer.js

Test case

There's some data from an initial test case I worked on, but most of the data is from what I called test case reduxjs#2. The test case starts with 1187 actions already commited to the log (or run through the data layer for redux). Then it starts the proper app, and for 3 seconds, performs a randomly selected action every 10ms, including new todos with random text. This yields around 1382 actions. The measure of the test is the time taken in the compute and render portions of the app, based on just grabbing timings from performance.now. All profiling is done on my relatively new Macbook Pro, but I'd love to figure out how to profile easily on a phone. There's a variation on the same test case that just runs for 30seconds instead, which yields around 3021 actions.

For the tests cases, I tried swapping a few different pieces in and out. These are mostly different optimizers, which improve the efficiency of performing computation, and renders, which improve the efficiency of updating the DOM. There's also some intial data about a compaction strategy for bounding the size of the log.

Renderers

There are a few different renderers, which are notified when facts are appended to the log, and can respond in some way to update the view. These are:

NaiveReactRenderer: after each fact, it calls React.render and does a top-down render.
RafReactRenderere: runs a requestAnimationFrame loop, which will batch up multiple facts between frames

Compute optimizers

There are also a few optimizers, that can optimize the computation instead of just reducing over the whole log each time. These are:

NoopOptimizer: does nothing, it just does reduce over the log each time
MemoizingOptimizer: memoizes some computations (with not very sensible semantics around
bounding the cache)
- MemoizingOptimizerV2: same thing, just trying to be simpler about how the internals work and simpler about bounding the size of the cache
- MemoizingSnapshotOptimizer: memoizes calls but also caches snapshots of reductions over the log, to start from when performing subsequent reduce operations. Cache here isn't bounded at the moment.

PrecomputeReactRenderer

And finally, there's a PrecomputeReactRenderer, which is slightly more involved and experimental. It uses a seam in the loggit API to React components that lets it track which components need which computations. It can then use this information to be more efficient about updating the UI than just doing a top-down render. This involves reaching into React internals describing the component tree, but essentially walks the tree, checking if the computations that the component needs have changed, and if they haven't, then we don't need to perform the render/reconciliation step for that component. We do still need to check its children though, since data doesn't flow top-down anymore (it's as if it were side-loaded). This is similar to the discussion in facebook/react#3398 (comment) and what @josephsavona described in his awesome Relay talk at React Europe, although the implementation here is simple and naive.

Initial profiling:

redux as baseline

3 seconds

This is the timeline for a single run with redux. Keep in mind there is noise and actual randomness in these tests and this is just a single run.

##### 30 seconds

On a 30second run of the same test case, this is what we see. Looks like there's some DOM nodes leaking and unable to be GCed, I haven't figured out why yet. When I saw this initially, by running on larger time scales these would get GCed, but that wasn't the case when trying again now, so i'm not sure why they're still retained.

memosnap+raf

3 seconds

This is the timeline for a single "memosnap+raf" run:

30 seconds

The DOM node leaks shows up here on the longer run as well, so definitely something to look into. You can see there's a lot more GC see-sawing than with redux, but I was surprised that the memory usage after a GC run isn't much higher.

memosnap+precompute

3 seconds

This is the timeline for the "memosnap" optimizer and the "precompute" renderer:

##### 30 seconds

The DOM node leak is still there:

Comparing performance of different strategies, and with redux

For test case reduxjs#2, I did seven runs of a few different configurations, and then graphed how they came out. I also added some instrumentation to redux to see how this compared to that as a baseline. I was a bit surprised with how the results came out, so I'll look over it again (and which is why I'm not summarizing here) and I'd love more eyes on this. The graph showing the TLDR and the raw dataset are here: https://docs.google.com/a/twitter.com/spreadsheets/d/1_j-exUs3XjqjXh4Xa4D7nDspH5vaRlj9_dEKmB7iGCM/pubhtml.

Compaction

I wrote an initial key-based compaction strategy, just as a proof of concept for reclaiming memory space. It's only triggered manually now, but you can see the effects here, after a 30 second run with memosnap+raf, then forcing a compaction.

I'm not quite sure how to interpret it. It didn't lock up everything performing the compaction, but the UI wasn't doing any other work at the time. From its perspective, 74% of the messages in the log were compacted away, but I'm not sure how to impact the impact on memory shown in the timeline.

# Next

For memory usage, turns out I don't understand how to use Chrome devtools enough to figure this out. performance.memory.usedJSHeapSize wouldn't change during my tests (presumably it's allocating larger chunks and the test just didn't fill the buffer). But more confusingly, the number shown in the Timeline view didn't match the number that I'd see when taking a heap snapshot in the Profiles view. So I need to learn how to read those better, and if anyone knows how and can help that would be awesome.

I also don't know what's going on with Chrome, and what work it's doing that is preventing everything from running 60fps. It's that magic outlined box that from what I remember means that Chrome is doing work but it isn't able to introspect and tell you what exactly the kind of work is.

It'd be awesome to get feedback on how to improve the test case here. It succeeds in throwing work at the system but it's not entirely realistic with the rate of updating and frequency of DOM changes.

It'd be interesting to see if using setState instead of forceUpdate in PrecomputeReactRenderer would allow for invaliding a component but then still short-circuiting rendering and reconciliation further down (versus forcing the entire render with forceUpdate now). This might work with loggit exposing a shouldComponentUpdate method and components implementing the hook and calling it. Might improve rendering performance further.

And try immutable data.

Having said all that, I'd like to write up and share this to give some context for talking about performance (especially with some mobile profiling). But the real next work here is in API design for computations. I think this example shows well that reducers aren't an intuitive fit for modeling these kinds of computations (how CHECK_TODO works is a good example). I think a good thing to try here can also provide some efficiency gains, and that's supporting chained computation in the same style as nuclear-js. So in addition to reducers there could also be a merge computation, that has the shape {inputs: {foo: computation, bar: computation}, merge}, where the semantics are that each computation in the inputs set needs to be computed first, and then the merge fn will be run with {foo: value, bar: value}. It can perform any arbitrary computation with those and return it. This would work well for this todos app, where when we're collecting Todos it'd be simpler to store as a map, and we could map that to a sorted list after. Or keep an index of sort order separately, or which todos are checked separately, and merge them into a higher-level entity.

Also, this doesn't really demonstrate the real value of this approach when it comes to server communication, since there's no server. :) So that would be a good improvement to add that in here, and then add computations to perform the kinds of things I mentioned in the talk.

I made a few changes in the initial hacking too, where this is slightly different than the Redux example, so it'd be good to clarify that so it's an apples-to-apples comparison when trying to use Redux as a baseline. And there's probably some straightforward optimizations like using immutable data in the Redux app that would be good to look at.

Conclusion

I'd love to hear any feedback folks have. This is early work but I think it demonstrates some of the ideas fairly well and can serve as a useful communication point if other people find it interesting or useful. I'll write up some explanations of smaller pieces of this that are short blog-post sized and would be more amenable to consumption. :)

Thanks!
-Kevin
https://twitter.com/krob

Broke all of Redux Broke all of Redux more At a good checkpoint, basic logging working Done with hack quality for now, halfway started GC Working, debugging compaction since markTodo isn't descriptive enough Done with tinkering In progress with HOC but stopping Revert "In progress with HOC but stopping" This reverts commit 672e5fe. Moving TodoItem state to log Trying to disable hot loading, tweaking some profiling Adding #start to interface for renderer, adding RafReactRenderer Initial progress on MemoizingOptimizer With memoizing snapshot optimizer, and initial analysis for test set 1 Some cleanup, adding more graphs comment cleanup Added PrecomputeReactRenderer, split out RenderTimer Tons of cleanup, profiling APIs, shell stitching pieces together Adding results for test2 Link to test2 results Rename to loggit-todomvc

kevinrobinson · 2015-07-06T00:41:10Z

Oh also, just to clarify, this is building from some awesome code in Redux, but the pull request is just to this local repo and not intended to be pulled into the proper redux repo. :)

gaearon · 2015-07-06T00:52:14Z

So in addition to reducers there could also be a merge computation, that has the shape {inputs: {foo: computation, bar: computation}, merge}, where the semantics are that each computation in the inputs set needs to be computed first, and then the merge fn will be run with {foo: value, bar: value}

Is this similar to https://github.com/faassen/reselect?

kevinrobinson · 2015-07-06T00:58:30Z

@gaearon Yeah I think so, although that's just from the looking at the description. I hadn't seen that but just saw you tweet about it and bookmarked it now. :)

store the minimal possible state

One difference is that the idea here is that there's no first-class "state" that's more truthful than anything else computed, since it's all computed from the log.

gaearon · 2015-07-06T01:02:02Z

The leak worries me. Something to do with Redux per se?

kevinrobinson · 2015-07-06T01:17:24Z

@gaearon I'm not sure, I don't know more other than what I posted there. I looked a bit earlier when I was in the air and couldn't track it down. I didn't see anything obvious and couldn't figure out how to find out more profiling (I couldn't track down what these were in the Profiles tab either).

I tried running the todomvc and counter apps off redux master to see, but ran into setup problems trying to run both. These are probably just problems with my setup but I'll open an issue on redux in case not.

arnihermann · 2015-07-07T04:09:47Z

Thanks @kevinrobinson for taking the time to work on this.

I've been reading through this and I've jotted down a few thoughts:

I constantly keep wondering about what assumptions we make about the server architecture if we model the application like this. Do we stream the entire log from the server to keep the application up to date? Do we fetch a snapshot or compacted log from the server for the initial bootstrapping of the application?

I think computations are known in cqrs and event sourcing community as projections. Can we learn something from that community that applies to this domain?

Another thought that came up as I was reading through the PR, should we establish a benchmark against the same application in redux to have a baseline to compare with?

As for how this relates to relesect, maybe we should clearly state the value proposition for modeling the application this way. To me, this could have a big impact on how we do collaborative editing and partial connectivity that might not be as simple to achieve with stateful ~~backends~~ stores.

kevinrobinson · 2015-07-07T12:11:55Z

@arnihermann Thanks for the thoughtful response, I really appreciate it. :)

Great points all around. First, in regards to server architecture, while there are similarities here to event-sourcing or log-based systems on the backend, the idea here isn't to stream all events from the server. The purpose is to record all information the UI knows about user actions, server requests/responses/real-time pushes from its own perspective. Even if the backend that has a standard REST API, that the UI is naively polling, there are still hard problems with optimistic updates, server updates blowing away actions the user is in the middle of performing. Having the log on the UI side to describe this, regardless of interaction with the server, preserves the information needed to resolve this different ways.

And in regards to the value proposition, yeah I'll write up something expanding a bit on what I described in my talk and move this hacking to a proper repo. That's the real important part here, that this could help with features like collaborative editing and partial connectivity. Talking about how this idea compares to approaches on the backend, or talking about the mechanics of some of these optimizations here sometimes distracts this, so I'll try to articulate that more clearly. Thanks!

Those are great questions about bootstrapping, etc., that I think are slightly decoupled - you could use any of those approaches. Using TodoMVC isn't the best example here, but I think grounding this in real use cases will help. For here, on a cold start I think you'd want to get a snapshot of server truth for the initial render. The mechanics of how to make requests from the server aren't even addressed here (this first cut was focused on getting a rough gauge of the computational efficiency).

For the mechanics of server communication, I think there are two interesting paths from here, one is similar to what I described in the first part of my talk, where components sink data structures that describe what they need, and that's reconciled. The other is a more imperative design where, as facts are logged, they may also trigger side-effects like requests. Experimenting here to find an intuitive API is I think the real work here.

With profiling, my intention was to compare this to the same TodoMVC code in Redux, but this was the rough first cut. There are a few minor differences that I made before deciding to this, but I think that's roughly what's here. Let me know if you think there's other ways to go about this, and especially how to structure the tests to make sure they're measuring something meaningful.

Really appreciate the thoughts, thank you!

arnihermann · 2015-07-07T13:03:05Z

Having the log on the UI side to describe this, regardless of interaction with the server, preserves the information needed to resolve this different ways

This is what makes me super interested. I have this exact scenario in mind for one of my systems that already uses logs on the server. I have a suspicion that having the log on the UI side makes it even more powerful if the server does it as well -- that reconciling will be easier than if the server is e.g. restful. Not sure about it though. I'm going to try out applying this idea to my problem domain and see how it affects the client application.

kevinrobinson · 2015-07-07T13:07:01Z

@arnihermann Sounds awesome! Yeah I'd love to hear more and see what you discover. :)

kevinrobinson · 2015-07-09T12:29:11Z

Moved this into its own repo, feel free to open questions, issues or feedback there! https://github.com/kevinrobinson/loggit

Kevin Robinson added 3 commits July 5, 2015 18:13

Add in profiling to Redux TodoMVC

75532af

Update port to avoid collision

41a7e65

kevinrobinson mentioned this pull request Jul 6, 2015

Possibly leaking DOM node references reduxjs/redux#219

Closed

Clarifying some comments

af307c8

kevinrobinson closed this Jul 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add loggit with example app, instrument redux for profiling, and add initial profiling data #1

Add loggit with example app, instrument redux for profiling, and add initial profiling data #1

kevinrobinson commented Jul 6, 2015

kevinrobinson commented Jul 6, 2015

gaearon commented Jul 6, 2015

kevinrobinson commented Jul 6, 2015

gaearon commented Jul 6, 2015

kevinrobinson commented Jul 6, 2015

arnihermann commented Jul 7, 2015

kevinrobinson commented Jul 7, 2015

arnihermann commented Jul 7, 2015

kevinrobinson commented Jul 7, 2015

kevinrobinson commented Jul 9, 2015

Add loggit with example app, instrument redux for profiling, and add initial profiling data #1

Add loggit with example app, instrument redux for profiling, and add initial profiling data #1

Conversation

kevinrobinson commented Jul 6, 2015

Overview

This set of work had two goals:

What is it:

Starting points to read code

Test case

Renderers

Compute optimizers

PrecomputeReactRenderer

Initial profiling:

redux as baseline

3 seconds

memosnap+raf

3 seconds

30 seconds

memosnap+precompute

3 seconds

Comparing performance of different strategies, and with redux

Compaction

Conclusion

kevinrobinson commented Jul 6, 2015

gaearon commented Jul 6, 2015

kevinrobinson commented Jul 6, 2015

gaearon commented Jul 6, 2015

kevinrobinson commented Jul 6, 2015

arnihermann commented Jul 7, 2015

kevinrobinson commented Jul 7, 2015

arnihermann commented Jul 7, 2015

kevinrobinson commented Jul 7, 2015

kevinrobinson commented Jul 9, 2015