-
Notifications
You must be signed in to change notification settings - Fork 223
21-Reinforcement-Learning #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I start with the Passive-ADP-Agent (Fig 21.2) and I want to visualize it evaluating a fixed policy in the 3x4 world (Fig 21.1). I have decided to use D3.js to visualize the environment. I want to use the Python code as a base for my implementation. |
Our original plan was to stick to javascript as we wanted to make the whole think static host-able. I will wait for @redblobgames for any revisions on that. |
@alireza-a Sounds like a good start. We have been using two.js for visualization so before you use d3.js look to see if two.js works (to keep consistency with the existing code base). Translating the python code to javascript seems like a good approach. |
Sure I will only depend on two.js. I mapped over MDP, GridWorld, and PassiveADPAgent (only the parts that I need) from the Python Implementation to JS. I could not find anything about testing on the repo. Do we only rely on the visualizations since it's a static site? If so, should I only make a pull request after the visualizations have been added? |
Testing would be nice. I'm not sure how to best approach it. Yes, we could have automated tests for the algorithms, but the main focus of the work is the visualizations, and those we'll have to test by asking people to try it out (user testing instead of unit testing). Yes, make a pull request after you've made a visualization or two. For some visualizations you may have to write the algorithm in a non-standard way to make the visualization work. A lot of this project will be experimental. Implementing the algorithms is work but it's work we know how to do. Interactive/animated visualizations for algorithms is something where we don't know which visualizations will be good, so we may have to make some and throw some away. Some visualizations will be for the problems, some will show concepts, and other visualizations will show the solutions (algorithms). Some visualizations can ask the reader to interact; look at this example. Some visualizations may be animated without the reader providing much input; look at these examples. We'll figure this out by experimenting :-) and I think it may be different in each chapter. |
So far, I have mapped over the MDP, GridWorld, and Policy Evaluation from the Python Implementation and I have visualized the Policy Evaluation on the GridWorld with a fixed policy (Fig 21.1). Here is the Policy Evaluation Demo. The MDP and GridWorld are shared between chapter 17 and 21. Would this mean a reimplementation of MDP and GridWorld by whoever takes on chapter 17 or do you prefer a single implementation between the two chapter? |
@alireza-a cool The main goal is the visualizations, so feel free to reuse between chapters if it works better, or reimplement if that works better. With the experimental nature of this project, sometimes it's easier to have two copies of the code while rapidly iterating on the visualizations, and then go back and merge them after we figure out which visualizations worked well. |
My next steps:
|
State Value Graph Democomments & questions
@redblobgames is this close to what you had in mind? |
@alireza-a pretty cool to see the convergence! It's not quite what I had in mind but it works :-)
I also wonder about how to show the individual steps. Right now you tap to see one number improve at a time, but it's unclear how that estimate is actually improving. This might be something that we can think about once we get to the later algorithms — we can show how the algorithms differ. Or maybe the individual steps aren't what we want to focus on, and there's some other aspect of the algorithm we want the reader to focus on. |
My original implementation used animations. To show what happened at every step, I slowed down the animations. This made it take far too long to converge so I switched to tapping. I will try slider this time. Also, I will be preoccupied with final exams until April 23rd. |
Ah, makes sense. A slider can either control animation speed or the simulation time.
|
@redblobgames I like the progress bar, start/pause, and next/back interface you used in your A* search article. I will create a similar interface. |
I reimplemented all the visualizations with D3.js and created a similar interface to the A* search article. Here is a demo for what I have so far.
Is there anything else I should try before moving on to the next agent? |
Looks cool! The diagram right now shows how states are evaluated and values updated. Can you put something in there describing what the colors mean? What is dark green vs light green? BTW if you want arrows in d3, I have some code (adapted from the SVG documentation) to produce an arrowhead marker:
Once you create the marker (just once on the page) you can attach the marker to any I think the main thing before moving on to the next agent/diagram is to make a list of the concepts you wanted to show in this diagram, and which you want in the next diagram(s). Sometimes it's useful to have a concept introduced in one and then used in the next without having to explain it again, and sometimes it's useful to have two diagrams side by side to show a concept that can't be shown by itself. |
I am thinking of redesigning the states (squares) to make the changes in state-value more pronounced. I want to remove the text representing the state value and replace it with a bar in the square. This design is what I'm currently considering. |
@alireza-a I think that's a good idea. When there are lots of things changing on the page it may be difficult for the reader to read all the numbers. The visual representations would make it easier to see at a glance. Your proposed design images 2 and 3 could be used for another design too: instead of showing all the lines in one chart on the right, you could show one chart inside each state's square. Maybe the line chart could instead be a bar chart with light gray bars (not prominent) but the current state (the rightmost bar) would be the green/red like in your proposed design. That way when you are looking at the current state you could also see the history of the value. |
This is fun! I'll create a history bar chart for each state and update the animations to match. |
Looks nice! |
I'm occupied with final exams until Aug 17th. I'll get back to this after I get free. |
Visualizations and code for chapter 21
The text was updated successfully, but these errors were encountered: