-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Book representation #371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Book representation #371
Conversation
Believe it or not but this PR is pretty much complete. It only took 2 or 3 commits (pretty sure the first commit is 8310cce) to recursively walk the @azerupi or @budziq would you be able to have a look through #360 before merging this one. I branched directly from there instead of master, so this PR contains a lot of commits from that PR. When you merge it the history should clean itself up. |
Wow that s a lot of work @Michael-F-Bryan ! But would you be willing to squash the commits somewhat? |
@azerupi, sure thing. Most of the commits are checkpoints I've made along the way and #360 will probably be easier to review as a single unit. I'll squash #360 to make that easier to review and merge, then once we're all happy with that I can clean up the history for this commit. Would you like me to post a small summary as a comment on #360 so you know what it's meant to do? I haven't tied any of this code into the actual |
I'm not sure if its worth the effort to keep two separate PR's Why not incorporate all of the changes to #360 change it's title and close this one? I would advise against merging it until it's actually used anyway. As for the comment/summary. Yes please do! And please add descriptive information to the squashed commit. |
Yeah, that's a good idea. I was thinking of keeping the two separate because they are logically separate (one parses I'm not sure if I want to hold out on merging until it's actually used though. Integrating the new Anyway, here's a rough summary of the work I've done so far: Summary Module
Book Module
Loader Module
Tests
I've made sure everything has documentation (with I feel like I'm exposing more than I'd like to here because all the data types ( I'm also trying out the new Sorry for the wall of words and the massive PR, in retrospect I really should have broken this up into much smaller pieces! EDIT: Forgot to mention all the tests |
@Michael-F-Bryan thanks for the summary. It will help with the review 👍
I might just be me but I subscribe to the line of thought that anything that is not used or does not implement a required usecase should be immediately purged from the code base (mostly because I work on very large legacy codebases day to day and hate code that was "just left because it might be used someday" with heat of thousand suns) 😉 These things tend to quickly bitrot add complexity, reduce readability and maintainability. On the other hand actually plugging your changes, seeing how they interact with rest of the codebase and replace the old versions will let you iterate and minimise the design if required.
No rush. I'll gladly wait to see how the final api will turn out 👍
As long as the changes are relatively self contained in commits this number is not that staggering. But First thing that comes to my mind is that you might be trying to make the design too broad. How about enumerating the actual usecases that are not covered by the legacy code and problems that you are trying to address in a short list here? |
When you put it that way it makes a lot of sense!
I was worried about that too, but over half the code is actually tests to ensure the implementation does what I expect it to. I've found doing unit and integration tests like that also helps quite a bit with the overall API ergonomics. I originally wanted to do this refactoring because I was trying to make a PDF backend and found the existing I've tried to make sure the |
Here are a couple use cases I was thinking of: MathJax Plugin AuthorI want to have easy access to all of the source text from each chapter in a book. I should be able to implement a struct MathJaxPreprocessor;
impl Visitor for MathJaxPreprocessor {
fn visit_chapter_content(&mut self, content: &mut String) {
let pattern = Regex::new(...).unwrap();
content = pattern.replace_all(...);
}
} PDF Renderer AuthorAs a renderer author I want complete access to all facets of the book, preferably in some sort of AST-like structure so I can walk the AST, generating a LaTeX document in memory which I can then compile to PDF. Similar to the plugin author I would like some sort of use pdf::Renderer;
use mdbook::{MdBook, Visitor};
fn main() {
let renderer = Renderer::new(...);
let mut md = MdBook::new("/path/to/book/root"); // Get the actual path using `clap`
md.set_renderer("pdf", Box::new(renderer)); // takes a boxed `Visitor`
md.build().unwrap();
} MdBook DeveloperIt'd be nice to structure the process of building a book as a pipeline of transformations, with each step manipulating a tree-like The Thanks for asking for a couple use cases. Even as I was writing them up I figured out a nicer way to expose the |
@Michael-F-Bryan thanks for the writeup. Here are some of my thoughts:
Yep that is a serious problem currently and major roadblock in supporting other renderers 👍
This will probably the hardest one to tackle as MathJax is a superset of of Some LaTeX with math specific plugins + MathML (so the output of MathJax on html + MathJax.js will most likely be largely incompatible with pdf one that would be LaTeX based). I suspect that we cannot do this cleanly and it might be easier to have separate incompatible plugins for Math Symbols for each renderer (otherwise we might have a MathJax to image converter). I would strongly advise to implement a second renderer (the more stripped and unfeatureful as possible, certainly lacking MathJax support) along this PR to actually show what is needed from this PR and most likely we will arrive at a simpler design which we might always rewrite again if need arises.
Ar we sure that we need a design of such caliber (yet)? The book is a relatively flat sequential structure (summary is actually needed only to establish viewing order and render TOC). I would rather see a minimal pdf renderer along with major decoupling and simplification than a more complicated (and arguably over engineered) MdBook design without actual renderer implementation that warrants it.
I would suggest avoiding the visitor pattern if possible as this will needlessly complicate the design without much benefit. The question is which element should drive the iteration (the book or the renderer). Arguably the renderer will have a more domain knowledge about its output and requirements and will be able to make better choices about items and data from the book it would like to visit/revisit. Most likely making the book contents iterable and accessible in some sane fashion from the renderer would make the design easier (Granted there would be some minor code duplication between renderers that we might extract into separate logic if we have more than 1-2 renderers 😉 )
This is a nice idea and would be quite easily implemented with some simple imperative design that would boil down to following pseudocode. Please do not take it as a real suggestion buth rather another point of view for discussion: let mut md = MdBook::new("/path/to/book/root");
// for renderer agnostic preprocessing that works on bare markdown (like the `{{#playpen}}` and upcoming `{{#include}}` )
for prep in preprocessors {
// possibly plugins here too
prep.process(md)?;
}
for renderer in renderers {
renderer.register_plugins(ctx); // get the plugins from some global context -
// run renderer dependent plugins and preprocessing internally in render implementation
render.render(md).build()?;
} |
I wasn't planning on making it too complex, to me a book is just a collection of chapters, and each chapter looks roughly like this: struct Chapter {
name: String,
content: String,
number: Option<SectionNumber>, // SectionNumber is just a newtype'd Vec<u32>
nested_chapters: Vec<Chapter>, // because chapters can have sub-chapters
} I probably shouldn't have used the word "AST", what I was meaning is a graph where each node is a This is roughly the design we already have, except I've replaced the |
Would we be able to use this data except for formatting a string prefix for chapter name? I don't see a real possibility for it in case of visitor approach. In case of non visitor approach we might use these for some wacky indexing/traversal if need be (but please see my next comment)
I hope that we are not planning to create any cycles . I guess that we could use a tree structure. We might even get away with a plain list/vec representing the tree traversal and I would suggest a vec as a first POC to findout if we need anything more sophisticated.
Might be. I'm not opposed to storing everything in memory. We are looking at trivial amounts of data anyway. Anyhow I would suggest that this effort would be driven by the needs of a budding pdf (or any other) renderer. Then we would be able to iterate on the design to find the cleanest and most maintainable one. |
I've started an initial PDF renderer here. It actually works and took about 5 minutes to write up, I just need to break the markdown up and convert it to LaTeX but that problem is kinda orthogonal to the book's representation. I've attached the outputted document after running it over the example book (just knock off the trailing ".txt", apparently github doesn't like attaching "*.tex" documents). |
Very cool 👍 (although I would call it LaTex renderer which might be even more awesome amd versatile in its own right) Also the second important question is if we bundle some additional renderrers with mdbook or do we split (even on repo level). I guess that I would start with approach similar to ripgrep. One canonical repo with several crates. Then you could add this WIP rendererr to this PR making the design much easier to reason. And then experiment with split to different repos. |
I think that's the only reason we need it. MdBook makes a distinction between prefix chapters, numbered chapters, and suffix chapters so I thought we'd still need it. I prefer having a newtype'd list of numbers though because I spent forever trying to figure out what the string in
Yep, Rust's borrowing doesn't like graphs with cycles (you end up with multiple mutable references), plus a tree structure seems to be the logical way of representing a book because your book almost always reflects how you've laid things out on the file system.
Haha, I'll implement a The visitor pattern may be a premature optimisation/generalisation on my part because I've been working on a toy language and that's about the only (feasible) way of inspecting and manipulating your AST 😜 |
f3ed090
to
da03672
Compare
Sorry it's taken me so long to get back to this! I got the go ahead to start working on a Rust library/DLL which will be integrated into a product at work so I've been a tad distracted. I've squashed the commits to make the history a bit easier to navigate, plus I implemented a depth-first iterator over the items in a book as per popular demand 😉 I reckon my implementation may be a bit nicer than the existing one, so that's a bonus too. Should my next step be to try and start integrating |
Ugh, apparently Appveyor's "stable" target isn't actually the stable release, so even though struct field shorthands are stable (as of 1.17) the build fails trying to compile the From what I can tell the dependency chain is hyper > url > idna > unicode-bidi, but I'm not sure what I can do to fix it on my end... EDIT: Fixed. Rust was being installed directly from static.rust-lang.org and that's probably gone stale, so I'm using rustup instead. |
Thanks a lot @Michael-F-Bryan . I would put the last commit azerupi@092b445 in a separate PR as it is not really related to rest of the work and could be merged immediately. I'll try to give a review of rest of the PR shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, here are some of my random thoughts.
I find the code very clean and very good craftsmanship 👍 ! But I have a feeling that Summary and Loader over more than a little over-complicated for relatively simple task at hand while the Book seams ok. Could we pare down the implementation and strip the number of entities to the bare minimum? Currently It's few times longer and more complex while performing exactly the same tasks (not counting the enormous amount of tests which I adore ❤️ ).
Don't get me wrong. I love the refactoring and separation for the most part (I haven't done a serious code review yet as Imho the top level organization could do some more thinking)
And I would love to see the code actually used in action replacing the original implementations (I also think that the actual removal and replacement of original implementations should be part of this PR) once it would be pared down.
src/loader/book.rs
Outdated
@@ -0,0 +1,322 @@ | |||
#![allow(missing_docs, unused_variables, unused_imports, dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not add any of these pragmas. If anything I would add most of these in deny mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! I added those during development to decrease the signal-to-noise ratio of the compiler output. Looks like I forgot to remove it afterwards.
src/lib.rs
Outdated
extern crate log; | ||
extern crate pretty_assertions; | ||
#[cfg(test)] | ||
extern crate tempdir; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need it here? It's used only in loader/book.rs tests mod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the pretty_assertions
crate invaluable when you try to debug assertions where the assert!()
is comparing two complex structures. However you have a point in that it's not really worth adding yet another dependency for a feature which is only used 3 or 4 times... I think I'll remove it and see how I go without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I meant tempdir here. pretty_assertions cannot be imported anywhere else due to macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, what would you prefer me to do here? Do you think I should try to remove the tempdir
dependency, rearrange the imports, or even move the extern crate
statement to loader/book.rs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tempdir
is very handy in tests so I guess that we could make the import it localized to them as long as it's used only in one suite. If you plan to use it elsewere (or in other suites) then its fine where it is.
/// | ||
/// You need to pass in the book's source directory because all the links in | ||
/// `SUMMARY.md` give the chapter locations relative to it. | ||
pub fn load_book_from_disk<P: AsRef<Path>>(summary: &Summary, src_dir: P) -> Result<Book> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would suggest renaming the method to just load_book
src/loader/summary.rs
Outdated
@@ -0,0 +1,726 @@ | |||
#![allow(dead_code, unused_variables)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not add these pragmas
|
||
/// Enum representing any type of item which can be added to a book. | ||
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)] | ||
pub enum BookItem { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we need separation between Chapter
, SummaryItem
and BookItem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra level of indirection from BookItem
is necessary because the SUMMARY.md
format says your book is (conceptually) a list of Chapter
s and section separators.
The difference between BookItem
and SummaryItem
is because summary parsing is a separate process to loading a book. Parsing the summary should just need to take in your SUMMARY.md
as a string and not need to touch the filesystem or load chapters into memory. That's why Summary
is a list of SummaryItem
s which are either section separators or Link
s (e.g. - [Introduction](./src/intro.md)"
). Hence the distinct Link
entity instead of reusing Chapter
.
.iter_mut() | ||
.enumerate() | ||
.filter_map(|(i, item)| item.maybe_link_mut().map(|l| (i, l))) | ||
.rev() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't just last()
work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I originally tried too. What you are trying to do is get the last chapter from the sub_items
so you can append the new Chapter
to it, but what happens when you call last()
and the last item is a Separator
instead of a Chapter
?
I agree that it's not overly pretty or easy to understand, I'll see if I can pull it out into a method.
Also, maybe_link_mut()
feels like a bit of a hack so I don't need to write a match statement inside a closure. It needs to go...
/// | ||
/// This is roughly the equivalent of `[Some section](./path/to/file.md)`. | ||
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)] | ||
pub struct Link { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need another entity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll probably want to look at my previous comment for more info, but basically a Chapter
and a Link
have a very different purpose even if they both look quite similar. One is a chapter and its contents loaded from disk (with all the IO and logic that entails), whereas the other represents a single entry in the SUMMARY.md
file.
/// Given a particular level (e.g. 3), go that many levels down the `Link`'s | ||
/// nested items then append the provided item to the last `Link` in the | ||
/// list. | ||
fn push_item_at_nesting_level(links: &mut Vec<SummaryItem>, mut item: SummaryItem, level: usize, mut section_number: SectionNumber) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this signature is quite awkward. Why not make this a method over summary?
appveyor.yml
Outdated
- rust-%RUST_VERSION%-%TARGET%.exe /VERYSILENT /NORESTART /DIR="C:\Program Files (x86)\Rust" | ||
- SET PATH=%PATH%;C:\Program Files (x86)\Rust\bin | ||
- rustc -V | ||
- ps: >- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please separate it to a different PR. This is orthogonal and is most likely to be merged much earlier than the rest of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I pulled it out into #373 so it won't be held up by this PR.
092b445
to
da03672
Compare
@budziq I definitely agree with what you are saying about over-complicating things and introducing loads of entities which may or may not even be necessary. I've had a bit of a re-think about what I'm doing and which things I've completely removed the I think a lot of the complication comes about because I've considered parsing I'd kinda like to leave the Out of curiosity, I just went through and did a really rough count of lines of executable code excluding tests and docs after incorporating some of the feedback you've given me. This PR would replace 320 lines (230 from |
So lets take these out of the clean room! I very much like to see these lines replacing the original and how these interact with rest of the codebase in this PR (removing the old implementations would help show how all of it comes together). |
@budziq I started integrating this PR in and surprisingly it merged quite nicely. I just needed to changed The only non-trivial issue is that @budziq and @azerupi what are your thoughts on I'm also a little scared because once the compiler was satisfied all the tests passed... The reason I'm worried is I know I've at least broken inter-chapter links, so how many other things could I have accidentally broken and not know about? 😨 |
I would make a builder type |
I'm thinking I might need to back up a commit or two and rethink how I'm merging this into the main library. I'd prefer to try and keep the changes to It'll be a bit easier when #374 lands because then I can use those tests to make sure I haven't accidentally introduced regressions. |
Sorry :(
I was swamped last week and I'm on vacation for the next two weeks too.
I'll try to get back on track as soon as I return!
…On 15 July 2017 at 06:14, Michael Bryan ***@***.***> wrote:
I'm thinking I might need to back up a commit or two and rethink how I'm
merging this into the main library. I'd prefer to try and keep the changes
to MDBook as minimal as possible because this PR is mainly about the
book's internal representation, whereas as I started integrating the
builder in, I started creeping into configuration and overall design and
broke a lot of stuff.
It'll be a bit easier when #374
<https://github.com/azerupi/mdBook/pull/374> lands because then I can use
those tests to make sure I haven't accidentally introduced regressions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/azerupi/mdBook/pull/371#issuecomment-315508299>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AANfSA26iPA7qTZxeCzrKlohSF0jLfRTks5sODy3gaJpZM4OLuRh>
.
|
No stress. As I think you said at some point earlier on, I'd rather take a little longer and put some thought into the design than rush things 😁 I'm thinking the next victim for refactoring will probably be the configuration system. It feels like it's threaded through the entire codebase as it's evolved and features have been added, so it may need some TLC before we can do things like plugins and alternative renderers. |
From the [pull request comment][pr], here's a rough summary of what was done in the squashed commits. --- \# Summary Parser - Added a private submodule called `mdbook::loader::summary` which contains all the code for parsing `SUMMARY.md` - A `Summary` contains a title (optional), then some prefix, numbered, and suffix chapters (technically `Vec<SummaryItem>`) - A `SummaryItem` is either a `Link` (i.e. link to a chapter), or a separator - A `Link` contains the chapter name, its location relative to the book's `src/` directory, and a list of nested `SummaryItems` - The `SummaryParser` (a state machine-based parser) uses `pulldown_cmark` to turn the `SUMMARY.md` string into a stream of `Events`, it then iterates over those events changing its behaviour depending on the current state, - The states are `Start`, `PrefixChapters`, `NestedChapters(u32)` (the `u32` represents your nesting level, because lists can contain lists), `SuffixChapters`, and `End` - Each state will read the appropriate link and build up the `Summary`, skipping any events which aren't a link, horizontal rule (separator), or a list \# Loader - Created a basic loader which can be used to load the `SUMMARY.md` in a directory. \# Tests - Added a couple unit tests for each state in the parser's state machine - Added integration tests for parsing a dummy SUMMARY.md then asserting the result is exactly what we expected [pr]: https://github.com/azerupi/mdBook/pull/371#issuecomment-312636102
This is a squashed commit. It roughly encompasses the following changes. --- \# Book - Created another private submodule, mdbook::loader::book - This submodule contains the data types representing a Book - For now the Book just contains a list of BookItems (either chapters or separators) - A Chapter contains its name, contents (as one long string), an optional section number (only numbered chapters have numbers, obviously), and any nested chapters - There's a function for loading a single Chapter from disk using it's associated Link entry from the SUMMARY.md - Another function builds up the Book by recursively visiting all Links and separators in the Summary and joining them into a single Vec<SummaryItem>. This is the only non-dumb-data-type item which is actually exported from the book module \# Loader - Made the loader use the book::load_book_from_disk function for loading a book in the loader's directory. \# Tests - Made sure you can load from disk by writing some files to a temporary directory - Made sure the Loader can load the entire example-book from disk and doesn't crash or hit an error - Increased test coverage from 34.4% to 47.7% (as reported by cargo kcov)
Hey @budziq and @azerupi I've finally found time to get back to this PR 🎉 I feel like I'm going to need to make a couple major breaking changes to the way I'll probably need to refactor it to be less coupled to the configuration and rest of the system, and so you can only ever get an instance of To help get a better idea of how things should look,
|
8ff7664
to
77fe87e
Compare
When I started, it was supposed to be the "facade" of the library. The type you use to manipulate the book. I am not attached to it per se, so if there is a better way to structure the code it's worth discussing. :)
Most of the time, configuration will be done once. But I would like to keep the ability to change the configuration on the fly. It doesn't mean the
I think the I'm open to other ideas, it's just what I have come up with at this time :)
I like that idea! 👍 |
I think it would be better in the long run if you can still load a book into memory and render it manually, so What do you think the best thing for me to do from here is? I've currently got a nicely decoupled way of loading the book from disk, now I need to convert the current system to using that instead of the old book representation. The issue is the current system assumes you can have a partially uninitialized state and be able to change all sorts of configuration things on the fly, while the way I've done things, you load the book from disk and shouldn't be able to change configuration options like the source directory afterwards because that'd result in an in inconsistent state. Sorry it's taking so long to finish off this PR! I was really enthusiastic about making the internal representation for a book less coupled and "nicer" to start off with, but when I started trying to integrate it with the |
Sorry for late replies. It's pretty hectic lately. IMHO It would be best if
Don't sweat it you are doing important work and doing great job at it 👍 There is no rush.
Yep. Largish refactors like this one tend to be on the painful side. Usually it is easier to refactor and integrate organically along the way instead of transplanting in one go.
On the bright side I've looked on such things as a todo list given for free by the compiler. I always shudder at larger python code rewrites if test coverage is not great (end even if coverage is good it tends to require soaking most of the code to my frontal lobes before we even starte ;) ). |
@budziq, I am not sure I agree with you here. I don't have much time to go into details, but the way I see it is that we indeed need a book struct that is iterable but it is not the The reason why I think that is that if we want to support multi-lang books we will need to easily duplicate those iterable book structures without duplicating all the common metadata. So I think we need an overarching structure (which was the If I understand your stream of thoughts correctly, you want to dissociate the configuration from the book data as much as possible? That doesn't sound like a bad idea, but at some point we need something to orchestrate the whole, how do you envision that? |
Right. I keep forgetting about the planned multilang support as It has been outside of my requirements (for now). Well I kind of like the idea of I'm still unclear if we would actually need any mapping between book translation summaries at all, except for some lints about translations diverging from the original. @azerupi do you have a use-case in mind in regard to the common metadata? Most likely I'm missing something there. |
To me, it sounds like there are three general structures at play here. First there's a On top of those two there'll be a It sounds like the Is there a issue/PR for multilingual support I can read through to get up to speed with how we're wanting to do it? From what I've seen of the |
@Michael-F-Bryan Yes, that is mostly how I envisioned things :)
There isn't anything complete. The best I can give you is issue #146 where I described how I imagined the different structs. The related PR #147 that never got merged, could be a good starting point. |
I think we should leave that to the renderer. Like you said, only few renderers will need mapping between languages (mainly the HTML one), so it makes sense to let the renderers implement it to suit their needs. |
@Michael-F-Bryan I propose that we try to integrate some part of this into master already. I am pushing this specific part to move one step closer to multi-lingual books. Does that sound ok? If you don't feel like making a new PR I can definitely take care of it. I have some free time today and this week in general :) |
I had a couple hours between classes so I decided to finish this off. I've rolled back to when I finished the I've made a PR (#409) and all the tests pass (woohoo! 🎉) so it looks like everything should work. @budziq and @azerupi should we start the reviewing process now? |
From the [pull request comment][pr], here's a rough summary of what was done in the squashed commits. --- \# Summary Parser - Added a private submodule called `mdbook::loader::summary` which contains all the code for parsing `SUMMARY.md` - A `Summary` contains a title (optional), then some prefix, numbered, and suffix chapters (technically `Vec<SummaryItem>`) - A `SummaryItem` is either a `Link` (i.e. link to a chapter), or a separator - A `Link` contains the chapter name, its location relative to the book's `src/` directory, and a list of nested `SummaryItems` - The `SummaryParser` (a state machine-based parser) uses `pulldown_cmark` to turn the `SUMMARY.md` string into a stream of `Events`, it then iterates over those events changing its behaviour depending on the current state, - The states are `Start`, `PrefixChapters`, `NestedChapters(u32)` (the `u32` represents your nesting level, because lists can contain lists), `SuffixChapters`, and `End` - Each state will read the appropriate link and build up the `Summary`, skipping any events which aren't a link, horizontal rule (separator), or a list \# Loader - Created a basic loader which can be used to load the `SUMMARY.md` in a directory. \# Tests - Added a couple unit tests for each state in the parser's state machine - Added integration tests for parsing a dummy SUMMARY.md then asserting the result is exactly what we expected [pr]: https://github.com/azerupi/mdBook/pull/371#issuecomment-312636102
I'm closing this in favor of #409. |
This PR adds a new internal representation for the book, plus functionality for loading that book using the
Summary
we got when parsingSUMMARY.md
.It is part of a larger issue (#359) which is working towards refactoring the
Book
structure and decouple it from the other bits like rendering and configuration.