Skip to content

Making configuration more flexible #457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Nov 12, 2017

Conversation

Michael-F-Bryan
Copy link
Contributor

This attempts to start cleaning up the configuration API. The idea is that all items should be public seeing as a Config struct is just a bag of options with no behaviour (besides convenience methods for loading from disk and such).

Removing the getters and setters and builder pattern may be a bit controversial, but my reasoning is you usually use getters/setters to allow for forward compatibility without breaking downstream. However, if we remove a config item that someone downstream is using, I'd want their build to break. This indicates they are using an option which is no longer available and they should update their stuff accordingly.

I'm also trying to make sure the html renderer isn't special. You still specify html renderer-specific options using the output.html table, but the output table itself is just a BTreeMap<String, toml::Value>. To make life more convenient for downstream I've added a try_get_output() method which allows you to get a table according to key and deserialize it automatically using serde.

This way, later on when we give the rendering and plugin pipeline a makeover, you can pass in a reference to the global Config struct and individual renderers/plugins can get the options specific to it.

@Michael-F-Bryan
Copy link
Contributor Author

So far I've just created the type definitions and a couple tests to make sure it deserializes as I expect it to. I'm going to start migrating the existing mdbook across to use the new configs now.

@Michael-F-Bryan
Copy link
Contributor Author

Michael-F-Bryan commented Sep 30, 2017

I finished updating everything which relies on configuration. Now that MDBook no longer has dozens of builder methods for updating the configuration, it should allow us to make the crate a lot less decoupled and stateful.

All the tests pass on my machine. @azerupi or @budziq, would you be able to review this?

@Michael-F-Bryan Michael-F-Bryan changed the title Config Making configuration less coupled and more extensible Sep 30, 2017
@Michael-F-Bryan Michael-F-Bryan changed the title Making configuration less coupled and more extensible Making configuration more flexible Sep 30, 2017
@Michael-F-Bryan Michael-F-Bryan mentioned this pull request Oct 1, 2017
6 tasks
@azerupi azerupi self-requested a review October 1, 2017 11:17
Copy link

@Phaiax Phaiax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better than the getter/setter config.

@@ -283,50 +295,52 @@ impl Renderer for HtmlHandlebars {

let rendered = self.post_process(rendered,
"print.html",
book.get_html_config().get_playpen_config());
&book.config.html_config().unwrap_or_default().playpen);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a html_config in scope (l. 243)

data.insert("livereload".to_owned(), json!(livereload));
}

// Add google analytics tag
if let Some(ref ga) = book.get_google_analytics_id() {
if let Some(ref ga) = book.config.html_config().and_then(|html| html.google_analytics) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use book.config instead of the config argument? (I guess there is a purpose, but I haven't read the whole source yet.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I think I was blindly trying to fix the compile errors and not thinking of the bigger picture, but this should really use the config argument.

That way, when we start restructuring the renderer/plugin interface you can pass in just the Book (representation of a book on disk) and the Config. Avoiding passing around the small kitchen sink which is MDBook 😜

tests/config.rs Outdated
}
// #[test]
// fn do_not_overwrite_unspecified_config_values() {
// let dir = TempDir::new("mdbook").expect("Could not create a temp dir");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still relevant?


debug!("[*]: Register handlebars template");
handlebars.register_template_string("index", String::from_utf8(theme.index.clone())?)?;

debug!("[*]: Register handlebars helpers");
self.register_hbs_helpers(&mut handlebars);

let mut data = make_data(book)?;
let mut data = make_data(book, &book.config)?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding the additional argument?

src/config.rs Outdated
/// This is for compatibility only. It will be removed completely once the
/// rendering and plugin system is established.
pub fn html_config(&self) -> Option<HtmlConfig> {
self.try_get_output("html").ok()
Copy link

@Phaiax Phaiax Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not cached, so it will be deserialized a few times, but I think there is no need to optimize this.
And editing a variable in code is not as straight forward as with the BookConfig. Maybe you should add a line in the documentation on how to change a html config variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... To be honest, I hadn't thought about letting people mutate the configuration after it has been loaded.

I think I'll rewrite the config to internally hold the raw Value, with getters and setters for accessing things like an output table (e.g. HtmlConfig) by key. The toml crate has a get_mut() method which I could expose to people so they can get something by index (e.g. output.html) and possibly mutate it.

Copy link
Contributor Author

@Michael-F-Bryan Michael-F-Bryan Oct 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just experimented with Config holding a raw Value and then providing getters and setters for the various components (book settings, outputs, etc) and it doesn't end up being overly ergonomic. The issue is that every time you want to access something from the BookConfig you need to deserialize it again, which requires dealing with errors and can't really be cached.

I think if I leave the internals pub that should be enough to let people update and inspect the config if they need to.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried a little but, but it did not end well, (see comment in line 57).
https://play.rust-lang.org/?gist=29a32500c428bfae20593421ec7bc4ab&version=stable

I would propose two other ways:

  1. Do not use toml::Value at all and just use HtmlConfig as the only struct in output with direct deserialization. Add an API to get the original config string (raw file contents), so that the plugin can deserialize for itself.
  2. Move the HtmlConfig struct into the HtmlHandlebars. Let the HtmlHandlebars deserialize from a toml::Value and store it's own deserialized config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long run I want to go for number 2 because that'll completely separate the handlebars renderer from the book. I think I marked the HtmlConfig getter as deprecated to indicate that.

src/config.rs Outdated
.try_into()
.chain_err(|| "Couldn't deserialize the value"),
None => bail!("Key Not Found"),
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these two match arms produce a more helpful error message?
It is only helpful for library users, but still.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point. Although the deserializing error is chained, meaning if you iterate through the causes you can see the root reason for why deserializing failed. Giving you a high level "deserializing failed" message, plus a more low level error if you need more information.

I'm not sure what to do in the None case though. I really want to avoid returning an Option<Result<T>> if I can, that just feels ugly.

@Phaiax
Copy link

Phaiax commented Oct 11, 2017

I almost wanted to ask if I can refactor the config, but then I found this nice PR.

I did a simple review and I vote for merging this.

@Michael-F-Bryan
Copy link
Contributor Author

Michael-F-Bryan commented Oct 16, 2017

I've just rebased and fixed up the things @Phaiax raised. @azerupi, do you reckon this is ready to merge?

"--curly-quotes 'Convert straight quotes to curly quotes, except for those \
that occur in code blocks and code spans'",
)
.arg_from_usage(
"[dir] 'A directory for your book{n}(Defaults to Current Directory \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the "curly quote" option removed?

Copy link
Contributor Author

@Michael-F-Bryan Michael-F-Bryan Oct 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I removed it because something like that belongs more in the config file than as a command line argument.

My thoughts were that options to alter the generated output and which usually stay the same belong in the config file, while options that alter runtime behaviour (e.g. --open or --no-create) are command line arguments... Does that sound logical?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable to me, but I am not sure about the intent behind the curly quote option.

@Phaiax Phaiax mentioned this pull request Oct 22, 2017
6 tasks
@Phaiax
Copy link

Phaiax commented Oct 23, 2017

ping @azerupi

@Phaiax Phaiax mentioned this pull request Nov 8, 2017
@Michael-F-Bryan
Copy link
Contributor Author

I'm going to go back and do another review of this PR to check that it works as we want. @Phaiax, I know you need access to configuration as part of #51, so when I'm happy would you like to give this another look over and see if it suits your needs?

At the moment I'm thinking of giving people access to the book.toml config file's contents with some config.get("search") method that returns an arbitrary toml::Value which will be specific to their plugin/feature/use case. You can also convert a Value into any arbitrary serializable thing using Value::try_into().

@Michael-F-Bryan
Copy link
Contributor Author

Michael-F-Bryan commented Nov 11, 2017

I think I've found quite a nice API for the Config. It's probably easiest to explain by showing a couple of the tests:

#[test]
fn load_arbitrary_output_type() {
    #[derive(Debug, Deserialize, PartialEq)]
    struct RandomOutput {
        foo: u32,
        bar: String,
        baz: Vec<bool>,
    }

    let src = r#"
    [output.random]
    foo = 5
    bar = "Hello World"
    baz = [true, true, false]
    "#;

    let should_be = RandomOutput {
        foo: 5,
        bar: String::from("Hello World"),
        baz: vec![true, true, false],
    };

    let cfg = Config::from_str(src).unwrap();
    let got: RandomOutput = cfg.get_deserialized("output.random").unwrap();

    assert_eq!(got, should_be);

    let baz: Vec<bool> = cfg.get_deserialized("output.random.baz").unwrap();
    let baz_should_be = vec![true, true, false];

    assert_eq!(baz, baz_should_be);
}

(get_deserialized() is just a helper which will get() a key then use serde to deserialize it automagically)

And I've also added a get_mut() function so you can mutate elements in the Config.

#[test]
fn mutate_some_stuff() {
    // really this is just a sanity check to make sure the borrow checker
    // is happy...
    let src = COMPLEX_CONFIG;
    let mut config = Config::from_str(src).unwrap();
    let key = "output.html.playpen.editable";

    assert_eq!(config.get(key).unwrap(), &Value::Boolean(true));
    *config.get_mut(key).unwrap() = Value::Boolean(false);
    assert_eq!(config.get(key).unwrap(), &Value::Boolean(false));
}

@budziq if you get a moment, can you have a look at this and let me know what you think? All the interesting stuff is in src/config.rs.

@Phaiax
Copy link

Phaiax commented Nov 11, 2017

This is a good design. 👍

I only looked over the config.rs and that seems good to me.

@Michael-F-Bryan
Copy link
Contributor Author

Something that's just occurred to me is that this will be a breaking change. At the moment configuration things specific to the book are present as top level items, e.g.

title = "my book"
authors = ["Michael-F-Bryan"]

[output.html]
theme = "./path/to/theme"

Whereas with this configuration I've taken some inspiration from Cargo.toml and put the "metadata" stuff under their own section.

Meaning the previous example should now look like this:

[book]     #   <-- new section
title = "my book"
authors = ["Michael-F-Bryan"]

[output.html]
theme = "./path/to/theme"

We're not 1.0 so technically we can do whatever we want, but at the same time I'd prefer not to break people's stuff without letting them know how to fix it. I think I'll add a compatibility thing to Config's Deserialize impl which will check for the previous format and load that, emitting a warning to tell the user to upgrade (with a link to the configuration page in the docs).

Copy link

@Phaiax Phaiax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ill restart the review because a new commit arrived


source = "my-src" # the source files will be found in `root/my-src` instead of `root/src`
src = "my-src" # the source files will be found in `root/my-src` instead of `root/src`
build-dir = "build"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You moved the build-dir into the common configuration. Hmmm. If I have multiple output generators, I cannot put them into different directorys. I think the other way was better. What are the arguments for doing it this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts were that (in the long term) you would have a single build/ directory where all renderers would put their rendered output. During the rendering stage a particular renderer is simply told the directory they should put things in, instead of each renderer doing their own thing and deciding it themselves.

That way mdbook would ensure you have a standardized directory layout looking something like this:

  • my_book
    • build
      • html
      • pdf
      • some_other_renderer
    • src
    • book.toml

With the special case being that if you're only using one renderer, mdbook will put tell a renderer to put its stuff directly in the root (avoiding the redundant extra level of nesting from build/html/).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that sounds fine :)

Some(dest_dir) => book.with_destination(dest_dir),
None => book,
};
if let Some(dest_dir) = args.value_of("dest-dir") {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. did you move the destination because this argument must be independent of the backend?

Copy link
Contributor Author

@Michael-F-Bryan Michael-F-Bryan Nov 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I was thinking a renderer would be told to put its stuff in the $dest-dir/renderer_name/ directory, or $dest-dir/ if there's only one renderer.

That way renderers are "sandboxed" and should only be touching stuff in the directory they're told to write output to. #409 means renderers are being given an entirely in-memory representation of the book, so these two combined would mean we're a lot more decoupled from the file system and the environment.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk

}

cfg.rest = table;
cfg
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/config.rs Outdated
if is_legacy_format(&table) {
warn!("It looks like you are using the legacy book.toml format.");
warn!("We'll parse it for now, but you should probably convert to the new format.");
warn!("See the mdbook documentation for more details");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit differences are mentioned neither here nor in the docs. I think it would be a nice gesture to just say:

In short words: Just move all top level configuration entries like title, author and description into a group called [book].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've seen rustc and other tools emit messages along the same line of "this is going to stop working soon so update your stuff. If you do x, y, and z then that should fix 99% of your problems" and found it a really nice gesture.

src/config.rs Outdated
warn!("It looks like you are using the legacy book.toml format.");
warn!("We'll parse it for now, but you should probably convert to the new format.");
warn!("See the mdbook documentation for more details");
warn!("http://azerupi.github.io/mdBook/format/config.html");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved into rust-lang-nursery.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updated the link and it seems to work.

@Phaiax
Copy link

Phaiax commented Nov 12, 2017

I vote for merging this finally. 💃
Any bugs or changes can be fixed later, we don't release immediately anyway.

@Michael-F-Bryan
Copy link
Contributor Author

Any bugs or changes can be fixed later, we don't release immediately anyway.

Awesome! I'll wait for travis to give the green light before hitting the merge button.

@steveklabnik
Copy link
Member

So, this is now causing the nomicon to fail building: https://travis-ci.org/rust-lang-nursery/nomicon/builds/313641706?utm_source=email&utm_medium=notification

is book.toml required now?

@Michael-F-Bryan
Copy link
Contributor Author

It looks like that was an accidental regression, when loading the config we should fall back to Config::default() if book.toml doesn't exist. I've made a PR (#506) to fix this.

@Michael-F-Bryan Michael-F-Bryan mentioned this pull request Jan 19, 2018
3 tasks
Ruin0x11 pushed a commit to Ruin0x11/mdBook that referenced this pull request Aug 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants