Skip to content

Offline support #546

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sorin-davidoi opened this issue Jan 16, 2018 · 24 comments
Open

Offline support #546

sorin-davidoi opened this issue Jan 16, 2018 · 24 comments

Comments

@sorin-davidoi
Copy link
Contributor

sorin-davidoi commented Jan 16, 2018

We could leverage the Service Worker API to make the book available offline.

Brief outline:

  • Off by default
  • Add option to make available offline
  • Cache all JavaScript and HTML files in the Service Worker
  • Handle offline state in the UI
  • Handle update logic

The main question is - would this be worth it? If I understand correctly, The Rust Book is bundled with Rust (or packaged in some distributions), so most of the people interested in it you have it locally. But then again, it might be nice to provide this regardless.

@Michael-F-Bryan
Copy link
Contributor

How would an offline version work? My understanding is that it'd have to cache all the pages in the book and if the user goes offline we serve pages up from the cache. That'd involve the user effectively trying to download everything the moment they visit a book and the service worker starts up in the background, wouldn't it? That's probably not a good idea for something as big as The Book or Rust By Example...

My opinion is that if it's easy enough to implement and maintain, then I'm all for it. I've (briefly) played around with Service Workers and found they can have issues with updating and caching. That could just be me being a noob though.

@sorin-davidoi
Copy link
Contributor Author

We could make it configurable, such that the user makes an active choice regarding which chapter should be available offline. My idea would be to don't cache anything by default and show a kind of unobtrusive action the first time the user loads the page, prompting to make it available offline (if its small enough, otherwise just the first few chapters).

@Michael-F-Bryan
Copy link
Contributor

We could make it configurable, such that the user makes an active choice regarding which chapter should be available offline.

I don't know if this would work in practice. People reading a book probably won't want to spend a while configuring how they read the book, and I'm quite conscious of how many buttons and options we add to the UI.

@sorin-davidoi
Copy link
Contributor Author

Yes, I guess you're right. Could we then try to cache as much as possible?

@Michael-F-Bryan
Copy link
Contributor

I just checked and the user guide itself is already 1.5M (644K when compressed), so storing the entire book may be a bit heavy. If you're only counting the HTML though, it drops to 284k (uncompressed).

What are your thoughts?

@sorin-davidoi
Copy link
Contributor Author

Well, it has to be HTML + JS, but I think that it is perfectly reasonable - this PWA caches entire videos in the Service Worker, so I don't think we should hit major roadblocks.

We could even compute, at compile time, how much we need to store and show that to the user. I imagine a small banner at the bottom, something like: "Want to make available offline (2.45 MB)?"

@Michael-F-Bryan
Copy link
Contributor

yeah I guess that's feasible, although we'd need to make sure the notification collapses down or hides when the user isn't interacting with it.

How does cache invalidation work? At the moment your typical dev experience is to run mdbook serve on your computer and then it'll livereload as you're editing. If the service worker gets in the way of that livereload, or if it tries to download the entire book a dozen times a minute (because you typed a couple words, hit save, then typed some more, all in quick succession), then that's gonna be a problem...

We can also use book.toml to tell the HTML renderer whether to include the service worker.

@sorin-davidoi
Copy link
Contributor Author

Service Workers are usually disabled in the development environment to avoid the issues you've mentioned. Not sure how that would work in this case.

@ghost
Copy link

ghost commented Feb 22, 2018

This is a great idea. The service work acts as a cacher and updater from a central place like a s3 bucket, etc.

Does stdweb provide Service Worker scaffolding ?
https://github.com/koute/stdweb

@jasonwilliams
Copy link
Member

jasonwilliams commented May 10, 2018

@Michael-F-Bryan having something like The Book available offline is exactly the sort of thing service workers were built for. We should certainly do this.
we could either do it automatically (users can easily clear their cache) or we can add a “store for offline” button. I don’t think 1.5M is heavy but that’s subjective.

@sorin-davidoi what’s the update on your PR?

Linking: rust-lang/rust#20866 (comment)

@Michael-F-Bryan
Copy link
Contributor

@jasonwilliams I agree that it'd be super useful and would be another step towards making mdbook more convenient to work with offline.

We just need to keep in mind that a book isn't always backed by a server, so something like mdbook build --open should continue to work and not care about service workers. From memory there are some general issues with running JavaScript from a file://... URL and using it to fetch documents from the file system.

I support the idea, we just need to be mindful of how it's implemented so it'll integrate well with the current workflow. Off the top of my head, this would require we

  • make sure books are still usable without a server
  • not use exorbitant amounts of bandwidth/memory
  • ensure the service worker cache gets cleared automatically during development (clearing the browser cache and restarting/updating workers manually is a pain)

@sorin-davidoi
Copy link
Contributor Author

@jasonwilliams No updates, there were some issues that needed to be handled before implementing this (see #571).

@jasonwilliams
Copy link
Member

@sorin-davidoi what are the issues?
Is there anything i can pick up from this to carry on the work you were doing? It looks like this has just come to a stop

@sorin-davidoi
Copy link
Contributor Author

@jasonwilliams I think the main issue was the use of external dependencies, since they paths need to be hardcoded in multiple places (and it also makes it much harder to compute their revision). Not sure how valid this still is since I'm stopped watching this project a while back.

@jasonwilliams
Copy link
Member

@sorin-davidoi thanks, why is the paths changing a problem?
the workbox tool you use seems to accept regular expressions, could you not make a regex based on the version? Or does this not solve the problem?

I can start by bringing the external deps into the project locally, but is this a hard copy and paste or a build function which can fetch these deps? (like npm)

@sorin-davidoi
Copy link
Contributor Author

the workbox tool you use seems to accept regular expressions, could you not make a regex based on the version?

I guess that could work.

I can start by bringing the external deps into the project locally, but is this a hard copy and paste or a build function which can fetch these deps? (like npm)

Probably a hard copy, but it would be nice to hear the opinion of one of the maintainers.

@jasonwilliams
Copy link
Member

Assuming the regex works, were there any other blockers to getting this working?

@sorin-davidoi
Copy link
Contributor Author

Ideally you would want to have a list of all these assets upfront, so they can be prefetched. If you don't have that you risk running into the following situation:

  • all the pages (html) are prefetched
  • user goes offline
  • user goes to page X which we didn't load before
  • page X needs resource R (e.g. mathJAX, syntax highlighting for a particular language) but it can't be fetched

@jasonwilliams
Copy link
Member

jasonwilliams commented Aug 9, 2019

So long as we've identified all the assets (which i think you have in the PR), we should be ok
Is there a good directory right now to put all of these assets?
i'm guessing in theme

@jasonwilliams
Copy link
Member

Help Needed!

#1000 seems to be working pretty well from what i can see so far, I just need some feedback now. The PR allows the rust book to be read offline, new changes will still take affect.

You can just navigate to https://jason-williams.co.uk/book/ and try it out, then leave any feedback in the issue above.

@sanmai-NL
Copy link

sanmai-NL commented Nov 9, 2020

@jasonwilliams I‘d like to help you. I use mdBook in higher education. Even today, we had a problem with our harmless github.io subdomain being blocked by Cisco Umbrella. Also, education administrators rightfully dislike educational content being dependent on (uncontracted) service providers’ availability. Having our mdBook availably offline, this time from its canonical URL rather than a local copy (which can get out of date, which is hard to manage for hundreds of students at the same time), would be a great feature.

image

@nihaals
Copy link
Contributor

nihaals commented Nov 9, 2020

github.io seems to be a common thing to be blocked even outside Cisco

@sanmai-NL
Copy link

@nihaals: yet at the same time it is a common and convenient publishing domain ...

By the way, this is duplicate of #463.

@nihaals
Copy link
Contributor

nihaals commented Nov 9, 2020

The way you can help is by reading from #1000 (comment) and giving any feedback or ideas (or possibly implementing the algorithm I wrote which can just be updated if we end up using only one of the config options)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants