-
Notifications
You must be signed in to change notification settings - Fork 153
fix config caching #1129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix config caching #1129
Conversation
docs/display-race.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed these two test pages were inadvertently committed.
test/build-test.ts
Outdated
@@ -42,7 +42,7 @@ describe("build", () => { | |||
|
|||
await rm(actualDir, {recursive: true, force: true}); | |||
if (generate) console.warn(`! generating ${expectedDir}`); | |||
const config = Object.assign(await readConfig(undefined, path), {output: outputDir}); | |||
const config = {...await readConfig(undefined, path), output: outputDir}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is to avoid mutating the Config
instance: the same config is used by other tests.
@@ -4,7 +4,7 @@ import {normalizeConfig as config, mergeToc, readConfig, setCurrentDate} from ". | |||
import {LoaderResolver} from "../src/dataloader.js"; | |||
|
|||
describe("readConfig(undefined, root)", () => { | |||
before(() => setCurrentDate(new Date("2024-01-11T01:02:03"))); | |||
before(() => setCurrentDate(new Date("2024-01-10T16:00:00"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is to match the build tests — since the config is now cached, we want the footer to have the same data, so the behavior of the tests is the same independent of order.
@@ -84,7 +85,7 @@ export async function searchIndex(config: Config, effects = defaultEffects): Pro | |||
) | |||
); | |||
|
|||
indexCache.set(config, {json, freshUntil: +new Date() + reindexingDelay}); | |||
indexCache.set(pages, {json, freshUntil: Date.now() + reindexDelay}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache is keyed by the pages
instead of the config
so that if the set of pages change, the search index is recomputed.
try { | ||
visited.add(statSync(join(root, ".observablehq")).ino); | ||
} catch { | ||
// ignore the .observablehq directory, if it exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that if you have a .md
file in your docs/.observablehq/cache
, it was being picked up by the default pages. We don’t want that, so let’s just ignore everything in .observablehq
under the root.
for (const file of visitMarkdownFiles(root)) { | ||
if (file === "index.md" || file === "404.md") continue; | ||
const source = readFileSync(join(root, file), "utf8"); | ||
const parsed = parseMarkdown(source, {path: file, md}); | ||
files.push({file, source}); | ||
hash.update(file).update(source); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also tried hashing the mtime of the file instead of the source, which is slightly faster in the cached case. But I don’t think it’s fast enough that it matters, and this felt simpler.
@@ -88,18 +89,29 @@ export async function readDefaultConfig(root?: string): Promise<Config> { | |||
return normalizeConfig(await importConfig(tsPath), root); | |||
} | |||
|
|||
let cachedPages: {key: string; pages: Page[]} | null = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m only storing a single value (the most recent value) in the cache for readPages
, just to keep things simple. I don’t expect that there would be more than one set of pages needed simultaneously.
(the issue predates this PR)
const cachedConfig = configCache.get(spec); | ||
if (cachedConfig) return cachedConfig; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you prefer this pattern over:
if (configCache.has(spec)) return configCache.get(spec);
For performance maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For TypeScript. (Though, it could be faster, too.)
Previously you had declared the WeakMap
without types, so keys and values were type any
by default. In this case TypeScript considers the return value of map.get
as any
, too, and hence doesn’t require that it returns a defined value and allows map.get(key).property
even though it is unsafe.
When we declare the cache as WeakMap<Config["pages"], {json: string; freshUntil: number}>
, then TypeScript will enforce type checking because map.get(key)
now returns {json: string; freshUntil: number} | undefined
. So if we don’t use the pattern above, we get an Object is possibly 'undefined'
type error:

TypeScript isn’t smart enough to know that if map.has(key)
returns true, then a subsequent map.get(key)
with the same key
returns a defined value. (There’s no way to declare such a relationship in an interface, at least not yet.) So as a type-safe alternative, we call map.get(key)
first, and then test the returned value which can’t change, and hence TypeScript knows that the value is defined.
src/search.ts
Outdated
processTerm(term) { | ||
return term.match(/\p{N}/gu)?.length > 6 ? null : term.slice(0, 15).toLowerCase(); // fields to return with search results | ||
processTerm(term: string) { | ||
return (term.match(/\p{N}/gu)?.length ?? 0) > 6 ? null : term.slice(0, 15).toLowerCase(); // fields to return with search results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm moving this comment to line 22 (the issue predates this PR, I just happened to see it here).
Previously
normalizeConfig
would return a newConfig
instance given the same spec. We want it to return the same instance given the same spec, so that downstream code (such as search) can use theConfig
as a key in aWeakMap
. This applies two additional optimizations:readPages
is now cached, keyed on the file contents within the directory; andparseMarkdownMetadata
is used instead ofparseMarkdown
withinreadPages
, avoiding unnecessary work to compute the default page list.