Skip to content

feat: etag support #746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed

feat: etag support #746

wants to merge 12 commits into from

Conversation

privatenumber
Copy link

@privatenumber privatenumber commented Oct 27, 2020

This PR contains a:

  • bugfix
  • new feature
  • code refactor
  • test update
  • typo fix
  • metadata update

Motivation / Use-Case

Breaking Changes

N/A

Additional Info

Will work on tests tomorrow but opening a PR to get a head start on review and feedback.

@privatenumber privatenumber changed the title feat: Etag support feat: etag support Oct 27, 2020
@codecov
Copy link

codecov bot commented Oct 27, 2020

Codecov Report

Merging #746 (420c6fb) into master (1df8477) will decrease coverage by 6.93%.
The diff coverage is 14.28%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #746      +/-   ##
==========================================
- Coverage   99.15%   92.21%   -6.94%     
==========================================
  Files          10       10              
  Lines         236      257      +21     
  Branches       72       78       +6     
==========================================
+ Hits          234      237       +3     
- Misses          2       16      +14     
- Partials        0        4       +4     
Impacted Files Coverage Δ
src/middleware.js 67.85% <14.28%> (-32.15%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1df8477...420c6fb. Read the comment docs.

compiler.hooks.done.tap('webpack-dev-middleware', computeEtags);
} else {
compiler.plugin('done', computeEtags);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No hooks here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already can load assets here, please look at logic

Copy link
Author

@privatenumber privatenumber Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that there's access to the memfs instance, but is there a manifest object (eg. build stats) I can iterate over to get all the asset paths?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why you need iterate?

Copy link
Member

@alexander-akait alexander-akait Oct 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway if you need this, you have context, put this logic inside done hooks and collect them context.etags

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

I want to iterate a list of distribution files to efficiently calculate etags. If I only have the memfs instance, I'd have to traverse the entire fs, which isn't optimal.

export default function wrapper(context) {
const { compiler } = context;
const etagRegistry = new Map();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory leak

Copy link
Member

@alexander-akait alexander-akait Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to WeakMap and use assets source as key

Copy link
Author

@privatenumber privatenumber Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, Weakmaps don't allow primitives (eg. strings) to be used as keys.

I'll clear all entries via Map#clear at the beginning of computeEtags.

Copy link
Member

@alexander-akait alexander-akait Oct 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is still memory leak, you should not use primitives for weak map, you have asset object for this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved the other thread about this so all conversation is in this thread.

Can you explain why there would still be a memory leak if unused etags are being cleaned up per build?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why there would still be a memory leak if unused etags are being cleaned up per build?

Because we create etagRegistry on middleware start so next compilation will put new files again and again, and if you use name.[contenthash][ext] you will increase size infinitely, it will be difficult to debug, but this is a potential leak, using WeakMap help to solve it, when asset was removed from comilation.assets it will be removed from WeakMap, so we free memory

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for explaining.

That's addressed by etagRegistry.clear() on every build completion because .clear() removes all entries in a Map.

I'm wondering how you imagine using a WeakMap is possible? I would jump to using it if it were possible to get the asset object (they key) when a new HTTP request comes in here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's addressed by etagRegistry.clear() on every build completion because .clear() removes all entries in a Map.

it was reduce performance, we should avoid it

I'm wondering how you imagine using a WeakMap is possible? I would jump to using it if it were possible to get the asset object (they key) when a new HTTP request comes in here.

store source of assets and use it as key

@@ -49,6 +75,19 @@ export default function wrapper(context) {
return;
}

const assetEtag = etagRegistry.get(filename);
if (assetEtag) {
Copy link
Member

@alexander-akait alexander-akait Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need option to enable/disable it, in future we can enable it by default

Copy link
Member

@alexander-akait alexander-akait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we have immutable flag in assetInfo, this flag assets will do be changed never

@privatenumber
Copy link
Author

As I work on the tests, I'm just realizing that express adds etags by default.

Feels redundant to make it disabled by default on the middleware, and this also makes it difficult to test because etags are on even if the middleware option is disabled.

Do you mind if I make this a PR against #747? That way I can pull in the vanilla HTTP version and use connect in the test.

@alexander-akait
Copy link
Member

As I work on the tests, I'm just realizing that express adds etags by default.

Yes, we will enable it by default, but we really need a option to disable this

Do you mind if I make this a PR against #747? That way I can pull in the vanilla HTTP version and use connect in the test.

Yes

@alexander-akait
Copy link
Member

I studied all the material and other implementation, so I want to implement options:

  • cacheControl
  • etag
  • lastModified

It should be not hard, WIP on this, but I think release will be on the next week only

@alexander-akait
Copy link
Member

Anyway thanks for good idea

@privatenumber
Copy link
Author

@evilebottnawi

I thought about time-based caching (cache-control/if-modified-since) as well but concluded it wouldn't be as useful as a content-based one (etag) in a rapid development environment.

I imagine it would actually lead to:

  • unexpected behavior (changed code not getting immediately fetched)
  • users not understanding the feature and disabling caching in the browser
  • people turning it off/not using it

@alexander-akait
Copy link
Member

I will provide some docs on this to help developers, anyway etag is very easy to implement, so it will be implemented in near future and enabled by default

for (const assetId in assets) {
if (hasOwnProperty.call(assets, assetId)) {
const { existsAt: fsPath } = assets[assetId];
const etag = getETag(assets[assetId].source());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this is no longer possible in Webpack 5, due to error:

[webpack-dev-middleware] Error: Content and Map of this Source is not available (only size() is supported)
at SizeOnlySource._error (/node_modules/webpack-sources/lib/SizeOnlySource.js:16:10)
at SizeOnlySource.source (/node_modules/webpack-sources/lib/SizeOnlySource.js:26:14)

@alexander-akait Do you have a recommended work around for this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up leveraging chunkHashs, which is actually more convenient because I can just use the hash as the etag instead of generating one from the file source:

const { chunkHashs: assetHashes } = stats.compilation.records;
for (const assetId in assetHashes) {
	if (hasOwnProp(assetHashes, assetId)) {
		etagRegistry.set(assetId, JSON.stringify(assetHashes[assetId]));
	}
}

If there's an alternative you recommend, LMK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @sokra What do you think?

const { assets } = stats.compilation;
for (const assetId in assets) {
if (hasOwnProperty.call(assets, assetId)) {
const { existsAt: fsPath } = assets[assetId];
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems existsAt doesn't exist in Webpack 5 either

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants