-
Notifications
You must be signed in to change notification settings - Fork 12.8k
Avoid writing files that are not changed while compiling incrementally. #6937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
function writeFile(fileName: string, data: string, writeByteOrderMark?: boolean): void { | ||
// If a BOM is required, emit one | ||
if (writeByteOrderMark) { | ||
data = "\uFEFF" + data; | ||
} | ||
|
||
const md5 = getMd5(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do not think we should do this for all files. in a normal batch compilation secnario, you will doing a fileExits, stat, and createHash needlessly.
Instead, we should expose a new optional function on sys to getModifiedTime, and createHash. in the watch logic, we can query them as needed, and store the map outputFileName => { mtime, hash} there as well.
as noted above, we should split this into infrastructure, in sys, and the calls should be part of the watch implementation. |
…ntation into compiler host
@mhegazy Updated~ 😄 |
const hash = sys.createHash(data); | ||
const mtimeBefore = sys.getModifiedTime(fileName); | ||
|
||
if (mtimeBefore && outputFingerprints.hasOwnProperty(fileName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the hasProperty
function
const outputFingerprints: Map<OutputFingerprint> = | ||
options.watch && sys.createHash && sys.getModifiedTime ? {} : undefined; | ||
|
||
const fileWriter: typeof sys.writeFile = outputFingerprints ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just a function?
function writeFile(fileName, data, writeByteOrderMark) {
if (outputFingerprints) {
const hash = sys.createHash(data);
const mtimeBefore = sys.getModifiedTime(fileName);
if (mtimeBefore && hasProperty(outputFingerprints, fileName)) {
const fingerprint = outputFingerprints[fileName];
// If output has not been changed, and the file has no external modification
if (fingerprint.byteOrderMark === writeByteOrderMark &&
fingerprint.hash === hash &&
fingerprint.mtime.getTime() === mtimeBefore.getTime()) {
return;
}
}
}
sys.writeFile(fileName, data, writeByteOrderMark);
if (outputFingerprints) {
const mtimeAfter = sys.getModifiedTime(fileName);
outputFingerprints[fileName] = {
hash,
byteOrderMark: writeByteOrderMark,
mtime: mtimeAfter
};
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhegazy I personally do not like parallel branches with same condition (because they can not indicate a strong relationship between those branches), so if I write them into a single function, I would like a single if...else
with two sys.writeFile
in each branch. And after merging them together, function fileWriter
would be awkward to exist alone with writeFile
. If I inline fileWriter
into writeFile
, it would result in a long try
block. 😢 What's your suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about two functions then:
const outputFingerprints: Map<OutputFingerprint>;
function writeFile(fileName, data, writeByteOrderMark) {
if (options.watch && sys.createHash && sys.getModifiedTime) {
writeFileIfUpdated(fileName, data, writeByteOrderMark);
}
else {
sys.writeFile(fileName, data, writeByteOrderMark);
}
}
function writeFileIfUpdated(fileName, data, writeByteOrderMark) {
if (!outputFingerprints) {
outputFingerprints = {};
}
const hash = sys.createHash(data);
const mtimeBefore = sys.getModifiedTime(fileName);
if (mtimeBefore && hasProperty(outputFingerprints, fileName)) {
const fingerprint = outputFingerprints[fileName];
// If output has not been changed, and the file has no external modification
if (fingerprint.byteOrderMark === writeByteOrderMark &&
fingerprint.hash === hash &&
fingerprint.mtime.getTime() === mtimeBefore.getTime()) {
return;
}
sys.writeFile(fileName, data, writeByteOrderMark);
const mtimeAfter = sys.getModifiedTime(fileName);
outputFingerprints[fileName] = {
hash,
byteOrderMark: writeByteOrderMark,
mtime: mtimeAfter
};
}
}
@mhegazy Thanks, updated~ |
👍 |
@DanielRosenwasser any more comments? |
Everything looks good to me. We should actually pull down the change and test out something larger like Monaco to see what sort of differences we can observe first. |
@DanielRosenwasser Tried it with VS Code (1.8 beta v.s. 1.8 beta cherry-picking related commits) on Windows 10. i7 5500u / 512G PCI-e SSD (RW 1+GB/s) Without PR Files: 1149
Lines: 316504
Nodes: 1623336
Identifiers: 509501
Symbols: 125935924
Types: 124763
Memory used: 812866K
I/O read: 0.32s
I/O write: 4.81s
Parse time: 4.69s
Bind time: 1.70s
Check time: 12.04s
Emit time: 7.23s
Total time: 25.67s
1:20:15 AM - Compilation complete. Watching for file changes.
1:20:21 AM - File change detected. Starting incremental compilation...
Files: 1149
Lines: 316504
Nodes: 1623336
Identifiers: 509501
Symbols: 126064848
Types: 124763
Memory used: 947241K
I/O read: 0.00s
I/O write: 4.13s
Parse time: 1.72s
Bind time: 0.05s
Check time: 10.64s
Emit time: 6.30s
Total time: 18.70s
1:20:40 AM - Compilation complete. Watching for file changes. With PR Files: 1149
Lines: 316504
Nodes: 1623336
Identifiers: 509501
Symbols: 125935924
Types: 124763
Memory used: 799590K
I/O read: 0.22s
I/O write: 4.74s
Parse time: 4.06s
Bind time: 2.05s
Check time: 11.04s
Emit time: 7.45s
Total time: 24.61s
1:17:41 AM - Compilation complete. Watching for file changes.
1:18:31 AM - File change detected. Starting incremental compilation...
Files: 1149
Lines: 316504
Nodes: 1623336
Identifiers: 509501
Symbols: 126064848
Types: 124763
Memory used: 905468K
I/O read: 0.02s
I/O write: 0.21s
Parse time: 1.85s
Bind time: 0.05s
Check time: 10.45s
Emit time: 2.17s
Total time: 14.53s
1:18:46 AM - Compilation complete. Watching for file changes. But if I exclude folder from Windows Defender, it doesn't help much: Without PR Files: 1149
Lines: 316504
Nodes: 1623336
Identifiers: 509501
Symbols: 125935924
Types: 124763
Memory used: 817125K
I/O read: 0.74s
I/O write: 0.91s
Parse time: 5.17s
Bind time: 1.70s
Check time: 11.63s
Emit time: 3.77s
Total time: 22.28s
1:25:02 AM - Compilation complete. Watching for file changes.
1:25:08 AM - File change detected. Starting incremental compilation...
Files: 1149
Lines: 316504
Nodes: 1623336
Identifiers: 509501
Symbols: 126064848
Types: 124763
Memory used: 920819K
I/O read: 0.00s
I/O write: 0.82s
Parse time: 1.71s
Bind time: 0.05s
Check time: 10.49s
Emit time: 2.97s
Total time: 15.22s
1:25:24 AM - Compilation complete. Watching for file changes. With PR Files: 1149
Lines: 316504
Nodes: 1623336
Identifiers: 509501
Symbols: 125935924
Types: 124763
Memory used: 810966K
I/O read: 0.29s
I/O write: 1.08s
Parse time: 4.34s
Bind time: 1.79s
Check time: 10.99s
Emit time: 3.95s
Total time: 21.07s
1:27:43 AM - Compilation complete. Watching for file changes.
1:27:51 AM - File change detected. Starting incremental compilation...
Files: 1149
Lines: 316504
Nodes: 1623336
Identifiers: 509501
Symbols: 126064848
Types: 124763
Memory used: 910171K
I/O read: 0.00s
I/O write: 0.30s
Parse time: 1.66s
Bind time: 0.03s
Check time: 10.38s
Emit time: 2.24s
Total time: 14.30s
1:28:05 AM - Compilation complete. Watching for file changes. |
I do not think the value of this change is performance optimization ( as you noted, io write time is usually negligible compared to the rest of the compilation process), the value is not updating the file timestamps unless you need to, that allows for other tools in the tool chain that are watching the outputs to better optimize their watch behavior. I am busy with the 1.8 release logistics at the moment, but should be able to pull down the change and work with it for a bit, to get some test coverage. |
@@ -559,6 +559,12 @@ namespace ts { | |||
sourceMap: false, | |||
}; | |||
|
|||
interface OutputFingerprint { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark this /* @internal */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DanielRosenwasser I am confused, it's not even exported, why does it need /** @internal */
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! Misread the code, forgive me. I was taking a quick glance at this PR again since it's been a bit.
sorry for the delay. merged manually. |
@mhegazy 😄 Thanks~ |
Is there any reason to check the fileModified timestamp? I have a setup where I (for performance reasons) have the compilation split into three different processes (with a bit of overlap). This means that in the overlapping cases, the hash will match, but the fileModified time will not. It works fine if I remove the file time check from writeFileIfUpdated |
#3113
If output and related file satisfy these conditions, it will now skip outputting the file: