-
Notifications
You must be signed in to change notification settings - Fork 510
Feature request: intelligent diff #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think this can be done by:
thoughts? |
@harleyttd splitting sentences is more versatile than just that single test, and different types of text warrant different tactics for it. From experience, I think this should be handled by user code. |
Google's diff library to solved this problem for me, but the link above is bad: |
I agree... This would be a monumental feature. I've been using Google's Diff Match Patch, however, ran into problems where it omitted real changes that were made between comparison texts. JSDiff is built much better IMO, but falls short of Google's intelligent semantic cleanup... |
Kevin, could you give us an idea of when and if you will be adding this to jsDiff? I'd appreciate it. |
@HelpingHand1 this isn't really on my radar right now. I'd be glad to look at pull requests if someone wanted to implement, of course. |
Did anyone take this up? There are probably already algorithms to look at we could inspire ourselves from, Git off the top of my head. |
I haven't. In the meantime, I've given my users the ability to manually choose which type of diff they would like to see, but this feature would be optimal... |
In cases like this, would it make sense to merge consecutive word diffs of the same type, including the whitespace/punctuation between them? |
I've solved this by re-arranging the resulting array of diffs to chain diffs of the same type. function diffWordsPretty(words1, words2) {
const uglyDiffs = diffWords(words1, words2);
let result = [];
let added = [];
let removed = [];
uglyDiffs.forEach(diff => {
if (diff.value.split("").every(c => c === " ") && !diff.removed && !diff.added) {
added.push({ ...diff, added: true });
removed.push({ ...diff, removed: true });
} else if (diff.added) {
added.push(diff);
} else if (diff.removed) {
removed.push(diff);
} else {
result = [ ...result, ...removed, ...added, diff ];
added = [];
removed = [];
}
});
result = [ ...result, ...removed, ...added ];
return result;
} |
Hmm. I've recently offered to help out maintaining jsdiff and am looking through old issues. I think this issue is really two issues in one. On the one hand we have a clear observation of a problem I agree is real and would be nice to fix: word diffs of texts that have been substantially rewritten are a nightmare to read because they alternate between word removed, word added, word removed, word added etc; it'd be nicer if we got a chain of word removals followed by a chain of word additions. (@johnloven presents a workaround to achieve this.) On the other hand, we have the titular request for a diff that automatically decides between character-level, word-level, or sentence-level diffs based on the density of changes. I think my answer to the actual, specific request in this issue is "no", since:
For that reason, I'm going to close this issue. However, I wonder if we can nonetheless fix the problem that motivated this feature request, perhaps by baking logic like @johnloven's into jsdiff. I'll think about it some more... |
FWIW it looks to me at a glance like @gdavoianb's proposal at #160 is very similar in nature to @johnloven's in this thread, and aims at solving the very same problem that this issue was motivated by. I haven't studied it in detail yet, but I will try to find time to do so in due course. |
Hi,
Thanks for the great library! I have one problem with it, though. With larger changes, this library just falls to pieces. For example, running this:
Produces this:
In the majority of cases, I want either word diffs or character diffs, but as larger changes are made (which isn't as common), I find it unpleasant. A character replacement here is even worse.
Would it possible to add an "intelligent" diff mode which will detect the density of diffs in a word or sentence and run, say, a sentence diff on that sentence instead?
Thanks a lot!
The text was updated successfully, but these errors were encountered: