Great work with Dwifft. Thanks for it.
One challenge with the LCS algorithm is that it is O(m*n) in both CPU and memory consumption. Though this isn't a problem in practice when the size of the arrays is moderate, it becomes a problem when the arrays are large.
Anything that can be done to reduce the size of the arrays passed to the LCS algorithm has very beneficial results.
This approach, which I've used in the past in another language, is to essentially trim off the matching prefixes and suffixes from the arrays passing only the "middle" of the array to the LCS algorithm.
Of course, if there is a change at both the head and tail of the array, this optimization has no benefit. In the common case, though, the benefit is huge.
Based on the guidelines, I thought it would be good to discuss the idea before opening a PR. Nevertheless, it may be easiest to just see the code.
In my previous implementation, I also added change beyond what I've proposed here to pick an upper bound on m*n work factor. When that limit is exceeded, a reloadData is really the best approach.
Great work with Dwifft. Thanks for it.
One challenge with the LCS algorithm is that it is
O(m*n)in both CPU and memory consumption. Though this isn't a problem in practice when the size of the arrays is moderate, it becomes a problem when the arrays are large.Anything that can be done to reduce the size of the arrays passed to the LCS algorithm has very beneficial results.
This approach, which I've used in the past in another language, is to essentially trim off the matching prefixes and suffixes from the arrays passing only the "middle" of the array to the LCS algorithm.
Of course, if there is a change at both the head and tail of the array, this optimization has no benefit. In the common case, though, the benefit is huge.
Based on the guidelines, I thought it would be good to discuss the idea before opening a PR. Nevertheless, it may be easiest to just see the code.
In my previous implementation, I also added change beyond what I've proposed here to pick an upper bound on
m*nwork factor. When that limit is exceeded, areloadDatais really the best approach.