-
-
Notifications
You must be signed in to change notification settings - Fork 600
Open
Labels
Description
The issue first showed up when generating new "is_required_phrase" rules or updating rules to become one.
Some short rules are for license keys or names and they can contain stopwords. For instance:
- in the "h2-1.0" license where "h2" is a stopword.
- in liliq-p-1.1 and cern-ohl-p-2.0, "p" is a stopword
- in "Server Side Public License", "side" is a stopword
- in "LGPL licensed bash script", "script" is a stopword
There are a few ways to consider to resolve this:
- remove some longer stop words like side and script
- abandon using stop words entirely
- just ensure that some required phrase rules are not generated/updated as such and accept a tiny bit of inaccuracy for these the few seldom seen rules
(note that these license ids will always be matched in license expressions since that does not ignore stopwords)