Apply rules to annotate candidates (in addition to the ML part of the filter-rank module) #150

Riruk · 2021-05-11T09:06:51Z

We could use the hard-coded rules to identify some commits relevant to security fixes.

For example: if a commit message says that this commit is a fix for a particular CVE, then it is a very strong candidate for the commit to be related to the vulnerability fix.

I will create a PR for the implementation part and I propose to have this issue to discuss possible hard-coded rules that we could create.

Riruk · 2021-05-11T09:12:17Z

The initial work is done in the context of the following pull request: #149

copernico · 2021-05-11T09:24:49Z

Another rule could be something like:

the advisory mentions a file (or class) and a given candidate is the only one (or one of the few..) that touches that file (class)

copernico · 2021-05-11T09:47:52Z

Another obvious one:

the Advisory mentions the commit at hand

NOTE: this is not conclusive per-se (some advisories point to commits that contain changelog changes, not the actual code fixes...)

copernico · 2021-05-11T10:20:43Z

As for the naming: I guess we could settle on calling these just "rules" (implying "manually-defined", or "handcrafted"),
unless the context might cause ambiguity.

copernico · 2021-05-17T13:53:10Z

Hi @Riruk, any progress on this issue?

Riruk · 2021-05-17T14:09:25Z

Hi @copernico, I was a bit busy last week with a paper. It's submitted now, so I will come back to working on the prospector tool from tomorrow

Riruk · 2021-05-25T14:06:05Z

What about adding a rule "contains text in commit message"?

copernico · 2021-05-26T14:56:56Z

What about adding a rule "contains text in commit message"?

The "text" would be a keyword extracted from the advisory? If so, yes, definitely a useful rule.

copernico · 2021-05-29T07:24:52Z

Work continues in #161

copernico · 2021-05-31T10:44:31Z

I think we need to elaborate on the representation of the results to be shown to the user before proceeding.
I would propose that, after the candidates are obtained from Git and they are processed to compute their features, we apply one or more "analysis" steps that produce "annotations" to be attached to the candidates. The user will then be able to inspect the results by seeing which "annotations" are attached to each candidate.

Examples:

commit X1 from repository R

Reason: token "user" in advisory matches the path "src/main/whatever/User.java" changed in the commit
Reason: the commit is in tag "v1.2.0" but not in the subsequent "v1.2.1"

Note: the reasons are human readable, but are constructed automatically based on the candidate features and annotations.

copernico · 2021-05-31T14:28:36Z

One point I'm not quite sure how to handle is the conceptual distinction between the above annotations and the features computed with the extract_* functions. In some sense, they look the same. Maybe, the annotations are what the user will see...?

Riruk · 2021-06-08T12:00:37Z

Work continues in #175

copernico assigned copernico and Riruk May 11, 2021

copernico added assuremoss component/prospector labels May 11, 2021

copernico changed the title ~~Use hard-coded rules before actual ML part of the filter-rank module~~ Use hard-coded rules before (or in addition to) the ML part of the filter-rank module May 11, 2021

copernico changed the title ~~Use hard-coded rules before (or in addition to) the ML part of the filter-rank module~~ Apply rules to annotate candidates (in addition to the ML part of the filter-rank module) May 31, 2021

copernico mentioned this issue Jun 21, 2021

Refactor CommitWithFeatures based on the new "rule" mechanism #190

Closed

copernico closed this as completed Jul 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Apply rules to annotate candidates (in addition to the ML part of the filter-rank module) #150

Apply rules to annotate candidates (in addition to the ML part of the filter-rank module) #150

Riruk commented May 11, 2021

Riruk commented May 11, 2021 •

edited

Loading

Uh oh!

copernico commented May 11, 2021

Uh oh!

copernico commented May 11, 2021

Uh oh!

copernico commented May 11, 2021

Uh oh!

copernico commented May 17, 2021

Uh oh!

Riruk commented May 17, 2021

Uh oh!

Riruk commented May 25, 2021

Uh oh!

copernico commented May 26, 2021

Uh oh!

copernico commented May 29, 2021

Uh oh!

copernico commented May 31, 2021

Uh oh!

copernico commented May 31, 2021

Uh oh!

Riruk commented Jun 8, 2021

Uh oh!

Apply rules to annotate candidates (in addition to the ML part of the filter-rank module) #150

Apply rules to annotate candidates (in addition to the ML part of the filter-rank module) #150

Comments

Riruk commented May 11, 2021

Riruk commented May 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copernico commented May 11, 2021

Uh oh!

copernico commented May 11, 2021

Uh oh!

copernico commented May 11, 2021

Uh oh!

copernico commented May 17, 2021

Uh oh!

Riruk commented May 17, 2021

Uh oh!

Riruk commented May 25, 2021

Uh oh!

copernico commented May 26, 2021

Uh oh!

copernico commented May 29, 2021

Uh oh!

copernico commented May 31, 2021

Uh oh!

copernico commented May 31, 2021

Uh oh!

Riruk commented Jun 8, 2021

Uh oh!

Riruk commented May 11, 2021 •

edited

Loading