Skip to content

Identifiers for spdx-license-identifier rules do not align with match #3634

@mrombout

Description

@mrombout

Description

The rule_identifier value for matches does not align with the rule identifier available in license_rule_references for matching using the 1-spdx-id matcher. It correctly creates a matching rule, but the hash at the end of the identifiers are different (see "How to Reproduce").

This prevents any kind of automation to reliably retrieve rule information for these kind of matches.

How To Reproduce

  1. Create a file with an SPDX license identifier comment:
    echo "/* SPDX-License-Identifier: MIT */" > Application.java
    
  2. Run ScanCode with the --license-references option:
    scancode --json-pp scancode_output.json --license --license-references --license-text Application.java
    
  3. In the scancode_output.json (file attached) there is now one match with a rule_identifier that doesn't exist. And also one license_rule_reference with an identifier that is never user.
    $ cat scancode_output.json | jq .files[].license_detections[].matches[].rule_identifier
    "spdx-license-identifier-mit-e0e2f62999b9522e22ba5602a715c2acd64e958b"
    $ cat scancode_output.json | jq .license_rule_references[].identifier
    "spdx-license-identifier-mit-2410ec7d8cecfb84d911cb1c29ba44ab907b8b8f"
    

Note how the unique identifier/hashes at the end of the identifiers are different. Both refer to mit and the license_rule_references[].text and .files[].license_detections[].matches[].matched_text so it's certain they are referring to the same thing.

System configuration

  • What OS are you running on? (Windows/MacOS/Linux)

Linux (Ubuntu 22.04.3 LTS), 64bit

  • What version of scancode-toolkit was used to generate the scan file?
ScanCode version: 32.0.8
ScanCode Output Format version: 3.0.0
SPDX License list version: 3.21
  • What installation method was used to install/run scancode? (pip/source download/other)

With pip on Python 3.10.12.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions