Skip to content

feat(vector-db): add cve_packages table #1243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 7, 2025

Conversation

samuv
Copy link
Contributor

@samuv samuv commented Mar 6, 2025

This PR introduces a new table, cve_packages, to store package-version pairs that have at least one vulnerability classified as high or critical. The data is stored in the S3 bucket alongside malicious, deprecated, and archived data. The goal is to have this data ready for when we export the dependency list. The data is generated by the GitHub Action and is currently triggered manually.

Changes

Added a new table cve_packages with the following fields:

  • name (TEXT, NOT NULL) – package name
  • version (TEXT, NOT NULL) – package version
  • type (TEXT, NOT NULL) – package type (e.g., npm, pypi)

Created indexes on the name, name:version, and name:version:type fields to optimize queries.
Modified import_packages.py to process and insert package versions with high/critical vulnerabilities into cve_packages.

@samuv samuv force-pushed the add-cve-to-vector-db branch from 72c1e48 to 901cbbe Compare March 6, 2025 15:00
Copy link
Contributor

@aponcedeleonch aponcedeleonch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Should we calculate the embeddings for the packages with CVEs? Or is that left for a future PR?

@samuv
Copy link
Contributor Author

samuv commented Mar 7, 2025

Looks good! Should we calculate the embeddings for the packages with CVEs? Or is that left for a future PR?

for packages with CVEs, we realized that embeddings aren’t needed since it’s just a straightforward select query for package:version

@aponcedeleonch aponcedeleonch merged commit 43de72a into stacklok:main Mar 7, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants