Skip to content

[Feature][Customize] Add Support for Incremental CSV Upload in the Customize Plugin #8216

@narrowizard

Description

@narrowizard

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Use case

As a DevLake user leveraging the Customize plugin to upload issues and issue_repo_commits data for further analysis, I need the ability to perform incremental CSV uploads. This would allow me to append new data to existing records without overwriting or replacing the entire dataset.

Description

Currently, the Customize plugin in DevLake only supports full data uploads, which replace all existing data with the new data from the uploaded CSV file. While this functionality works for initial data loads, it poses significant challenges as the dataset grows over time:

  1. Data Integrity Risks: Full uploads may inadvertently overwrite or lose historical data, compromising the dataset's accuracy and completeness.
  2. File Maintenance Overhead: CSV files become increasingly large as time progresses, making them cumbersome to maintain and manage.
    To address these challenges, I propose adding incremental upload support to the Customize plugin. This feature would enable users to append new records from CSV files to the existing dataset without requiring a complete overwrite.

Benefits:

  • Enhanced Data Integrity: Ensures existing data remains untouched while appending new entries.
  • Improved Scalability: Reduces the need to maintain and manage increasingly large CSV files.
  • Better User Experience: Simplifies data upload workflows for users.
    I envision this feature functioning as follows:
  1. Users upload a new CSV file containing only new data entries.
  2. The Customize plugin compares the uploaded data with existing records.
  3. New records are appended to the domain layer, while existing records remain unchanged.
    This functionality would greatly improve the usability of the Customize plugin and make it more suitable for long-term data collection and analysis workflows.

Let me know if additional details or clarifications are needed!

Related issues

No

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions