Handle changes when importing

Optionally modifying how repeated imports are done: currently if a file doesn't exist in the expected target directory, it is created. We frequently import a directory-tree of files, then organize them in Girder so they are not conceptually in the original directory-tree. Reimporting makes duplicates of all of these files. It would be great if there were an option in import to "skip if file already is in Girder somewhere" -- this can be done by matching the import path. If the file size has changed, we would update the existing file. The more sophisticated method would be to use the computed hash and match on that -- the file might have been renamed either on the assetstore OR in Girder, and, if the hash matches, it would be nice to not have a duplicate. This would be slower, as the hash has to be computed.

It would be nice to have a feature to flag any file in girder that is no longer available on an assetstore. For filesystem assetstores, this would confirm the path is reachable. For S3 assetstores, this would have to confirm the asset is still in the bucket (so would probably be slow). If we did this, we would probably want to show a list of such files (or only such files on a specific assetstore, or only such files from a specific import path) and then have an option to delete associated Girder items (and probably prune empty girder folders, too).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle changes when importing #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handle changes when importing #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions