Skip to content

create_id_breakdown is a bottleneck and needs rewriting #77

@ross-spencer

Description

@ross-spencer

Given an 8 million line SF YAML, (631,286 row database), create_id_breakdown is taking too long. It is largely unoptimized and not brilliantly written. Any rewrite I believe should bring pretty decent efficiency gains. Lets have a look at what we can do.

Edit: For reference, without this function alone, the script is quicker by over an hour, and completes in 77 seconds. There may be other bottlenecks along the way as much relies on the output here, but one step at a time.

NB. Rewrite could be focused on better sqlite queries which do not seem to be a bottleneck at all. Or it could be focused on improving the data structures we're using.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions