Skip to content

Switch to Papa Parse for CSV data loading and parsing in a worker thread #59

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RandomFractals opened this issue Jan 1, 2022 · 3 comments
Labels
data Data task enhancement New feature or request

Comments

@RandomFractals
Copy link
Owner

Even though I got the data parsing part decently tuned with tableshema library and table.iter() setup in #50, I still need to do it in worker threads.

Papa Parse has that as one of the config options and looks more up to date and resilient with malformed CSV data files.

https://www.papaparse.com/

@RandomFractals
Copy link
Owner Author

RandomFractals commented Jan 1, 2022

This looks promising. Parsing CTA ridership CSV from #56 in a browser with Papa Parse:

papa-parse-demo

@RandomFractals
Copy link
Owner Author

takes about 2 minutes to parse 1.1 Gb of data with dynamic typing. This can be further enhanced by turning off dynamic typing and let the grid handle data formatting on visible rows display ...

image

@RandomFractals
Copy link
Owner Author

This is done. Will handle loading all parsed data into table view incrementally with pagination in #60

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Data task enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant