You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pandas read_csv throws an exception when encountering a line that seems to have too many fields, but it can be made to skip these bad lines and then report them on stdout if passed error_bad_lines=True. While Pandas does not make it easy to deal with these lines ( pandas-dev/pandas#5686 ) , it would be nice if csvs-to-sqlite could offer something. Maybe parsing read_csv ouput and then traversing the file and save the bad lines separately so the user can fix and reprocess them?
The text was updated successfully, but these errors were encountered:
With --skip-errors one can take stderr output and sed/grep those lines from the csv and fix them up separately. It would still be helpful if this tool dumped the lines somewhere.
Do you see csvs-to-sql as a tool that should handle most scenarios by itself eventually (error handling, remote files, compressed formats) or being used along with other established command-line tools ?
Pandas read_csv throws an exception when encountering a line that seems to have too many fields, but it can be made to skip these bad lines and then report them on stdout if passed error_bad_lines=True. While Pandas does not make it easy to deal with these lines ( pandas-dev/pandas#5686 ) , it would be nice if csvs-to-sqlite could offer something. Maybe parsing read_csv ouput and then traversing the file and save the bad lines separately so the user can fix and reprocess them?
The text was updated successfully, but these errors were encountered: