Skip to content

DEPR: float_precision in read_csv #64395

@jbrockmendel

Description

@jbrockmendel

We have "high" (the default), "legacy", and "round_trip". When it was introduced in #8044, using the "high" precision apparently came with a performance penalty, but that changed by 2017 (#17154 (comment)) so the default was changed from "legacy" to "high" in #36228. I can't think of any reason why anyone would use "legacy".

I'm not aware of anyone who uses this parameter at all. Let's deprecate it and simplify the code+API.

Update: I patched the code to always use precision="high" to see whether it broke any tests. Aside from a test specifically asserting that "legacy" is inaccurate, the only test that broke was test_precise_conversion (4 cases out of 42) where we parse 1.700000000000000177635684 to 1.7. I'm fine with this level of rounding (though I think using fast_float might improve it to 1.7000000000000002 which is what pyarrow gives). I'd also be OK with saying "round_trip" level precision is only for the python engine (though that engine also gives 1.7)

Metadata

Metadata

Assignees

No one assigned

    Labels

    DeprecateFunctionality to remove in pandasIO CSVread_csv, to_csvNeeds DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions