Checks
Reproducible example
import polars as pl
import pandas as pd
df = pl.DataFrame({'dt': "2024-06-03 20:02:48.6800000"})
dt_format = "%Y-%m-%d %H:%M:%S%.6f0". # NOTE: this format has a trailing 0
df['dt'].str.to_datetime(dt_format, time_unit='ns')
Log output
Traceback (most recent call last):
File "/Users/anto/src/poste/sda-poste-logistics/script/polars_bug_datetime.py", line 7, in <module>
df["dt"].str.to_datetime(dt_format, time_unit="ns")
File "/Users/anto/src/poste/sda-poste-logistics/venv/lib/python3.11/site-packages/polars/series/utils.py", line 107, in wrapper
return s.to_frame().select_seq(f(*args, **kwargs)).to_series()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anto/src/poste/sda-poste-logistics/venv/lib/python3.11/site-packages/polars/dataframe/frame.py", line 8524, in select_seq
return self.lazy().select_seq(*exprs, **named_exprs).collect(_eager=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/anto/src/poste/sda-poste-logistics/venv/lib/python3.11/site-packages/polars/lazyframe/frame.py", line 1909, in collect
return wrap_df(ldf.collect(callback))
^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.InvalidOperationError: conversion from `str` to `datetime[ns]` failed in column 'dt' for 1 out of 1 values: ["2024-06-03 20:02:48.6800000"]
You might want to try:
- setting `strict=False` to set values that cannot be converted to `null`
- using `str.strptime`, `str.to_date`, or `str.to_datetime` and providing a format string
Issue description
Converting from string to datetime with a format string should allow to decode custom formats.
In this example the input has 7 digits for fractional seconds after the decimal dot. However the last digit is always zero and should be ignored because there is a trailing 0 in the format string.
Instead, polars gives the above error during conversion.
Stripping the extra zero from the string before attempting the conversion works correctly in polars:
dt_format1 = "%Y-%m-%d %H:%M:%S%.6f" # NOTE: no trailing 0 in the format
df['dt'].str.strip_suffix('0').str.to_datetime(dt_format1, time_unit='ns')
Pandas accepts the original format and convert the string correctly, as does python's datetime
x = pd.to_datetime(
df["dt"].to_pandas(use_pyarrow_extension_array=True),
format="%Y-%m-%d %H:%M:%S%.6f0",
)
df = df.with_columns(pl.from_pandas(x))
Expected behavior
The column should be converted to datetime without error, as done in pandas and datetime from python standard lib.
Installed versions
Details
--------Version info---------
Polars: 1.0.0-rc.2
Index type: UInt32
Platform: macOS-14.5-arm64-arm-64bit
Python: 3.11.3 (main, Sep 1 2023, 14:56:45) [Clang 14.0.3 (clang-1403.0.22.14.1)]
----Optional dependencies----
adbc_driver_manager: 1.0.0
cloudpickle: 3.0.0
connectorx: 0.3.3
deltalake: <not installed>
fastexcel: 0.10.4
fsspec: 2023.12.2
gevent: 24.2.1
great_tables: <not installed>
hvplot: 0.10.0
matplotlib: 3.8.4
nest_asyncio: 1.6.0
numpy: 1.26.4
openpyxl: <not installed>
pandas: 2.2.2
pyarrow: 16.1.0
pydantic: 2.7.3
pyiceberg: 0.6.1
sqlalchemy: 2.0.30
torch: <not installed>
xlsx2csv: 0.8.2
xlsxwriter: 3.2.0
Checks
Reproducible example
Log output
Issue description
Converting from string to datetime with a format string should allow to decode custom formats.
In this example the input has 7 digits for fractional seconds after the decimal dot. However the last digit is always zero and should be ignored because there is a trailing 0 in the format string.
Instead, polars gives the above error during conversion.
Stripping the extra zero from the string before attempting the conversion works correctly in polars:
Pandas accepts the original format and convert the string correctly, as does python's
datetimeExpected behavior
The column should be converted to datetime without error, as done in pandas and datetime from python standard lib.
Installed versions
Details