-
-
Notifications
You must be signed in to change notification settings - Fork 19.5k
Description
Code Sample, a copy-pastable example if possible
import pandas as pd
import datetime as dt
df = pd.DataFrame({'Date': ['2017-01-02', '2017-01-03','2017-01-04'],
'T': [10, 11,12],
'RM': [28, 29,30]})
df['Date'] = pd.to_datetime(df.Date,infer_datetime_format=True)
df.set_index('Date', inplace=True)
df = df.asfreq('D')
print(df)
print('Dataframe index of dtype: {} and freq: {}'.format(df.index.dtype_str, df.index.freq))
print("Droping one row")
df = df.drop(df.index[1])
print(df)
print('The new index is of dtype: {} and freq: {}'.format(df.index.dtype_str, df.index.freq))
print('''Let's change in place the unexisting index: 2017-01-03''')
df.loc['2017-01-03', 'RM']=290
print(df)
print('''The dataframe has shape: {} and it's new index is of dtype: {}'''.format(df.shape, df.index.dtype_str))
Problem description
[this should explain why the current behaviour is a problem and why the expected output is a better solution.]
According to Pandas documentation, the updates of a cell value based on index lookup (df.loc and df.at) should work correctly only when the index is existing within dataframe.
The problem I encountered happens when I try to update some cells accessed by DateTime index, in case the index (which is actually a date) does not exist in the dataframe. According to the documentation, an exception should be raised in this case.
What actually happens, is that without raising any exception: 1) Pandas transforms the DateTime index into an object index (thus making it unusable for timeseries processing), 2) insert new rows in the dataframe with the specified new object index and set all columns to Nan, except the updated one.
I solved the above problem, wrapping the update commands in conditional 'If' rules but according to the documentation it seems to be a misbehavior of Pandas.
Expected Output
Output of pd.show_versions()
Details
[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS
commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-60-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.4
pytest: 4.5.0
pip: 19.1.1
setuptools: 41.0.1
Cython: 0.29.7
numpy: 1.17.0
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: 2.0.1
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: 1.2.1
tables: 3.5.1
numexpr: 2.6.8
feather: None
matplotlib: 3.1.0
openpyxl: 2.6.1
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.8
lxml: 4.3.0
bs4: 4.7.1
html5lib: 0.9999999
sqlalchemy: 1.3.3
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.7.0