-
-
Notifications
You must be signed in to change notification settings - Fork 159
Open
Labels
Master TrackerHigh level tracker for similar issuesHigh level tracker for similar issues
Description
This is the list of things that are in pandas 2.2 release notes that need to be addressed in pandas-stubs. I have removed the sections Performance improvements and Bug fixes.
PR's welcome. If you do a PR, check off the item and put a link to the PR that closed it. One PR can address multiple issues.
Some of these may already have been taken care of, so if so, check them off and indicate with a comment such as "previously complete"
Upcoming changes in pandas 3.0
- Dedicated string data type (backed by Arrow) by default
Enhancements
- ADBC Driver support in to_sql and read_sql
- Create a pandas Series based on one or more conditions
- Series.struct accessor for PyArrow structured data
- Series.list accessor for PyArrow list data
- Calamine engine for
read_excel -
~DataFrame.to_sqlwith method parameter set tomultiworks with Oracle on the backend - :attr:
Series.attrs/ :attr:DataFrame.attrsnow uses a deepcopy for propagatingattrs(/BUG:copy.deepcopy()doesn't deepcopy the metadata in.attrspandas#54134). -
get_dummiesnow returning extension dtypesbooleanorbool[pyarrow]that are compatible with the input dtype (/BUG:pd.get_dummiesshould returnbool[pyarrow]types pandas#56273) -
read_csvnow supportson_bad_linesparameter withengine="pyarrow"(/ENH: Pandas 2.0 with pyarrow engine add the argument like 'skip_bad_lines=True' pandas#54480) -
read_sasreturnsdatetime64dtypes with resolutions better matching those stored natively in SAS, and avoids returning object-dtype in cases that cannot be stored withdatetime64[ns]dtype (/ENH: non-nano datetime64s for read_sas pandas#56127) -
read_spssnow returns aDataFramethat stores the metadata in :attr:DataFrame.attrs(/read_spss doens't return the metadata pandas#54264) -
tseries.api.guess_datetime_formatis now part of the public API (/ENH: make guess_datetime_format public pandas#54727) -
DataFrame.applynow allows the usage of numba (viaengine="numba") to JIT compile the passed function, allowing for potential speedups (/ENH: numba engine in df.apply pandas#54666) -
ExtensionArray._explodeinterface method added to allow extension type implementations of theexplodemethod (/ENH: make_explodea method of theExtensionArrayinterface pandas#54833) -
ExtensionArray.duplicatedadded to allow extension type implementations of theduplicatedmethod (/ENH/PERF: add ExtensionArray.duplicated pandas#55255) -
Series.ffill,Series.bfill,DataFrame.ffill, andDataFrame.bfillhave gained the argumentlimit_area; 3rd party.ExtensionArrayauthors need to add this argument to the method_pad_or_backfill(/ENH: add limit_area argument to ffill() method as interpolate( method='ffill', limit_area='inside') as been deprecated pandas#56492) - Allow passing
read_only,data_onlyandkeep_linksarguments to openpyxl usingengine_kwargsofread_excel(/ENH: _openpyxl.py load_workbook allow to modify the read_only, data_only and keep_links parameters using engine_kwargs pandas#55027) - Implement
Series.interpolateandDataFrame.interpolateforArrowDtypeand masked dtypes (/.interpolate()Method Incompatible withfloat[pyarrow]Dtype pandas#56267) - Implement masked algorithms for
Series.value_counts(/ENH: Implement masked algorithm for value_counts pandas#54984) - Implemented
Series.dtmethods and attributes forArrowDtypewithpyarrow.durationtype (/BUG: Cannot access Timedelta properties with Arrow Backend pandas#52284) - Implemented
Series.str.extractforArrowDtype(/BUG:str.extractMethod Not Implemented forpd.ArrowDtype(pa.string())pandas#56268) - Improved error message that appears in
DatetimeIndex.to_periodwith frequencies which are not supported as period frequencies, such as"BMS"(/ENH: Raise TypeError when converting DatetimeIndex to PeriodIndex with invalid period frequency pandas#56243) - Improved error message when constructing
Periodwith invalid offsets such as"QS"(/BUG: QuarterBegin Does not work with Period pandas#55785) - The dtypes
string[pyarrow]andstring[pyarrow_numpy]now both utilize thelarge_stringtype from PyArrow to avoid overflow for long columns (/BUG: new string dtype fails with >2 GB of data in a single column pandas#56259)
Notable bug fixes
-
check_exactnow only takes effect for floating-point dtypes intesting.assert_frame_equalandtesting.assert_series_equal. In particular, integer dtypes are always checked exactly (/BUG: assert_series_equal not raising on unequal series? pandas#55882)
Deprecations
- Chained assignment
- Deprecate aliases
M,Q,Y, etc. in favour ofME,QE,YE, etc. for offsets - Deprecated automatic downcasting
- Changed
Timedelta.resolution_stringto returnh,min,s,ms,us, andnsinstead ofH,T,S,L,U, andN, for compatibility with respective deprecations in frequency aliases (/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536) - Deprecated :attr:
offsets.Day.delta, :attr:offsets.Hour.delta, :attr:offsets.Minute.delta, :attr:offsets.Second.delta, :attr:offsets.Milli.delta, :attr:offsets.Micro.delta, :attr:offsets.Nano.delta, usepd.Timedelta(obj)instead (/DEPR: Tick.delta pandas#55498) - Deprecated
pandas.api.types.is_intervalandpandas.api.types.is_period, useisinstance(obj, pd.Interval)andisinstance(obj, pd.Period)instead (/DEPR: is_decimal, is_interval pandas#55264) - Deprecated
read_gbqandDataFrame.to_gbq. Usepandas_gbq.read_gbqandpandas_gbq.to_gbqinstead https://pandas-gbq.readthedocs.io/en/latest/api.html (/DEPR: read_gbq, DataFrame.to_gbq pandas#55525) - Deprecated
.DataFrameGroupBy.fillnaand.SeriesGroupBy.fillna; use.DataFrameGroupBy.ffill,.DataFrameGroupBy.bfillfor forward and backward filling or.DataFrame.fillnato fill with a single value (or the Series equivalents) (/DEPR: groupby.fillna pandas#55718) - Deprecated
DateOffset.is_anchored, useobj.n == 1for non-Tick subclasses (for Tick this was always False) (/DOC: DateOffset.is_anchored track down intention, fix docstring pandas#55388) - Deprecated
DatetimeArray.__init__andTimedeltaArray.__init__, usearrayinstead (/DEPR: DTA/TDA.__init__ pandas#55623) - Deprecated
Index.format, useindex.astype(str)orindex.map(formatter)instead (/DEPR: Index.format? pandas#55413) - Deprecated
Series.ravel, the underlying array is already 1D, so ravel is not necessary (/DEPR: Deprecate Series.ravel pandas#52511) - Deprecated
Series.resampleandDataFrame.resamplewith aPeriodIndex(and the 'convention' keyword), convert toDatetimeIndex(with.to_timestamp()) before resampling instead (/DEPR: Resample with PeriodIndex? pandas#53481). Note: this deprecation was later undone in pandas 2.3.3 (/QST: FutureWarning: Resampling with a PeriodIndex is deprecated, how to resample now? pandas#57033) - Deprecated
Series.view, useSeries.astypeinstead to change the dtype (/DEPR: Series.view pandas#20251) - Deprecated
offsets.Tick.is_anchored, useFalseinstead (/DOC: DateOffset.is_anchored track down intention, fix docstring pandas#55388) - Deprecated
core.internalsmembersBlock,ExtensionBlock, andDatetimeTZBlock, use public APIs instead (/DEPR: deprecate exposing blocks in core.internals pandas#55139) - Deprecated
year,month,quarter,day,hour,minute, andsecondkeywords in thePeriodIndexconstructor, usePeriodIndex.from_fieldsinstead (/DEPR: PeriodIndex.__new__ accepting ordinals, fields pandas#55960) - Deprecated accepting a type as an argument in
Index.view, call without any arguments instead (/API/DEPR: Index.view _typ check, return type pandas#55709) - Deprecated allowing non-integer
periodsargument indate_range,timedelta_range,period_range, andinterval_range(/DEPR: fractional periods in date_range, timedelta_range, period_range… pandas#56036) - Deprecated allowing non-keyword arguments in
DataFrame.to_clipboard(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_csvexceptpath_or_buf(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_dict(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_excelexceptexcel_writer(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_gbqexceptdestination_table(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_hdfexceptpath_or_buf(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_htmlexceptbuf(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_jsonexceptpath_or_buf(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_latexexceptbuf(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_markdownexceptbuf(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_parquetexceptpath(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_pickleexceptpath(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_stringexceptbuf(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing non-keyword arguments in
DataFrame.to_xmlexceptpath_or_buffer(/DEPR: Positional arguments in to_* I/O methods pandas#54229) - Deprecated allowing passing
BlockManagerobjects toDataFrameorSingleBlockManagerobjects toSeries(/DEPR: accepting Manager objects in DataFrame/Series pandas#52419) - Deprecated behavior of
Index.insertwith an object-dtype index silently performing type inference on the result, explicitly callresult.infer_objects(copy=False)for the old behavior instead (/API: Index.insert too much dtype inference pandas#51363) - Deprecated casting non-datetimelike values (mainly strings) in
Series.isinandIndex.isinwithdatetime64,timedelta64, andPeriodDtypedtypes (/DEPR: casting in DatetimeLikeArrayMixin.isin pandas#53111) - Deprecated dtype inference in
Index,SeriesandDataFrameconstructors when giving a pandas input, call.infer_objectson the input to keep the current behavior (/DEPR: Series and Index shouldn't do inference on pandas objects pandas#56012) - Deprecated dtype inference when setting a
Indexinto aDataFrame, cast explicitly instead (/DEPR: Disallow dtype inference when setting Index into DataFrame pandas#56102) - Deprecated including the groups in computations when using
.DataFrameGroupBy.applyand.DataFrameGroupBy.resample; passinclude_groups=Falseto exclude the groups (/API: way to exclude the grouped column with apply pandas#7155) - Deprecated indexing an
Indexwith a boolean indexer of length zero (/BUG: inconsistency in Index.__getitem__ for Numpy and non-numpy dtypes pandas#55820) - Deprecated not passing a tuple to
.DataFrameGroupBy.get_groupor.SeriesGroupBy.get_groupwhen grouping by a length-1 list-like (/Groupby on single key should be accessible by tuple of length 1 pandas#25971) - Deprecated string
ASdenoting frequency inYearBeginand stringsAS-DEC,AS-JAN, etc. denoting annual frequencies with various fiscal year starts (/DEPR: deprecate the alias 'A' in favour of 'Y' for year end frequency pandas#54275) - Deprecated string
Adenoting frequency inYearEndand stringsA-DEC,A-JAN, etc. denoting annual frequencies with various fiscal year ends (/DEPR: deprecate the alias 'A' in favour of 'Y' for year end frequency pandas#54275) - Deprecated string
BASdenoting frequency inBYearBeginand stringsBAS-DEC,BAS-JAN, etc. denoting annual frequencies with various fiscal year starts (/DEPR: deprecate the alias 'A' in favour of 'Y' for year end frequency pandas#54275) - Deprecated string
BAdenoting frequency inBYearEndand stringsBA-DEC,BA-JAN, etc. denoting annual frequencies with various fiscal year ends (/DEPR: deprecate the alias 'A' in favour of 'Y' for year end frequency pandas#54275) - Deprecated strings
H,BH, andCBHdenoting frequencies inHour,BusinessHour,CustomBusinessHour(/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536) - Deprecated strings
H,S,U, andNdenoting units into_timedelta(/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536) - Deprecated strings
H,T,S,L,U, andNdenoting units inTimedelta(/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536) - Deprecated strings
T,S,L,U, andNdenoting frequencies inMinute,Second,Milli,Micro,Nano(/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536) - Deprecated support for combining parsed datetime columns in
read_csvalong with thekeep_date_colkeyword (/DEPR: read_csv keywords: keep_date_col, delim_whitespace pandas#55569) - Deprecated the :attr:
.DataFrameGroupBy.grouperand :attr:SeriesGroupBy.grouper; these attributes will be removed in a future version of pandas (/DEPR: groupby.grouper pandas#56521) - Deprecated the
.Groupingattributesgroup_index,result_index, andgroup_arraylike; these will be removed in a future version of pandas (/DEPR: Certain Grouper and Grouping attributes pandas#56148) - Deprecated the
delim_whitespacekeyword inread_csvandread_table, usesep="\\s+"instead (/DEPR: read_csv keywords: keep_date_col, delim_whitespace pandas#55569) - Deprecated the
errors="ignore"option into_datetime,to_timedelta, andto_numeric; explicitly catch exceptions instead (/DEPR: deprecate errors='ignore' in to_datetime and make output dtype predictable pandas#54467) - Deprecated the
fastpathkeyword in theSeriesconstructor (/CLN: remove fastpath & verify_integrity from constructors pandas#20110) - Deprecated the
kindkeyword inSeries.resampleandDataFrame.resample, explicitly cast the object'sindexinstead (/DEPR: kind keyword in resample pandas#55895) - Deprecated the
ordinalkeyword inPeriodIndex, usePeriodIndex.from_ordinalsinstead (/DEPR: PeriodIndex.__new__ accepting ordinals, fields pandas#55960) - Deprecated the
unitkeyword inTimedeltaIndexconstruction, useto_timedeltainstead (/DEPR: DatetimeIndex/TimedeltaIndex constructor keywords pandas#55499) - Deprecated the
verbosekeyword inread_csvandread_table(/DEPR: read_csv keywords: keep_date_col, delim_whitespace pandas#55569) - Deprecated the behavior of
DataFrame.replaceandSeries.replacewithCategoricalDtype; in a future version replace will change the values while preserving the categories. To change the categories, useser.cat.rename_categoriesinstead (/DEPR/API: Series[categorical].replace behavior pandas#55147) - Deprecated the behavior of
Series.value_countsandIndex.value_countswith object dtype; in a future version these will not perform dtype inference on the resultingIndex, doresult.index = result.index.infer_objects()to retain the old behavior (/DEPR: dtype inference in value_counts pandas#56161) - Deprecated the default of
observed=FalseinDataFrame.pivot_table; will beTruein a future version (/DEPR: observed=False default for DataFrame.pivot_table pandas#56236) - Deprecated the extension test classes
BaseNoReduceTests,BaseBooleanReduceTests, andBaseNumericReduceTests, useBaseReduceTestsinstead (/DEPR: BaseNoReduceTests pandas#54663) - Deprecated the option
mode.data_managerand theArrayManager; only theBlockManagerwill be available in future versions (/DEPR: ArrayManager pandas#55043) - Deprecated the previous implementation of
DataFrame.stack; specifyfuture_stack=Trueto adopt the future version (/DEPR/BUG: DataFrame.stack including all null rows when stacking multiple levels pandas#53515)
Metadata
Metadata
Assignees
Labels
Master TrackerHigh level tracker for similar issuesHigh level tracker for similar issues