You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Assigning values with ``DataFrame.__setitem__`` now consistently assigns a new array, rather than mutating inplace (:issue:`33457`, :issue:`35271`, :issue:`35266`)
359
+
360
+
Previously, ``DataFrame.__setitem__`` would sometimes operate inplace on the
361
+
underlying array, and sometimes assign a new array. Fixing this inconsistency
362
+
can have behavior-changing implications for workloads that relied on inplace
363
+
mutation. The two most common cases are creating a ``DataFrame`` from an array
364
+
and slicing a ``DataFrame``.
365
+
366
+
*Previous Behavior*
367
+
368
+
The array would be mutated inplace for some dtypes, like NumPy's ``int64`` dtype.
369
+
370
+
.. code-block:: python
371
+
372
+
>>>import pandas as pd
373
+
>>>import numpy as np
374
+
>>> a = np.array([1, 2, 3])
375
+
>>> df = pd.DataFrame(a, columns=['a'])
376
+
>>> df['a'] =0
377
+
>>> a # mutated inplace
378
+
array([0, 0, 0])
379
+
380
+
But not others, like :class:`Int64Dtype`.
381
+
382
+
.. code-block:: python
383
+
384
+
>>>import pandas as pd
385
+
>>>import numpy as np
386
+
>>> a = pd.array([1, 2, 3], dtype="Int64")
387
+
>>> df = pd.DataFrame(a, columns=['a'])
388
+
>>> df['a'] =0
389
+
>>> a # not mutated
390
+
<IntegerArray>
391
+
[1, 2, 3]
392
+
Length: 3, dtype: Int64
393
+
394
+
395
+
*New Behavior*
396
+
397
+
In pandas 1.1.0, ``DataFrame.__setitem__`` consistently sets on a new array rather than
398
+
mutating the existing array inplace.
399
+
400
+
.. ipython:: python
401
+
402
+
For NumPy's int64 dtype
403
+
404
+
import pandas as pd
405
+
import numpy as np
406
+
a = np.array([1, 2, 3])
407
+
df = pd.DataFrame(a, columns=['a'])
408
+
df['a'] = 0
409
+
a # not mutated
410
+
411
+
For :class:`Int64Dtype`.
412
+
413
+
import pandas as pd
414
+
import numpy as np
415
+
a = pd.array([1, 2, 3], dtype="Int64")
416
+
df = pd.DataFrame(a, columns=['a'])
417
+
df['a'] = 0
418
+
a # not mutated
419
+
420
+
This also affects cases where a second ``Series`` or ``DataFrame`` is a view on a first ``DataFrame``.
421
+
422
+
.. code-block:: python
423
+
424
+
df = pd.DataFrame({"A": [1, 2, 3]})
425
+
df2 = df[['A']]
426
+
df['A'] = np.array([0, 0, 0])
427
+
428
+
Previously, whether ``df2`` was mutated depending on the dtype of the array being assigned to. Now, a
429
+
new array is consistently assigned, so ``df2`` is not mutated.
0 commit comments