PERF: avoid materializing values[indexer] in Block.setitem by hyoj0942 · Pull Request #64251 · pandas-dev/pandas

hyoj0942 · 2026-02-20T07:55:13Z

closes PERF: object-dtype iloc setitem with datetimelike list-like indexers is slow on large arrays #64250
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.
I have reviewed and followed all the contribution guidelines
If I used AI to develop this pull request, I prompted it to follow AGENTS.md.

This avoids materializing values[indexer] in the object-dtype datetimelike compatibility path of Block.setitem.

Changes:

gate the optimization on non-scalar indexers
use length_of_indexer for common indexer types instead of materializing values[indexer]
keep the existing behavior for uncommon indexers by falling back to len(values[indexer])
add a whatsnew entry for the performance improvement

Validation run locally:

python -m pytest pandas/tests/indexing/test_iloc.py -k "test_iloc_setitem_custom_object"
python -m pre_commit run --files pandas/core/internals/blocks.py
python -m pre_commit run --files doc/source/whatsnew/v3.1.0.rst

hyoj0942 · 2026-02-20T08:01:54Z

ASV results for this change (same machine, same env):

asv run -e -E existing -b SetitemObjectDtypeDatetimelike --record-samples --dry-run

Note: for apples-to-apples comparison, baseline was measured by reverting only the optimization in blocks.py while keeping the benchmark class constant.

nrows	frac	bool baseline	bool this PR	bool speedup	ndarray baseline	ndarray this PR	ndarray speedup
10000	0.01	14.4μs	14.4μs	1.000x	8.03μs	8.14μs	0.986x
10000	0.50	120μs	102μs	1.176x	117μs	99.8μs	1.172x
10000	0.99	213μs	188μs	1.133x	224μs	194μs	1.155x
`1000000`	0.01	957μs	715μs	1.338x	253μs	201μs	1.259x
`1000000`	0.50	15.2ms	11.7ms	1.299x	12.9ms	11.2ms	1.152x
`1000000`	0.99	18.8ms	17.1ms	1.099x	24.4ms	20.6ms	1.184x
10000000	0.01	15.9ms	10.6ms	1.500x	3.18ms	2.85ms	1.116x
10000000	0.50	151ms	115ms	1.313x	221ms	161ms	1.373x
10000000	0.99	177ms	166ms	1.066x	437ms	316ms	1.383x
20000000	0.01	43.2ms	25.5ms	1.694x	10.5ms	7.78ms	1.350x
20000000	0.50	302ms	233ms	1.296x	458ms	333ms	1.375x
20000000	0.99	371ms	335ms	1.107x	936ms	668ms	1.401x

Geometric mean speedup:

bool: ~1.24x
ndarray: ~1.24x
overall: ~1.24x

rhshadrach

Thanks for the PR!

hyoj0942 · 2026-02-23T01:44:25Z

Pushed follow-up commit 5a09be1244 addressing review feedback:

reduced ASV parameterization to indexer_kind only
removed broad try/except fallback in _datetimelike_compat_num_set
removed newly added internals tests and rely on existing public API tests for coverage

jbrockmendel · 2026-03-26T01:11:41Z

Can you add a whatsnew for the perf improvement? otherwise LGTM

hyoj0942 · 2026-03-26T06:00:22Z

Added a whatsnew entry in doc/source/whatsnew/v3.1.0.rst and updated the checklist in the PR description.

Stale

jbrockmendel · 2026-04-10T15:27:47Z

thanks @hyoj0942

…-comparison * upstream/main: PERF: use lookup instead of hash_inner_join for merge with unique right keys (pandas-dev#64691) BUG : update `SeriesGroupBy.ohlc()` to honor `as_index=False` (pandas-dev#65141) PERF: Use DataFrame-level reductions in DataFrame.agg with list of funcs (pandas-dev#65031) DOC: document required external libraries in read_* I/O docstrings (pandas-dev#65143) DOC: improve MultiIndex.is_monotonic_increasing/decreasing docstrings (pandas-dev#65154) BUG: Raise ValueError for non-boolean numeric_only in DataFrame/Series reductions (GH#53098) (pandas-dev#65131) BUG: Timedelta.round() raises ZeroDivisionError when internal unit is 's' and target frequency is sub-second (pandas-dev#64836) ENH: Add replace method to Index (closes pandas-dev#19495) (pandas-dev#65099) PERF: improve StringArray.isna (pandas-dev#57733) BUG: read parquet files with older pytz (DEP: keep lower pytz minimum version) (pandas-dev#65133) DEPR: deprecate dates-with-datetime64 in _maybe_downcast_for_indexing (pandas-dev#64871) DOC: note that DataFrame.values is not writeable (pandas-dev#65142) CLN: Update groupby observed defaults (pandas-dev#65148) PERF: avoid materializing values[indexer] in Block.setitem (pandas-dev#64251) DOC: update GroupBy.sum/min/max See Also sections (pandas-dev#65144)

PERF: avoid materializing values[indexer] in Block.setitem

beecb96

hyoj0942 mentioned this pull request Feb 20, 2026

PERF: object-dtype iloc setitem with datetimelike list-like indexers is slow on large arrays #64250

Closed

3 tasks

rhshadrach previously requested changes Feb 20, 2026

View reviewed changes

Comment thread asv_bench/benchmarks/indexing.py Outdated

Comment thread pandas/core/internals/blocks.py Outdated

Comment thread pandas/tests/internals/test_internals.py Outdated

PERF: address review feedback on Block.setitem num_set path

5a09be1

jbrockmendel added the Performance Memory or execution speed performance label Mar 6, 2026

jbrockmendel reviewed Mar 15, 2026

View reviewed changes

Comment thread pandas/core/internals/blocks.py

PERF: simplify Block.setitem datetimelike fastpath

8c94c1e

jbrockmendel reviewed Mar 17, 2026

View reviewed changes

Comment thread pandas/tests/indexing/test_iloc.py Outdated

jbrockmendel reviewed Mar 17, 2026

View reviewed changes

Comment thread asv_bench/benchmarks/indexing.py Outdated

PERF: drop extra setitem datetimelike coverage

72874fd

DOC: add whatsnew for iloc datetimelike setitem perf

53eee23

hyoj0942 and others added 5 commits March 26, 2026 15:11

Merge upstream/main into perf-block-setitem-datetimelike-numset

c8d355c

Merge branch 'main' into perf-block-setitem-datetimelike-numset

f97a166

Merge branch 'main' into perf-block-setitem-datetimelike-numset

ac3103b

Merge branch 'main' into perf-block-setitem-datetimelike-numset

b879e3d

Merge branch 'main' into perf-block-setitem-datetimelike-numset

3f9071d

jbrockmendel approved these changes Mar 26, 2026

View reviewed changes

hyoj0942 requested a review from rhshadrach March 30, 2026 09:20

jbrockmendel merged commit 32b7892 into pandas-dev:main Apr 10, 2026
45 checks passed

mroeschke added this to the 3.1 milestone Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PERF: avoid materializing values[indexer] in Block.setitem#64251

PERF: avoid materializing values[indexer] in Block.setitem#64251
jbrockmendel merged 10 commits into
pandas-dev:mainfrom
hyoj0942:perf-block-setitem-datetimelike-numset

hyoj0942 commented Feb 20, 2026 •

edited

Loading

Uh oh!

hyoj0942 commented Feb 20, 2026

Uh oh!

rhshadrach left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hyoj0942 commented Feb 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbrockmendel commented Mar 26, 2026

Uh oh!

hyoj0942 commented Mar 26, 2026

Uh oh!

Uh oh!

jbrockmendel commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Uh oh!

Conversation

hyoj0942 commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hyoj0942 commented Feb 20, 2026

Uh oh!

rhshadrach left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hyoj0942 commented Feb 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbrockmendel commented Mar 26, 2026

Uh oh!

hyoj0942 commented Mar 26, 2026

Uh oh!

Uh oh!

jbrockmendel commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hyoj0942 commented Feb 20, 2026 •

edited

Loading