CI: Add pyarrow 11 to ci #51308

phofl · 2023-02-10T18:33:41Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

lithomas1

Can we put a hold on this for now? I would like to discuss with Arrow folks about the best way to test between our two projects.

(Extended thoughts below)

lithomas1 · 2023-02-10T21:13:10Z

Status Quo

Currently, we test on pyarrow versions 8, 9, and 10 (IIRC because some version of pyarrow in between our min/latest version broke, but the min/latest version passed). This currently adds a lot of burden on our CI (builds are queueing for 2 hours again even though Github bumped up our concurrent runners), since the builds are created from the cross product of matrix parameters, and also hasn't really caught that many issues (which is expected, since failures would mean that pyarrow had a regression).

IMO, we should share some of that testing burden between the two projects, both to reduce load on the pandas CI, and to ensure that pyarrow has enough testing coverage to catch its own bugs in Pandas/Pyarrow integration.

Proposed changes

Pyarrow test pandas nightlies in addition to testing min/latest pandas
- pyarrow seems to run their tests on pandas nightly.
- They should add a job to run pandas tests that test Arrow Extension Arrays/CSV IO/Parquet IO etc.
  - Pyarrow uses pandas internals, so it is easy for them to break something, because of changes to internal structures on our side
  - We should provide a mark to only run Arrow tests so that they don't have to run the full test suite.
We drop testing of intermediate versions of pyarrow and move to testing only min/max again
- This aligns with what we're doing with numpy
- There is a risk of breakage due to pyarrow's usage of our internals, so we should kinda figure out what pyarrow needs in our internals and mark those functions in the code.
- Maybe we can have a "pseudo-public" API for those functions.
(optionally) We test pyarrow nightlies
- If CI capacity frees up enough, we should also test pyarrow nightlies (I couldn't find the nightly builds for the Python bindings though).

Alternatives

Add the "arrow" mark on our side and only run those tests on the parametrized arrow build.

I will open an issue on their side in a couple days if no objections.

cc @mroeschke @jorisvandenbossche

mroeschke · 2023-02-10T21:26:02Z

Generally I would support only testing pyarrow min + max again right away.

pyarrow has seemed fairly stable within the past few versions and appears to be primarily additive in functionality. Especially since pyarrow functionality has been primarily opt-in-without-fallback, I don't see too much utility testing intermediate versions of pyarrow anymore.

phofl · 2023-02-10T21:29:54Z

I am a bit hesitant about only testing min and max as long as we have the flags that control behavior dependent on the pyarrow version. Wdyt about testing 8 on python 3.8, 9 on python 3.9 and so on?

mroeschke · 2023-02-10T21:37:05Z

FWIW I think since we bumped up the pyarrow version recently we only have version dependent fallback behavior in 2 spots, argsort and fillna in special cases. (They appear to also be or versions < 7 which we recently made our min version)

argsort

        null_placement = {"last": "at_end", "first": "at_start"}.get(na_position, None)
        if null_placement is None or pa_version_under7p0:
            # Although pc.array_sort_indices exists in version 6
            # there's a bug that affects the pa.ChunkedArray backing
            # https://issues.apache.org/jira/browse/ARROW-12042
            fallback_performancewarning("7")
            return super().argsort(
                ascending=ascending, kind=kind, na_position=na_position
            )

fillna

        if method is not None and pa_version_under7p0:
            # fill_null_{forward|backward} added in pyarrow 7.0
            fallback_performancewarning(version="7")
            return super().fillna(value=value, method=method, limit=limit)

jorisvandenbossche · 2023-02-10T22:11:24Z

I don't fully understand why we at some point moved towards parametrizing the build matrix on all supported versions of pyarrow. Personally, I would keep testing all versions of pyarrow (at least if we currently only support the latest 4 versions that seem doable), but that doesn't mean that we have to build every possible combination of python version and single-cpu / non-single-cpu with every version of pyarrow?

We could also generally run one version of pyarrow per version of Python? (roughly speaking) That gives 8 builds (4*2) instead of 32 (4*4*2)

jorisvandenbossche · 2023-02-10T22:18:42Z

Pyarrow test pandas nightlies in addition to testing min/latest pandas

pyarrow seems to run their tests on pandas nightly.

Indeed, we already have a build with pandas nightly (which regularly catches some changes/regressions)

They should add a job to run pandas tests that test Arrow Extension Arrays/CSV IO/Parquet IO etc.

Yes, this is indeed something we could do. I am not actually sure why we don't do that for pandas, as we do exactly that for dask (run a subset of dask's tests using the nightly arrow)

I think starting with specifically running a few files/directories (parquet and csv tests, the arrow extension arrays) might be a good enough start (not necessarily needing a mark, although if we start with marking a few whole files, that's also easy to add)

(optionally) We test pyarrow nightlies

I would emphasize this a bit more, and it should actually be easy to add (we already have a build with upstream dev versions, and we can add pyarrow nightly to the same build?)

mroeschke · 2023-02-10T22:28:18Z

but that doesn't mean that we have to build every possible combination of python version and single-cpu / non-single-cpu with every version of pyarrow

The pyaarrow CSV engine tests are marked as single_cpu while the other tests are not. In the past, I noticed the arrow CSV engine would stall if run with xdist.

lithomas1 · 2023-02-10T22:39:15Z

They should add a job to run pandas tests that test Arrow Extension Arrays/CSV IO/Parquet IO etc.

Yes, this is indeed something we could do. I am not actually sure why we don't do that for pandas, as we do exactly that for dask (run a subset of dask's tests using the nightly arrow)

I think starting with specifically running a few files/directories (parquet and csv tests, the arrow extension arrays) might be a good enough start (not necessarily needing a mark, although if we start with marking a few whole files, that's also easy to add)

Cool. I'll open up an issue on the Arrow side then.

(optionally) We test pyarrow nightlies

I would emphasize this a bit more, and it should actually be easy to add (we already have a build with upstream dev versions, and we can add pyarrow nightly to the same build?)

I can add this if you give me a link. Googling only gives me the R nightlies.

jorisvandenbossche · 2023-02-10T22:41:20Z

but that doesn't mean that we have to build every possible combination of python version and single-cpu / non-single-cpu with every version of pyarrow

The pyaarrow CSV engine tests are marked as single_cpu while the other tests are not. In the past, I noticed the arrow CSV engine would stall if run with xdist.

Yes, but what I mentioned was running each combo of python version and single/non-single CPU just with 1 version of pyarrow instead of 4. That still runs the csv tests with pyarrow for single-cpu.

jorisvandenbossche · 2023-02-10T22:42:47Z

I can add this if you give me a link. Googling only gives me the R nightlies.

@lithomas1 see https://arrow.apache.org/docs/dev/developers/python.html#installing-nightly-packages (it's indeed a bit hidden in the development section instead of having it on the install page).

mroeschke · 2023-02-10T23:36:18Z

Yes, but what I mentioned was running each combo of python version and single/non-single CPU just with 1 version of pyarrow instead of 4. That still runs the csv tests with pyarrow for single-cpu.

Right that's fair. I think I would still prefer to test the min & max version of pyarrow. Anecdotally, pyarrow development has notably affected the minimum pyarrow version build or all builds affecting all Pyarrow versions.

github-actions · 2023-03-25T00:05:20Z

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

lithomas1 · 2023-03-25T00:53:09Z

Sorry, this slipped off my radar for a bit.

Right now, I'm currently working on getting rid of all hanging tests for read_csv with pyarrow. Then, we should be able to get rid of the single_cpu pyarrow builds.

I'm holding off on adding pyarrow nightlies, until we can first reduce the CI matrix first.

phofl · 2023-03-25T05:11:22Z

I stumbled upon this on the desk side today, what are your thoughts on adding the arrow nightly to the numpy nightly build?

lithomas1 · 2023-03-25T14:04:32Z

I stumbled upon this on the desk side today, what are your thoughts on adding the arrow nightly to the numpy nightly build?

I don't think they test with Cython 3 ATM, so it wouldn't be practical (They currently do support the latest beta as of a couple of days ago, but it is untested in their CI).
It also could cause some noise in our CI when numpydev breaks arrow.

mroeschke · 2023-03-25T20:59:36Z

IMO I think we could already:

Start only testing pyarrow min version & unpinned (latest)
Add a new job in the ubuntu.yml workflow for testing just arrow nightly

That would be a net of -3 jobs in the workflow while testing pyarrow nightly which is probably more valuable than testing intermediate versions of pyarrow

mroeschke · 2023-03-31T17:45:36Z

I think we can close this as we cleaned up some of the Arrow testing in the CI as apart of #52211

CI: Add pyarrow 11 to ci

a0e8cfc

lithomas1 requested changes Feb 10, 2023

View reviewed changes

simonjayhawkins added CI Continuous Integration Arrow pyarrow functionality labels Feb 22, 2023

github-actions bot added the Stale label Mar 25, 2023

mroeschke closed this Mar 31, 2023

phofl deleted the pyarrow_testing branch April 27, 2023 19:53

jorisvandenbossche mentioned this pull request Jul 19, 2023

CI: Test pyarrow nightly instead of intermediate versions #52211

Merged

5 tasks

Uh oh!

CI: Add pyarrow 11 to ci #51308

CI: Add pyarrow 11 to ci #51308

Uh oh!

Conversation

phofl commented Feb 10, 2023

Uh oh!

lithomas1 left a comment

Choose a reason for hiding this comment

Uh oh!

lithomas1 commented Feb 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mroeschke commented Feb 10, 2023

Uh oh!

phofl commented Feb 10, 2023

Uh oh!

mroeschke commented Feb 10, 2023

Uh oh!

jorisvandenbossche commented Feb 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jorisvandenbossche commented Feb 10, 2023

Uh oh!

mroeschke commented Feb 10, 2023

Uh oh!

lithomas1 commented Feb 10, 2023

Uh oh!

jorisvandenbossche commented Feb 10, 2023

Uh oh!

jorisvandenbossche commented Feb 10, 2023

Uh oh!

mroeschke commented Feb 10, 2023

Uh oh!

github-actions bot commented Mar 25, 2023

Uh oh!

lithomas1 commented Mar 25, 2023

Uh oh!

phofl commented Mar 25, 2023

Uh oh!

lithomas1 commented Mar 25, 2023

Uh oh!

mroeschke commented Mar 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mroeschke commented Mar 31, 2023

Uh oh!

Uh oh!

lithomas1 commented Feb 10, 2023 •

edited

Loading

jorisvandenbossche commented Feb 10, 2023 •

edited

Loading

mroeschke commented Mar 25, 2023 •

edited

Loading