Skip to content

Groupby benchmark fails after PR #2317 merge #2463

Closed
@gshimansky

Description

@gshimansky

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):

Ubuntu 20.04.1 LTS

  • Modin version (modin.__version__):

0.8.2.1+16.g3e32d02

  • Python version:

Python 3.8.6

  • Code we can use to reproduce:

python -u -m pytest --log-cli-level=info --benchmark-sort=name --benchmark-verbose --benchmark-json=results.json -r A -s --tb=native 'ci/benchmarks/test_benchmarks.py::test_groupby_sum[agg(sum)-no_indexing-10000x10000-int-modin]'

Benchmark fails with exception:

Traceback (most recent call last):
  File "/localdisk/gashiman/modin/ci/benchmarks/test_benchmarks.py", line 206, in test_groupby_sum
    result = benchmark.pedantic(
  File "/nfs/site/home/gashiman/.local/lib/python3.8/site-packages/pytest_benchmark/fixture.py", line 139, in pedantic
    return self._raw_pedantic(target, args=args, kwargs=kwargs, setup=setup, rounds=rounds,
  File "/nfs/site/home/gashiman/.local/lib/python3.8/site-packages/pytest_benchmark/fixture.py", line 213, in _raw_pedantic
    runner(loops_range)
  File "/nfs/site/home/gashiman/.local/lib/python3.8/site-packages/pytest_benchmark/fixture.py", line 87, in runner
    sys.settrace(None)
  File "/localdisk/gashiman/modin/ci/benchmarks/test_benchmarks.py", line 170, in benchmark_groupby_agg_sum_function
    result = gb.agg(sum)
  File "/localdisk/gashiman/modin/modin/pandas/groupby.py", line 399, in aggregate
    result = self._apply_agg_function(
  File "/localdisk/gashiman/modin/modin/pandas/groupby.py", line 887, in _apply_agg_function
    new_manager = groupby_qc.groupby_agg(
  File "/localdisk/gashiman/modin/modin/backends/pandas/query_compiler.py", line 2633, in groupby_agg
    new_modin_frame = self._modin_frame._apply_full_axis(
  File "/localdisk/gashiman/modin/modin/engines/base/frame/data.py", line 1301, in _apply_full_axis
    return self.broadcast_apply_full_axis(
  File "/localdisk/gashiman/modin/modin/engines/base/frame/data.py", line 1673, in broadcast_apply_full_axis
    new_axes = [
  File "/localdisk/gashiman/modin/modin/engines/base/frame/data.py", line 1674, in <listcomp>
    self._compute_axis_labels(i, new_partitions)
  File "/localdisk/gashiman/modin/modin/engines/base/frame/data.py", line 289, in _compute_axis_labels
    return self._frame_mgr_cls.get_indices(
  File "/localdisk/gashiman/modin/modin/engines/ray/pandas_on_ray/frame/partition_manager.py", line 99, in get_indices
    new_idx = ray.get(new_idx)
  File "/localdisk/gashiman/miniconda3/lib/python3.8/site-packages/ray/worker.py", line 1452, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(TypeError): ray::deploy_ray_func() (pid=4067967, ip=10.241.129.55)
  File "python/ray/_raylet.pyx", line 446, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 468, in ray._raylet.execute_task
ray.exceptions.RayTaskError: ray::deploy_ray_func() (pid=4067969, ip=10.241.129.55)
  File "python/ray/_raylet.pyx", line 482, in ray._raylet.execute_task
  File "/localdisk/gashiman/modin/modin/engines/ray/pandas_on_ray/frame/axis_partition.py", line 105, in deploy_ray_func
    result = func(*args)
  File "/localdisk/gashiman/modin/modin/engines/base/frame/axis_partition.py", line 224, in deploy_axis_func
    result = func(dataframe, **kwargs)
  File "/localdisk/gashiman/modin/modin/engines/base/frame/data.py", line 1036, in _map_reduce_func
    series_result = func(df, *args, **kwargs)
  File "/localdisk/gashiman/modin/modin/backends/pandas/query_compiler.py", line 2634, in <lambda>
    axis, lambda df: groupby_agg_builder(df)
  File "/localdisk/gashiman/modin/modin/backends/pandas/query_compiler.py", line 2627, in groupby_agg_builder
    return compute_groupby(df)
  File "/localdisk/gashiman/modin/modin/backends/pandas/query_compiler.py", line 2618, in compute_groupby
    result = agg_func(grouped_df, **agg_kwargs)
  File "/localdisk/gashiman/modin/modin/utils.py", line 123, in wrapper
    result = func(*args, **kwargs)
TypeError: unsupported operand type(s) for +: 'int' and 'tuple'

Metadata

Metadata

Assignees

Labels

bug 🦗Something isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions