Skip to content

Binary operations throws exceptions when Series has duplicated indexes #2265

Closed
@prutskov

Description

@prutskov

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • Modin version: 0.8.1.1
  • Python version: 3.8.3
  • Exact command to reproduce:
import modin.pandas as pd
import pandas

data = {'a': [0,1,2,3]}
data2 = {'a': [4,5,6,7]}
index = [0,1,0,1]

md_df = pd.DataFrame(data, index=index)
pd_df = pandas.DataFrame(data, index=index)

md_df2 = pd.DataFrame(data2)
pd_df2 = pandas.DataFrame(data2)


print(f'pandas res \n{pd_df.a * pd_df2.a }')
print(f'modin res \n{md_df.a * md_df2.a }')

Describe the problem

Source code / logs

Log
print(f'modin res \n{md_df.a * md_df2.a }')
File "/localdisk/aprutsko/modin/modin/pandas/series.py", line 290, in __mul__                          
    return self.mul(right)                 
  File "/localdisk/aprutsko/modin/modin/pandas/series.py", line 1131, in mul                 
    return super(Series, new_self).mul(                       
  File "/localdisk/aprutsko/modin/modin/pandas/base.py", line 1794, in mul                     
    return self._binary_op(                                  
  File "/localdisk/aprutsko/modin/modin/pandas/base.py", line 244, in _binary_op                                  
    new_query_compiler = getattr(self._query_compiler, op)(other, **kwargs)
  File "/localdisk/aprutsko/modin/modin/data_management/functions/binary_function.py", line 48, in caller               
    query_compiler._modin_frame._binary_op(
  File "/localdisk/aprutsko/modin/modin/engines/base/frame/data.py", line 1735, in _binary_op            
    left_parts, right_parts, joined_index = self._copartition(
  File "/localdisk/aprutsko/modin/modin/engines/base/frame/data.py", line 1666, in _copartition  
    reindexed_self = self._frame_mgr_cls.map_axis_partitions( 
  File "/localdisk/aprutsko/modin/modin/engines/base/frame/partition_manager.py", line 323, in map_axis_partitions
    return cls.broadcast_axis_partitions(
  File "/localdisk/aprutsko/modin/modin/engines/base/frame/partition_manager.py", line 249, in broadcast_axis_partitions
    [                                                                     
  File "/localdisk/aprutsko/modin/modin/engines/base/frame/partition_manager.py", line 250, in <listcomp>           
    part.apply(                 
  File "/localdisk/aprutsko/modin/modin/engines/base/frame/axis_partition.py", line 172, in apply              
    return self._wrap_partitions(self.deploy_axis_func(*args))
  File "/localdisk/aprutsko/modin/modin/engines/base/frame/axis_partition.py", line 218, in deploy_axis_func     
    result = func(dataframe, **kwargs)
  File "/localdisk/aprutsko/modin/modin/engines/base/frame/data.py", line 1667, in <lambda>                          
    axis, self._partitions, lambda df: df.reindex(joined_index, axis=axis)
  File "/localdisk/aprutsko/miniconda3/lib/python3.8/site-packages/pandas/util/_decorators.py", line 309, in wrapper  
    return func(*args, **kwargs)       
  File "/localdisk/aprutsko/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 4031, in reindex                 
    return super().reindex(**kwargs)    
  File "/localdisk/aprutsko/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 4458, in reindex                   
    return self._reindex_axes(           
  File "/localdisk/aprutsko/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 3877, in _reindex_axes      
    frame = frame._reindex_index(                           
  File "/localdisk/aprutsko/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 3896, in _reindex_index
    return self._reindex_with_indexers(
  File "/localdisk/aprutsko/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 4521, in _reindex_with_indexers
    new_data = new_data.reindex_indexer(
  File "/localdisk/aprutsko/miniconda3/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 1276, in reindex_indexer
    self.axes[axis]._can_reindex(indexer)                                
  File "/localdisk/aprutsko/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3285, in _can_reindex
    raise ValueError("cannot reindex from a duplicate axis")
ValueError: cannot reindex from a duplicate axis

Metadata

Metadata

Assignees

Labels

bug 🦗Something isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions