Skip to content

CLN: ASV indexing #19031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 3, 2018
Merged

CLN: ASV indexing #19031

merged 4 commits into from
Jan 3, 2018

Conversation

mroeschke
Copy link
Member

I moved some benchmarks to index_object.py that were testing method of (mostly) MultiIndexes. Otherwise mostly cleanup and now linting files that start with i.

$ asv dev -b ^indexing
· Discovering benchmarks
· Running 49 total benchmarks (1 commits * 1 environments * 49 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:211
[  2.04%] ··· Running indexing.IntervalIndexing.time_getitem_list         240μs
[  4.08%] ··· Running indexing.IntervalIndexing.time_getitem_scalar       136μs
[  6.12%] ··· Running indexing.IntervalIndexing.time_loc_list             208μs
[  8.16%] ··· Running indexing.IntervalIndexing.time_loc_scalar           222μs
[  8.16%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:245
[ 10.20%] ··· Running indexing.MethodLookup.time_lookup_iloc             10.4μs
[ 12.24%] ··· Running indexing.MethodLookup.time_lookup_ix               10.3μs
[ 14.29%] ··· Running indexing.MethodLookup.time_lookup_loc              10.1μs
[ 16.33%] ··· Running ...iesIndex.time_frame_assign_timeseries_index     6.08ms
[ 18.37%] ··· Running ....DataFrameNumericIndexing.time_bool_indexer     1.44ms
[ 20.41%] ··· Running indexing.DataFrameNumericIndexing.time_iloc         448μs
[ 22.45%] ··· Running ...ing.DataFrameNumericIndexing.time_iloc_dups      550μs
[ 24.49%] ··· Running indexing.DataFrameNumericIndexing.time_loc          827μs
[ 26.53%] ··· Running ...xing.DataFrameNumericIndexing.time_loc_dups     6.57ms
[ 28.57%] ··· Running ...g.DataFrameStringIndexing.time_boolean_rows      680μs
[ 30.61%] ··· Running ...rameStringIndexing.time_boolean_rows_object      671μs
[ 32.65%] ··· Running ...xing.DataFrameStringIndexing.time_get_value      215μs
[ 32.65%] ····· /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:115: FutureWarning: get_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
                self.df.get_value(self.idx_scalar, self.col_scalar)

[ 34.69%] ··· Running ...DataFrameStringIndexing.time_getitem_scalar      238μs
[ 36.73%] ··· Running indexing.DataFrameStringIndexing.time_ix            327μs
[ 38.78%] ··· Running indexing.DataFrameStringIndexing.time_loc           265μs
[ 40.82%] ··· Running ...Column.time_frame_getitem_single_column_int      167μs
[ 42.86%] ··· Running ...lumn.time_frame_getitem_single_column_label      158μs
[ 44.90%] ··· Running ...xing.InsertColumns.time_assign_with_setitem     55.0ms
[ 46.94%] ··· Running indexing.InsertColumns.time_insert                  103ms
[ 48.98%] ··· Running indexing.MultiIndexing.time_frame_ix               17.5ms
[ 51.02%] ··· Running indexing.MultiIndexing.time_index_slice            11.7ms
[ 53.06%] ··· Running indexing.MultiIndexing.time_series_ix              17.2ms
[ 55.10%] ··· Running ...ing.NonNumericSeriesIndexing.time_get_value         ok
[ 55.10%] ···· 
               ========== ========
                 index            
               ---------- --------
                 string    23.4ms 
                datetime   4.74ms 
               ========== ========

[ 55.10%] ····· 
                
                For parameters: 'string'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:94: FutureWarning: get_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
                  self.s.get_value(self.lbl)
                
                For parameters: 'datetime'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:94: FutureWarning: get_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
                  self.s.get_value(self.lbl)

[ 57.14%] ··· Running ...ericSeriesIndexing.time_getitem_label_slice         ok
[ 57.14%] ···· 
               ========== ========
                 index            
               ---------- --------
                 string    26.0ms 
                datetime   5.32ms 
               ========== ========

[ 59.18%] ··· Running ...umericSeriesIndexing.time_getitem_pos_slice         ok
[ 59.18%] ···· 
               ========== ========
                 index            
               ---------- --------
                 string    2.89ms 
                datetime   474μs  
               ========== ========

[ 61.22%] ··· Running ...onNumericSeriesIndexing.time_getitem_scalar         ok
[ 61.22%] ···· 
               ========== ========
                 index            
               ---------- --------
                 string    24.6ms 
                datetime   4.69ms 
               ========== ========

[ 63.27%] ··· Running ...ng.NumericSeriesIndexing.time_getitem_array         ok
[ 63.27%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    58.4ms 
                pandas.core.indexes.numeric.Float64Index   269ms  
               ========================================== ========

[ 65.31%] ··· Running ...umericSeriesIndexing.time_getitem_list_like         ok
[ 65.31%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    56.7ms 
                pandas.core.indexes.numeric.Float64Index   265ms  
               ========================================== ========

[ 67.35%] ··· Running ...ng.NumericSeriesIndexing.time_getitem_lists         ok
[ 67.35%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    64.9ms 
                pandas.core.indexes.numeric.Float64Index   269ms  
               ========================================== ========

[ 69.39%] ··· Running ...g.NumericSeriesIndexing.time_getitem_scalar         ok
[ 69.39%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    2.68ms 
                pandas.core.indexes.numeric.Float64Index   3.40ms 
               ========================================== ========

[ 71.43%] ··· Running ...ng.NumericSeriesIndexing.time_getitem_slice         ok
[ 71.43%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    284μs  
                pandas.core.indexes.numeric.Float64Index   3.61ms 
               ========================================== ========

[ 73.47%] ··· Running indexing.NumericSeriesIndexing.time_iloc_array         ok
[ 73.47%] ···· 
               ========================================== =======
                                 param1                          
               ------------------------------------------ -------
                 pandas.core.indexes.numeric.Int64Index    363μs 
                pandas.core.indexes.numeric.Float64Index   328μs 
               ========================================== =======

[ 75.51%] ··· Running ...g.NumericSeriesIndexing.time_iloc_list_like         ok
[ 75.51%] ···· 
               ========================================== =======
                                 param1                          
               ------------------------------------------ -------
                 pandas.core.indexes.numeric.Int64Index    231μs 
                pandas.core.indexes.numeric.Float64Index   236μs 
               ========================================== =======

[ 77.55%] ··· Running ...xing.NumericSeriesIndexing.time_iloc_scalar         ok
[ 77.55%] ···· 
               ========================================== =======
                                 param1                          
               ------------------------------------------ -------
                 pandas.core.indexes.numeric.Int64Index    132μs 
                pandas.core.indexes.numeric.Float64Index   134μs 
               ========================================== =======

[ 79.59%] ··· Running indexing.NumericSeriesIndexing.time_iloc_slice         ok
[ 79.59%] ···· 
               ========================================== =======
                                 param1                          
               ------------------------------------------ -------
                 pandas.core.indexes.numeric.Int64Index    220μs 
                pandas.core.indexes.numeric.Float64Index   221μs 
               ========================================== =======

[ 81.63%] ··· Running indexing.NumericSeriesIndexing.time_ix_array           ok
[ 81.63%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    58.6ms 
                pandas.core.indexes.numeric.Float64Index   266ms  
               ========================================== ========

[ 83.67%] ··· Running ...ing.NumericSeriesIndexing.time_ix_list_like         ok
[ 83.67%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    57.0ms 
                pandas.core.indexes.numeric.Float64Index   264ms  
               ========================================== ========

[ 85.71%] ··· Running indexing.NumericSeriesIndexing.time_ix_scalar          ok
[ 85.71%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    3.46ms 
                pandas.core.indexes.numeric.Float64Index   3.60ms 
               ========================================== ========

[ 87.76%] ··· Running indexing.NumericSeriesIndexing.time_ix_slice           ok
[ 87.76%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    3.33ms 
                pandas.core.indexes.numeric.Float64Index   3.74ms 
               ========================================== ========

[ 89.80%] ··· Running indexing.NumericSeriesIndexing.time_loc_array          ok
[ 89.80%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    58.2ms 
                pandas.core.indexes.numeric.Float64Index   266ms  
               ========================================== ========

[ 91.84%] ··· Running ...ng.NumericSeriesIndexing.time_loc_list_like         ok
[ 91.84%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    57.4ms 
                pandas.core.indexes.numeric.Float64Index   266ms  
               ========================================== ========

[ 93.88%] ··· Running indexing.NumericSeriesIndexing.time_loc_scalar         ok
[ 93.88%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    58.6ms 
                pandas.core.indexes.numeric.Float64Index   112ms  
               ========================================== ========

[ 95.92%] ··· Running indexing.NumericSeriesIndexing.time_loc_slice          ok
[ 95.92%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    2.82ms 
                pandas.core.indexes.numeric.Float64Index   3.63ms 
               ========================================== ========

[ 97.96%] ··· Running indexing.PanelIndexing.time_subset                 5.09ms
[100.00%] ··· Running indexing.Take.time_take                                ok
[100.00%] ···· 
               ========== ========
                 index            
               ---------- --------
                  int      11.1ms 
                datetime   11.0ms 
               ========== ========

@mroeschke
Copy link
Member Author

Here are the asv for the benchmarks that were moved:

asv dev -b ^index_object.MultiIndex
· Discovering benchmarks
· Running 11 total benchmarks (1 commits * 1 environments * 11 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/index_object.py:135
[  9.09%] ··· Running ...IndexValues.time_datetime_level_values_copy     25.0ms
[ 18.18%] ··· Running ...dexValues.time_datetime_level_values_sliced      538μs
[ 27.27%] ··· Running ...tiIndexDuplicates.time_remove_unused_levels     1.12ms
[ 36.36%] ··· Running ...MultiIndexGet.time_multiindex_large_get_loc      423ms
[ 45.45%] ··· Running ...IndexGet.time_multiindex_large_get_loc_warm      895ms
[ 54.55%] ··· Running ...t.MultiIndexGet.time_multiindex_med_get_loc     4.14ms
[ 63.64%] ··· Running ...tiIndexGet.time_multiindex_med_get_loc_warm     19.1ms
[ 72.73%] ··· Running ...IndexGet.time_multiindex_small_get_loc_warm     14.7ms
[ 81.82%] ··· Running ...ultiIndexGet.time_multiindex_string_get_loc      691μs
[ 90.91%] ··· Running ...x_object.MultiIndexInteger.time_get_indexer      337ms
[100.00%] ··· Running ..._object.MultiIndexInteger.time_is_monotonic      227ms
(pandas_dev)matt@matt-Inspiron-1545:~/Projects/pandas-mroeschke/asv_bench$ asv dev -b ^index_object.Float64IndexMethod
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[100.00%] ··· Running index_object.Float64IndexMethod.time_get_loc       7.80ms

@mroeschke mroeschke changed the title Asv clean indexing CLN: ASV indexing Jan 2, 2018
@gfyoung
Copy link
Member

gfyoung commented Jan 2, 2018

@mroeschke : Looks like some lint errors, but otherwise, Travis is happy.

[np.arange(1000), np.arange(20), list(string.ascii_letters)],
names=['one', 'two', 'three'])
self.mi_med = MultiIndex.from_product(
[np.arange(1000), np.arange(10), list('A')],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make a sub-dir scheme here

IOW

benchmarks/index/.....

and split out multi, numeric, datetime...etc

@jreback jreback added Benchmark Performance (ASV) benchmarks Indexing Related to indexing on series/frames, not to indexes themselves labels Jan 2, 2018
@codecov
Copy link

codecov bot commented Jan 3, 2018

Codecov Report

Merging #19031 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19031      +/-   ##
==========================================
- Coverage   91.57%   91.56%   -0.01%     
==========================================
  Files         150      150              
  Lines       48942    48942              
==========================================
- Hits        44818    44816       -2     
- Misses       4124     4126       +2
Flag Coverage Δ
#multiple 89.93% <ø> (-0.01%) ⬇️
#single 41.75% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/util/testing.py 84.74% <0%> (-0.22%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ab000a9...207c797. Read the comment docs.

@mroeschke
Copy link
Member Author

@jreback Some of the benchmarks in the file param'd over the various types of indexes or combined different indexes in the setup which would be difficult to split.

Instead, I put the MultiIndex benchmarks in their own file.

$ asv dev -b ^multiindex_object
· Discovering benchmarks
· Running 15 total benchmarks (1 commits * 1 environments * 15 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/multiindex_object.py:129
[  6.67%] ··· Running ...ject.Values.time_datetime_level_values_copy     24.9ms
[ 13.33%] ··· Running ...ct.Values.time_datetime_level_values_sliced      534μs
[ 20.00%] ··· Running multiindex_object.Duplicated.time_duplicated        223ms
[ 26.67%] ··· Running ...object.Duplicates.time_remove_unused_levels     1.18ms
[ 33.33%] ··· Running multiindex_object.GetLoc.time_large_get_loc         432ms
[ 40.00%] ··· Running ...index_object.GetLoc.time_large_get_loc_warm      890ms
[ 46.67%] ··· Running multiindex_object.GetLoc.time_med_get_loc          4.14ms
[ 53.33%] ··· Running multiindex_object.GetLoc.time_med_get_loc_warm     18.9ms
[ 60.00%] ··· Running ...index_object.GetLoc.time_small_get_loc_warm     15.0ms
[ 66.67%] ··· Running multiindex_object.GetLoc.time_string_get_loc        702μs
[ 73.33%] ··· Running multiindex_object.Integer.time_get_indexer          349ms
[ 80.00%] ··· Running multiindex_object.Integer.time_is_monotonic         245ms
[ 86.67%] ··· Running ...index_object.Sortlevel.time_sortlevel_int64      777ms
[ 93.33%] ··· Running multiindex_object.Sortlevel.time_sortlevel_one     18.2ms
[100.00%] ··· Running ...iindex_object.Sortlevel.time_sortlevel_zero     18.3ms

@jreback jreback added this to the 0.23.0 milestone Jan 3, 2018
@jreback jreback merged commit c883128 into pandas-dev:master Jan 3, 2018
@jreback
Copy link
Contributor

jreback commented Jan 3, 2018

thanks @mroeschke

@mroeschke mroeschke deleted the asv_clean_indexing branch January 3, 2018 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Benchmark Performance (ASV) benchmarks Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants