All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- Refactor. Please refer to the
Design Overviewsession in docs for more details. - Support both
matplotlibandplotly. - Update tutorials according to the refactor codes.
- Better unit test.
- Semi-automate dosctring generation.
- Formal documentation hosted on readthedocs.org
- Keep trace of historical documentations
- Unit tests
info_plots.target_plot_interact: visualise average target value across interaction between two featuresinfo_plots.actual_plot_interact: visualise prediction distribution across interaction between two featuresget_dataset: store models and datasets for three different problems (binary classification, multi-class classification, regression)- Tutorials in jupyter notebook format
- Move all information related plots under
info_plots, includinginfo_plots.target_plotinfo_plots.target_plot_interactinfo_plots.actual_plotinfo_plots.actual_plot_interact
- Move all utility functions under
xx_utils.pyutils.py: general utility functionsinfo_plot_utils.py: utility functions for information plotspdp_calc_utils.py: utility functions for pdp related calculationpdp_plot_utils.py: utility functions for pdp related plots
class PDPIsolate- Rename
class pdp_isolate_objasclass PDPIsolate - Remove
self.classifier,self.model_features,self.actual_columns: useless - Add
self.which_class,self.percentile_info,self.count_data,self.hist_data: store class information for multi-class problem, store percentile information for grid points, store value count information as well as feature values for numeric feature
- Rename
class PDPInteract- Rename
class pdp_interact_objasclass PDPInteract - Remove
self.classifier,self.model_features: useless - Add
self.which_class: store class information for multi-class problem - Combine
self.pdp_isolate_out1andself.pdp_isolate_out2intoself.pdp_isolate_outs
- Rename
pdp.pdp_isolate- Replace
train_Xasdatasetto store whole dataset instead of only the subset for model training, thus addmodel_featuresto indicate features used for model training - Add
grid_type,grid_range: define type and range for grid points - Add
memory_limit,n_jobs: limit memory usage, support parallel processing - Set
predict_kwdsdefault value intoNoneinstead of{} - Add
data_transformer: support dataset transformation
- Replace
pdp.pdp_plot- Add
plot_pts_dist: enable to plot distribution of data points - Remove
plot_org_pts: no longer support plotting original data points - Set
cluster_methoddefault value as 'accurate' instead of None - Add
show_percentile: show percentile information of grid points - Set
ncolsdefault value as 2 instead of None - Add
which_classes, removemulti_flag,which_class: plot for a single class is now supported bywhich_classes
- Add
pdp.pdp_interact- Replace
train_Xasdatasetto store whole dataset instead of only the subset for model training, thus addmodel_featuresto indicate features used for model training - Set
num_grid_pointsdefault value as None instead of[10, 10] - Add
grid_type,grid_range: define type and range for grid points - Set
percentile_rangesdefault value as None instead of[None, None] - Set
cust_grid_pointsdefault value as None instead of[None, None] - Set
predict_kwdsdefault value intoNoneinstead of{}
- Replace
pdp.pdp_interact_plot- Add
plot_type,plot_pdp, removeonly_inter: define plot type and whether to plot pdp for both features, only showing contour plot now is supported byplot_typeandplot_pdp - Add
which_classes, removemulti_flag,which_class: plot for a single class is now supported bywhich_classes - Set
ncolsdefault value as 2 instead of None - Remove
center,plot_org_pts,plot_lines,frac_to_plot,cluster,n_cluster_centers,cluster_method: no longer support plotting separate pdp plots
- Add
info_plots.target_plot- Add
grid_type,grid_range: define type and range for grid points - Add
show_percentile: show percentile information of grid points - Add
show_outliers: whether to show data points outside the grid range - Add
endpoint: whether stop is the last grid point - Add
ncols: define number of columns for multiple plots
- Add
info_plots.actual_plot- Add
model,X,feature, removepdp_isolate_out: no longer depend onpdp.pdp_isolate, thus need to define all necessary parameters for calculating the results - Add
num_grid_points,grid_type,percentile_range,grid_range,cust_grid_points,show_percentile,show_outliers,endpoint,which_classes,predict_kwds - Set
ncolsdefault value as 2 instead of None - Add
which_classes, removemulti_flag,which_class: plot for a single class is now supported bywhich_classes - Set
predict_kwdsdefault value intoNoneinstead of{}
- Add
- Python3 compatibility
- All plotting related functions would return a
matplotlib.figure.Figureobject as well asMatplotlib.axesfor further modification