Skip to content

Commit 21be690

Browse files
authored
Merge branch 'master' into master
2 parents 6683a60 + 8d76edd commit 21be690

File tree

153 files changed

+6555
-911
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

153 files changed

+6555
-911
lines changed

.gitignore

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,3 +122,13 @@ fontList-v300.json
122122
# tex folders
123123
tex.cache/
124124
.Rproj.user
125+
testtt.py
126+
real.py
127+
0to2_beforeduringafter.csv
128+
129+
TEST.ipynb
130+
TrhCsCh.csv
131+
Untitled.ipynb
132+
tt.py
133+
# Visual Studio
134+
.vscode/

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
The Clear BSD License
22

3-
Copyright (c) 2016-2020 Joses W. Ho
3+
Copyright (c) 2016-2023 Joses W. Ho
44
All rights reserved.
55

66
Redistribution and use in source and binary forms, with or without

README.md

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,28 @@
11
# DABEST-Python
2-
[![Travis CI build status](https://travis-ci.org/ACCLAB/DABEST-python.svg?branch=master)](https://travis-ci.org/ACCLAB/DABEST-python)
2+
<!-- [![Travis CI build status](https://travis-ci.org/ACCLAB/DABEST-python.svg?branch=master)](https://travis-ci.org/ACCLAB/DABEST-python) -->
33
[![minimal Python version](https://img.shields.io/badge/Python%3E%3D-3.6-6666ff.svg)](https://www.anaconda.com/distribution/)
4-
[![PyPI version](https://badge.fury.io/py/dabest.svg)](https://badge.fury.io/py/dabest)
4+
[![PyPI version](https://badge.fury.io/py/dabest.svg?kill_cache=1)](https://badge.fury.io/py/dabest)
55
[![Downloads](https://pepy.tech/badge/dabest/month)](https://pepy.tech/project/dabest/month)
66
[![Free-to-view citation](https://zenodo.org/badge/DOI/10.1038/s41592-019-0470-3.svg)](https://rdcu.be/bHhJ4)
77
[![License](https://img.shields.io/badge/License-BSD%203--Clause--Clear-orange.svg)](https://spdx.org/licenses/BSD-3-Clause-Clear.html)
88

9+
## Version Update
10+
11+
**DABEST v2023.02.14** for Python is now released!
12+
13+
This new version provides the following new features:
14+
15+
1. [**Repeated measures.**](https://acclab.github.io/DABEST-python-docs/repeatedmeasures.html) Augments the prior function for plotting (independent) multiple test groups versus a shared control; it can now do the same for repeated-measures experimental design. Together, these two methods can be used to replace both flavors of the 1-way ANOVA with an estimation analysis.
16+
17+
2. [**Proportional data.**](https://acclab.github.io/DABEST-python-docs/proportion-plot.html) Generates proportional bar plots, proportional differences, and calculates Cohen's h. Also enables plotting Sankey diagrams for paired binary data. This is the estimation graphic equivalent to a bar chart with Fisher's exact test.
18+
19+
3. [**The ∆∆ plot.**](https://acclab.github.io/DABEST-python-docs/deltadelta.html) Calculates the delta-delta (∆∆) for 2 × 2 experimental design and plots the four groups with their relevant effect sizes. This design can be used as a replacement for the 2 × 2 ANOVA.
20+
21+
4. [**Mini-meta.**](https://acclab.github.io/DABEST-python-docs/minimetadelta.html) Calculates and plots a weighted delta (∆) for meta-analysis of experimental replicates. Useful for summarizing data from multiple replicated experiments, for example by different scientists in the same lab or the same scientist at different times. When the observed values are known (and share a common metric), this function makes such meta-analysis convenient.
22+
23+
We recommend all users update to the new version. Please see the [updated documentation](https://acclab.github.io/DABEST-python-docs/) for more details and relevant tutorials.
24+
25+
926
## Contents
1027
<!-- TOC depthFrom:1 depthTo:2 withLinks:1 updateOnSave:1 orderedList:0 -->
1128
- [About](#about)
@@ -32,7 +49,7 @@ An estimation plot has two key features.
3249

3350
2. It presents the effect size as a **bootstrap 95% confidence interval** on a **separate but aligned axes**.
3451

35-
![The five kinds of estimation plots](docs/source/_images/showpiece.png?raw=true "The five kinds of estimation plots.")
52+
![The five kinds of estimation plots](docs/source/_images/showpiece.png "The five kinds of estimation plots.")
3653

3754
DABEST powers [estimationstats.com](https://www.estimationstats.com/), allowing everyone access to high-quality estimation plots.
3855

@@ -78,7 +95,7 @@ iris_dabest = dabest.load(data=iris, x="species", y="petal_width",
7895
# Produce a Cumming estimation plot.
7996
iris_dabest.mean_diff.plot();
8097
```
81-
![A Cumming estimation plot of petal width from the iris dataset](https://github.com/ACCLAB/DABEST-python/blob/master/iris.png)
98+
![A Cumming estimation plot of petal width from the iris dataset](iris.png)
8299

83100
Please refer to the official [tutorial](https://acclab.github.io/DABEST-python-docs/tutorial.html) for more useful code snippets.
84101

@@ -108,11 +125,11 @@ We also have a [Code of Conduct](https://github.com/ACCLAB/DABEST-python/blob/ma
108125
### A wish list for new features
109126
Currently, DABEST offers functions to handle data traditionally analyzed with Student’s paired and unpaired t-tests. It also offers plots for multiplexed versions of these, and the estimation counterpart to a 1-way analysis of variance (ANOVA), the shared-control design. While these five functions execute a large fraction of common biomedical data analyses, there remain three others: 2-way data, time-series group data, and proportional data. We aim to add these new functions to both the R and Python libraries.
110127

111-
● In many experiments, four groups are investigate to isolate an interaction, for example: a genotype × drug effect. Here, wild-type and mutant animals are each subjected to drug or sham treatments; the data are traditionally analysed with a 2×2 ANOVA. We have received requests by email, Twitter, and GitHub to implement an estimation counterpart to the 2-way ANOVA. To do this, we will implement ∆∆ plots, in which the difference of means (∆) of two groups is subtracted from a second two-group ∆.
128+
● In many experiments, four groups are investigate to isolate an interaction, for example: a genotype × drug effect. Here, wild-type and mutant animals are each subjected to drug or sham treatments; the data are traditionally analysed with a 2×2 ANOVA. We have received requests by email, Twitter, and GitHub to implement an estimation counterpart to the 2-way ANOVA. To do this, we will implement ∆∆ plots, in which the difference of means (∆) of two groups is subtracted from a second two-group ∆. **Implemented in v2023.02.14.**
112129

113-
● Currently, DABEST can analyse multiple paired data in a single plot, and multiple groups with a common, shared control. However, a common design in biomedical science is to follow the same group of subjects over multiple, successive time points. An estimation plot for this would combine elements of the two other designs, and could be used in place of a repeated-measures ANOVA.
130+
● Currently, DABEST can analyse multiple paired data in a single plot, and multiple groups with a common, shared control. However, a common design in biomedical science is to follow the same group of subjects over multiple, successive time points. An estimation plot for this would combine elements of the two other designs, and could be used in place of a repeated-measures ANOVA. **Implemented in v2023.02.14**
114131

115-
● We have observed that proportional data are often analyzed in neuroscience and other areas of biomedical research. However, compared to other data types, the charts are frequently impoverished: often, they omit error bars, sample sizes, and even P values—let alone effect sizes. We would like DABEST to feature proportion charts, with error bars and a curve for the distribution of the proportional differences.
132+
● We have observed that proportional data are often analyzed in neuroscience and other areas of biomedical research. However, compared to other data types, the charts are frequently impoverished: often, they omit error bars, sample sizes, and even P values—let alone effect sizes. We would like DABEST to feature proportion charts, with error bars and a curve for the distribution of the proportional differences. **Implemented in v2023.02.14**
116133

117134
We encourage contributions for the above features.
118135

dabest/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,4 @@
2323
from ._stats_tools import effsize as effsize
2424
from ._classes import TwoGroupsEffectSize, PermutationTest
2525

26-
__version__ = "0.3.1"
26+
__version__ = "2023.02.14"

dabest/_api.py

Lines changed: 50 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,10 @@
44
55

66

7-
def load(data, idx, x=None, y=None, paired=False, id_col=None,
8-
ci=95, resamples=5000, random_seed=12345):
7+
def load(data, idx=None, x=None, y=None, paired=None, id_col=None,
8+
ci=95, resamples=5000, random_seed=12345, proportional=False,
9+
delta2 = False, experiment = None, experiment_label = None,
10+
x1_level = None, mini_meta=False):
911
'''
1012
Loads data in preparation for estimation statistics.
1113
@@ -18,10 +20,19 @@ def load(data, idx, x=None, y=None, paired=False, id_col=None,
1820
List of column names (if 'x' is not supplied) or of category names
1921
(if 'x' is supplied). This can be expressed as a tuple of tuples,
2022
with each individual tuple producing its own contrast plot
21-
x : string, default None
23+
x : string or list, default None
24+
Column name(s) of the independent variable. This can be expressed as
25+
a list of 2 elements if and only if 'delta2' is True; otherwise it
26+
can only be a string.
2227
y : string, default None
2328
Column names for data to be plotted on the x-axis and y-axis.
24-
paired : boolean, default False.
29+
paired : string, default None
30+
The type of the experiment under which the data are obtained. If 'paired'
31+
is None then the data will not be treated as paired data in the subsequent
32+
calculations. If 'paired' is 'baseline', then in each tuple of x, other
33+
groups will be paired up with the first group (as control). If 'paired' is
34+
'sequential', then in each tuple of x, each group will be paired up with
35+
its previous group (as control).
2536
id_col : default None.
2637
Required if `paired` is True.
2738
ci : integer, default 95
@@ -34,6 +45,28 @@ def load(data, idx, x=None, y=None, paired=False, id_col=None,
3445
This integer is used to seed the random number generator during
3546
bootstrap resampling, ensuring that the confidence intervals
3647
reported are replicable.
48+
proportional : boolean, default False.
49+
An indicator of whether the data is binary or not. When set to True, it
50+
specifies that the data consists of binary data, where the values are
51+
limited to 0 and 1. The code is not suitable for analyzing proportion
52+
data that contains non-numeric values, such as strings like ‘yes’ and ‘no’.
53+
When False or not provided, the algorithm assumes that
54+
the data is continuous and uses a non-proportional representation.
55+
delta2 : boolean, default False
56+
Indicator of delta-delta experiment
57+
experiment : String, default None
58+
The name of the column of the dataframe which contains the label of
59+
experiments
60+
experiment_lab : list, default None
61+
A list of String to specify the order of subplots for delta-delta plots.
62+
This can be expressed as a list of 2 elements if and only if 'delta2'
63+
is True; otherwise it can only be a string.
64+
x1_level : list, default None
65+
A list of String to specify the order of subplots for delta-delta plots.
66+
This can be expressed as a list of 2 elements if and only if 'delta2'
67+
is True; otherwise it can only be a string.
68+
mini_meta : boolean, default False
69+
Indicator of weighted delta calculation.
3770
3871
Returns
3972
-------
@@ -59,7 +92,19 @@ def load(data, idx, x=None, y=None, paired=False, id_col=None,
5992
6093
>>> my_data = dabest.load(df, idx=("Control 1", "Test 1"))
6194
95+
For proportion plot.
96+
97+
>>> np.random.seed(88888)
98+
>>> N = 10
99+
>>> c1 = np.random.binomial(1, 0.2, size=N)
100+
>>> t1 = np.random.binomial(1, 0.5, size=N)
101+
>>> df = pd.DataFrame({'Control 1' : c1, 'Test 1': t1})
102+
>>> my_data = dabest.load(df, idx=("Control 1", "Test 1"),proportional=True)
103+
104+
105+
62106
'''
63107
from ._classes import Dabest
64108

65-
return Dabest(data, idx, x, y, paired, id_col, ci, resamples, random_seed)
109+
return Dabest(data, idx, x, y, paired, id_col, ci, resamples, random_seed, proportional, delta2, experiment, experiment_label, x1_level, mini_meta)
110+

dabest/_bootstrap_tools.py

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,13 @@ class bootstrap:
1818
NaNs are automatically discarded.
1919
2020
paired: boolean, default False
21-
Whether or not x1 and x2 are paired samples.
22-
21+
Whether or not x1 and x2 are paired samples. If 'paired' is None then
22+
the data will not be treated as paired data in the subsequent calculations.
23+
If 'paired' is 'baseline', then in each tuple of x, other groups will be
24+
paired up with the first group (as control). If 'paired' is 'sequential',
25+
then in each tuple of x, each group will be paired up with the previous
26+
group (as control).
27+
2328
statfunction: callable, default np.mean
2429
The summary statistic called on data.
2530
@@ -47,8 +52,8 @@ class bootstrap:
4752
Whether or not the summary is the difference between two groups.
4853
If False, only x1 was supplied.
4954
50-
is_paired: boolean
51-
Whether or not the difference reported is between 2 paired groups.
55+
is_paired : string, default None
56+
The type of the experiment under which the data are obtained
5257
5358
statistic: callable
5459
The function used to compute the summary.
@@ -85,19 +90,19 @@ class bootstrap:
8590
8691
pvalue_2samp_ind_ttest: float
8792
P-value obtained from scipy.stats.ttest_ind.
88-
If a single array was given (x1 only), or if `paired` is True,
93+
If a single array was given (x1 only), or if `paired` is not None,
8994
returns 'NIL'.
9095
See https://docs.scipy.org/doc/scipy-1.0.0/reference/generated/scipy.stats.ttest_ind.html
9196
9297
pvalue_2samp_related_ttest: float
9398
P-value obtained from scipy.stats.ttest_rel.
94-
If a single array was given (x1 only), or if `paired` is False,
99+
If a single array was given (x1 only), or if `paired` is None,
95100
returns 'NIL'.
96101
See https://docs.scipy.org/doc/scipy-1.0.0/reference/generated/scipy.stats.ttest_rel.html
97102
98103
pvalue_wilcoxon: float
99104
P-value obtained from scipy.stats.wilcoxon.
100-
If a single array was given (x1 only), or if `paired` is False,
105+
If a single array was given (x1 only), or if `paired` is None,
101106
returns 'NIL'.
102107
The Wilcoxons signed-rank test is a nonparametric paired test of
103108
the null hypothesis that the related samples x1 and x2 are from
@@ -113,7 +118,7 @@ class bootstrap:
113118
114119
'''
115120
def __init__(self, x1, x2=None,
116-
paired=False,
121+
paired=None,
117122
statfunction=None,
118123
smoothboot=False,
119124
alpha_level=0.05,
@@ -155,7 +160,7 @@ def __init__(self, x1, x2=None,
155160
if len(x1) != len(x2):
156161
raise ValueError('x1 and x2 are not the same length.')
157162

158-
if (x2 is None) or (paired is True) :
163+
if (x2 is None) or (paired is not None) :
159164

160165
if x2 is None:
161166
tx = x1
@@ -165,7 +170,7 @@ def __init__(self, x1, x2=None,
165170
ttest_2_paired = 'NIL'
166171
wilcoxonresult = 'NIL'
167172

168-
elif paired is True:
173+
elif paired is not None:
169174
diff = True
170175
tx = x2 - x1
171176
ttest_single = 'NIL'
@@ -188,7 +193,7 @@ def __init__(self, x1, x2=None,
188193
pct_low_high = np.nan_to_num(pct_low_high).astype('int')
189194

190195

191-
elif x2 is not None and paired is False:
196+
elif x2 is not None and paired is None:
192197
diff = True
193198
x2 = pd.Series(x2).dropna()
194199
# Generate statarrays for both arrays.
@@ -268,7 +273,7 @@ def __repr__(self):
268273
else:
269274
stat = self.statistic
270275

271-
diff_types = {True: 'paired', False: 'unpaired'}
276+
diff_types = {'sequential': 'paired', 'baseline': 'paired', None: 'unpaired'}
272277
if self.is_difference:
273278
a = 'The {} {} difference is {}.'.format(diff_types[self.is_paired],
274279
stat, self.summary)

0 commit comments

Comments
 (0)