Skip to content

Commit dd81cf5

Browse files
authored
Merge pull request #426 from jvdp1/median
Addition of a subroutine to compute the median of array elements
2 parents 88e2219 + a13c700 commit dd81cf5

10 files changed

+779
-8
lines changed

Diff for: doc/specs/stdlib_stats.md

+94
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@ The Pearson correlation between two rows (or columns), say `x` and `y`, of `arra
2626

2727
`result = [[stdlib_stats(module):corr(interface)]](array, dim [, mask])`
2828

29+
### Class
30+
31+
Generic subroutine
32+
2933
### Arguments
3034

3135
`array`: Shall be a rank-1 or a rank-2 array of type `integer`, `real`, or `complex`.
@@ -83,6 +87,10 @@ The scaling can be changed with the logical argument `corrected`. If `corrected`
8387

8488
`result = [[stdlib_stats(module):cov(interface)]](array, dim [, mask [, corrected]])`
8589

90+
### Class
91+
92+
Generic subroutine
93+
8694
### Arguments
8795

8896
`array`: Shall be a rank-1 or a rank-2 array of type `integer`, `real`, or `complex`.
@@ -134,6 +142,10 @@ Returns the mean of all the elements of `array`, or of the elements of `array` a
134142

135143
`result = [[stdlib_stats(module):mean(interface)]](array, dim [, mask])`
136144

145+
### Class
146+
147+
Generic subroutine
148+
137149
### Arguments
138150

139151
`array`: Shall be an array of type `integer`, `real`, or `complex`.
@@ -166,6 +178,80 @@ program demo_mean
166178
end program demo_mean
167179
```
168180

181+
## `median` - median of array elements
182+
183+
### Status
184+
185+
Experimental
186+
187+
### Description
188+
189+
Returns the median of all the elements of `array`, or of the elements of `array`
190+
along dimension `dim` if provided, and if the corresponding element in `mask` is
191+
`true`.
192+
193+
The median of the elements of `array` is defined as the "middle"
194+
element, after that the elements are sorted in an increasing order, e.g. `array_sorted =
195+
sort(array)`. If `n = size(array)` is an even number, the median is:
196+
197+
```
198+
median(array) = array_sorted( floor( (n + 1) / 2.))
199+
```
200+
201+
and if `n` is an odd number, the median is:
202+
203+
```
204+
median(array) = mean( array_sorted( floor( (n + 1) / 2.):floor( (n + 1) / 2.) + 1 ) )
205+
```
206+
207+
The current implementation is a quite naive implementation that relies on sorting
208+
the whole array, using the subroutine `[[stdlib_sorting(module):ord_sort(interface)]]`
209+
provided by the `[[stdlib_sorting(module)]]` module.
210+
211+
### Syntax
212+
213+
`result = [[stdlib_stats(module):median(interface)]](array [, mask])`
214+
215+
`result = [[stdlib_stats(module):median(interface)]](array, dim [, mask])`
216+
217+
### Class
218+
219+
Generic subroutine
220+
221+
### Arguments
222+
223+
`array`: Shall be an array of type `integer` or `real`.
224+
225+
`dim`: Shall be a scalar of type `integer` with a value in the range from 1 to `n`, where `n` is the rank of `array`.
226+
227+
`mask` (optional): Shall be of type `logical` and either a scalar or an array of the same shape as `array`.
228+
229+
### Return value
230+
231+
If `array` is of type `real`, the result is of type `real` with the same kind as `array`.
232+
If `array` is of type `real` and contains IEEE `NaN`, the result is IEEE `NaN`.
233+
If `array` is of type `integer`, the result is of type `real(dp)`.
234+
235+
If `dim` is absent, a scalar with the median of all elements in `array` is returned. Otherwise, an array of rank `n-1`, where `n` equals the rank of `array`, and a shape similar to that of `array` with dimension `dim` dropped is returned.
236+
237+
If `mask` is specified, the result is the median of all elements of `array` corresponding to `true` elements of `mask`. If every element of `mask` is `false`, the result is IEEE `NaN`.
238+
239+
240+
### Example
241+
242+
```fortran
243+
program demo_median
244+
use stdlib_stats, only: median
245+
implicit none
246+
real :: x(1:6) = [ 1., 2., 3., 4., 5., 6. ]
247+
real :: y(1:2, 1:3) = reshape([ 1., 2., 3., 4., 5., 6. ], [ 2, 3])
248+
print *, median(x) !returns 3.5
249+
print *, median(y) !returns 3.5
250+
print *, median(y, 1) !returns [ 1.5, 3.5, 5.5 ]
251+
print *, median(y, 1,y > 3.) !returns [ NaN, 4.0, 5.5 ]
252+
end program demo_median
253+
```
254+
169255
## `moment` - central moments of array elements
170256

171257
### Status
@@ -199,6 +285,10 @@ The _k_-th order moment about `center` is defined as :
199285

200286
`result = [[stdlib_stats(module):moment(interface)]](array, order, dim [, center [, mask]])`
201287

288+
### Class
289+
290+
Generic subroutine
291+
202292
### Arguments
203293

204294
`array`: Shall be an array of type `integer`, `real`, or `complex`.
@@ -264,6 +354,10 @@ The use of the term `n-1` for scaling is called Bessel 's correction. The scalin
264354

265355
`result = [[stdlib_stats(module):var(interface)]](array, dim [, mask [, corrected]])`
266356

357+
### Class
358+
359+
Generic subroutine
360+
267361
### Arguments
268362

269363
`array`: Shall be an array of type `integer`, `real`, or `complex`.

Diff for: src/CMakeLists.txt

+1
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ set(fppFiles
1919
stdlib_stats_corr.fypp
2020
stdlib_stats_cov.fypp
2121
stdlib_stats_mean.fypp
22+
stdlib_stats_median.fypp
2223
stdlib_stats_moment.fypp
2324
stdlib_stats_moment_all.fypp
2425
stdlib_stats_moment_mask.fypp

Diff for: src/Makefile.manual

+6
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ SRCFYPP =\
1919
stdlib_stats_corr.fypp \
2020
stdlib_stats_cov.fypp \
2121
stdlib_stats_mean.fypp \
22+
stdlib_stats_median.fypp \
2223
stdlib_stats_moment.fypp \
2324
stdlib_stats_moment_all.fypp \
2425
stdlib_stats_moment_mask.fypp \
@@ -121,6 +122,11 @@ stdlib_stats_mean.o: \
121122
stdlib_optval.o \
122123
stdlib_kinds.o \
123124
stdlib_stats.o
125+
stdlib_stats_median.o: \
126+
stdlib_optval.o \
127+
stdlib_kinds.o \
128+
stdlib_sorting.o \
129+
stdlib_stats.o
124130
stdlib_stats_moment.o: \
125131
stdlib_optval.o \
126132
stdlib_kinds.o \

Diff for: src/common.fypp

+63
Original file line numberDiff line numberDiff line change
@@ -156,4 +156,67 @@ ${prefix + joinstr.join([line.strip() for line in txt.split("\n")]) + suffix}$
156156
#:endcall
157157
#:enddef
158158

159+
#!
160+
#! Generates an array rank suffix for subarrays along a dimension
161+
#!
162+
#! Args:
163+
#! varname (str): Name of the variable to be used as origin
164+
#! rank (int): Rank of the original variable
165+
#! dim (int): Dimension of the variable
166+
#!
167+
#! Returns:
168+
#! Array rank suffix string enclosed in braces
169+
#!
170+
#! E.g.,
171+
#! select_subvector('j', 5, 2)
172+
#! -> (j1, :, j3, j4, j5)
173+
#!
174+
#! Used, e.g., in
175+
#! stdlib_stats_median.fypp
176+
#!
177+
#:def select_subvector(varname, rank, idim)
178+
#:assert rank > 0
179+
#:call join_lines(joinstr=", ", prefix="(", suffix=")")
180+
#:for i in range(1, idim)
181+
${varname}$${i}$
182+
#:endfor
183+
:
184+
#:for i in range(idim + 1, rank + 1)
185+
${varname}$${i}$
186+
#:endfor
187+
#:endcall
188+
#:enddef
189+
190+
#!
191+
#! Generates an array rank suffix for arrays
192+
#!
193+
#! Args:
194+
#! varname (str): Name of the variable to be used as origin
195+
#! rank (int): Rank of the original array variable
196+
#! idim (int): Dimension of the variable dropped
197+
#!
198+
#! Returns:
199+
#! Array rank suffix string enclosed in braces
200+
#!
201+
#! E.g.,
202+
#! reduce_subvector('j', 5, 2)
203+
#! -> (j1, j3, j4, j5)
204+
#!
205+
#! Used, e.g., in
206+
#! stdlib_stats_median.fypp
207+
#!
208+
#:def reduce_subvector(varname, rank, idim)
209+
#:assert rank > 0
210+
#:if rank > 1
211+
#:call join_lines(joinstr=", ", prefix="(", suffix=")")
212+
#:for i in range(1, idim)
213+
${varname}$${i}$
214+
#:endfor
215+
#:for i in range(idim + 1, rank + 1)
216+
${varname}$${i}$
217+
#:endfor
218+
#:endcall
219+
#:endif
220+
#:enddef
221+
159222
#:endmute

Diff for: src/stdlib_stats.fypp

+61-6
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
#:set RANKS = range(1, MAXRANK + 1)
33
#:set REDRANKS = range(2, MAXRANK + 1)
44
#:set RC_KINDS_TYPES = REAL_KINDS_TYPES + CMPLX_KINDS_TYPES
5+
#:set IR_KINDS_TYPES_OUTPUT = list(zip(INT_KINDS,INT_TYPES, ['dp']*len(INT_KINDS))) + list(zip(REAL_KINDS, REAL_TYPES, REAL_KINDS))
56
module stdlib_stats
67
!! Provides support for various statistical methods. This includes currently
78
!! descriptive statistics
@@ -11,14 +12,14 @@ module stdlib_stats
1112
implicit none
1213
private
1314
! Public API
14-
public :: corr, cov, mean, moment, var
15+
public :: corr, cov, mean, median, moment, var
1516

1617

1718
interface corr
1819
!! version: experimental
1920
!!
2021
!! Pearson correlation of array elements
21-
!! ([Specification](../page/specs/stdlib_stats.html#description))
22+
!! ([Specification](../page/specs/stdlib_stats.html#corr-pearson-correlation-of-array-elements))
2223
#:for k1, t1 in RC_KINDS_TYPES
2324
#:set RName = rname("corr",1, t1, k1)
2425
module function ${RName}$(x, dim, mask) result(res)
@@ -110,7 +111,7 @@ module stdlib_stats
110111
!! version: experimental
111112
!!
112113
!! Covariance of array elements
113-
!! ([Specification](../page/specs/stdlib_stats.html#description_1))
114+
!! ([Specification](../page/specs/stdlib_stats.html#cov-covariance-of-array-elements))
114115
#:for k1, t1 in RC_KINDS_TYPES
115116
#:set RName = rname("cov",1, t1, k1)
116117
module function ${RName}$(x, dim, mask, corrected) result(res)
@@ -209,7 +210,7 @@ module stdlib_stats
209210
!! version: experimental
210211
!!
211212
!! Mean of array elements
212-
!! ([Specification](../page/specs/stdlib_stats.html#description_2))
213+
!! ([Specification](../page/specs/stdlib_stats.html#mean-mean-of-array-elements))
213214
#:for k1, t1 in RC_KINDS_TYPES
214215
#:for rank in RANKS
215216
#:set RName = rname("mean_all",rank, t1, k1)
@@ -305,11 +306,65 @@ module stdlib_stats
305306
end interface mean
306307

307308

309+
interface median
310+
!! version: experimental
311+
!!
312+
!! Median of array elements
313+
!! ([Specification](../page/specs/stdlib_stats.html#median-median-of-array-elements))
314+
#:for k1, t1, o1 in IR_KINDS_TYPES_OUTPUT
315+
#:for rank in RANKS
316+
#:set name = rname("median_all",rank, t1, k1, o1)
317+
module function ${name}$ (x, mask) result(res)
318+
${t1}$, intent(in) :: x${ranksuffix(rank)}$
319+
logical, intent(in), optional :: mask
320+
real(${o1}$) :: res
321+
end function ${name}$
322+
#:endfor
323+
#:endfor
324+
325+
#:for k1, t1, o1 in IR_KINDS_TYPES_OUTPUT
326+
#:for rank in RANKS
327+
#:set name = rname("median",rank, t1, k1, o1)
328+
module function ${name}$(x, dim, mask) result(res)
329+
${t1}$, intent(in) :: x${ranksuffix(rank)}$
330+
integer, intent(in) :: dim
331+
logical, intent(in), optional :: mask
332+
real(${o1}$) :: res${reduced_shape('x', rank, 'dim')}$
333+
end function ${name}$
334+
#:endfor
335+
#:endfor
336+
337+
#:for k1, t1, o1 in IR_KINDS_TYPES_OUTPUT
338+
#:for rank in RANKS
339+
#:set name = rname('median_all_mask',rank, t1, k1, o1)
340+
module function ${name}$(x, mask) result(res)
341+
${t1}$, intent(in) :: x${ranksuffix(rank)}$
342+
logical, intent(in) :: mask${ranksuffix(rank)}$
343+
real(${o1}$) :: res
344+
end function ${name}$
345+
#:endfor
346+
#:endfor
347+
348+
#:for k1, t1, o1 in IR_KINDS_TYPES_OUTPUT
349+
#:for rank in RANKS
350+
#:set name = rname('median_mask',rank, t1, k1, o1)
351+
module function ${name}$(x, dim, mask) result(res)
352+
${t1}$, intent(in) :: x${ranksuffix(rank)}$
353+
integer, intent(in) :: dim
354+
logical, intent(in) :: mask${ranksuffix(rank)}$
355+
real(${o1}$) :: res${reduced_shape('x', rank, 'dim')}$
356+
end function ${name}$
357+
#:endfor
358+
#:endfor
359+
360+
end interface
361+
362+
308363
interface var
309364
!! version: experimental
310365
!!
311366
!! Variance of array elements
312-
!! ([Specification](../page/specs/stdlib_stats.html#description_4))
367+
!! ([Specification](../page/specs/stdlib_stats.html#var-variance-of-array-elements))
313368

314369
#:for k1, t1 in RC_KINDS_TYPES
315370
#:for rank in RANKS
@@ -418,7 +473,7 @@ module stdlib_stats
418473
!! version: experimental
419474
!!
420475
!! Central moment of array elements
421-
!! ([Specification](../page/specs/stdlib_stats.html#description_3))
476+
!! ([Specification](../page/specs/stdlib_stats.html#moment-central-moments-of-array-elements))
422477
#:for k1, t1 in RC_KINDS_TYPES
423478
#:for rank in RANKS
424479
#:set RName = rname("moment_all",rank, t1, k1)

0 commit comments

Comments
 (0)