Skip to content

Commit 5a81476

Browse files
authored
Merge pull request #44 from fastverse/development
Development
2 parents 48bf1eb + c7e6efb commit 5a81476

File tree

19 files changed

+858
-450
lines changed

19 files changed

+858
-450
lines changed

.Rbuildignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,8 @@
22
^\.appveyor\.yml$
33
^README\.md$
44
LICENSE
5+
^.*\.Rproj$
6+
^\.Rproj\.user$
7+
^_pkgdown\.yml$
8+
^docs$
9+
^pkgdown$

.github/workflows/pkgdown.yaml

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
2+
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
3+
on:
4+
push:
5+
branches: [main, master]
6+
pull_request:
7+
release:
8+
types: [published]
9+
workflow_dispatch:
10+
11+
name: pkgdown.yaml
12+
13+
permissions: read-all
14+
15+
jobs:
16+
pkgdown:
17+
runs-on: ubuntu-latest
18+
# Only restrict concurrency for non-PR jobs
19+
concurrency:
20+
group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }}
21+
env:
22+
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
23+
permissions:
24+
contents: write
25+
steps:
26+
- uses: actions/checkout@v4
27+
28+
- uses: r-lib/actions/setup-pandoc@v2
29+
30+
- uses: r-lib/actions/setup-r@v2
31+
with:
32+
use-public-rspm: true
33+
34+
- uses: r-lib/actions/setup-r-dependencies@v2
35+
with:
36+
extra-packages: any::pkgdown, local::.
37+
needs: website
38+
39+
- name: Build site
40+
run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)
41+
shell: Rscript {0}
42+
43+
- name: Deploy to GitHub pages 🚀
44+
if: github.event_name != 'pull_request'
45+
uses: JamesIves/[email protected]
46+
with:
47+
clean: false
48+
branch: gh-pages
49+
folder: docs

DESCRIPTION

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,13 @@ Authors@R: c(person("Morgan", "Jacob", role = c("aut", "cre", "cph"), email = "m
88
Author: Morgan Jacob [aut, cre, cph], Sebastian Krantz [ctb]
99
Maintainer: Morgan Jacob <[email protected]>
1010
Description: Basic functions, implemented in C, for large data manipulation. Fast vectorised ifelse()/nested if()/switch() functions, psum()/pprod() functions equivalent to pmin()/pmax() plus others which are missing from base R. Most of these functions are callable at C level.
11+
URL: https://fastverse.github.io/kit/, https://github.com/fastverse/kit
1112
License: GPL-3
1213
Depends: R (>= 3.1.0)
14+
Suggests: knitr, rmarkdown
15+
VignetteBuilder: knitr
1316
Encoding: UTF-8
14-
BugReports: https://github.com/2005m/kit/issues
17+
BugReports: https://github.com/fastverse/kit/issues
1518
NeedsCompilation: yes
1619
ByteCompile: TRUE
1720
Repository: CRAN

NEWS.md

Lines changed: 254 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,254 @@
1+
# kit 0.0.20 <small>(2025-04-17)</small>
2+
3+
### Notes
4+
5+
- Update copyright date in c files
6+
7+
- Fix note on CRAN regarding Rf_isFrame
8+
9+
# kit 0.0.19 <small>(2024-09-07)</small>
10+
11+
### Bug Fixes
12+
13+
- Fix multiple warnings in C code.
14+
15+
# kit 0.0.18 <small>(2024-06-06)</small>
16+
17+
### Bug Fixes
18+
19+
- Fix `iif` tests for new version of R.
20+
21+
# kit 0.0.17 <small>(2024-05-03)</small>
22+
23+
### Bug Fixes
24+
25+
- Fix `nswitch`. Thanks to Sebastian Krantz for raising an issue.
26+
27+
### Notes
28+
29+
- Update copyright date in c files
30+
31+
- Fix note on CRAN regarding SETLENGTH
32+
33+
# kit 0.0.16 <small>(2024-03-01)</small>
34+
35+
### Notes
36+
37+
- Check if `"kit.nThread"` is defined before setting it to `1L`
38+
39+
# kit 0.0.15 <small>(2023-10-01)</small>
40+
41+
### Notes
42+
43+
- Correct typo in configure file
44+
45+
# kit 0.0.14 <small>(2023-08-12)</small>
46+
47+
### Notes
48+
49+
- Update configure file to extend support for GCC
50+
51+
- Correct warnings in NEWS.Rd (strong)
52+
53+
- Correct typo in funique.Rd thanks to @davidbudzynski
54+
55+
# kit 0.0.13 <small>(2023-02-24)</small>
56+
57+
### Notes
58+
59+
- Function `pprod` now returns double output even if inputs are integer - in line with `base::prod` - to avoid integer overflows.
60+
61+
- Update configure file
62+
63+
# kit 0.0.12 <small>(2022-10-26)</small>
64+
65+
### New Features
66+
67+
- Function `pcountNA` is equivalent to `pcount(..., value = NA)`.
68+
69+
- Function `pcountNA` and `pcount(..., value = NA)` allow `NA` counting with mixed data type (including `data.frame`). `pcountNA` also supports list-vectors as inputs and counts empty or `NULL` elements as `NA`.
70+
71+
- Functions `panyv`, `panyNA`, `pallv` and `pallNA` are added as efficient wrappers around `pcount` and `pcountNA`. They are parallel equivalents of scalar functions `base::anyNA` and `anyv`, `allv` and `allNA` in the 'collapse' R package.
72+
73+
- Functions `pfirst` and `plast` are added to efficiently obtain the row-wise first and last non-missing value or non-empty element of lists. They are parallel equivalents to the (column-wise) `ffirst` and `flast` functions in the 'collapse' R package. Implemented by @SebKrantz.
74+
75+
- Functions `psum/pprod/pmean` also support logical vectors as input. Implemented by @SebKrantz.
76+
77+
### Bug Fixes
78+
79+
- Function `charToFact` was not returning proper results. Thanks to @alex-raw for raising an issue.
80+
81+
### Notes
82+
83+
- Function `pprod` now returns double output even if inputs are integer - in line with `base::prod` - to avoid integer overflows.
84+
85+
- C compiler warnings on CRAN R-devel caused by compilation with -Wstrict-prototypes are now fixed. Declaration of functions without prototypes is depreciated in all versions of C. Thanks to Sebastian Krantz for the PR.
86+
87+
# kit 0.0.11 <small>(2022-03-19)</small>
88+
89+
### New Features
90+
91+
- Function `pcount` now supports data.frame.
92+
93+
### Bug Fixes
94+
95+
- Function `pcount` now works with specific NA values, i.e. NA_real_, NA_character_ etc...
96+
97+
# kit 0.0.10 <small>(2021-11-28)</small>
98+
99+
### New Features
100+
101+
- Function `psum`, `pmean`, `pprod`, `pany` and `pall` now support lists. Thanks to Sebastian Krantz for the request and code suggestion.
102+
103+
### Bug Fixes
104+
105+
- Function `topn` should now work for ALTREP object. Thanks to @ben-schwen for raising an issue.
106+
107+
# kit 0.0.9 <small>(2021-09-12)</small>
108+
109+
### Notes
110+
111+
- Re-organise header to prevent compilation errors with new version of Clang due to conflicts between R C headers and OpenMP.
112+
113+
# kit 0.0.8 <small>(2021-08-21)</small>
114+
115+
### New Features
116+
117+
- Function `funique` now preserves the attributes if the input is a `data.table`, `tibble` or similar objects. Thanks to Sebastian Krantz for the request.
118+
119+
- Function `topn` now defaults to base R `order` for large value of `n`. Please see updated documentation for more information `?kit::topn`.
120+
121+
- Function `charToFact` gains a new argument `addNA=TRUE` to be used to include (or not) `NA` in levels of the output.
122+
123+
- Function `shareData`, `getData` and `clearData` implemented to share data objects between R sessions. These functions are experimental and might change in the future. Feedback is welcome. Please see `?kit::shareData` for more information.
124+
125+
### Notes
126+
127+
- Few `calloc` functions at C level have been replaced by R C API function `Calloc` to avoid valgrind errors/warnings in Travis CI.
128+
129+
- Errors reported by `rchk` on CRAN have been fixed.
130+
131+
# kit 0.0.7 <small>(2021-03-07)</small>
132+
133+
### New Features
134+
135+
- Function `charToFact` gains a new argument `decreasing=FALSE` to be used to order levels of the output in decreasing or increasing order.
136+
137+
- Function `topn` gains a new argument `index=TRUE` to be used return index (`TRUE`) or values (`FALSE`) of input vector.
138+
139+
### Bug Fixes
140+
141+
- Some tests of memory access errors using valgrind and AddressSanitizer were reported by CRAN. An attempt to fix these errors has been submitted as part of this package version. It also seems that these same errors were causing some tests to fail for `funique` and `psort` on some platforms.
142+
143+
### Notes
144+
145+
- Functions `pmean`, `pprod` and `psum` will result in error if used with factors. Documentation has been updated.
146+
147+
# kit 0.0.6 <small>(2021-02-21)</small>
148+
149+
### New Features
150+
151+
- Function `funique` and `fduplicated` gain an additional argument `fromLast=FALSE` to indicate whether the search should start from the end or beginning [PR#11](https://github.com/2005m/kit/pull/11).
152+
153+
- Functions `pall`, `pany`, `pmean`, `pprod` and `psum` accept `data.frame` as input [PR#15](https://github.com/2005m/kit/pull/15). Please see documentation for more information.
154+
155+
- Function `charToFact` is equivalent to to base R `as.factor` but is much quicker and only converts character vector to factor. Note that it is parallelised. For more details and benchmark please see `?kit::charToFact`.
156+
157+
- Function `psort` is experimental and equivalent to to base R `sort` but is only for character vector. It can sort by "C locale" or by "R session locale". For more details and benchmark please see `?kit::psort`.
158+
159+
### Notes
160+
161+
- A few OpenMP directives were missing for functions `vswitch` and `nswitch` for character vectors. These have been added in [PR#12](https://github.com/2005m/kit/pull/12).
162+
163+
- Function `funique` was not preserving attributes for character, logical and complex vectors/data.frames. Thanks to Sebastian Krantz (@SebKrantz) for bringing that to my attention. This has been fixed in [PR#13](https://github.com/2005m/kit/pull/13).
164+
165+
- Functions `funique` and `uniqLen` should now be faster for `factor` and `logical` vectors [PR#14](https://github.com/2005m/kit/pull/14).
166+
167+
# kit 0.0.5 <small>(2020-11-21)</small>
168+
169+
### New Features
170+
171+
- Function `uniqLen(x)` is equivalent to base R `length(unique(x))` and `uniqueN` in package [data.table](https://CRAN.R-project.org/package=data.table). Function `uniqLen`, implemented in C, supports vectors, `data.frame` and `matrix`. It should be faster than these functions. For more details and benchmark please see `?kit::uniqLen`.
172+
173+
- Function `vswitch` now supports mixed encoding and gains an additional argument `checkEnc=TRUE`. Thanks to Xianying Tan (@shrektan) for the request and review [PR#7](https://github.com/2005m/kit/pull/7).
174+
175+
- Function `nswitch` is a nested version of function `vswitch` and also supports mixed encoding. Please see please see `?kit::nswitch` for further details. Thanks to Xianying Tan (@shrektan) for the request and review [PR#10](https://github.com/2005m/kit/pull/10).
176+
177+
### Notes
178+
179+
- Small algorithmic improvement for functions `fduplicated`, `funique` and `countOccur` for `vectors`, `data.frame` and `matrix`.
180+
181+
- A tests folder has been added to the source package to track coverage and bugs.
182+
183+
### C-Level Facilities
184+
185+
- Function `nif` has been split into two distinctive functions at C level, one has its arguments evaluated in a lazy way and is for R users and the other one (nifInternalR) is not lazy and is intended for usage at C level.
186+
187+
# kit 0.0.4 <small>(2020-07-21)</small>
188+
189+
### New Features
190+
191+
- Function `countOccur(x)`, implemented in C, is comparable to `base` R function `table`. It returns a `data.frame` and is between 3 to 50 times faster. For more details, please see `?kit::countOccur`.
192+
193+
- Functions `funique` and `fduplicated` now support matrices. Additionally, these two functions should also have better performance compare to previous release.
194+
195+
- Functions `topn` has an additional argument `hasna=TRUE` to indicates whether data contains `NA` value or not. If the data does not contain `NA` values, the function should be faster.
196+
197+
### C-Level Facilities
198+
199+
- A few C functions have been added to subset `data.frame` and `matrix` as well as do other operations. These functions are not exported or visible to the user but might become available and callable at C level in the future.
200+
201+
### Bug Fixes
202+
203+
- Function `fpos` was not properly handling `NaN` and `NA` for complex and double. This should now be fixed. The function has also been changed in case the 'needle' and 'haysatck' are vectors so that a vector is returned.
204+
205+
- Functions `funique` and `fduplicated` were not properly handling data containing `POSIX` data. This has now been fixed.
206+
207+
# kit 0.0.3 <small>(2020-06-21)</small>
208+
209+
### New Features
210+
211+
- Functions `fduplicated(x)` and `funique(x)`, implemented in C, are comparable to `base` R functions `duplicated` and `unique`. For more details, please see `?kit::funique`.
212+
213+
- Functions `psum` and `pprod` have now better performance for type double and complex.
214+
215+
### Bug Fixes
216+
217+
- Function `count(x, y)` now checks that `x` and `y` have the same class and levels. So does `pcount`.
218+
219+
- Function `pmean` was not callable at C level because of a typo. This is now fixed.
220+
221+
# kit 0.0.2 <small>(2020-05-22)</small>
222+
223+
### New Features
224+
225+
- Function `count(x, value)`, implemented in C, to simply count the number of times an element `value` occurs in a vector or in a list `x`. For more details, please see `?kit::count`.
226+
227+
- Function `pmean(..., na.rm=FALSE)`, `pall(..., na.rm=FALSE)`, `pany(..., na.rm=FALSE)` and `pcount(..., value)`, implemented in C, are similar to already available function `psum` and `pprod`. These functions respectively apply base R functions `mean`, `all` and `any` element-wise. For more details, benchmarks and help, please see `?kit::pmean`.
228+
229+
### Bug Fixes
230+
231+
- Fix Solaris Unicode warnings for NEWS file. Benchmarks have been moved from the NEWS file to each function Rd file.
232+
233+
- Fix some `NA` edge cases for `pprod` and `psum` so these functions behave more like base R function `prod` and `sum`.
234+
235+
- Fix installation errors for version of R (<3.5.0).
236+
237+
# kit 0.0.1 <small>(2020-05-03)</small>
238+
239+
### Initial Release
240+
241+
- Function `fpos(needle, haystack, all=TRUE, overlap=TRUE)`, implemented in C, is inspired by base function `which` when used in the following form `which(x == y, arr.ind =TRUE)`. Function `fpos` returns the index(es) or position(s) of a matrix/vector within a larger matrix/vector. Please see `?kit::fpos` for more details.
242+
243+
- Function `iif(test, yes, no, na=NULL, tprom=FALSE, nThread=getOption("kit.nThread"))`, originally contributed as `fifelse` in package [data.table](https://CRAN.R-project.org/package=data.table), was moved to package kit to be developed independently. Unlike the current version of `fifelse`, `iif` allows type promotion like base function `ifelse`. For further details about the differences with `fifelse`, as well as `hutils::if_else` and `dplyr::if_else`, please see `?kit::iif`.
244+
245+
- Function `nif(..., default=NULL)`, implemented in C, is inspired by *SQL CASE WHEN*. It is comparable to [dplyr](https://CRAN.R-project.org/package=dplyr) function `case_when` however it evaluates it arguments in a lazy way (i.e only when needed). Function `nif` was originally contributed as function `fcase` in the [data.table](https://CRAN.R-project.org/package=data.table) package but then moved to package kit so its development may resume independently. Please see `?kit::nif` for more details.
246+
247+
- Function `pprod(..., na.rm=FALSE)` and `psum(..., na.rm=FALSE)`, implemented in C, are inspired by base function `pmin` and `pmax`. These new functions work only for integer, double and complex types and do not recycle vectors. Please see `?kit::psum` for more details.
248+
249+
- Function `setlevels(x, old, new, skip_absent=FALSE)`, implemented in C, may be used to set levels of a factor object. Please see `?kit::setlevels` for more details.
250+
251+
- Function `topn(vec, n=6L, decreasing=TRUE)`, implemented in C, returns the top largest or smallest `n` values for a given numeric vector `vec`. It is inspired by `dplyr::top_n` and equivalent to base functions order and sort in specific cases as shown in the documentation. Please see `?kit::topn` for more details.
252+
253+
- Function `vswitch(x, values, outputs, default=NULL, nThread=getOption("kit.nThread"))`, implemented in C, is a vectorised version of `base` R function `switch`. This function can also be seen as a particular case of function `nif`. Please see `?kit::switch` for more details.
254+

0 commit comments

Comments
 (0)