Skip to content

Style roxygen code examples #381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

lorenzwalthert
Copy link
Collaborator

@lorenzwalthert lorenzwalthert commented Mar 24, 2018

  • This is WIP with high-level template functions and a suggestion for the hierarchical structure.
  • I think having a text and path interface for identify start and stop is convenient (identify_start_and_stop_of_royxgen_examples_from_text()), at least as long as we are in the experimental state.
  • I also added a test structure for the identification part and numbered test files, which is a friction to other tests in the repo but I think it can help to quickly identify a failing combinations of the elements we want to test. We can possibly remove the numbers eventually.
  • I used long function names on purpose to be very verbose.

To do:

  • completing function templates.
  • more tests for identification of start / stop plus unit test for other functionality this PR provides.
  • Documentation.
  • namespace handling.
  • consider relocating functions in R/utils.R to separate files (like string-manipulation.R, string-creation.R)

@lorenzwalthert lorenzwalthert force-pushed the style-roxygen-code-examples branch from 6e64394 to 23f2d8e Compare March 24, 2018 23:16

plain_examples <- map(start_stop_sequences, remove_roxygen_mask,
text = masked_examples
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oups this seems to need mapping over both sequences anybody masked examples (which is a list with each element corresponding to one example).

@lorenzwalthert
Copy link
Collaborator Author

As noted in #332, we should aim for consistency with styling of Rmd files. In particular, styling has to happen inside transform_code because all other approaches fail to identify (and communicate to the end user) in absence of a change in the code.
We could just call style_text() on the identified code within transform_code, which would have the advantage that we could style roxygen comments in roxygen comments recursively. This implies passing a lot of arguments downstream. Also, creating such a recursive style_text() call is not in the spirit of styler (so far) and hence we may want to try to do it differently.

Probably a better and more consistent approach would be to see code examples as part of the code body that is styled with transformer functions. That would imply to

  • create a new column is_roxygen_code_example prior to parsing that identifies roxygen code examples.
  • remove the roxygen mask including the header for all roxygen code example tokens.
  • compute nested parse table.
  • style all code as implemented currently (so roxygen code examples are treated as normal code).
  • adding roxygen mask, basically inverting its removal.

Probably that's too complicated if not infeasible.

In addition, we should use consistent naming for file names. R/transform-code.R should have rmd in the name since it mostly deals with rmd. The first function transform_code() should be renamed since it is essentially transform_file() and transform_file() should be transform_file_verbose(). The new transform_file() should be moved to R/transform-files.R.

Since this will lead to an API change, we better think twice how we name the argument and what its values can be. In consideration of extending style functionality for roxygen comments, e.g. by maximal line width, indention etc. as suggested in #332 (comment), we may want to call the argument include_roxygen to the tidyverse style guide, which can be NULL or a vector with arguments for elements of roxygen comments to style.

@lorenzwalthert lorenzwalthert force-pushed the style-roxygen-code-examples branch 5 times, most recently from e1dc788 to 30df9c0 Compare April 7, 2018 16:31
@lorenzwalthert
Copy link
Collaborator Author

lorenzwalthert commented Apr 7, 2018

Further todo:

  • Add tests.
  • Handle don't run.
  • Make styling of roxygen an option to top-level API.

@lorenzwalthert lorenzwalthert force-pushed the style-roxygen-code-examples branch from 30df9c0 to fa90de7 Compare April 7, 2018 16:34
}
}

#' @importFrom purrr map_at flatten_chr
parse_transform_serialize_roxygen <- function(text, transformers) {
roxygen_seqs <- identify_start_to_stop_of_roxygen_examples_from_text(text)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably need to split that up into multiple functions. Also, some of it might be re-used for dontrun handling.

@lorenzwalthert lorenzwalthert force-pushed the style-roxygen-code-examples branch 2 times, most recently from 447ea81 to 5af7bef Compare April 7, 2018 17:05
@codecov-io
Copy link

codecov-io commented Apr 7, 2018

Codecov Report

Merging #381 into master will decrease coverage by 0.51%.
The diff coverage is 94.06%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #381      +/-   ##
==========================================
- Coverage   90.63%   90.12%   -0.52%     
==========================================
  Files          30       34       +4     
  Lines        1474     1570      +96     
==========================================
+ Hits         1336     1415      +79     
- Misses        138      155      +17
Impacted Files Coverage Δ
R/addins.R 0% <0%> (ø) ⬆️
R/roxygen-examples-parse.R 100% <100%> (ø)
R/testing.R 73.46% <100%> (-0.79%) ⬇️
R/transform-files.R 100% <100%> (ø) ⬆️
R/ui.R 100% <100%> (ø) ⬆️
R/roxygen-examples-find.R 100% <100%> (ø)
R/utils.R 81.48% <66.66%> (-5.48%) ⬇️
R/roxygen-examples-add-remove.R 92.3% <92.3%> (ø)
R/roxygen-examples.R 96.15% <96.15%> (ø)
R/parse.R 83.33% <0%> (-11.46%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f5bd8fd...47761db. Read the comment docs.

@lorenzwalthert
Copy link
Collaborator Author

@jonmcalder you can review if you want.

@lorenzwalthert
Copy link
Collaborator Author

@krlmlr any idea how to handle \dontrun{} segments? I am playing around with one approach, but it seems rather brittle and inelegant. Maybe you have any knowledge from roxygen2 that could help us here?

@krlmlr
Copy link
Member

krlmlr commented May 2, 2018

Maybe with tools::parse_Rd()?

file <- tempfile()
writeLines(
  c(
    "\\examples{",
    "\\dontrun{",
    "TRUE",
    "}",
    "}"
  ),
  file)
parsed <- tools::parse_Rd(
  file
)
str(parsed)
#> List of 2
#>  $ :List of 3
#>   ..$ : atomic [1:1] 
#> 
#>   .. ..- attr(*, "Rd_tag")= chr "RCODE"
#>   .. ..- attr(*, "srcref")=Class 'srcref'  atomic [1:6] 1 11 1 11 11 11
#>   .. .. .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x55eafd898290> 
#>   ..$ :List of 2
#>   .. ..$ : atomic [1:1] 
#> 
#>   .. .. ..- attr(*, "Rd_tag")= chr "VERB"
#>   .. .. ..- attr(*, "srcref")=Class 'srcref'  atomic [1:6] 2 10 2 10 10 10
#>   .. .. .. .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x55eafd898290> 
#>   .. ..$ : atomic [1:1] TRUE
#> 
#>   .. .. ..- attr(*, "Rd_tag")= chr "VERB"
#>   .. .. ..- attr(*, "srcref")=Class 'srcref'  atomic [1:6] 3 1 3 5 1 5
#>   .. .. .. .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x55eafd898290> 
#>   .. ..- attr(*, "Rd_tag")= chr "\\dontrun"
#>   .. ..- attr(*, "srcref")=Class 'srcref'  atomic [1:6] 2 1 4 1 1 1
#>   .. .. .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x55eafd898290> 
#>   ..$ : atomic [1:1] 
#> 
#>   .. ..- attr(*, "Rd_tag")= chr "RCODE"
#>   .. ..- attr(*, "srcref")=Class 'srcref'  atomic [1:6] 4 2 4 2 2 2
#>   .. .. .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x55eafd898290> 
#>   ..- attr(*, "Rd_tag")= chr "\\examples"
#>   ..- attr(*, "srcref")=Class 'srcref'  atomic [1:6] 1 1 5 1 1 1
#>   .. .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x55eafd898290> 
#>  $ : atomic [1:1] 
#> 
#>   ..- attr(*, "Rd_tag")= chr "TEXT"
#>   ..- attr(*, "srcref")=Class 'srcref'  atomic [1:6] 5 2 5 2 2 2
#>   .. .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x55eafd898290> 
#>  - attr(*, "class")= chr "Rd"
#>  - attr(*, "srcref")=Class 'srcref'  atomic [1:6] 0 0 6 1 0 1
#>   .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x55eafd898290> 
#>  - attr(*, "macros")=<environment: 0x55eafd71e0b0>

Created on 2018-05-02 by the reprex package (v0.2.0).

@lorenzwalthert
Copy link
Collaborator Author

Well I am not sure if we want to replace source code with styled derivatives that are potentially out of date.

@krlmlr
Copy link
Member

krlmlr commented May 2, 2018

\dontrun{} doesn't always mean outdated code. It may be an example that won't run on CRAN, or just something that's valid syntax but doesn't work. Either way, after parse_Rd() you should be able to leave out these parts.

@lorenzwalthert
Copy link
Collaborator Author

Yes sure, not always, but it can be outdated. Let's say you update a code example in your roxygen comments and then run styler without doing devtools::document(). This will replace the source with its styled derivative and you will loose your changes. So I wondered if there is a way around taking code directly from roxygen comments, which we can do if it does not involve a dontrun.
We could also look into taking all example code from the .Rd and require up to date documentation prior to styling when roxygen code examples are to be styled. This would mean to start all over since we now worked in a solution based in the source, that is, taking code from the roxygen comments and not from the .Rd files

@krlmlr
Copy link
Member

krlmlr commented May 3, 2018

How about

  1. look for @examples section in roxygen2 comments
  2. remove the '#
  3. style
  4. put back the '#
  5. replace the @examples section with the styled version

?

@lorenzwalthert
Copy link
Collaborator Author

This is what we have but that does not allow \dontrun{} segments.

@krlmlr
Copy link
Member

krlmlr commented May 20, 2018

2a) Run parseRd().

  1. Style code in all elements of the returned structure.

3a) Serialize the parsed structure.

@lorenzwalthert
Copy link
Collaborator Author

So I am not sure if I understand you correctly, you want the following:

  1. look for @examples section in roxygen2 comments
  2. remove the '#.
  3. Parse blocks.
    a. wrap the extracted code in \examples{...}.
    b. Call as.character(tools::parse_Rd(...) to get a vector that contains the individual elements of the example in useful blocks.
  4. Process the blocks
  5. wrap the extracted code in \examples{...}.
  6. put back the '#
  7. replace the @examples section with the styled version

Here an illustration of step 3 to 5:

library(magrittr)
file <- tempfile()
writeLines(
  c(
    "\\examples{",
    "hi <- 3",
    "\\dontrun{",
    "TRUE", "function(x) {", "  x", "}",
    "}",
    "x",
    "}"
  ),
  file
)
parsed <- tools::parse_Rd(
  file,
  fragment = TRUE
)
#> Warning in tools::parse_Rd(file, fragment = TRUE): /var/folders/8l/
#> fhmv3yj12kncddcjqwc19tkr0000gr/T//Rtmp6bX2B6/file20645f19d510:1: unexpected
#> section header '\examples'
char <- as.character(parsed, deparse = TRUE)
char
#>  [1] ""                "{"               "\n"             
#>  [4] "hi <- 3\n"       "\\dontrun"       "{"              
#>  [7] "\n"              "TRUE\n"          "function(x) {\n"
#> [10] "  x\n"           "}\n"             "}"              
#> [13] "\n"              "x\n"             "}"              
#> [16] "\n"


# styline the blocks in palce.
# Code here

paste0("\\examples", paste0(char, collapse = "")) %>% cat()
#> \examples{
#> hi <- 3
#> \dontrun{
#> TRUE
#> function(x) {
#>   x
#> }
#> }
#> x
#> }

Created on 2018-05-21 by the reprex package (v0.2.0).

Problem: tools::parse_Rd() only takes a file as input, which means we need to write a temp file if we want to use the always-up-to-date source code instead of the .Rd derivative. Rather not elegant. What do you think? We could also give warnings and take derivatives if they are newer than the source file but then we need a mapping from .Rd to .R.

@lorenzwalthert
Copy link
Collaborator Author

lorenzwalthert commented May 21, 2018

Oh we could just use textConnection to feed in a string into tools::parse_Rd().

@krlmlr
Copy link
Member

krlmlr commented May 21, 2018

Yes, it's a bit of an effort but I think it's worth it to have nicely styled examples.

@lorenzwalthert
Copy link
Collaborator Author

Ok, I agree. I think me and @jonmcalder have already put in some time, so I think it's worth going forward this way. I think we need a lot of TDD to accommodate all edge cases.

@lorenzwalthert
Copy link
Collaborator Author

We can also set up a test infrastructure on Travis to download and style random CRAN packages for testing.

@lorenzwalthert lorenzwalthert force-pushed the style-roxygen-code-examples branch 2 times, most recently from d0af5a4 to c31b5a7 Compare July 8, 2018 18:26
@lorenzwalthert
Copy link
Collaborator Author

I don't understand "line breaks are not respected". Would strsplit(., "\n", fixed = TRUE)[[1]] help? Or is this about line breaks near the braces in \dontrun{} et al.?

See #381 (comment)

@lorenzwalthert
Copy link
Collaborator Author

I think it's good practice to leave a PR for a few days and go back afterwards. I think I got the big picture right, but there are some small things that can simplify the code (and add the feature preserving line-breaks).

@lorenzwalthert lorenzwalthert force-pushed the style-roxygen-code-examples branch from b347652 to 688feff Compare July 26, 2018 08:40
@lorenzwalthert
Copy link
Collaborator Author

Will do a PR on top of this PR to solve various problems. -.-

@lorenzwalthert
Copy link
Collaborator Author

Finally, it seems we got it right 🎉

@lorenzwalthert
Copy link
Collaborator Author

We just added ~3000 lines of code to styler. Most of it is tests though, but still a few hundred in the Rdirectory.

library(dplyr)
log <- gitsum::parse_log_detailed_full_run()
log %>%
  arrange(desc(date)) %>%
  slice(1:63) %>%
  unnest(nested) %>%
  filter(dirname(changed_file) == "R") %>%
  summarize(sum(insertions), sum(deletions))

#> # A tibble: 1 x 2
#>  `sum(insertions)` `sum(deletions)`
#>              <int>            <int>
#> 1               488              354

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants