Skip to content

if-else with '=' assignment, wrong curly placement #259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lefec opened this issue Oct 25, 2017 · 12 comments
Closed

if-else with '=' assignment, wrong curly placement #259

lefec opened this issue Oct 25, 2017 · 12 comments

Comments

@lefec
Copy link

lefec commented Oct 25, 2017

Hi, I didn't find a previous issue about this.

I have this (working) code:

x = 5

if(x >= 5)
  y = TRUE else 
    y = FALSE

With style_active_region() producing:

if (x >= 5) {
  y
} <- TRUE else
  y <- FALSE

It breaks my code here.

@lorenzwalthert
Copy link
Collaborator

Thanks @lefec for reaching out. This is indeed very unfortunate. There are two aspects to it.

  1. If you replace the assignment = with <-, you won't get invalid code back. The problem here is really an inconsistency in the R parser discussed in inconsistency in getParseData? #100.
  2. The indention of the braces will still not be correct, i.e.
reprex::reprex_info()
#> Created by the reprex package v0.1.1.9000 on 2017-10-25

library(styler)
style_text("x = 5
if(x >= 5)
y <- TRUE else
  y <- FALSE
")
#> x <- 5
#> if (x >= 5) {
#>   y <- TRUE
#>   } else {
#>   y <- FALSE
#>   }

Which is an issue already reported #257 and which I hope to solve soon.

@krlmlr What do you suggest? Should we just disable adding braces if there is any = assignment in the code in order to prevent (1) from happening? Invalidating code is the worst styler can do anyways and I think it should be avoided at any cost. Eventually we need to fix the parsing inconsistency but I think that might take some time.

@krlmlr
Copy link
Member

krlmlr commented Oct 25, 2017

I understand that even if #100 is closed the underlying problems are not resolved completely:

styler:::create_tree("if (TRUE) a = 1")
#> Warning: replacing previous import 'scales::viridis_pal' by
#> 'viridis::viridis_pal' when loading 'DiagrammeR'
#>                                              levelName
#> 1  ROOT (token: short_text [lag_newlines/spaces] {id})
#> 2   °--expr:  [0/0] {16}                              
#> 3       ¦--IF: if [0/1] {1}                           
#> 4       ¦--'(': ( [0/0] {2}                           
#> 5       ¦--expr:  [0/0] {4}                           
#> 6       ¦   °--NUM_CONST: TRUE [0/0] {3}              
#> 7       ¦--')': ) [0/1] {5}                           
#> 8       ¦--expr:  [0/1] {9}                           
#> 9       ¦   °--SYMBOL: a [0/0] {7}                    
#> 10      ¦--EQ_ASSIGN: = [0/1] {8}                     
#> 11      °--expr:  [0/0] {11}                          
#> 12          °--NUM_CONST: 1 [0/0] {10}

What does it take to put the EX_ASSIGN and the expr/NUM_CONST where it belongs? Can we compute the parent information ourselves from line + col data, to avoid relying on the (sometimes) faulty parent ID in the parse data?

@lorenzwalthert
Copy link
Collaborator

I haven't looked into the solution Jim Hester provided in #100, but when mentioning #100, I just meant to "solve" the parsing problem, i.e. getting a consistent parse data (that is, the same parsing data as if <- was used instead of =, just with text being different for the assignment token) which would solve the problem. That involves getting the right parent from the parse data. I think this is not trivial. We can also parse, replace the tokens EQ_ASSIGN with arrow assignment, serialise, parse again and replace text from the second serialisation with the text from the first, which is not particularly elegant.

@krlmlr
Copy link
Member

krlmlr commented Oct 26, 2017

Compare the parse tree between arrow assignment and equals assignment:

styler:::create_tree("a <- b <- 1")
#>                                              levelName
#> 1  ROOT (token: short_text [lag_newlines/spaces] {id})
#> 2   °--expr:  [0/0] {11}                              
#> 3       ¦--expr:  [0/1] {3}                           
#> 4       ¦   °--SYMBOL: a [0/0] {1}                    
#> 5       ¦--LEFT_ASSIGN: <- [0/1] {2}                  
#> 6       ¦--expr:  [0/1] {6}                           
#> 7       ¦   °--SYMBOL: b [0/0] {4}                    
#> 8       ¦--LEFT_ASSIGN: <- [0/1] {5}                  
#> 9       °--expr:  [0/0] {8}                           
#> 10          °--NUM_CONST: 1 [0/0] {7}
styler:::create_tree("a = b = 1")
#>                                             levelName
#> 1 ROOT (token: short_text [lag_newlines/spaces] {id})
#> 2  ¦--expr:  [0/1] {3}                               
#> 3  ¦   °--SYMBOL: a [0/0] {1}                        
#> 4  ¦--EQ_ASSIGN: = [0/1] {2}                         
#> 5  ¦--expr:  [0/1] {6}                               
#> 6  ¦   °--SYMBOL: b [0/0] {4}                        
#> 7  ¦--EQ_ASSIGN: = [0/1] {5}                         
#> 8  °--expr:  [0/0] {8}                               
#> 9      °--NUM_CONST: 1 [0/0] {7}

Seems that all we need to do here is to wrap EQ_ASSIGN sequences into a separate expression. Maybe this operator is special because it has lowest precedence?

  1. Walk all nests
  2. Look at sequences of EQ_ASSIGN
  3. If there is one in the nest, wrap everything between the token before the first EQ_ASSIGN and the token after the last EQ_ASSIGN into a separate expression, very similarly to the addition of braces (Wrap expr in expr before enclosing with curly braces #263).

@lorenzwalthert
Copy link
Collaborator

Ok, I see. Looks like that could be a valid approach. Need to look at more examples maybe. We could try to fix it in the nested structure or the flat structure. If we fix it in the flat structure (that is, reassigning parents and ids before nesting), this might have the advantage that we fix the problem early in the food chain, so other packages / tasks that do nesting in a different way than we do might benefit. However, that might be potentially more difficult.

@lorenzwalthert
Copy link
Collaborator

Ok, I guess for the time being I just want to follow your suggestion and fix the problem in the nested parse data.

@lorenzwalthert
Copy link
Collaborator

I think you can't do it exactly that way because in the case below, two = appear on the same level but there is an else in between that you don't want to wrap.

reprex::reprex_info()
#> Created by the reprex package v0.1.1.9000 on 2017-11-02

styler:::create_tree("if (TRUE) \na<- 3 else b <- 4")
#>                                                  levelName
#> 1  ROOT (token: short_text [lag_newlines/spaces] {pos_id})
#> 2   °--expr:  [0/0] {1}                                   
#> 3       ¦--IF: if [0/1] {2}                               
#> 4       ¦--'(': ( [0/0] {3}                               
#> 5       ¦--expr:  [0/0] {5}                               
#> 6       ¦   °--NUM_CONST: TRUE [0/0] {4}                  
#> 7       ¦--')': ) [0/0] {6}                               
#> 8       ¦--expr:  [1/1] {7}                               
#> 9       ¦   ¦--expr:  [0/0] {9}                           
#> 10      ¦   ¦   °--SYMBOL: a [0/0] {8}                    
#> 11      ¦   ¦--LEFT_ASSIGN: <- [0/1] {10}                 
#> 12      ¦   °--expr:  [0/0] {12}                          
#> 13      ¦       °--NUM_CONST: 3 [0/0] {11}                
#> 14      ¦--ELSE: else [0/1] {13}                          
#> 15      °--expr:  [0/0] {14}                              
#> 16          ¦--expr:  [0/1] {16}                          
#> 17          ¦   °--SYMBOL: b [0/0] {15}                   
#> 18          ¦--LEFT_ASSIGN: <- [0/1] {17}                 
#> 19          °--expr:  [0/0] {19}                          
#> 20              °--NUM_CONST: 4 [0/0] {18}
styler:::create_tree("if (TRUE) \na = 3 else b = 4")
#>                                                  levelName
#> 1  ROOT (token: short_text [lag_newlines/spaces] {pos_id})
#> 2   °--expr:  [0/0] {1}                                   
#> 3       ¦--IF: if [0/1] {2}                               
#> 4       ¦--'(': ( [0/0] {3}                               
#> 5       ¦--expr:  [0/0] {5}                               
#> 6       ¦   °--NUM_CONST: TRUE [0/0] {4}                  
#> 7       ¦--')': ) [0/0] {6}                               
#> 8       ¦--expr:  [1/1] {8}                               
#> 9       ¦   °--SYMBOL: a [0/0] {7}                        
#> 10      ¦--EQ_ASSIGN: = [0/1] {9}                         
#> 11      ¦--expr:  [0/1] {11}                              
#> 12      ¦   °--NUM_CONST: 3 [0/0] {10}                    
#> 13      ¦--ELSE: else [0/1] {12}                          
#> 14      ¦--expr:  [0/1] {14}                              
#> 15      ¦   °--SYMBOL: b [0/0] {13}                       
#> 16      ¦--EQ_ASSIGN: = [0/1] {15}                        
#> 17      °--expr:  [0/0] {17}                              
#> 18          °--NUM_CONST: 4 [0/0] {16}

@krlmlr
Copy link
Member

krlmlr commented Nov 2, 2017

Good point. I guess we need to look for sequences of EQ_ASSIGN that are interspersed with exactly one other token.

@lorenzwalthert
Copy link
Collaborator

lorenzwalthert commented Nov 2, 2017

Yes, I agree. So I am just trying to first split a nest into blocks, each containing one valid = expression and then map over the blocks like this.

relocate_eq_assign_nest <- function(pd) {
  is_eq_assign <- which(pd$token == "EQ_ASSIGN")
  if (length(is_eq_assign) > 0) {
    browser()
    block_id <-
      cumsum((is_eq_assign - lag(is_eq_assign, default = is_eq_assign[1])) > 2)
    blocks <- split(pd, block_id)
    pd <- map_dfr(blocks, relocate_eq_assign_one)
  }
  pd
}

Well not exactly, but somehow like this.

@krlmlr
Copy link
Member

krlmlr commented Nov 2, 2017

Can you use diff()?

@lorenzwalthert
Copy link
Collaborator

lorenzwalthert commented Nov 2, 2017

I modified create_treee() so it can also return just the structure of an expression, i.e

reprex::reprex_info()
#> Created by the reprex package v0.1.1.9000 on 2017-11-02

styler:::create_tree(
  "x <- 5
  
  if(x >= 5)
  y <- TRUE else 
  y <- FALSE",
  structure_only = TRUE
)
#>                 levelName
#> 1  Hierarchical structure
#> 2   ¦--1                 
#> 3   ¦   ¦--1             
#> 4   ¦   ¦   °--1         
#> 5   ¦   ¦--2             
#> 6   ¦   °--3             
#> 7   ¦       °--1         
#> 8   °--2                 
#> 9       ¦--1             
#> 10      ¦--2             
#> 11      ¦--3             
#> 12      ¦   ¦--1         
#> 13      ¦   ¦   °--1     
#> 14      ¦   ¦--2         
#> 15      ¦   °--3         
#> 16      ¦       °--1     
#> 17      ¦--4             
#> 18      ¦--5             
#> 19      ¦   ¦--1         
#> 20      ¦   ¦   °--1     
#> 21      ¦   ¦--2         
#> 22      ¦   °--3         
#> 23      ¦       °--1     
#> 24      ¦--6             
#> 25      °--7             
#> 26          ¦--1         
#> 27          ¦   °--1     
#> 28          ¦--2         
#> 29          °--3         
#> 30              °--1

I think I managed to solve the problem, so we have

reprex::reprex_info()
#> Created by the reprex package v0.1.1.9000 on 2017-11-02

all.equal(
  styler:::create_tree("a <- b <- 1", structure_only = TRUE),
  styler:::create_tree("a =  b = 1", structure_only = TRUE)
)
#> [1] TRUE

all.equal(
  styler:::create_tree(
  "x = 5
  
  if(x >= 5)
    y = TRUE else 
      y = FALSE",
  structure_only = TRUE
  ),
  styler:::create_tree(
    "x <- 5
    
    if(x >= 5)
    y <- TRUE else 
    y <- FALSE",
    structure_only = TRUE
  )
)
#> [1] TRUE

Will push later.

@lorenzwalthert
Copy link
Collaborator

@lefec finally hoping to close this with #276.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants