Skip to content

Conversation

@eborden
Copy link
Collaborator

@eborden eborden commented Sep 8, 2019

My intent with this PR is to take stock of the status of the datadecl branch and to push for an in progress merge. This work has been pending with sporadic changes for a long time. I believe that is because of the monolithic nature of the work. An all or none strategy has not worked and is counter to brittany's basic design on top of exact print (to be able to incrementally support new features). As well most incremental progress has been put towards refreshing the branch with each sporadic effort. This bit-rotting has been a time suck.

Instead I would like to see the current work (pending a few updates) merged with the goal of providing layout for newtype and record style data types.

Current Status

Supported

  • single records
  • multi-field records types
  • tabular alignment
  • basic deriving
  • deriving strategies
  • existential quantification

Blockers

  • retaining comments
  • CPP for older ghc versions

Deferred

  • normal types
  • sum types

@eborden eborden added this to the Version 1.0.0.0 milestone Sep 8, 2019
@eborden eborden requested review from lspitzner and tfausak September 8, 2019 22:43
@eborden eborden self-assigned this Sep 8, 2019
@tfausak
Copy link
Collaborator

tfausak commented Sep 9, 2019

👀 I'll review this soon. I glanced over the test cases and liked what I saw!

Copy link
Collaborator

@tfausak tfausak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reading layoutDataDecl but I can't really grok it. No fault of yours though! I'm not familiar enough with this codebase to do a thorough code review. Is there anything in particular you'd like me to look at?

| ColBindStmt
| ColDoLet -- the non-indented variant
| ColRec
| ColRecUpdate -- used for both RecCon and RecUpd. TODO: refactor to reflect?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the benefit of differentiating between record construction and record update? I thought they were always formatted the same.

@lspitzner
Copy link
Owner

I fully approve of the idea of getting this merged. I started working on supporting the older ghcs, and with this commit it compiles with both ghc-8.4 and ghc-8.6. Unfortunately, the tests on 8.4 do not pass.

One failure case is some whitespace problem that I have not investigated but that is probably relatively easy to fix.

The more annoying issue is that 8.4 does not have DerivingVia, which means that we need test-cases that only apply starting at some compiler version. Expanding the test file schema like this:

#group mytestgroup

#test mytestname
#pending myreason
func = ...

#test othertestname
func = ...

#test conditionaltestname
#starting-at-ghc 8.6
func = ...

should do the trick.

(this might point towards the question if we want to reduce the number of supported compiler versions, but I think I want this feature regardless; it seems it would be required again in the future even if we only supported two versions at a time.)

I plan to keep working on this when I find the time, in small pieces. If anyone else works on this over more than an evening, you might want to leave a note so that we don't conflict / do redundant work. This advice applies to myself, too.

@eborden
Copy link
Collaborator Author

eborden commented Oct 14, 2019

@lspitzner The big blocker in my mind right now is comments. They are something that could precipitate a lot of change in this code (though maybe they won't 🤷‍♂️) If you have the time I might recommend doing some research on what that might look like instead of attacking multi version support.

100% agree on committing early and often and keeping the lines of communication going so we don't have redundant work.

@lspitzner
Copy link
Owner

I have started on supporting comments, will push something this night.

@eborden
Copy link
Collaborator Author

eborden commented Oct 22, 2019

I did a bit of fiddling with it, but didn't get anywhere. I'd love to have a brain dump from you on how brittany is currently handling comments.

Comment on lines 405 to 421
data Foo = Bar
{ foo :: Baz
, bars :: Bizzz
}
-- a
deriving --b
( -- c
ToJSON -- d
, -- e
FromJSON --f
) -- g
via -- h
( -- i
SomeType --j
, -- k
ABC --l
)
Copy link
Collaborator Author

@eborden eborden Oct 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is a lot of fun!

Maybe something like this:

  -- a
  deriving --b
           ( -- c
             ToJSON -- d
           , -- e
             FromJSON --f
           ) -- g
           via  -- h
           ( -- i
             SomeType --j
           , -- k
             ABC --l
           )

The alignment with deriving would then be dropped in align left and replaced with a single indent.

  -- a
  deriving --b
    ( -- c
      ToJSON -- d
    , -- e
      FromJSON --f
    ) -- g
    via  -- h
    ( -- i
      SomeType --j
    , -- k
      ABC --l
    )

@eborden
Copy link
Collaborator Author

eborden commented Nov 4, 2019

@lspitzner I've updated the branch to build with all supported GHC versions. There are a number of test failures that will need to be picked off. One specific issue is literate tests that use syntax that is not available in earlier versions, like DerivingVia in 8.2.2.

  src-literatetests/Main.hs:151: 
  1) data type declarations record deriving via
       expected: Right 
                 data Foo = Bar
                   { foo  :: Baz
                   , bars :: Bizzz
                   }
                   deriving ToJSON via (SomeType)
                   deriving (ToJSON, FromJSON) via (SomeType)

        but got: Left "parsing error: parse error on input \8216via\8217"

We'll likely need a mechanism to disable certain tests depending on the GHC version.

@lspitzner
Copy link
Owner

great work, I'll have a look later. I think one set of errors comes down to one trivial mistake that causes duplication of offsets/comments in front of a data-def. That, together with the deriving-via fix you mentioned, I think 8.4 is clean already.

@lspitzner
Copy link
Owner

All green! nice. I still need to review the last few commits, and I want to test this on one or two larger codebases.

@eborden
Copy link
Collaborator Author

eborden commented Nov 7, 2019

Nice work on the min-ghc constraint in tests! That will come in handy. I'll give this a run on Freckle's codebase (180 KLOC) and see if anything funky pops up.

@eborden
Copy link
Collaborator Author

eborden commented Nov 7, 2019

I'm only seeing one issue. Interstitial comments are getting de-indented one level too much.

data CurrentLevel = CurrentLevel
   { clLevel :: FreckleTextLevel
-  -- ^ Either the last Snapshot or Starting/Overridden level
+-- ^ Either the last Snapshot or Starting/Overridden level
   , clStartingLevel :: FreckleTextLevel
-  -- ^ Specifically the Starting/Overridden level
+-- ^ Specifically the Starting/Overridden level
   , clOverridden :: Bool
-  -- ^ True if level has been Overridden
+-- ^ True if level has been Overridden
   }
   deriving Show

@tfausak
Copy link
Collaborator

tfausak commented Nov 7, 2019

I ran this branch over ITProTV's medium sized (72 KLOC) simple codebase and only ran into one bona fide problem:

data EnterpriseGrantsForCompanyResponse = EnterpriseGrantsForCompanyResponse Types.Company [ EnterpriseGrantResponse
]

(We have conf_layout.lconfig_cols set to 100.) I'm not exactly sure how that should be formatted, but I'd be alright with one of these:

data EnterpriseGrantsForCompanyResponse
  = EnterpriseGrantsForCompanyResponse Types.Company [EnterpriseGrantResponse]

data EnterpriseGrantsForCompanyResponse = EnterpriseGrantsForCompanyResponse
  Types.Company
  [EnterpriseGrantResponse]

data EnterpriseGrantsForCompanyResponse
  = EnterpriseGrantsForCompanyResponse
    Types.Company
    [EnterpriseGrantResponse]

@eborden
Copy link
Collaborator Author

eborden commented Nov 7, 2019

This would be consistent with how brittany formats types in signatures.

data EnterpriseGrantsForCompanyResponse = EnterpriseGrantsForCompanyResponse
  Types.Company
  [EnterpriseGrantResponse]

@eborden
Copy link
Collaborator Author

eborden commented Nov 7, 2019

@tfausak I added some tests and layouting for that situation.

@eborden
Copy link
Collaborator Author

eborden commented Nov 7, 2019

@lspitzner I'll need your help with the comment issue.

@tfausak
Copy link
Collaborator

tfausak commented Nov 7, 2019

Awesome, thanks for the quick fix! I re-ran Brittany at 393bdb0 against my codebase and everything worked. I'm also seeing the comment problem, but everything builds properly after formatting.

@lspitzner
Copy link
Owner

My notes from testing on some non-public code:

  • ran into the same comment indentation problem. It is way more than "reduce by one 'layer'/tab". Need to reverse-engineer ghc-exactprint behaviour :D
  • for small records (think Point { x :: Int, y :: Int }) a one-line format might be nice;
  • for dictionary-style records (that contain functions), long type signatures are line-broken and indented not very well. This is something where I was not even sure what I had expected as ideal output;
  • for data-types with non-trivial existentials + constraints you get really nasty linebreaks;
  • this code-base puts deriving directly after the closing brace (of a record). This saves a line, but would not work well for more than one constructor. But then it is really rare that you have records + multiple constructors. It also leads to more indentation/inconsistency if you have multiple deriving (strategy) lines. Maybe I'll put this behind a config flag. (I know some of you are not a fan of more config flags. I'll ask the other people on that codebase first anyway.)

Will add test-cases/examples for the above later.

I plan to write some documentation to explain the processing of comments. I'll also look into this particular comment problem; the ghc-exactprint semantics of its DP (DeltaPosition) value are not intuitive, and this looks like another unexpected special case.

@tfausak
Copy link
Collaborator

tfausak commented Nov 12, 2019

I re-ran 09da6d4 against ITProTV's code base and everything looks great!

Most of our record data declarations were written with deriving immediately after the closing brace, but I think it makes sense to put that on its own line. I'd be alright with Brittany simply choosing to put them on their own line and not having a config for it.

@eborden
Copy link
Collaborator Author

eborden commented Nov 12, 2019

I'd be alright with Brittany simply choosing to put them on their own line and not having a config for it.

The choice to put them on a new line sacrifices one vertical line for consistency in the current GHC world where DerivingStrategies is becoming more popular. I'm in favor of keeping the simplicity of always having a new line.

@tfausak
Copy link
Collaborator

tfausak commented Nov 12, 2019

Ah, I found a problem. Trying to format a data declaration like this:

data T = C {}

Returns a syntactically invalid result:

data T =

I think it should probably keep the empty braces around, but as far as I know that's the same as not taking any arguments. I'd be fine with either of these:

data T = C
data T = C {}

@eborden
Copy link
Collaborator Author

eborden commented Nov 12, 2019

@tfausak I've added a test and corrected that bug.

@tfausak
Copy link
Collaborator

tfausak commented Nov 13, 2019

Awesome, thanks!

@eborden
Copy link
Collaborator Author

eborden commented Nov 14, 2019

@lspitzner do you have an examples of these two?

  • for dictionary-style records (that contain functions), long type signatures are line-broken and indented not very well. This is something where I was not even sure what I had expected as ideal output;
  • for data-types with non-trivial existentials + constraints you get really nasty linebreaks;

lspitzner and others added 19 commits December 9, 2019 11:17
The `Makefile` includes `stack test` configurations to support building
versions of `brittany` with supported versions of `ghc`. Each version
uses a separate `.stack-work` directory to allow minimal compilation on
each change.
`docWrapNodePrior` caused duplication of offset of `data` keyword
and of comments connected to it.
Only single line formatting of normal records was being supported. For
records with long names we need multi line formatting. This also needs
to support both multi and left indentation policies.
Add tests for nullary prefix data types and nullary record data types.
- Fix bug in BackendUtil/lowest level of brittany about
  alignment being ignored after a comment,
- Properly layout large (more than single-line) types in
  record fields and in data decl rhs arguments,
- Properly layout data decl constructors with large "heads"
  (forall, constraints),
- Add a config flag to control single-line layout of record
  definition,
@eborden
Copy link
Collaborator Author

eborden commented Dec 9, 2019

☝️ pushed up conflict fixes. Please git pull -r if you have additional commits to add.

@tfausak
Copy link
Collaborator

tfausak commented Dec 9, 2019

@tfausak could you perhaps check this one more time on your larger codebase(s) please?

Sorry, I didn't see this until now. I re-ran the latest commit (a1282c3) against the ITProTV code base and everything looked great! 👍

@lspitzner
Copy link
Owner

@tfausak great, thanks.

@eborden thanks for the rebase. I have found one glitch in a special case that I have not had time to look at in the last few days. Will have a look now. If it has a simple fix I'd like to include it, otherwise I'll merge this and release. Will hopefully get back to you in the next hour or so.

@eborden
Copy link
Collaborator Author

eborden commented Dec 9, 2019

@lspitzner I'm not necessarily pushing for a release. I'm just pushing for merging to master. If there are further improvements to be made before a release then we should defer releasing until they are also merged to master.

@lspitzner
Copy link
Owner

But once it is on master, we cannot make releases without this feature. And I'd like to make a new release with the regression fixes that are on master asap.

I'll just make a patch release 0.12.1.1 and merge this immediately after. I am not happy with my latest commit (had to add another Bool to the BriDoc datatype..) but it passes all tests now. I am pretty certain that this only affects rather special cases.

@lspitzner lspitzner merged commit 434854f into lspitzner:master Dec 9, 2019
@lspitzner
Copy link
Owner

Thanks again @eborden . I have just released 0.12.1.1 without this feature, but we will definitely include it in the next release. I really think this is ready for publication even when it does not cover mulitple constructors yet; I just wanted to get the bugfixes out without having to organize the announcement that we should have with the new feature release.

@eborden
Copy link
Collaborator Author

eborden commented Dec 10, 2019

Thanks @lspitzner. My current plan is working on sums next, which I'll follow with GADTs. Let me know if this conflicts with any of your initiatives.

@eborden eborden deleted the datadecl branch December 10, 2019 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants