Leading comma 2 #4971

phadej · 2017-12-24T01:11:24Z

Another take on #4953

Here I followed @23Skidoo and used less-tricky approach, just by passing CabalVersionSpec value (not dictionary!). Thus I had to change ParsecParser to a newtype (good), but I went all-in and vendored small part of parsers. Unfortunately I made two changes at once: introduce CharParsing, and redo CabalSpecVersion, so I cannot say what affects more. Performance is noticeably degraded

Parsing all of Hackage:

leading-comma (old)

leading-comma % /home/ogre/Documents/other-haskell/cabal/dist-newstyle/build/x86_64-linux/ghc-8.2.2/Cabal-2.1.0.0/t/parser-hackage-tests/build/parser-hackage-tests/parser-hackage-tests parse-parsec +RTS -s
Reading index from: /home/ogre/.cabal/packages/hackage.haskell.org/01-index.tar
99594 files processed
 333,809,501,824 bytes allocated in the heap
  22,579,224,952 bytes copied during GC
      10,012,264 bytes maximum residency (6340 sample(s))
         737,704 bytes maximum slop
              25 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     316742 colls,     0 par   27.968s  27.840s     0.0001s    0.0081s
  Gen  1      6340 colls,     0 par    0.610s   0.607s     0.0001s    0.0011s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time   58.508s  ( 58.300s elapsed)
  GC      time   28.578s  ( 28.447s elapsed)
  EXIT    time   -0.001s  ( -0.001s elapsed)
  Total   time   87.086s  ( 86.747s elapsed)

  %GC     time      32.8%  (32.8% elapsed)

  Alloc rate    5,705,318,987 bytes per MUT second

  Productivity  67.2% of total user, 67.2% of total elapsed

leading-comma-2 (this)

leading-comma-2 % /home/ogre/Documents/other-haskell/cabal/dist-newstyle/build/x86_64-linux/ghc-8.2.2/Cabal-2.1.0.0/t/parser-hackage-tests/build/parser-hackage-tests/parser-hackage-tests parse-parsec +RTS -s
Reading index from: /home/ogre/.cabal/packages/hackage.haskell.org/01-index.tar
99594 files processed
 433,418,031,816 bytes allocated in the heap
  32,397,561,472 bytes copied during GC
      14,846,248 bytes maximum residency (9899 sample(s))
         693,000 bytes maximum slop
              31 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     409259 colls,     0 par   35.270s  35.131s     0.0001s    0.0106s
  Gen  1      9899 colls,     0 par    0.869s   0.865s     0.0001s    0.0009s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time   80.541s  ( 80.309s elapsed)
  GC      time   36.139s  ( 35.996s elapsed)
  EXIT    time   -0.001s  ( -0.001s elapsed)
  Total   time  116.679s  (116.304s elapsed)

  %GC     time      31.0%  (30.9% elapsed)

  Alloc rate    5,381,329,078 bytes per MUT second

  Productivity  69.0% of total user, 69.1% of total elapsed

TL;DR

I think this approach is better Haskell, though a little slower. If we compare with parsing Hackage with ReadP (this includes parsing with parsec), (563 - 116 =~ 450, i.e. 4x slower). And I think the implementation can be optimised, one just have to profile it properly (e.g. find why it allocates more).

 685,191,711,600 bytes allocated in the heap
 115,559,383,376 bytes copied during GC
   1,147,583,896 bytes maximum residency (6699 sample(s))
       4,493,928 bytes maximum slop
            2633 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     654315 colls,     0 par   202.929s  203.841s     0.0003s    0.9998s
  Gen  1      6699 colls,     0 par    2.600s   2.628s     0.0004s    0.0216s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time  331.056s  (333.344s elapsed)
  GC      time  205.529s  (206.469s elapsed)
  EXIT    time   -0.001s  ( -0.001s elapsed)
  Total   time  536.584s  (539.812s elapsed)

  %GC     time      38.3%  (38.2% elapsed)

  Alloc rate    2,069,714,691 bytes per MUT second

  Productivity  61.7% of total user, 61.8% of total elapsed
``


---

Please include the following checklist in your PR:

* [x] Patches conform to the [coding conventions](https://github.com/haskell/cabal/#conventions).
* [x] Any changes that could be relevant to users have been recorded in the changelog.
* [x] The documentation has been updated, if necessary.
* [x] If the change is docs-only, `[ci skip]` is used to avoid triggering the build bots.

Please also shortly describe how you tested your change. Bonus points for added tests!

phadej · 2017-12-24T03:26:59Z

Few tactical INLINEs and, performance is improved

 396,398,110,000 bytes allocated in the heap
  26,608,655,168 bytes copied during GC
      14,035,480 bytes maximum residency (8012 sample(s))
         741,800 bytes maximum slop
              32 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     376633 colls,     0 par   29.331s  29.279s     0.0001s    0.0100s
  Gen  1      8012 colls,     0 par    0.649s   0.648s     0.0001s    0.0010s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time   69.094s  ( 69.016s elapsed)
  GC      time   29.980s  ( 29.926s elapsed)
  EXIT    time   -0.001s  ( -0.001s elapsed)
  Total   time   99.073s  ( 98.941s elapsed)

  %GC     time      30.3%  (30.2% elapsed)

  Alloc rate    5,737,059,255 bytes per MUT second

  Productivity  69.7% of total user, 69.8% of total elapsed

It's 10ms per cabal file.

phadej · 2017-12-24T11:33:00Z

Christmas magic happened, the travis is green! 🎆

... except that now I rebased, so who knows what happens...

23Skidoo · 2017-12-24T13:22:00Z

10 ms per .cabal file is acceptable IMO, do you think the gap could be made narrower with some more optimisations?

Do we have any automated perf checks for the parser in the test suite? May be worth adding one, unless there will be too many false positives/negatives due to Travis perf being too wobbly.

It may be also worthwhile to look at the distribution of parse times, maybe there are some pathological outliers that we should fix.

phadej · 2017-12-24T13:50:20Z

Note: 10ms is just a 1ms slowdown from 9ms from the leading-comma branch.

I tried to run profiler on a test-suite parsing all cabal files of packages staring with m. I didn't saw difference between leading-comma and leading-comma-2 (this). This branch allocates a little more, and thus is slower. For example INLINE caused something to happen, and 5s (of ~100) is cut from GC time. I think the new parser generates more garbage than it strictly needs to, but I didn't saw anything obvious.

It would be great to have benchmarks, but Travis is way too noisy environment (and no guarantees that machine are the same)!

Below another round of "benchmarks", on a quite machine:

master

4303 files processed
  13,400,116,376 bytes allocated in the heap
     694,625,432 bytes copied during GC
       3,783,704 bytes maximum residency (566 sample(s))
         738,280 bytes maximum slop
              12 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     11950 colls,     0 par    1.379s   1.377s     0.0001s    0.0042s
  Gen  1       566 colls,     0 par    0.043s   0.043s     0.0001s    0.0017s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    3.900s  (  3.902s elapsed)
  GC      time    1.422s  (  1.420s elapsed)
  EXIT    time   -0.001s  ( -0.001s elapsed)
  Total   time    5.321s  (  5.322s elapsed)

  %GC     time      26.7%  (26.7% elapsed)

  Alloc rate    3,436,352,335 bytes per MUT second

  Productivity  73.3% of total user, 73.3% of total elapsed

leading-comma

4303 files processed
  13,404,932,160 bytes allocated in the heap
     703,423,400 bytes copied during GC
       4,872,272 bytes maximum residency (565 sample(s))
         734,184 bytes maximum slop
              12 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     11956 colls,     0 par    1.390s   1.387s     0.0001s    0.0043s
  Gen  1       565 colls,     0 par    0.043s   0.043s     0.0001s    0.0015s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    3.814s  (  3.817s elapsed)
  GC      time    1.433s  (  1.430s elapsed)
  EXIT    time   -0.001s  ( -0.001s elapsed)
  Total   time    5.246s  (  5.247s elapsed)

  %GC     time      27.3%  (27.3% elapsed)

  Alloc rate    3,514,858,037 bytes per MUT second

  Productivity  72.7% of total user, 72.7% of total elapsed

leading-comma-2 (this)

4303 files processed
  15,388,158,568 bytes allocated in the heap
     823,986,376 bytes copied during GC
       3,773,480 bytes maximum residency (626 sample(s))
         736,216 bytes maximum slop
              11 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     13848 colls,     0 par    1.478s   1.475s     0.0001s    0.0039s
  Gen  1       626 colls,     0 par    0.045s   0.045s     0.0001s    0.0011s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    4.538s  (  4.542s elapsed)
  GC      time    1.522s  (  1.520s elapsed)
  EXIT    time   -0.001s  ( -0.001s elapsed)
  Total   time    6.061s  (  6.061s elapsed)

  %GC     time      25.1%  (25.1% elapsed)

  Alloc rate    3,390,624,744 bytes per MUT second

  Productivity  74.9% of total user, 74.9% of total elapsed

CabalSpecVersion type-class will allow to gather per-spec conditionals. Currently it's used for selecting parsers / grammatical structure. Leading (or trailing commas) for CommaFSep/CommaVSep fields, i.e. fields with mandatory comma are (atm): - build-depends - build-tool-depends - build-tools - mixins - pkgconfig-depends - reexported-modules - setup-depends

Tag Backpack fields (mixins, signatures) to `availableSince [2,0]`. This "fixes" haskell#4448, as fields are recognised, warned, but parsed as empty if cabal-version < 2.0 (actual cut-off is ! (>= 1.25). For example, a file with cabal-version: >=1.10 library: mixins: foo-indef requires (Foo42 as FooImpl) will be accepted, yet warned, and parsed `mixins` in `BuildInfo` will be an empty list. Also availableSince is removed from `build-tool-depends`, as we **want** to parse (and not warn) it in old Cabal files. It can be thought as added retrospectively to old specs, but old `Cabal` s don't know how to use it.

23Skidoo

LGTM. I like these changes, and the 1 ms per .cabal file slowdown doesn't look like a big deal.

23Skidoo · 2017-12-25T01:35:04Z

@phadej You may want to try heap profiling both branches, maybe it will tell you why the second one allocates more.

phadej requested review from 23Skidoo and hvr December 24, 2017 01:11

phadej added 2 commits December 24, 2017 19:26

phadej force-pushed the leading-comma-2 branch 2 times, most recently from fc0f160 to 6cb9d87 Compare December 24, 2017 17:48

phadej added 2 commits December 24, 2017 19:50

Introduce CharParsing, redo leading-comma PR

f0b6497

Add INLINE pragmas for funAppMon methods of ParsecParser

a316340

phadej force-pushed the leading-comma-2 branch from 6cb9d87 to a316340 Compare December 24, 2017 17:51

23Skidoo approved these changes Dec 25, 2017

View reviewed changes

phadej merged commit 9705f68 into haskell:master Dec 25, 2017

phadej deleted the leading-comma-2 branch December 25, 2017 10:28

This was referenced Dec 25, 2017

Versioned grammars + leading comma support for build-depends #4953

Closed

cabal-version check doesn't happen early enough #4448

Closed

phadej mentioned this pull request Jan 31, 2018

Optional commas in dependency lists #1509

Closed

jneira mentioned this pull request May 22, 2019

Allow etlas to parse leading commas like cabal spec >= 2.2 typelead/etlas#103

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Leading comma 2 #4971

Leading comma 2 #4971

Uh oh!

phadej commented Dec 24, 2017 •

edited

Loading

Uh oh!

phadej commented Dec 24, 2017 •

edited

Loading

Uh oh!

phadej commented Dec 24, 2017 •

edited

Loading

Uh oh!

23Skidoo commented Dec 24, 2017

Uh oh!

phadej commented Dec 24, 2017

Uh oh!

23Skidoo left a comment

Uh oh!

23Skidoo commented Dec 25, 2017

Uh oh!

Uh oh!

Leading comma 2 #4971

Leading comma 2 #4971

Uh oh!

Conversation

phadej commented Dec 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

leading-comma (old)

leading-comma-2 (this)

TL;DR

Uh oh!

phadej commented Dec 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phadej commented Dec 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

23Skidoo commented Dec 24, 2017

Uh oh!

phadej commented Dec 24, 2017

master

leading-comma

leading-comma-2 (this)

Uh oh!

23Skidoo left a comment

Choose a reason for hiding this comment

Uh oh!

23Skidoo commented Dec 25, 2017

Uh oh!

Uh oh!

phadej commented Dec 24, 2017 •

edited

Loading

phadej commented Dec 24, 2017 •

edited

Loading

phadej commented Dec 24, 2017 •

edited

Loading