Add TASTY pickling of quotes and implement `~` on quotes #3662

nicolasstucki · 2017-12-12T21:10:39Z

Based on #3634

dottybot

Hello, and thank you for opening this PR! 🎉

All contributors have signed the CLA, thank you! ❤️

Commit Messages

We want to keep history, but for that to actually be useful we have
some rules on how to format our commit messages (relevant xkcd).

Please stick to these guidelines for commit messages:

Separate subject from body with a blank line

When fixing an issue, start your commit message with Fix #<ISSUE-NBR>:

Limit the subject line to 72 characters

Capitalize the subject line

Do not end the subject line with a period

Use the imperative mood in the subject line ("Added" instead of "Add")

Wrap the body at 80 characters

Use the body to explain what and why vs. how

adapted from https://chris.beams.io/posts/git-commit

Have an awesome day! ☀️

allanrenucci · 2017-12-13T08:34:28Z

compiler/src/dotty/tools/dotc/core/tasty/TastyString.scala

+object TastyString {
+
+  /** Decode the TASTY String into TASTY bytes */
+  def stringToTasty(str: String): Array[Byte] = { // TODO factor out this and tastyToString


str.getBytes?

No, the string encoding messes up the bytes

allanrenucci · 2017-12-13T08:35:49Z

compiler/src/dotty/tools/dotc/core/tasty/TastyString.scala

+  }
+
+  /** Encode TASTY bytes into a TASTY String */
+  def tastyToString(bytes: Array[Byte]): String = {


new String(bytes)?

Same problem

This is a tricky problem. Looking at Stackoverflow, people say you should use a Codec for this, typically Base64. The scheme of mapping all bytes to ranges 0..255 looks like it would work, but it's not optimal. Strings are represented in Classfiles as UTF8 characters, with one byte for ranges 0.127 and two bytes for ranges 128-255. This means that, assuming a uniform bit distribution you get an overhead of 50%. Doing a 8->7 bit codec would give an overhead of less than 15%.

There's another problem of string size. Strings are limited to 65365 characters. This might not be enough for a larger quoted program.

scalac solves both of these problems when serializing its pickles as annotations. I think we should copy that scheme. I tried to find it but could not. @retronym @lrytz @adriaanm does one of you have an idea where the code that serializes a Pickle as an annotation is?

I think we can leave it like this for this PR, but then we should open an issue for future improvements.

I will start looking at the alternatives. I also think we should start with this for now to unblock the next PRs and allow people to use it.

@odersky https://github.com/scala/scala/blob/2f3791c3079d998d29788d121552c27517f58a6c/src/compiler/scala/tools/nsc/backend/jvm/BCodeHelpers.scala#L1036-L1119

@lrytz thanks for the link. Could you also point me to the place where the String/Array[Strings] are converted back into an Array[Byte]. Thanks.

It took me a while to find it.. Need to clean this up / document. Method parseScalaSigBytes calls ConstantPool.getBytes which goes through ByteCodecs.decode.

The encoding is explained here http://www.scala-lang.org/old/sites/default/files/sids/dubochet/Mon,%202010-05-31,%2015:25/Storage%20of%20pickled%20Scala%20signatures%20in%20class%20files.pdf

first map all 8-bit bytes to 7 bits (shifting the rest)

then increment all by 1 (in 7 bits), so 0x7f becomes 0x00

then encode 0x00 as 0xc0 0x80, which is an overlong utf 8 encoding for zero. it's what the jvm classfile spec uses to avoid having 0x00 in strings. it's called "modified utf 8".

the reason for the incrementing by 1 that 0x7f is expected to be less common than 0x00, so the two byte encoding hits less often.

The confusing part is that the class ScalaSigBytes used in the backend to encode the signature uses ByteCodecs.encode8to7, but does the +1 itself. It doesn't need to map 0x00 to the two byte version because ASM will do it when writing the annotation to the classfile. However, in the unpickler, we don't use ASM to read the annotation, but just get the bytes from the classfile directly. So there we'll see the two byte encoding. ByteCodecs.decode does the necessary work.

nicolasstucki · 2017-12-28T14:07:29Z

All requested changes have been made.

Those tests have never fully been supported, all files are compiled at the same time.

* Pickle/unpickle quotes * Add Splicer * Add tree interpreter for splicing staged expression * Add concrete implementations of Quoted

nicolasstucki · 2018-01-08T13:06:55Z

Rebased to make sure we do not have regressions.

nicolasstucki added the stat:wip label Dec 12, 2017

dottybot reviewed Dec 12, 2017

View reviewed changes

nicolasstucki force-pushed the add-meta-with-tasty branch from 4ad2209 to a71e267 Compare December 12, 2017 21:45

allanrenucci reviewed Dec 13, 2017

View reviewed changes

nicolasstucki force-pushed the add-meta-with-tasty branch 24 times, most recently from cbca568 to 166c49c Compare December 20, 2017 10:18

nicolasstucki mentioned this pull request Dec 20, 2017

A symmetric meta programming framework #3634

Merged

nicolasstucki force-pushed the add-meta-with-tasty branch from 280f625 to 8cedf06 Compare December 20, 2017 16:45

nicolasstucki force-pushed the add-meta-with-tasty branch 2 times, most recently from 48bbffd to 8939eb9 Compare December 28, 2017 11:54

nicolasstucki assigned odersky and unassigned nicolasstucki Dec 28, 2017

nicolasstucki requested a review from odersky January 8, 2018 09:59

nicolasstucki added 18 commits January 8, 2018 13:54

Avoid running sepparate compilation tests under legacy tests

86939bb

Those tests have never fully been supported, all files are compiled at the same time.

Rename quote test files

f021562

Add TASTY quote pickling for trees

ddb4449

* Pickle/unpickle quotes * Add Splicer * Add tree interpreter for splicing staged expression * Add concrete implementations of Quoted

Fix bug while unpickling splices in quotes

c40a022

Implement primitive liftable

c80ea20

Add String to Liftable expressions and abstract over all primitives

1674cd8

Re-enable valueTypeNameToJavaType

10753bc

Disable spourious containsQuotesOrSplices = true

a15fc90

Rename QuoteUnpickler to TastyUnpickler

e4c2a33

Add comments to RawQuotes

29ef05d

Replace PrimitiveExprs by the sinlge class ConstantExpr

65acf1c

Avoid printing encoded TASTY in TastyExpr and TastyType

2ff30d6

Simplify RawQuoted instantiations

2bd3b6f

Implement extractor for quoted trees

990e3d1

Re-work Interpreter

2aa05ef

Move quote unpickling to the core package

8b94c86

Encapsulate quoted types in top level type

82d36bd

Add environment to the interpreter

f86bcc5

nicolasstucki force-pushed the add-meta-with-tasty branch from 6289d07 to f86bcc5 Compare January 8, 2018 13:06

odersky approved these changes Jan 8, 2018

View reviewed changes

nicolasstucki merged commit 39e466d into scala:master Jan 8, 2018

allanrenucci deleted the add-meta-with-tasty branch January 13, 2018 09:24

nicolasstucki mentioned this pull request Jan 20, 2018

Improve compression of pickled quotes #3877

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TASTY pickling of quotes and implement `~` on quotes #3662

Add TASTY pickling of quotes and implement `~` on quotes #3662

nicolasstucki commented Dec 12, 2017

dottybot left a comment

allanrenucci Dec 13, 2017

nicolasstucki Dec 13, 2017

allanrenucci Dec 13, 2017

nicolasstucki Dec 13, 2017

odersky Dec 25, 2017

odersky Dec 25, 2017

nicolasstucki Dec 25, 2017

odersky Dec 27, 2017

lrytz Jan 5, 2018 •

edited

Loading

nicolasstucki Jan 12, 2018

lrytz Jan 15, 2018 •

edited

Loading

nicolasstucki commented Dec 28, 2017

nicolasstucki commented Jan 8, 2018

Add TASTY pickling of quotes and implement ~ on quotes #3662

Add TASTY pickling of quotes and implement ~ on quotes #3662

Conversation

nicolasstucki commented Dec 12, 2017

dottybot left a comment

Choose a reason for hiding this comment

Commit Messages

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lrytz Jan 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lrytz Jan 15, 2018 • edited Loading

Choose a reason for hiding this comment

nicolasstucki commented Dec 28, 2017

nicolasstucki commented Jan 8, 2018

Add TASTY pickling of quotes and implement `~` on quotes #3662

Add TASTY pickling of quotes and implement `~` on quotes #3662

lrytz Jan 5, 2018 •

edited

Loading

lrytz Jan 15, 2018 •

edited

Loading