Implementation

Compiler plug-in or compiler patch?

The current implementation was created by modifying the compiler -- just because that seemed easier.

An implementation as a plug-in is possible by recognizing the de-sugared code generated by the existing SymbolicXMLBuilder and reconstructing it. I don't think, however, that this is a good choice, because:

It would be tremendously unstable: any change in implementation of the SymbolicXMLBuilder would make programs stop to compile.
The plug-in would not be able to distinguish between code generated by the compiler and equivalent code coming from source. This is unlikely, but it would be extremely confusing if it happened. More importantly during the development of this project, I can't see a way to provide a reliable backward compatibility test.
A plug-in can't resolve the problem of loss of lexical information.

Description of current changes

In summary, the current code compiles and builds and passes the tests (with only one of them seriously modified), but it is definitely not production-grade.

High-level description of the Changes

MarkupParsers has been moderately changed to:
- Allow the SymbolicXMLBuilder to work with Tree=>Tree instead of Tree -- this can and should be undone.
- Support the <?scala PI
- Delegate namespace prefix handling to the SymbolicXMLBuilder (and eventually, currently, to the unmarshaller)
- Separate the building of intermediate results (handle.mkXMLSeq, handle.pattern) from the obtention of the final result (handle.unmarshallLiteral / handle.unmarshallPattern)
SymbolicXMLBuilder has been rewritten beyond recognition. It uses and abuses functional programming to build the syntax tree from the inside out. In retrospect, this wasn't entirely necessary. Will it cause performance issues? Should it be changed back to a more procedural approach?
A scala PI has been added to Predef to provide the ScalaXMLUnmarshaller as the default unmarshaller. The ugly definition of $scope can be removed, just not yet: starr needs it to build Predef.
Typers has been changed to pull scala.Dynamic out of experimental, as the compiler is not building with -Xexperimental and Dynamic is absolutely necessary to implement backward compatibility. I've also corrected the documentation of scala.Dynamic to match reality.
Finally, ScalaXMLUnmarshaller and the pretty uninteresting trait XMLUnmarshaller are completely new.
Oh, and an two ugly ones:
- This test shows how horribly the implementation details leak in compiler messages.
- We need at least 2MB or stack space in partest, now :-(

Structure of the resulting code for XML Literals

The current code replaces XML literals with very (possibly very very) long expressions, e.g.:

<i18n:msg xmlns:i18n="http://www.gremideprogramadors.org/scala-i18n">
  Welcome back, {user}! You last logged in on <date/>
</i18n:msg>

translates to (line breaks added for readability):

$xmlUnmarshaller.startXmlExpr().
  `sTag_i18n:msg`().
    startAttributes().
      `startAttribute_xmlns:i18n`().charData("http://www.gremideprogramadors.org/scala-i18n").endAttribute().
    endAttributes().
    charData("\n    Welcome back, ").scalaExpr(user).charData("! You last logged in on ").
    `sTag_date`().startAttributes().endAttributes().eTag().
    charData("\n  ").
  eTag().
endXmlExpr()

This was convenient during concept development -- sort of unmarshallers providing a Fluent Interface to build XML -- but it has proven a problem for the typer phase to type such possibly very long expressions, often requiring in excess of 2MB of stack to process a couple of KB of XML. It can be argued that it doesn't make sense to embed large chunks of XML in code, but still, a few more KB would be frequent if the feature was used, e.g., to provide templates for HTML applications.

The plan is to use a sequence of definitions of synthetic values instead, e.g.:

{
  val $1= $xmlUnmarshaller.startXmlExpr()
  val $2= $1.`startSTag_i18n:msg`()
  val $3= $2.`startAttribute_xmlns:i18n`()
  val $4= $3.charData("http://www.gremideprogramadors.org/scala-i18n")
  val $5= $4.endAttribute()
  val $6= $5.endSTag()
  val $7= $6.charData("\n    Welcome back, ")
  val $8= $7.scalaExpr(user)
  val $9= $8.charData("! You last logged in on ")
  val $10= $9.`emptyElemTag_date`()
  val $11= $10.charData("\n  ").
  val $12= $11.eTag()
  $12.endXmlExpr()
}

Is this a good idea? Will it cause some other performance problem, e.g. in handling of names in this scope? It is pretty much what the old SymbolicXMLBuilder did, so I don't expect this to cause other problems.

Handling of XML Patterns

The current code replaces XML patterns with long expressions which end up providing an object with an unapplySeq method, sort of unmarshallers providing a Fluent Interface to build an unapply or unapplySeq method:

e.g.

xml match { case <tag>{v@_*}</tag> => v }

translates to (line breaks added for readability):

xml match {
  case $xmlUnmarshaller.startXmlPattern().sTag_tag().startAttributes().endAttributes().
    scalaStarPattern().eTag().endXmlPattern(v @ _) => v
}

Note that the list of scala patterns (which includes all variable bindings) is 'flattened', which has two consequences:

1. The resulting patterns are much simpler, with most of the matching done in the classes provided by the unmarshaller rather than code generated by the compiler. This is expected to reduce the impact of [SI-1133](https://issues.scala-lang.org/browse/SI-1133)

1. `_*` are translated into `_`. Tests have been created to check for backward compatibility, but the possibility remains that behaviour has been altered in some subtle way.

Note that the existing ScalaXMLUnmarshaller provides an unapplySeq method, but other unmarshallers could provide an unapply method -- which would probably make for more efficient pattern matching code. This was attempted for the ScalaXMLUnmarshaller, but I just couldn't get the unapply method to have a Product of the right arity as its (compile-type) return type. I believe solving this problem would make for more efficient pattern matching code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation

Compiler plug-in or compiler patch?

Description of current changes

Structure of the resulting code for XML Literals

Handling of XML Patterns

Clone this wiki locally