-
Notifications
You must be signed in to change notification settings - Fork 33
Add SimpleTable construct based on Named Tuples #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SimpleTable construct based on Named Tuples #81
Conversation
1501969 to
496dc23
Compare
496dc23 to
5e6edca
Compare
|
@lihaoyi pushed a queryable for arbitrary named tuples, and now only nested case class need to extend SimpleTable.Source |
|
I've been experimenting with augmenting However this is not possible as covariant so I think it will have to remain as is |
944299f to
006172d
Compare
|
latest commit renames Lift to MapOver, and removes some duplication |
|
moved the named tuple queryable stuff to a dedicated file. |
the queryables function should only cache factories that still require the mappers to be provided.
e0f83fe to
339eb4b
Compare
339eb4b to
6cf41b2
Compare
|
now i put everything available in one import |
aca6099 to
e1e5cf6
Compare
e1e5cf6 to
0de7f5c
Compare
4600d21 to
40c3237
Compare
a7997c1 to
88ede3a
Compare
410c2c6 to
f152114
Compare
|
@lihaoyi I have updated the PR description with an explanation of choices made |
|
Thanks, will take a look |
|
@lihaoyi I have updated the PR description with changes to the build, documentation, and testing |
|
here is a benchmark to compare against good to know anyway, perhaps there are tweaks that can be made before anyone immediately jumps to macros |
|
Not surprising there's some performance penalty, but it's probably fine. ScalaSql isn't meant to be hyper-optimized and generally the bottleneck for database queries is on the actual database anyway, rather than in application code |
Abstract
In this PR, introduce a new
com.lihaoyi::scalasql-namedtuplesmodule, supporting only Scala 3.7+, with three main contributions:scalasql.namedtuples.SimpleTableclass as an alternative toTable. The pitch is that you can use a "basic" case class with no higher-kinded type parameters to represent your models.scalasql.namedtuples.NamedTupleQueryable- provides implicitQueryableso you can return a named tuple from a query.scalasql.simplepackage object, that re-exportsscalasqlpackage, plusSimpleTableandNamedTupleQueryableExample
Defining a table
return named tuples from queries
Design
This PR manages to introduce
SimpleTableentirely without needing to change the core library. It is designed around leveraging the new Named Tuples and programmatic structural typing facilities introduced in Scala 3.7. No macros are needed.It was also designed such that any query should be pretty much source compatible if you change from
TabletoSimpleTable- (with the exception of dropping[Sc]type arguments).Within a query, e.g.
City.select.map(c => ...)we still needcto be an object that has all the fields ofCity, but they need to be wrapped by eitherscalasql.Expr[T]orscalasql.Column[T].With
Tablethis is done by the case class having an explicit type parameter (e.g.City[T[_]](name: T[String]...)) - so you would just substitute the parameter. Of course withSimpleTablethe main idea is that you do not declare thisT[_]type parameter, but thescalasql.querypackage expects it to be there.The solution in this PR is to represent the table row within queries by
Record[City, Expr], (rather thanCity[Expr]).Record[C, T[_]is a new class, and essentially a structurally typed tuple that extendsscala.Selectablewith a named tupleFieldstype member, derived mappingToverNamedTuple.From[C].Record(andSimpleTable) still support using a nested case class field to share common columns (with a caveat*).When you return a
Record[C, T]from a query, you need to still get back aC, soSimpleTableprovides an implicitQueryable.Row[Record[C, Expr], C], which is generated by compiletime derivation (viainlinemethods).Implementation
To make a simpler diff,
SimpleTableis entirely defined in terms ofTable. i.e. here is the signature:The
metadata0argument is expected to be generated automatically from an inline given inSimpleTableMacros.scala(I suggest to rename toSimpleTableDerivation.scala)Table[V[_[_]], being a higher kinded type, normally expects somecase class Foo[T[_]], and fills in various placesV[Expr]orV[Column]in queries, andV[Sc]for results. However forSimpleTablewhenT[_]isscalasql.Scwe want to returnCand otherwise return thisRecord[C, T]soMapOverneeds to be a match type:(
Tombstoneis used here to try and introduce a unique type that would never be used for any other purpose, i.e. be disjoint in the eyes of the match type resolver - also so we can convince ourselves that ifTreturnsTombstoneit is probably the identity and not some accident.)See #83 for another approach that eliminates removes the
V[_[_]]fromTable,Insertand various other places.Design of
RecordRecord[C, T[_]]is implemented as a structural type that tries to wrap the fields ofCinT. It has a few design constraints:Chas a field of typeXthat is a nested Table, the corresponding field inRecord[C, T]must also beRecord[X, T].ExprorColumn) to wrap fields in from the outer level.First decision:
Recorduses aFieldstype member for structural selection, rather than the traditional type refinements.Why:
Arrayrather than a hash map,Fieldsderived viaNamedTuple.From[C]is treated as part of the class implementation, this means you never get a huge refinement type showing up whenever you hover in the IDE.Second decision: how to decide which fields are "scalar" data and which are nested records.
Constraints:
Table, the only evidence that a field of typeXis a nested table is implicit evidence of typeTable.ImplicitMetadata[X]scala.compiletimeintrinsic) that can tell you if there exists an implicit of typeX.Choices:
foo: Ref[Foo], unclear how much this would be intrusive at each use-siteSimpleTable.Nested) that the nested case class should extend - this does however prevent using "third party" classes as a nested tableThe implicit derivation of Metadata also enforces that whenever an implicit metadata is discovered for use as field, the class must extend
SimpleTable.Nested.Alternatives
Why is there
Record[C, T]and notExprRecord[C]orColumnRecord[C]classes?This was explored in #83, which requires a large change to the
scalasql.querypackage, i.e. a new type hierarchy forTable(but makes more explicit in types the boundary between read-only queries, column updates, and results). It's also unclear if it relies upon "hacks" to work.Why use
Record[C, T]and not named tuples in queries?What is needed to get rid of
Simpletable.Nested?lets remind ourselves of the current definition of
SimpleTable:First thing - we determined that the transitive closure of available implicit
SimpleTable.GivenMetadata[Foo]needs to be added as an argument toRecord.In #82 we explored this by just precomputing all the field types ahead of time in a macro, so the types would look a bit like
Record[City, Expr, (id: Expr[Long], name: Expr[String], nested: (fooId: Expr[Long], ...))]which was very verbose.An alternative could be to pass as a type parameter the classes which have a metadata defined. Something like
Record[City, Expr, Foo | Bar]orRecord[Foo, Expr, Empty.type], and modify theRecordclass as such:This could be a sweet spot between verbosity and extensibility to "uncontrolled" third party classes - but it is uncertain who in reality would be blocked by needing to extend
SimpleTable.Nested. Also it is still to determine the potential impact on performance of compilation times, also the best place to compute this type without causing explosions of implicit searches.You can see a prototype here: bishabosha/scalasql#table-named-tuples-infer-nested-tables
Build changes
introduce top level
scalasql-namedtuplesmodulecom.lihaoyi:scalasql-namedtuples_33.7.0scalasql/namedtuplesscalasql("3.6.2")- so that it can re-export all of scalasql from thescalasql.simplepackage objectAlso declare
scalasql-namedtuples.testmodulescalasql/namedtuples/testscalasql("3.6.2").test, so the custom test framework can be used to capture test results.Testing changes
The main approach to testing was to copy test sources that already exist, and convert them to use SimpleTable with otherwise no other changes.
Assumptions made when copying:
scalasqltests are testing the query translation to SQL, rather than specifically the implementation of Table.Metadata generated by macros.SimpleTableandTableare the signatures of implicits available, and the implementation ofTable.Metadata, the test coverage forSimpleTableshould focus on type checking, and that the fundamentals of TableMetadata are implemented correctly in a "round trip".scalasql/test/src/ExampleTests.scala,scalasql/test/src/datatypes/DataTypesTests.scalaandscalasql/test/src/datatypes/OptionalTests.scala, renaming the traits and switching fromTabletoSimpleTable, otherwise unchanged.scalasql/test/src/ConcreteTestSuites.scalatoscalasql/namedtuples/test/src/SimpleTableConcreteTestSuites.scala, commenting out most objects exceptOptionalTestsandDataTypesTests, which now extend the duplicated and renamed suites. I also renamed the package toscalasql.namedtuplesscalasql/test/src/WorldSqlTests.scala(toscalasql/namedtuples/test/src/example/WorldSqlTestsNamedTuple.scala) to ensure that every example intutorial.mdcompiles after switching toSimpleTable, and also to provide snippets I will include in thetutorial.md.OptionalTests.scalaandDataTypesTests.scalaso that they would generate unique names that can be included inreference.md.New tests:
SimpleTableH2Exampletestsscalasql/namedtuples/test/src/datatypes/LargeObjectTest.scalato stress test the compiler for large sized classes.scalasql/namedtuples/test/src/example/foo.scalafor quick testing of compilation, typechecking etc.copymethod withRecord#updatesinSimpleTableOptionalTests.scalaDocumentation changes
tutorial.mdandreference.mdare generated from scala source files and test results indocs/generateDocs.mill.I decided that rather than duplicate both
tutorial.mdandreference.mdforSimpleTable, it would be better to avoid duplication, or potential drift, by reusing the original documents, but include specific notes when use ofSimpleTableorNamedTupleQueryableadds new functionality or requires different code.tutorial.md
To update
tutorial.mdI wrote the new text as usual inWorldSqlTests.scala. These texts exclusively talk about differences between the two approaches, such as declaring case classes, returning named tuples, or using theupdatesmethod on record. To support the new texts, I needed to include code snippets. But like inWorldSqlTests.scalaI would prefer the snippets to be verified in a test suite. So the plan was to copyWorldSqlTests.scalato a new file, update the examples to useSimpleTableand include snippets from there.To support including snippets from another file I updated the
generateTutorialtask indocs/generateDocs.mill. The change was that if the scanner sees a line// +INCLUDE SNIPPET [FOO] somefileinWorldSqlTests.scala, then it switches to reading the lines fromsomefilefile, looking for the first line containing// +SNIPPET [FOO], then splices all lines ofsome fileuntil it reaches a line containing// -SNIPPET [FOO], then it switches back to reading the lines inWorldSqlTests.scala.The main idea is that snippets within
somefileshould be declared in the same order that they are included fromWorldSqlTests.scala, meaning that the scanner traverses both files from top to bottom once (beginning from the previous position whenever switching back).So to declare the snippets as mention above I copied
WorldSqlTests.scalatoscalasql/namedtuples/test/src/example/WorldSqlTestsNamedTuple.scala, replacedTablebySimpleTableand declared in there the snippets I wanted (and included them fromWorldSqlTests.scala) .Any other changes (e.g. newlines, indentation etc) are likely due to updating scalafmt.
reference.md
this file is generated by the
generateReferencetask indocs/generateDocs.mill. It works by formatting the data fromout/recordedTests.json(captured by running tests with a custom framework) and grouping tests by the suite they occur in.Like with
tutorial.mdI thought it best to only add extra snippets that highlight the differences between the two kinds of table.So first thing to capture the output of simple table tests, in the build I set the
SCALASQL_RECORDED_TESTS_NAMEandSCALASQL_RECORDED_SUITE_DESCRIPTIONS_NAMEenvironment variables in thescalasql-namedtuples.testmodule: in this caserecordedTestsNT.jsonandout/recordedSuiteDescriptionsNT.json.Next I updated the
generateReferencetask so that it also includes the recorded outputs fromrecordedTestsNT.json. This task handles grouping of tests and removing duplicates (e.g. themysql,h2variants). I made it so that for each Suite e.g.DataTypesit find the equivalent suite in the simple table results, and then only include the test names it hadn't seen at the end of that suite.So therefore to include any test result from
SimpleTableDataTypesTests.scalaorSimpleTableOptionalTests.scala, it is only necessary to rename an individual test, and it will be appended to the bottom of the relevant group inreference.md. For this PR I did this by adding a- with SimpleTablesuffix to relevant tests (i.e. the demonstration of nested classes, and the usage ofRecord#updatesmethod)