Add SimpleTable construct based on Named Tuples #81

bishabosha · 2025-05-04T21:15:40Z

Abstract

In this PR, introduce a new com.lihaoyi::scalasql-namedtuples module, supporting only Scala 3.7+, with three main contributions:

scalasql.namedtuples.SimpleTable class as an alternative to Table. The pitch is that you can use a "basic" case class with no higher-kinded type parameters to represent your models.
scalasql.namedtuples.NamedTupleQueryable - provides implicit Queryable so you can return a named tuple from a query.
scalasql.simple package object, that re-exports scalasql package, plus SimpleTable and NamedTupleQueryable

Example

Defining a table

import scalasql.simple.{*, given}

case class City(
    id: Int,
    name: String,
    countryCode: String,
    district: String,
    population: Long
)

object City extends SimpleTable[City]

return named tuples from queries

val query = City.select.map(c => (name = c.name, population = c.population))
val res: Seq[(name: String, population: Long)] = db.run(query)

Design

This PR manages to introduce SimpleTable entirely without needing to change the core library. It is designed around leveraging the new Named Tuples and programmatic structural typing facilities introduced in Scala 3.7. No macros are needed.

It was also designed such that any query should be pretty much source compatible if you change from Table to SimpleTable - (with the exception of dropping [Sc] type arguments).

Within a query, e.g. City.select.map(c => ...) we still need c to be an object that has all the fields of City, but they need to be wrapped by either scalasql.Expr[T] or scalasql.Column[T].

With Table this is done by the case class having an explicit type parameter (e.g. City[T[_]](name: T[String]...)) - so you would just substitute the parameter. Of course with SimpleTable the main idea is that you do not declare this T[_] type parameter, but the scalasql.query package expects it to be there.

The solution in this PR is to represent the table row within queries by Record[City, Expr], (rather than City[Expr]).
Record[C, T[_] is a new class, and essentially a structurally typed tuple that extends scala.Selectable with a named tuple Fields type member, derived mapping T over NamedTuple.From[C].

Record (and SimpleTable) still support using a nested case class field to share common columns (with a caveat*).

When you return a Record[C, T] from a query, you need to still get back a C, so SimpleTable provides an implicit Queryable.Row[Record[C, Expr], C], which is generated by compiletime derivation (via inline methods).

Implementation

To make a simpler diff, SimpleTable is entirely defined in terms of Table. i.e. here is the signature:

class SimpleTable[C](
    using name: sourcecode.Name,
    metadata0: Table.Metadata[[T[_]] =>> SimpleTable.MapOver[C, T]]
) extends Table[[T[_]] =>> SimpleTable.MapOver[C, T]](using name, metadata0)

The metadata0 argument is expected to be generated automatically from an inline given in SimpleTableMacros.scala (I suggest to rename to SimpleTableDerivation.scala)

Table[V[_[_]], being a higher kinded type, normally expects some case class Foo[T[_]], and fills in various places V[Expr] or V[Column] in queries, and V[Sc] for results. However for SimpleTable when T[_] is scalasql.Sc we want to return C and otherwise return this Record[C, T] so MapOver needs to be a match type:

object SimpleTable {
  type MapOver[C, T[_]] = T[Internal.Tombstone.type] match {
    case Internal.Tombstone.type => C // T is `Sc`
    case _ => Record[C, T]
  }
}

(Tombstone is used here to try and introduce a unique type that would never be used for any other purpose, i.e. be disjoint in the eyes of the match type resolver - also so we can convince ourselves that if T returns Tombstone it is probably the identity and not some accident.)

See #83 for another approach that eliminates removes the V[_[_]] from Table, Insert and various other places.

Design of Record

Record[C, T[_]] is implemented as a structural type that tries to wrap the fields of C in T. It has a few design constraints:

When C has a field of type X that is a nested Table, the corresponding field in Record[C, T] must also be Record[X, T].
when selecting a nested Record, preserve which type (e.g. Expr or Column) to wrap fields in from the outer level.
simple types in the IDE

First decision: Record uses a Fields type member for structural selection, rather than the traditional type refinements.
Why:

can be constructed without macros
internals can be based on Array rather than a hash map,
Fields derived via NamedTuple.From[C] is treated as part of the class implementation, this means you never get a huge refinement type showing up whenever you hover in the IDE.

Second decision: how to decide which fields are "scalar" data and which are nested records.
Constraints:

previously with Table, the only evidence that a field of type X is a nested table is implicit evidence of type Table.ImplicitMetadata[X]
match types can only dispatch on statically known information, and there is not currently any match type (or scala.compiletime intrinsic) that can tell you if there exists an implicit of type X.

Choices:

pre-compute the transitive closure of all possible nested fields as a third type argument to record, which in typical cases would be empty
require that the field have some marker, e.g. foo: Ref[Foo], unclear how much this would be intrusive at each use-site
introduce a marker class (SimpleTable.Nested) that the nested case class should extend - this does however prevent using "third party" classes as a nested table

The implicit derivation of Metadata also enforces that whenever an implicit metadata is discovered for use as field, the class must extend SimpleTable.Nested.

object SimpleTable {

  // needs to be a class so the match type reducer can "prove disjoint" to various other types. 
  abstract class Nested

  final class Record[C, T[_]](private val data: IArray[AnyRef]) extends Selectable:
    /**
     * For each field `x: X` of class `C` there exists a field `x` in this record of type
     * `Record[X, T]` if `X` is a case class that represents a table, or `T[X]` otherwise.
     */
    type Fields = NamedTuple.Map[
      NamedTuple.From[C],
      [X] =>> X match {
        case SimpleTable.Nested => Record[X, T]
        case _ => T[X]
      }
    ]
  }
}

Alternatives

Why is there Record[C, T] and not ExprRecord[C] or ColumnRecord[C] classes?

This was explored in #83, which requires a large change to the scalasql.query package, i.e. a new type hierarchy for Table (but makes more explicit in types the boundary between read-only queries, column updates, and results). It's also unclear if it relies upon "hacks" to work.

Why use Record[C, T] and not named tuples in queries?

its almost impossible and (expensive when possible) to preserve the mapping that a large named tuple type (with no reference to the original class) should map back to the class after running the query
Also would be ambiguous with when you explicitly want to return a named tuple, rather than map back to the table class.
Record is a very cheap association directly back to the class it derives from, it also is a compact type if ever needed to be written explicitly, or shown by an IDE.

What is needed to get rid of Simpletable.Nested?

lets remind ourselves of the current definition of SimpleTable:

class SimpleTable[C](
    using name: sourcecode.Name,
    metadata0: Table.Metadata[[T[_]] =>> SimpleTable.MapOver[C, T]]
) extends Table[[T[_]] =>> SimpleTable.MapOver[C, T]](using name, metadata0) {
  given simpleTableGivenMetadata: SimpleTable.GivenMetadata[C] =
    SimpleTable.GivenMetadata(metadata0)
}

First thing - we determined that the transitive closure of available implicit SimpleTable.GivenMetadata[Foo] needs to be added as an argument to Record.

In #82 we explored this by just precomputing all the field types ahead of time in a macro, so the types would look a bit like Record[City, Expr, (id: Expr[Long], name: Expr[String], nested: (fooId: Expr[Long], ...))] which was very verbose.

An alternative could be to pass as a type parameter the classes which have a metadata defined. Something like Record[City, Expr, Foo | Bar] or Record[Foo, Expr, Empty.type], and modify the Record class as such:

-final class Record[C, T[_]](private val data: IArray[AnyRef]) extends Selectable:
+final class Record[C, T[_], <TABLES>](private val data: IArray[AnyRef]) extends Selectable:
    /**
     * For each field `x: X` of class `C` there exists a field `x` in this record of type
     * `Record[X, T]` if `X` is a case class that represents a table, or `T[X]` otherwise.
     */
    type Fields = NamedTuple.Map[
      NamedTuple.From[C],
-     [X] =>> X match {
-       case Nested => Record[X, T]
+     [X] =>> IsSub[X, <TABLES>] match {
+       case true => Record[X, T]
        case _ => T[X]
      }
    ]
  }

This could be a sweet spot between verbosity and extensibility to "uncontrolled" third party classes - but it is uncertain who in reality would be blocked by needing to extend SimpleTable.Nested. Also it is still to determine the potential impact on performance of compilation times, also the best place to compute this type without causing explosions of implicit searches.
You can see a prototype here: bishabosha/scalasql#table-named-tuples-infer-nested-tables

Build changes

introduce top level scalasql-namedtuples module

publishes as com.lihaoyi:scalasql-namedtuples_3
scalaVersion 3.7.0
sources are located in scalasql/namedtuples
depends on module scalasql("3.6.2") - so that it can re-export all of scalasql from the scalasql.simple package object

Also declare scalasql-namedtuples.test module

sources in scalasql/namedtuples/test
depends on module scalasql("3.6.2").test, so the custom test framework can be used to capture test results.

Testing changes

The main approach to testing was to copy test sources that already exist, and convert them to use SimpleTable with otherwise no other changes.

Assumptions made when copying:

the majority of existing scalasql tests are testing the query translation to SQL, rather than specifically the implementation of Table.Metadata generated by macros.
Since the only difference between using SimpleTable and Table are the signatures of implicits available, and the implementation of Table.Metadata, the test coverage for SimpleTable should focus on type checking, and that the fundamentals of TableMetadata are implemented correctly in a "round trip".
so I copied the tests from scalasql/test/src/ExampleTests.scala, scalasql/test/src/datatypes/DataTypesTests.scala and scalasql/test/src/datatypes/OptionalTests.scala, renaming the traits and switching from Table to SimpleTable, otherwise unchanged.
I also had to copy scalasql/test/src/ConcreteTestSuites.scala to scalasql/namedtuples/test/src/SimpleTableConcreteTestSuites.scala, commenting out most objects except OptionalTests and DataTypesTests, which now extend the duplicated and renamed suites. I also renamed the package to scalasql.namedtuples
finally I also copied scalasql/test/src/WorldSqlTests.scala (to scalasql/namedtuples/test/src/example/WorldSqlTestsNamedTuple.scala) to ensure that every example in tutorial.md compiles after switching to SimpleTable, and also to provide snippets I will include in the tutorial.md.
I also renamed a few tests in the duplicates of OptionalTests.scala and DataTypesTests.scala so that they would generate unique names that can be included in reference.md.

New tests:

demonstrations of returning Named Tuples from the various SimpleTableH2Example tests
scalasql/namedtuples/test/src/datatypes/LargeObjectTest.scala to stress test the compiler for large sized classes.
scalasql/namedtuples/test/src/example/foo.scala for quick testing of compilation, typechecking etc.
replacement of case class copy method with Record#updates in SimpleTableOptionalTests.scala

Documentation changes

tutorial.md and reference.md are generated from scala source files and test results in docs/generateDocs.mill.

I decided that rather than duplicate both tutorial.md and reference.md for SimpleTable, it would be better to avoid duplication, or potential drift, by reusing the original documents, but include specific notes when use of SimpleTable or NamedTupleQueryable adds new functionality or requires different code.

tutorial.md

To update tutorial.md I wrote the new text as usual in WorldSqlTests.scala. These texts exclusively talk about differences between the two approaches, such as declaring case classes, returning named tuples, or using the updates method on record. To support the new texts, I needed to include code snippets. But like in WorldSqlTests.scala I would prefer the snippets to be verified in a test suite. So the plan was to copy WorldSqlTests.scala to a new file, update the examples to use SimpleTable and include snippets from there.

To support including snippets from another file I updated the generateTutorial task in docs/generateDocs.mill. The change was that if the scanner sees a line // +INCLUDE SNIPPET [FOO] somefile in WorldSqlTests.scala, then it switches to reading the lines from somefile file, looking for the first line containing // +SNIPPET [FOO], then splices all lines of some file until it reaches a line containing // -SNIPPET [FOO], then it switches back to reading the lines in WorldSqlTests.scala.

The main idea is that snippets within somefile should be declared in the same order that they are included from WorldSqlTests.scala, meaning that the scanner traverses both files from top to bottom once (beginning from the previous position whenever switching back).

So to declare the snippets as mention above I copied WorldSqlTests.scala to scalasql/namedtuples/test/src/example/WorldSqlTestsNamedTuple.scala, replaced Table by SimpleTable and declared in there the snippets I wanted (and included them from WorldSqlTests.scala) .

Any other changes (e.g. newlines, indentation etc) are likely due to updating scalafmt.

reference.md

this file is generated by the generateReference task in docs/generateDocs.mill. It works by formatting the data from out/recordedTests.json (captured by running tests with a custom framework) and grouping tests by the suite they occur in.

Like with tutorial.md I thought it best to only add extra snippets that highlight the differences between the two kinds of table.

So first thing to capture the output of simple table tests, in the build I set the SCALASQL_RECORDED_TESTS_NAME and SCALASQL_RECORDED_SUITE_DESCRIPTIONS_NAME environment variables in the scalasql-namedtuples.test module: in this case recordedTestsNT.json and out/recordedSuiteDescriptionsNT.json.

Next I updated the generateReference task so that it also includes the recorded outputs from recordedTestsNT.json. This task handles grouping of tests and removing duplicates (e.g. the mysql, h2 variants). I made it so that for each Suite e.g. DataTypes it find the equivalent suite in the simple table results, and then only include the test names it hadn't seen at the end of that suite.

So therefore to include any test result from SimpleTableDataTypesTests.scala or SimpleTableOptionalTests.scala, it is only necessary to rename an individual test, and it will be appended to the bottom of the relevant group in reference.md. For this PR I did this by adding a - with SimpleTable suffix to relevant tests (i.e. the demonstration of nested classes, and the usage of Record#updates method)

bishabosha · 2025-05-17T16:08:44Z

@lihaoyi pushed a queryable for arbitrary named tuples, and now only nested case class need to extend SimpleTable.Source

bishabosha · 2025-05-18T16:15:26Z

I've been experimenting with augmenting class Record[C, T[_]] with two subclasses to "improve" inferred types, namely convert Record to trait Record[C, +T[_]], and introduce ExprRecord[C] extends Record[C, Expr] and ColRecord extends ExprRecord[C] with Record[C, Column]. because in the end the actual data would always be a ColRecord, just sometimes with a "view" as an ExprRecord (e.g. for V[Expr])

However this is not possible as covariant T can not be referenced from the Fields type of Record, and otherwise if T is invariant (or a member type) then it is illegal to override. And ColRecord must extend ExprRecord or else there should be class cast exception somewhere.

so I think it will have to remain as is

bishabosha · 2025-05-18T16:44:32Z

latest commit renames Lift to MapOver, and removes some duplication

bishabosha · 2025-05-19T19:07:06Z

moved the named tuple queryable stuff to a dedicated file.

the queryables function should only cache factories that still require the mappers to be provided.

bishabosha · 2025-05-21T05:06:00Z

now i put everything available in one import import scalasql.simple.{*, given} which exports all of scalasql.package, NamedTupleQueryable.given and SimpleTable

bishabosha · 2025-05-24T15:18:15Z

@lihaoyi I have updated the PR description with an explanation of choices made

lihaoyi · 2025-05-24T15:19:07Z

Thanks, will take a look

bishabosha · 2025-05-26T22:26:33Z

@lihaoyi I have updated the PR description with changes to the build, documentation, and testing

bishabosha · 2025-05-31T20:33:47Z

here is a benchmark to compare against Table it seems a 1.5x increase in compilation time - https://gist.github.com/bishabosha/fd52492bb8977fa89e8dc28239ee48a6

good to know anyway, perhaps there are tweaks that can be made before anyone immediately jumps to macros

lihaoyi · 2025-05-31T20:40:16Z

Not surprising there's some performance penalty, but it's probably fine. ScalaSql isn't meant to be hyper-optimized and generally the bottleneck for database queries is on the actual database anyway, rather than in application code

bishabosha added 8 commits April 23, 2025 22:01

bump mill-version (critical)

fafb80a

wip: add new module

d437520

wip: simple table and inline defs

716309a

wip: queryable

5364d93

wip: vExpr

b47953b

wip: use record type that maps back to case class

8d7fcbc

wip: pass example tests for each dialect

3fa8eb3

wip: pass datatypes test, introduce marker class

35fea06

bishabosha force-pushed the feature/table-named-tuples branch from 1501969 to 496dc23 Compare May 5, 2025 16:30

bishabosha added 6 commits May 5, 2025 18:39

pass optiontests, add record updater

63b2fce

Update scalafmt to suppoer 3.7 syntax

20fd1e1

REFORMAT SOURCES

96b759d

Set semanticdb version explicitly

f20982f

format reference

f83f2af

add large object test

5e6edca

bishabosha force-pushed the feature/table-named-tuples branch from 496dc23 to 5e6edca Compare May 5, 2025 16:39

bishabosha mentioned this pull request May 5, 2025

SimpleTable with Record, w/o marker class #82

Closed

bishabosha added 4 commits May 11, 2025 16:41

dont print AST while compiling

17c7012

dont blow stack while computing size of case class

aa2f0ea

only require SimpleTable.Source for nested

678360e

add named tuple queryable

e415f63

switch order of Lift

1b5ab9b

bishabosha force-pushed the feature/table-named-tuples branch from 944299f to 006172d Compare May 18, 2025 16:43

bishabosha added 3 commits May 20, 2025 22:14

rename Lift, remove rowExpr

65e0159

only cache the factories

5c606bf

the queryables function should only cache factories that still require the mappers to be provided.

move named tuple querable to new file

9efa287

bishabosha force-pushed the feature/table-named-tuples branch from e0f83fe to 339eb4b Compare May 20, 2025 20:37

remove some wrapper classes

6cf41b2

bishabosha force-pushed the feature/table-named-tuples branch from 339eb4b to 6cf41b2 Compare May 20, 2025 21:10

reexport everything via scalasql.simple package

cf16d30

Add scaladoc to SimpleTable

487ce9e

bishabosha force-pushed the feature/table-named-tuples branch 4 times, most recently from aca6099 to e1e5cf6 Compare May 23, 2025 23:19

wip: tutorial.md

0de7f5c

bishabosha force-pushed the feature/table-named-tuples branch from e1e5cf6 to 0de7f5c Compare May 23, 2025 23:39

fix formatting

40c3237

bishabosha force-pushed the feature/table-named-tuples branch from 4600d21 to 40c3237 Compare May 24, 2025 11:20

record SimpleTable tests in reference.md

88ede3a

bishabosha force-pushed the feature/table-named-tuples branch from a7997c1 to 88ede3a Compare May 24, 2025 12:27

record more SimpleTable tests in reference.md

f152114

bishabosha force-pushed the feature/table-named-tuples branch from 410c2c6 to f152114 Compare May 24, 2025 12:44

bishabosha mentioned this pull request May 25, 2025

SimpleTable with separate classes for Columns and Record (of Expr) #83

Closed

lihaoyi merged commit eb6bf3a into com-lihaoyi:main May 28, 2025
6 checks passed

bishabosha deleted the feature/table-named-tuples branch May 28, 2025 06:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SimpleTable construct based on Named Tuples #81

Add SimpleTable construct based on Named Tuples #81

Uh oh!

bishabosha commented May 4, 2025 •

edited

Loading

Uh oh!

bishabosha commented May 17, 2025

Uh oh!

bishabosha commented May 18, 2025 •

edited

Loading

Uh oh!

bishabosha commented May 18, 2025

Uh oh!

bishabosha commented May 19, 2025

Uh oh!

bishabosha commented May 21, 2025

Uh oh!

bishabosha commented May 24, 2025

Uh oh!

lihaoyi commented May 24, 2025

Uh oh!

bishabosha commented May 26, 2025

Uh oh!

Uh oh!

bishabosha commented May 31, 2025

Uh oh!

lihaoyi commented May 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add SimpleTable construct based on Named Tuples #81

Add SimpleTable construct based on Named Tuples #81

Uh oh!

Conversation

bishabosha commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Abstract

Example

Design

Implementation

Alternatives

Build changes

Testing changes

Documentation changes

tutorial.md

reference.md

Uh oh!

bishabosha commented May 17, 2025

Uh oh!

bishabosha commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bishabosha commented May 18, 2025

Uh oh!

bishabosha commented May 19, 2025

Uh oh!

bishabosha commented May 21, 2025

Uh oh!

bishabosha commented May 24, 2025

Uh oh!

lihaoyi commented May 24, 2025

Uh oh!

bishabosha commented May 26, 2025

Uh oh!

Uh oh!

bishabosha commented May 31, 2025

Uh oh!

lihaoyi commented May 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bishabosha commented May 4, 2025 •

edited

Loading

bishabosha commented May 18, 2025 •

edited

Loading