Skip to content

Conversation

@vergenzt
Copy link

@vergenzt vergenzt commented Dec 12, 2025

I tried to make no semantic changes to anything, but that could use some verifying. I did make substantial editorial decisions about what "belongs" where -- totally open to feedback on any of it.

I may try to do some sort of automated analysis to confirm that all of the rules expressions from before and after are the same. We'll see.

Would you be open to merging something like this?

I tried to make no semantic changes to anything, but that could use some verifying.
@vergenzt
Copy link
Author

vergenzt commented Dec 13, 2025

Okay I was able to figure out a good way to do that validation. I used an ast-grep rule to fetch all relevant grammar rule JS key/value pairs, normalized whitespace for each (since I dedented when extracting to the grammar/ folder), then sorted and wrote to a file and was able to diff the files.

On both main and this branch, I ran the following:

ast-grep scan --json --inline-rules='id: _
           language: js
           rule:
             kind: pair
             inside:
               kind: object
               inside:
                 any:
                   - pattern: module.exports = { $$$ }
                   - kind: pair
                     has:
                       field: key
                       regex: rules
' grammar* | jq '.[].text | gsub("\\\\s+"; " ")' | sort > rules-$(git rev-parse HEAD)

This then produced the following files:

  • rules-224f3a49757b4997fc89b89fe31b84c635002534 (all rules detected at 224f3a4, tip of this branch)
  • rules-fe77f6868d6cdea593052a6af390116495093dc1 (all rules detected at fe77f68, current tip of main)

Then, git diff --no-index rules-fe77f6868d6cdea593052a6af390116495093dc1 rules-224f3a49757b4997fc89b89fe31b84c635002534 for me yields:

diff --git a/rules-fe77f6868d6cdea593052a6af390116495093dc1 b/rules-224f3a49757b4997fc89b89fe31b84c635002534
index c025468..320e991 100644
--- a/rules-fe77f6868d6cdea593052a6af390116495093dc1
+++ b/rules-224f3a49757b4997fc89b89fe31b84c635002534
@@ -148,7 +148,7 @@
 "direction: $ => choice($.keyword_desc, $.keyword_asc)"
 "distinct_from: $ => seq($.keyword_is, $.keyword_distinct, $.keyword_from)"
 "dollar_quote: () => /\\$[^\\$]*\\$/"
-"double: $ => choice( make_keyword(\"float8\"), unsigned_type($, parametric_type($, $.keyword_double, ['precision', 'scale'])), unsigned_type($, parametric_type($, seq($.keyword_double, $.keyword_precision), ['precision', 'scale'])), unsigned_type($, parametric_type($, $.keyword_real, ['precision', 'scale'])), )"
+"double: $ => choice( $.keyword_float8, unsigned_type($, parametric_type($, $.keyword_double, ['precision', 'scale'])), unsigned_type($, parametric_type($, seq($.keyword_double, $.keyword_precision), ['precision', 'scale'])), unsigned_type($, parametric_type($, $.keyword_real, ['precision', 'scale'])), )"
 "drop_column: $ => seq( $.keyword_drop, optional( $.keyword_column, ), optional($._if_exists), field('name', $.identifier), )"
 "drop_constraint: $ => seq( $.keyword_drop, $.keyword_constraint, optional($._if_exists), $.identifier, optional($._drop_behavior), )"
 "drop_database: $ => prec.left(seq( $.keyword_drop, $.keyword_database, optional($._if_exists), $.identifier, optional($.keyword_with), optional($.keyword_force), ))"
@@ -306,6 +306,7 @@
 "keyword_filter: _ => make_keyword(\"filter\")"
 "keyword_first: _ => make_keyword(\"first\")"
 "keyword_float: _ => make_keyword(\"float\")"
+"keyword_float8: _ => make_keyword(\"float8\")"
 "keyword_following: _ => make_keyword(\"following\")"
 "keyword_follows: _ => make_keyword(\"follows\")"
 "keyword_for: _ => make_keyword(\"for\")"

... which of course now reminds me that I did change one thing, which was to extract a separate keyword_float8 rule instead of the previous inline call to make_keyword. 😄

Looks like that's confirmed to be the only change though!!

@DerekStride
Copy link
Owner

I like the idea here, splitting everything out into separate modules seems nice, are you aware of other projects that make use of this pattern? We're already quite different in that we don't commit the generated code so I worry about adding more friction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants