Next generation Tree Sitter Java binding. A "Java-first" fork optimized for modern developer experience, safety, and ecosystem breadth.
- Ecosystem Breadth: Expanded support for the modern stack (Kotlin, Zig, Angular, Vue.js), supplementing the official grammars maintained by the upstream project.
- Modern Java Ergonomics: Moving away from C-style wrappers toward a library that feels native to modern Java, while maintaining a low Java 11+ baseline requirement.
Note on Building: While the compiled library targets Java 11, building the project from source requires JDK 17+ (due to Gradle 9 requirements).
- Strict Null Safety: Integration with JSpecify and Error Prone for compile-time safety at the JNI boundary.
- Idiomatic Patterns: Lazy collection patterns (e.g.,
getNamedChildren()) and strict handling (e.g.,parseStringOrThrow()). - Advanced Query Support: First-class support for Tree-sitter predicates and directives (e.g.,
#eq?,#match?,#set!) directly within the Java API. - Resource Management: Automated native memory management using the Cleaner API with
AutoCloseablesupport.
try (TSParser parser = new TSParser();
TSLanguage json = new TreeSitterJson()) {
parser.setLanguage(json);
// Use parseStringOrThrow for strict null handling
try (TSTree tree = parser.parseStringOrThrow(null, "[1, null]")) {
TSNode rootNode = tree.getRootNode();
// Access children via index
TSNode arrayNode = rootNode.getNamedChild(0);
// Or use the new lazy list pattern for easier iteration
for (TSNode child : arrayNode.getNamedChildren()) {
System.out.println(child.getType());
}
}
}We maintain both official and high-demand community grammars.
| Language | Source | Support Level |
|---|---|---|
| Java, Python, C++, Go, etc. | Official | Upstream grammars, bundled here |
| Kotlin, Zig | Community | Maintained & packaged in this fork |
| Vue, Angular | Community | Extended support for web stack |
We bridge the gap between Java's GC and C's manual memory management using a dual-layered approach:
- AutoCloseable: Primary resources (Parsers, Trees, Cursors) implement
AutoCloseablefor deterministic cleanup viatry-with-resources. - Cleaner API: A
Cleanerfallback ensures that if a Java object is garbage collected without being closed, the underlying native memory is still freed, preventing leaks in long-running processes.
By utilizing JSpecify annotations and Error Prone static analysis, we enforce null-safety across the JNI
boundary. This ensures that the "C-heavy" nature of Tree-Sitter doesn't lead to NullPointerException or JVM crashes in
your Java application.
We use Zig as our C/C++ compiler toolchain. This allows us to produce perfectly matched native binaries for Linux, macOS, and Windows (x86_64 and aarch64) from a single CI environment without complex cross-compilation headers.
# Compile Java and native modules
./gradlew compile
# Build and test all subprojects
./gradlew build
# (Re)generate NodeType/NodeField/NodeSchema sources for a language module
# Produces/updates src/main/java/org/treesitter/<Lang>NodeType.java (+ NodeField/NodeSchema) from upstream node-types.json
./gradlew :tree-sitter-tsx:generateNodeTypes
# (Re)generate NodeType/NodeField/NodeSchema sources for all language modules
./gradlew :generateNodeTypesMost upstream tree-sitter grammars publish a node-types.json file describing supported node types. This repo can
generate and ship:
org.treesitter.<Lang>NodeType— named node types (enum)org.treesitter.<Lang>NodeField— field names (enum)org.treesitter.<Lang>NodeSchema— lightweight schema helpers derived fromnode-types.json
Example (TSX): org.treesitter.TsxNodeType.ABSTRACT_CLASS_DECLARATION.getType().equals("abstract_class_declaration").
You can also do TsxNodeType.from(node) / TsxNodeType.fromType(node.getType()) which return TsxNodeType.__NULL__
for null input (or unknown types).
Field/schema usage (TSX):
Set<TsxNodeField> possibleFields = TsxNodeSchema.fields(TsxNodeType.FUNCTION_DECLARATION);
Set<TsxNodeType> allowedNameTypes = TsxNodeSchema.allowedTypes(TsxNodeType.FUNCTION_DECLARATION, TsxNodeField.NAME);
boolean isNameRequired = TsxNodeSchema.isRequired(TsxNodeType.FUNCTION_DECLARATION, TsxNodeField.NAME);
Set<TsxNodeType> allowedChildTypes = TsxNodeSchema.allowedChildTypes(TsxNodeType.FUNCTION_DECLARATION);Generation is done via the Gradle task :tree-sitter-<lang>:generateNodeTypes and the generated sources are checked in
under each subproject’s src/main/java so they ship in published artifacts without requiring codegen at consumer build
time.
The project distinguishes between the Java library version (libVersion) and the upstream grammar version (
upstreamVersion). We are not currently working on Maven Central publishing. For now, we provide a pre-bundled ZIP
to ensure all native binaries are perfectly matched to the library version.
We use a lockstep versioning strategy for releases. This means that every module in the repository shares the exact
same libVersion (e.g., 0.1.0).
When a new release is cut, all modules are published with this new version number, regardless of whether their specific
parser or upstream grammar changed.
This provides a simple and predictable experience: you only ever need to specify one version number for all
ai.brokk:tree-sitter-* dependencies in your build file, and they are guaranteed to be perfectly compatible with each
other.
When building or publishing a new release of the Java bindings, specify the libVersion:
# Build with version
./gradlew build -PlibVersion=0.1.0
# Publish with version
./gradlew publish -PlibVersion=0.1.0The upstreamVersion is managed in each subproject's gradle.properties and controls which version of the native
tree-sitter C code is downloaded and compiled.
Note: Native binaries are generated into
src/main/resources/libduring the build process and are ignored by Git. They are built automatically in CI and do not need to be committed to the repository.
- Wide Compatibility: Low Java 11 minimum requirement.
- 100% Tree Sitter API coverage.
- Easy to bootstrap cross compiling environments powered by Zig.
- Built-in official parsers.
- Load parsers as shared object from disk.
- x86_64-windows
- x86_64-macos
- aarch64-macos
- x86_64-linux
- aarch64-linux
Currently, we distribute perfectly matched native binaries via a pre-bundled ZIP to avoid Git history bloat. For full
instructions on how to automate fetching and caching these dependencies via Gradle flatDir, please see
our Installation Guide.
Want to add a new community grammar? Check out our Guide to Adding Parsers to see how our code-generation task handles the boilerplate.
| Name | Grammar Version | Source |
|---|---|---|
tree-sitter-agda |
1.3.3 |
official |
tree-sitter-angular |
0.8.3 |
community |
tree-sitter-bash |
0.25.1 |
official |
tree-sitter-c |
0.24.1 |
official |
tree-sitter-c-sharp |
0.23.1 |
official |
tree-sitter-cpp |
0.23.4 |
official |
tree-sitter-css |
0.25.0 |
official |
tree-sitter-embedded-template |
0.25.0 |
official |
tree-sitter-go |
0.25.0 |
official |
tree-sitter-haskell |
0.23.1 |
official |
tree-sitter-html |
0.23.2 |
official |
tree-sitter-java |
0.23.5 |
official |
tree-sitter-javascript |
0.25.0 |
official |
tree-sitter-json |
0.24.8 |
official |
tree-sitter-julia |
0.25.0 |
official |
tree-sitter-kotlin |
0.3.8 |
community |
tree-sitter-ocaml |
0.23.2 |
official |
tree-sitter-php |
0.24.2 |
official |
tree-sitter-python |
0.25.0 |
official |
tree-sitter-regex |
0.25.0 |
official |
tree-sitter-ruby |
0.23.1 |
official |
tree-sitter-rust |
0.24.0 |
official |
tree-sitter-scala |
0.24.0 |
official |
tree-sitter-tsx |
0.23.2 |
official |
tree-sitter-typescript |
0.23.2 |
official |
tree-sitter-verilog |
1.0.3 |
official |
tree-sitter-vue |
ce8011a4 |
community |
tree-sitter-zig |
6479aa13 |
community |
class Main {
public static void main(String[] args) throws Exception {
String jsonSource = "[1, null]";
// TSParser, TSLanguage, TSTree, TSQuery, TSQueryCursor, TSTreeCursor implement AutoCloseable.
// They are also registered in the Cleaner, but explicit closing via try-with-resources is recommended.
try (TSParser parser = new TSParser();
TSLanguage json = new TreeSitterJson()) {
// Set language parser
parser.setLanguage(json);
// Parse with string input
try (TSTree tree = parser.parseString(null, jsonSource)) {
assert tree != null;
parser.reset();
// Or parse with encoding
try (TSTree tree2 = parser.parseStringEncoding(null, jsonSource, TSInputEncoding.TSInputEncodingUTF8)) {
// ...
}
parser.reset();
// Or parse with custom reader
byte[] buffer = new byte[1024];
TSReader reader = (buf, offset, position) -> {
byte[] sourceBytes = jsonSource.getBytes(StandardCharsets.UTF_8);
if (offset >= sourceBytes.length) {
return 0;
}
ByteBuffer byteBuffer = ByteBuffer.wrap(buf);
byteBuffer.put(sourceBytes);
return sourceBytes.length;
};
try (TSTree tree3 = parser.parse(buffer, null, reader, TSInputEncoding.TSInputEncodingUTF8)) {
assert tree3 != null;
}
// Traverse the AST tree with DOM-like APIs
TSNode rootNode = tree.getRootNode();
// Access children as a standard Java List
List<TSNode> children = rootNode.getChildren();
TSNode arrayNode = rootNode.getNamedChild(0);
// Or traverse the AST with cursor
try (TSTreeCursor rootCursor = new TSTreeCursor(rootNode)) {
rootCursor.gotoFirstChild();
}
// Or query the AST with S-expression using modern Stream API
try (TSQuery query = new TSQuery(json, "((document) @root)");
TSQueryCursor cursor = new TSQueryCursor()) {
cursor.exec(query, rootNode);
// Use .stream() for functional patterns.
// Note: use .copy() if you need to collect matches, as the cursor reuses match objects!
List<TSQueryMatch> matches = cursor.stream()
.map(TSQueryMatch::copy)
.collect(Collectors.toList());
// Or use the enhanced for-loop (Iterable)
for (TSQueryMatch match : cursor) {
System.out.println("Pattern index: " + match.getPatternIndex());
}
}
// Debug the parser with a logger
TSLogger logger = (type, message) -> {
System.out.println(message);
};
parser.setLogger(logger);
// Or output the AST tree as DOT graph
File dotFile = File.createTempFile("json", ".dot");
parser.printDotGraphs(dotFile);
}
}
}
}