This file provides context for Claude Code sessions working on this codebase.
Duckgres is a PostgreSQL wire protocol server backed by DuckDB. It allows any PostgreSQL client (psql, pgAdmin, lib/pq, psycopg2, JDBC, etc.) to connect and execute queries against DuckDB databases.
Duckgres supports three run modes: standalone (default), control-plane, and duckdb-service.
Standalone: PostgreSQL Client → TLS → Duckgres Server → DuckDB (per-user database)
Control Plane: PostgreSQL Client → TLS/Auth/PG Protocol → Control Plane → Flight SQL (UDS) → Worker (DuckDB)
- main.go: Entry point, configuration loading (CLI flags, env vars, YAML), mode routing
- server/server.go: Server struct, connection handling, graceful shutdown,
CreateDBConnection()(standalone function) - server/conn.go: Client connection handling, query execution, COPY protocol
- server/protocol.go: PostgreSQL wire protocol message encoding/decoding
- server/exports.go: Exported wrappers for protocol functions (used by control plane workers)
- server/catalog.go: pg_catalog compatibility views and macros initialization
- server/types.go: Type OID mapping between DuckDB and PostgreSQL
- server/ratelimit.go: Rate limiting for brute-force protection
- server/certs.go: Auto-generation of self-signed TLS certificates
- server/sysinfo.go: System memory detection and auto memory limit computation
- server/parent.go: Child process spawning for ProcessIsolation mode
- server/worker.go: Per-connection child worker (ProcessIsolation mode)
- transpiler/: AST-based SQL transpiler (PostgreSQL → DuckDB)
transpiler.go: Main API, transform pipeline orchestrationconfig.go: Configuration types (DuckLakeMode, ConvertPlaceholders)transform/: Individual transform implementations
- controlplane/: Multi-process control plane architecture
control.go: Control plane main loop (TCP listener, TLS, auth, PG protocol, SQL transpilation, connection routing)worker_mgr.go: Flight SQL worker pool management (spawn, health check, least-connections routing, rolling update)session_mgr.go: Session lifecycle management (maps PG connections to Flight SQL sessions on workers)handover.go: Graceful deployment (listener FD transfer between control planes)sdnotify.go: systemd sd_notify integrationvalidation.go: Configuration validation
- duckdbservice/: Standalone DuckDB Arrow Flight SQL service (used as worker in control-plane mode)
service.go: Flight SQL server lifecycle, gRPC setupflight_handler.go: Arrow Flight SQL handler (DoPut, DoGet, GetFlightInfo, session management)arrow_helpers.go: Arrow/DuckDB type mapping and conversionauth.go: Bearer token authentication middlewareconfig.go: Service configuration (listen addr, bearer token, max sessions)
The server implements the PostgreSQL v3 protocol:
- Frontend (client→server): Query, Parse, Bind, Describe, Execute, Sync, Close, CopyData, CopyDone
- Backend (server→client): AuthOK, RowDescription, DataRow, CommandComplete, ReadyForQuery, CopyInResponse, CopyOutResponse
- Client sends Query message ('Q')
- Server parses SQL, rewrites pg_catalog references
- Server executes via DuckDB's database/sql driver
- Server sends RowDescription + DataRow messages
- Server sends CommandComplete + ReadyForQuery
Supports prepared statements (Parse/Bind/Execute) for parameterized queries and binary result formats.
psql and other clients expect PostgreSQL system catalogs. We provide compatibility by:
-
Creating views in main schema (server/catalog.go
initPgCatalog()):pg_database,pg_class_full,pg_collation,pg_policy,pg_rolespg_statistic_ext,pg_publication,pg_publication_rel,pg_inherits, etc.
-
Creating macros for PostgreSQL functions (server/catalog.go):
pg_get_userbyid,pg_table_is_visible,format_type,pg_get_exprobj_description,col_description,pg_get_indexdef, etc.
-
AST-based SQL transpilation (transpiler/ package): The transpiler parses PostgreSQL SQL into an AST using pg_query_go (PostgreSQL's C parser), applies transforms, and deparses back to DuckDB-compatible SQL. Transforms include:
- PgCatalogTransform:
pg_catalog.pg_class→pg_class_full, strips schema prefix from functions - TypeCastTransform:
::pg_catalog.regtype→::VARCHAR - VersionTransform:
version()→ PostgreSQL-compatible version string - SetShowTransform: Converts SET/SHOW commands, marks ignored parameters
- DDLTransform: (DuckLake mode) Strips PRIMARY KEY, UNIQUE, REFERENCES, SERIAL types
- PlaceholderTransform: Counts $1, $2 parameters for prepared statements
- PgCatalogTransform:
Supports bulk data transfer:
- COPY TO STDOUT: Streams query results to client
- COPY FROM STDIN: Receives data from client, inserts row by row
- Supports CSV format with HEADER, DELIMITER, and NULL options
- standalone (default): Single process, handles everything including TLS, auth, PG protocol, and DuckDB execution.
- control-plane: Multi-process. The control plane owns client connections end-to-end (TLS, auth, PG wire protocol, SQL transpilation) and routes queries to a pool of Flight SQL worker processes over Unix sockets.
- duckdb-service: Thin DuckDB execution engine exposed via Arrow Flight SQL. Spawned automatically by the control plane as worker processes, or run standalone for testing.
Key CLI flags for control plane mode:
--mode control-plane|duckdb-service|standalone--min-workers N(default 0, pre-warm workers at startup)--max-workers N(default 0 = unlimited)--memory-budget SIZE(e.g., "24GB", default: 75% system RAM)--socket-dir /path(Unix sockets for Flight SQL workers)--handover-socket /path(graceful deployment between control planes)
Key CLI flags for duckdb-service mode:
--duckdb-listen(listen address, e.g.,unix:///var/run/duckgres/duckdb.sockor:8816)--duckdb-token(bearer token for authentication)--duckdb-max-sessions(max concurrent sessions, 0=unlimited)
Three-tier configuration (highest to lowest priority):
- CLI flags (
--port,--config,--mode, etc.) - Environment variables (
DUCKGRES_PORT, etc.) - YAML config file
- Built-in defaults
The project uses just as a command runner. Run just to see all available recipes for building, testing, running, metrics, and scripts.
- Add view creation SQL in
initPgCatalog()incatalog.go - If the view needs query rewriting (e.g.,
pg_catalog.viewname→viewname):- Add mapping in
transpiler/transform/pgcatalog.goinpgCatalogViewMappings
- Add mapping in
- Add
CREATE MACROin thefunctionsslice ininitPgCatalog() - The transpiler automatically strips
pg_catalog.prefix from function calls
- Create a new file in
transpiler/transform/implementing theTransforminterface - Register the transform in
transpiler/transpiler.goNew()function - Add tests in
transpiler/transpiler_test.go
- Add message type constant in
protocol.go - Add write function (e.g.,
writeCopyData()) - Handle in message loop in
conn.go
github.com/duckdb/duckdb-go/v2- DuckDB Go drivergithub.1485827954.workers.dev/pganalyze/pg_query_go/v6- PostgreSQL SQL parser (CGO, uses libpg_query)gopkg.in/yaml.v3- YAML config parsing
- No replication
- Some pg_catalog tables are stubs (return empty)
- Unmapped DuckDB types (MAP, STRUCT, UNION, ENUM, BIT) fall back to OidText
- DML RETURNING is not supported via extended query protocol (see below)
DML statements with RETURNING clauses produce result rows but cannot be described without executing the mutation. The extended query protocol's Describe step probes schema by executing the query, which would cause unintended side effects. We reject these at Describe time with SQLSTATE 0A000 (feature_not_supported).
Three functions form the detection chain:
scanForReturning(upper, topLevelOnly)— SQL-aware lexer that scans for the RETURNING keyword while skipping strings (single-quoted, E-strings, dollar-quoted), double-quoted identifiers, block/line comments, and tracking parenthesis depth.containsReturning(upper)— wrapper that matches RETURNING at depth 0 only. Used for plain DML (INSERT/UPDATE/DELETE prefix).containsReturningAnyDepth(upper)— wrapper that matches RETURNING at any depth. Used for WITH-prefixed queries because writable CTEs place RETURNING insideAS (...)parens.isDMLReturning(query)— top-level guard called fromhandleDescribe. Routes to depth-0 or any-depth scanning based on the query prefix.
In writable CTEs, RETURNING is syntactically required inside the CTE body:
WITH d AS (DELETE FROM t RETURNING *) SELECT * FROM d
-- ^^^^^^^^^ depth 1, inside AS (...)Depth-0-only scanning structurally cannot detect this. The any-depth scan accepts a small false-positive risk (a column literally named returning in a CTE) in exchange for preventing mutation during Describe.
stripLeadingNoise(query)— loopsstripLeadingComments+TrimLeftto handle interleaved parentheses, whitespace, and comments before the query keyword.queryReturnsResults(query)— determines whether a query produces result rows (SELECT, WITH, VALUES, SHOW, DML RETURNING, etc.). Gates whether Describe attempts schema probing at all.
| Case | Handling |
|---|---|
RETURNING inside subquery (SELECT returning FROM s) |
Skipped (depth > 0 for plain DML) |
RETURNING in string literal 'returning' |
Skipped by string scanner |
RETURNING in E-string E'foo\'s RETURNING bar' |
Skipped by backslash-escape-aware scanner |
RETURNING in $$/$tag$ dollar-quoted string |
Skipped by dollar-quote scanner |
RETURNING in "double-quoted" identifier |
Skipped by identifier scanner |
| RETURNING in block/line comments | Skipped by comment scanner |
RETURNING*, RETURNING(id), RETURNING,col |
Matched (trailing *, (, , accepted) |
RETURNING_ID, RETURNINGS |
Not matched (identifier continuation) |
| Writable CTE with RETURNING in AS (...) | Matched via any-depth scan for WITH prefix |
Parenthesized queries with newlines (\nSELECT 1) |
Handled by stripLeadingNoise |
- False negatives are dangerous — the Describe path executes the query, causing unintended mutations. Err on the side of false positives.
- False positives are safe — the client gets an error but no data corruption. A column named
returningtriggering the guard is acceptable. - All detection is heuristic (string scanning). If precision becomes critical, consider using
pg_query_goAST parsing instead. - LIMIT 0 does NOT prevent CTE side effects — PostgreSQL CTEs are optimization fences, so writable CTEs execute even with LIMIT 0.
- DuckDB does not currently support MERGE. If it adds MERGE RETURNING in the future, add
MERGEto the prefix check inisDMLReturning.
See TODO.md for the full feature roadmap and known issues.