Skip to content

Add query timing tooling into toolbox#1

Open
frankhereford wants to merge 7 commits intomainfrom
frank/db-profiling-tooling
Open

Add query timing tooling into toolbox#1
frankhereford wants to merge 7 commits intomainfrom
frank/db-profiling-tooling

Conversation

@frankhereford
Copy link
Copy Markdown
Member

@frankhereford frankhereford commented Apr 29, 2026

This is a tool developed in support of issue cityofaustin/atd-data-tech#26858 where we have developed a system to tune our local database and also optimized the production database.

Local tuning

There is a PR open now showing an example of when this tool was used to optimize a set of DB runtime parameters.

Testing

You're welcome to check out the tool and profile some databases, but you can also observe that this tool was used in the completion of the above issue and seems to have done the trick.

- Introduced logging configuration and integrated logging throughout the benchmarking process.
- Updated default PostgreSQL port from 5431 to 5432.
- Added command-line argument for setting the logging level and connection timeout.
- Enhanced logging for better visibility into benchmark execution and errors.
- Clarified instructions on overriding `PG*` environment variables at runtime, specifying the use of environment or argument flags.
- Removed redundant details about running the benchmark against a specific Postgres instance.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a dockerized database benchmarking helper under toolbox/ to time representative Vision Zero queries/views and support Postgres tuning work tied to issue #26858.

Changes:

  • Introduces a Python (psycopg) benchmark runner with optional interactive loop UI.
  • Adds a default SQL workload (timing_queries.sql) for timing.
  • Adds Docker/Docker Compose scaffolding plus README usage/tuning notes.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
toolbox/db_benchmark/benchmark_db.py Implements timed query execution, cache stats collection, and optional curses loop UI
toolbox/db_benchmark/timing_queries.sql Provides a default set of queries/views to benchmark
toolbox/db_benchmark/docker-compose.yaml Defines the benchmark runner service and connection env vars
toolbox/db_benchmark/Dockerfile Builds a minimal Python image to run the benchmark script
toolbox/db_benchmark/requirements.txt Adds psycopg[binary] dependency
toolbox/db_benchmark/README.md Documents quick start, targeting, and tuning-related parameters
toolbox/db_benchmark/.gitignore Ignores Python bytecode cache

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread toolbox/db_benchmark/README.md Outdated
Comment thread toolbox/db_benchmark/benchmark_db.py Outdated
Comment on lines +116 to +121
start = perf_counter()
cur.execute(sql_text)
logger.info("SQL execution finished, fetching rows")
rows = cur.fetchall()
end = perf_counter()
logger.info("Fetched %s rows, collecting cache stats", len(rows))
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using cur.fetchall() will materialize the entire result set in memory, which can OOM the container and also makes the timing heavily dependent on client-side fetch/memory overhead (especially for the large export views in timing_queries.sql). Consider iterating with fetchmany()/cursor.stream() to count rows without storing them, and decide explicitly whether you want to time server execution only or execution+fetch.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be an interesting future experiment.

Comment thread toolbox/db_benchmark/benchmark_db.py Outdated
Comment thread toolbox/db_benchmark/benchmark_db.py Outdated
- Removed the deprecated `--loop` argument and integrated its functionality directly into the main execution flow.
- Updated README.md to reflect the continuous execution of SQL statements in an interactive terminal UI, including instructions for quitting the loop.
- Updated hit ratio calculation to handle division by zero using NULLIF.
- Refactored the run loop to remove the interval_seconds variable, allowing continuous execution without configurable delays.
- Improved logging messages for better clarity on loop iterations and results.
- Updated README.md to clarify that the benchmark runs SQL statements back-to-back without a delay.
@frankhereford frankhereford changed the title Add profile tooling into toolbox Add query profile tooling into toolbox May 7, 2026
@frankhereford frankhereford changed the title Add query profile tooling into toolbox Add query timing tooling into toolbox May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants