Summary
Create TestRunner.Report module with shared markdown report generation functions for both JSON and Lisp test runners. This enables consistent report output format across both DSLs and eliminates code duplication.
Context
Architecture reference: Test Runner Refactoring Plan - see "Module Responsibilities" section for PtcDemo.TestRunner.Report
Dependencies: None (Phase 1 tasks can be done in parallel)
Related issues: #196 (TestRunner.Base - completed), Epic #195
Current State
Report generation is currently implemented in LispTestRunner (lines 679-798):
write_report/2 - writes report content to file
generate_report/1 - builds full markdown report from summary
generate_results_table/1 - creates markdown table of all test results
generate_failed_details/1 - formats detailed failure information
generate_all_programs_section/1 - lists all programs attempted per test
generate_test_programs/1 - helper for single test program listing
format_timestamp/1 - formats DateTime for report header
The JSON TestRunner has no report generation capability.
Note: LispTestRunner also has private implementations of format_cost/1, format_duration/1, truncate/2, and format_attempt_result/1 (lines 652-675) that duplicate the functions in TestRunner.Base. The Report module should use the Base functions rather than reimplementing these.
Acceptance Criteria
Implementation Hints
Files to create:
demo/lib/ptc_demo/test_runner/report.ex - new module with report generation
Code to extract from LispTestRunner:
- Lines 679-798 in
demo/lib/ptc_demo/lisp_test_runner.ex
- Change
write_report/2 to write_report/3 with DSL name parameter
- Change
generate_report/1 to generate_report/2 with DSL name parameter
- Report title should use DSL name:
# PTC-#{dsl_name} Test Report
Patterns to follow:
- Follow
TestRunner.Base module structure (see demo/lib/ptc_demo/test_runner/base.ex)
- Use
@moduledoc and @doc with examples
- Add
@spec for all public functions
Reuse from TestRunner.Base:
alias PtcDemo.TestRunner.Base
# Use these instead of reimplementing:
Base.format_cost/1 # for cost formatting
Base.format_duration/1 # for duration formatting
Base.truncate/2 # for string truncation
Base.format_attempt_result/1 # for formatting program results
Summary map structure expected (from Base.build_summary/5):
%{
passed: integer(),
failed: integer(),
total: integer(),
total_attempts: integer(),
duration_ms: integer(),
model: String.t(),
data_mode: atom(),
results: [result_map()],
stats: %{total_tokens: integer(), total_cost: float()},
timestamp: DateTime.t()
}
Result map structure expected (each item in results list):
%{
index: integer(),
query: String.t(),
passed: boolean(),
attempts: integer(),
program: String.t() | nil, # final successful program
all_programs: [{String.t(), any()}], # all attempted {program, result} pairs
error: String.t() | nil, # error message if failed
description: String.t(),
constraint: tuple()
}
Edge Cases
- Empty results list: Should generate valid report with "0/0 passed" and empty tables
- All tests passed: Should omit "Failed Tests" section (currently returns empty string)
- Results without
:all_programs key: Graceful handling with "(no programs)" fallback (already implemented in source)
- Results without
:program key: Use "-" as fallback (already implemented in source)
- DSL name capitalization: Accept as-is (caller provides "JSON" or "Lisp")
Test Plan
Unit tests: Deferred to Phase 2, task 2.3 (separate issue)
Manual verification:
- Module compiles without warnings:
mix compile --warnings-as-errors
- Functions are callable: verify in
iex -S mix that functions exist and accept correct arities
- Generate sample report:
alias PtcDemo.TestRunner.Report
summary = %{
passed: 1, failed: 1, total: 2,
total_attempts: 3, duration_ms: 5000,
model: "test-model", data_mode: :schema,
stats: %{total_tokens: 100, total_cost: 0.01},
timestamp: DateTime.utc_now(),
results: [
%{index: 1, query: "Count items", passed: true, attempts: 1,
program: "(count items)", all_programs: [{"(count items)", 5}],
description: "Should count", constraint: {:eq, 5}},
%{index: 2, query: "Sum values", passed: false, attempts: 2,
program: nil, all_programs: [{"(sum x)", {:error, "undefined"}}],
error: "Expected 10, got 5", description: "Should sum",
constraint: {:eq, 10}}
]
}
IO.puts(Report.generate_report(summary, "Test"))
Out of Scope
- Refactoring
LispTestRunner to use this module (Phase 2, task 2.1)
- Unit tests for Report module (Phase 2, task 2.3)
- HTML or other output formats
- Custom report templates
Documentation Updates
None - internal demo module, no public API docs required
Summary
Create
TestRunner.Reportmodule with shared markdown report generation functions for both JSON and Lisp test runners. This enables consistent report output format across both DSLs and eliminates code duplication.Context
Architecture reference: Test Runner Refactoring Plan - see "Module Responsibilities" section for
PtcDemo.TestRunner.ReportDependencies: None (Phase 1 tasks can be done in parallel)
Related issues: #196 (TestRunner.Base - completed), Epic #195
Current State
Report generation is currently implemented in
LispTestRunner(lines 679-798):write_report/2- writes report content to filegenerate_report/1- builds full markdown report from summarygenerate_results_table/1- creates markdown table of all test resultsgenerate_failed_details/1- formats detailed failure informationgenerate_all_programs_section/1- lists all programs attempted per testgenerate_test_programs/1- helper for single test program listingformat_timestamp/1- formats DateTime for report headerThe JSON
TestRunnerhas no report generation capability.Note:
LispTestRunneralso has private implementations offormat_cost/1,format_duration/1,truncate/2, andformat_attempt_result/1(lines 652-675) that duplicate the functions inTestRunner.Base. The Report module should use theBasefunctions rather than reimplementing these.Acceptance Criteria
PtcDemo.TestRunner.Reportmodule created atdemo/lib/ptc_demo/test_runner/report.exwrite_report/3accepts path, summary, and DSL name to write report filegenerate_report/2accepts summary and DSL name, returns markdown stringgenerate_results_table/1generates consistent table format from resultsgenerate_failed_details/1formats failure details sectiongenerate_all_programs_section/1generates all programs sectionformat_timestamp/1formats DateTime as "YYYY-MM-DD HH:MM:SS UTC"Base.format_cost/1,Base.format_duration/1,Base.truncate/2, andBase.format_attempt_result/1- no duplication@moduledocand@docwith examples@spectype specificationsImplementation Hints
Files to create:
demo/lib/ptc_demo/test_runner/report.ex- new module with report generationCode to extract from LispTestRunner:
demo/lib/ptc_demo/lisp_test_runner.exwrite_report/2towrite_report/3with DSL name parametergenerate_report/1togenerate_report/2with DSL name parameter# PTC-#{dsl_name} Test ReportPatterns to follow:
TestRunner.Basemodule structure (seedemo/lib/ptc_demo/test_runner/base.ex)@moduledocand@docwith examples@specfor all public functionsReuse from TestRunner.Base:
Summary map structure expected (from
Base.build_summary/5):Result map structure expected (each item in
resultslist):Edge Cases
:all_programskey: Graceful handling with "(no programs)" fallback (already implemented in source):programkey: Use "-" as fallback (already implemented in source)Test Plan
Unit tests: Deferred to Phase 2, task 2.3 (separate issue)
Manual verification:
mix compile --warnings-as-errorsiex -S mixthat functions exist and accept correct aritiesOut of Scope
LispTestRunnerto use this module (Phase 2, task 2.1)Documentation Updates
None - internal demo module, no public API docs required