Skip to content

[Phase 5] Add typo suggestions for unknown operations #43

@andreasronge

Description

@andreasronge

Summary

Add typo suggestions to validation error messages using Jaro-Winkler string similarity. When a user provides an unknown operation or field name, the error message will suggest the closest valid alternatives (e.g., "Unknown operation 'filer'. Did you mean 'filter'?"). This helps LLMs self-correct their generated programs.

Context

Architecture reference: Phase 5: Polish - "Error messages optimized for LLM consumption", "Validation with helpful suggestions"
Dependencies: None - Phase 1-4 are complete
Related issues: None

Current State

Verified via codebase analysis:

Unknown operations (lib/ptc_runner/validator.ex:25):

"Unknown operation '#{op}'"

Unknown comparison fields (lib/ptc_runner/validator.ex:204):
When field name typos occur, no suggestion is given - just a generic error.

Existing validation covers all operations (literal, load, var, let, if, and, or, not, merge, concat, zip, pipe, filter, map, select, eq, neq, gt, gte, lt, lte, contains, get, sum, count, avg, min, max, first, last, nth, reject, call) but provides no "did you mean?" suggestions.

Acceptance Criteria

  • Unknown operation names suggest the closest valid operation (if similarity > 0.8)
    • Example: "Unknown operation 'filer'. Did you mean 'filter'?"
    • Example: "Unknown operation 'selct'. Did you mean 'select'?"
    • Example: "Unknown operation 'xyz'" (no suggestion if nothing is close)
  • Suggestion uses Jaro-Winkler similarity algorithm (built into Elixir's String module)
  • Only suggests when similarity is high enough (> 0.8 threshold) to avoid confusing suggestions
  • E2E test demonstrates typo correction feedback
  • Existing tests pass
  • No external dependencies added (use Elixir stdlib)

Implementation Hints

Files to modify:

  • lib/ptc_runner/validator.ex - Add suggestion helper function, update unknown operation error
  • test/ptc_runner/validator_test.exs - Add tests for typo suggestions

Patterns to follow:
Use String.jaro_distance/2 which is built into Elixir:

String.jaro_distance("filer", "filter")
# => 0.9555555555555556

Key implementation:

@valid_operations ~w(literal load var let if and or not merge concat zip pipe filter map select eq neq gt gte lt lte contains get sum count avg min max first last nth reject call)

defp suggest_operation(unknown_op) do
  @valid_operations
  |> Enum.map(fn valid -> {valid, String.jaro_distance(unknown_op, valid)} end)
  |> Enum.max_by(fn {_op, score} -> score end)
  |> case do
    {suggested, score} when score > 0.8 -> " Did you mean '#{suggested}'?"
    _ -> ""
  end
end

Edge cases to consider:

  • Very short operation names (1-2 chars) may have misleading similarity scores
  • Case sensitivity: should "FILTER" suggest "filter"? (recommend: yes, lowercase compare)
  • Empty string input: should return no suggestion

Test Plan

Unit tests:

  • "filer" suggests "filter" (common typo)
  • "selct" suggests "select" (missing letter)
  • "filtter" suggests "filter" (extra letter)
  • "xyz" has no suggestion (nothing close)
  • "FILTER" suggests "filter" (case insensitive)
  • "lit" suggests "literal" only if score > 0.8 (may not meet threshold - verify)

E2E test:

test "provides helpful typo suggestion for misspelled operation" do
  program = ~s({"op": "filer", "where": {"op": "eq", "field": "x", "value": 1}})
  assert {:error, {:validation_error, msg}} = PtcRunner.run(program)
  assert msg =~ "Did you mean 'filter'?"
end

Out of Scope

  • Field name suggestions (e.g., "wher" -> "where") - separate issue
  • Required field suggestions (e.g., "literal requires 'value'") - separate issue
  • Path context in error messages (e.g., "at steps[2]") - separate issue
  • Structured JSON error output - separate issue
  • Parse error improvements - separate issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions