Skip to content

[Story] LSP Phase 6: Incremental linting #47

@ekropotin

Description

@ekropotin

User Story

As a developer working with large Markdown files, I want linting to update incrementally as I type, so I get fast feedback without performance lag.

Acceptance Criteria

  • Only re-lints affected portions of document on changes
  • Maintains sub-100ms response time for typical edits
  • Correctly handles multi-line changes and structural modifications
  • Preserves diagnostics for unchanged document sections
  • Handles edge cases (insertions, deletions, replacements)
  • Thread-safe architecture supporting concurrent requests

Additional Context

Technical Implementation Details:

Critical Architecture Refactoring Required:
This phase requires the fundamental architectural changes outlined in the LSP implementation plan:

  1. Replace Single-Use Contract: Transition from disposable MultiRuleLinter to persistent, stateful linting
  2. Thread-Safe Context: Replace Rc<Context> with Arc<Context> for concurrent access
  3. Incremental Parsing: Implement tree-sitter incremental parsing infrastructure

New Stateful Architecture:

pub struct PersistentMultiRuleLinter {
    file_cache: Arc<DashMap<PathBuf, FileState>>,
    config: Arc<RwLock<QuickmarkConfig>>,
    rule_registry: &'static [Rule],
}

pub struct FileState {
    content: String,
    version: i32,
    parse_tree: tree_sitter::Tree,
    last_analysis: SystemTime,
    cached_violations: Vec<RuleViolation>,
    dirty_ranges: Vec<Range>,  // Tracks what needs re-linting
}

pub trait StatefulRuleLinter: Send + Sync {
    fn analyze_document(&self, context: &Context) -> Vec<RuleViolation>;
    fn analyze_changes(&self, context: &Context, changes: &[TextEdit]) -> Vec<RuleViolation>;
    fn invalidate_range(&mut self, range: Range); // Mark range for re-analysis
}

Thread-Safe Context:

pub struct Context {
    pub file_path: PathBuf,
    pub config: Arc<QuickmarkConfig>,
    pub lines: Arc<Vec<String>>,  // Immutable, shareable
    pub node_cache: Arc<HashMap<String, Vec<NodeInfo>>>,
    pub document_content: Arc<String>,
    pub parse_tree: Arc<tree_sitter::Tree>,
}

Incremental Parsing Strategy:

impl FileState {
    pub fn apply_changes(&mut self, changes: &[TextDocumentContentChangeEvent]) -> Vec<Range> {
        let mut affected_ranges = Vec::new();
        
        for change in changes {
            // Apply text change to content
            self.content = apply_text_change(&self.content, change);
            
            // Use tree-sitter incremental parsing
            let old_tree = self.parse_tree.clone();
            self.parse_tree = self.parser.parse(&self.content, Some(&old_tree))
                .expect("Incremental parse failed");
            
            // Calculate affected range for re-linting
            let affected_range = self.calculate_affected_range(change, &old_tree);
            affected_ranges.push(affected_range);
            self.dirty_ranges.push(affected_range);
        }
        
        affected_ranges
    }
    
    fn calculate_affected_range(&self, change: &TextDocumentContentChangeEvent, 
                               old_tree: &Tree) -> Range {
        // Analyze tree-sitter change detection
        // Expand range to include structural dependencies (e.g., list items, headings)
        // Account for rule-specific dependencies
    }
}

Smart Caching Strategy:

impl FileState {
    fn merge_violations(&mut self, 
                       new_violations: Vec<RuleViolation>, 
                       affected_ranges: &[Range]) -> Vec<RuleViolation> {
        // Remove old violations in affected ranges
        self.cached_violations.retain(|v| {
            \!affected_ranges.iter().any(|range| range.contains(v.range))
        });
        
        // Add new violations
        self.cached_violations.extend(new_violations);
        
        // Clear dirty ranges
        self.dirty_ranges.clear();
        
        self.cached_violations.clone()
    }
}

Performance Optimizations:

  • Range-based rule filtering: Only run rules that can be affected by the changed ranges
  • AST node caching: Cache frequently accessed node types between edits
  • Violation diffing: Only send diagnostic updates for changed violations
  • Debounced analysis: Batch rapid successive changes before re-linting

Change Event Handling:

impl LspServer {
    async fn handle_did_change(&mut self, params: DidChangeTextDocumentParams) {
        let uri = params.text_document.uri;
        let changes = params.content_changes;
        
        // Incremental linting
        let violations = self.linter.lint_incremental(&uri, &changes).await;
        
        // Convert to diagnostics and publish only if changed
        let new_diagnostics = self.violations_to_diagnostics(violations);
        if self.diagnostics_changed(&uri, &new_diagnostics) {
            self.publish_diagnostics(uri, new_diagnostics).await;
        }
    }
}

Dependencies Added:

tree-sitter = { version = "0.20", features = ["incremental"] }
rayon = "1.7"          # Parallel rule execution
tokio = { version = "1", features = ["time"] }  # Debouncing

Migration Strategy:
This phase requires significant refactoring of the core linting architecture. Consider implementing in stages:

  1. Thread-safe context and stateful linters
  2. Basic incremental parsing without optimization
  3. Smart caching and range-based rule filtering
  4. Performance optimizations and fine-tuning

This is Phase 6 of a 6-phase LSP implementation plan for QuickMark.

Metadata

Metadata

Assignees

No one assigned

    Labels

    status: triageNeeds review and prioritizationtype: storyNew feature or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions