Skip to content
Change the repository type filter

All

    Repositories list

    • baxbench

      Public
      Python
      218821Updated Jan 30, 2026Jan 30, 2026
    • swe-star

      Public
      Python
      0300Updated Jan 27, 2026Jan 27, 2026
    • swt-bench

      Public
      [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation
      Python
      1570120Updated Jan 15, 2026Jan 15, 2026
    • insights

      Public
      We track and analyze the activity and performance of autonomous code agents in the wild
      TypeScript
      44900Updated Dec 5, 2025Dec 5, 2025
    • Heavily compressed docker images for SWE Bench Verified
      Go
      1400Updated Oct 1, 2025Oct 1, 2025
    • SWEBench

      Public
      SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?
      Python
      754000Updated Jul 30, 2025Jul 30, 2025
    • tests

      Public
      0000Updated May 5, 2025May 5, 2025