Skip to content

deepanshuiiitv/Autonomous_Hacks_AI_Hackathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📄 Document Classifier & Discrepancy Detector

Problem Statement 4 — Agentic AI System

An LLM-free, autonomous, agentic document analysis system that classifies documents, detects discrepancies, and computes a confidence-aware alignment score using deterministic multi-step reasoning.


🚀 Project Overview

This project solves Problem Statement 4: Document Classifier + Discrepancy Detector.

Given 3–5 short documents (PDF or text), the system autonomously:

  • Reads and preprocesses documents

  • Extracts structured factual claims

  • Compares claims across documents

  • Detects numeric and semantic contradictions

  • Computes:

    • Alignment Score
    • Confidence Score
    • Explainable reasoning trace

The system follows a plan → act → evaluate → refine loop and does not rely on LLMs or prompt-based generation.


❗ Key Clarification (Very Important)

This system does NOT use any Large Language Model (LLM).

It is a deterministic, rule-based agentic pipeline built using:

  • Regex-based fact extraction
  • Structured reasoning
  • Mathematical evaluation

This design is intentional and optimized for:

  • Auditability
  • Reliability
  • Zero hallucination
  • Enterprise and financial use-cases

🧠 Why This Is NOT a Prompt-Only Project

This project explicitly avoids:

  • ❌ One-shot prompt → output
  • ❌ Chatbot-style demos
  • ❌ Probabilistic hallucinations
  • ❌ Black-box reasoning
  • ❌ Static or linear pipelines

Instead, it implements:

  • ✅ Autonomous agent orchestration
  • ✅ Multi-step reasoning loop
  • ✅ Internal state & memory
  • ✅ Self-evaluation & stopping criteria
  • ✅ Fully explainable decision trace

🏗️ Agentic Architecture

ASCII diagrams are intentionally used for clarity and judge readability.

User Uploads Documents
        ↓
FastAPI API Layer
        ↓
OrchestratorAgent
        ↓
┌─────────────────────────────┐
│ PlannerAgent                │
│ (decides strategy &         │
│ required entities)          │
└──────────────┬──────────────┘
               ↓
┌─────────────────────────────┐
│ Tool Execution Layer        │
│                             │
│ - Summarizer Tool           │
│ - Claim Extraction Tool     │
│ - Comparison Tool           │
│ - Contradiction Tools       │
└──────────────┬──────────────┘
               ↓
┌─────────────────────────────┐
│ EvaluatorAgent              │
│ (confidence calculation &   │
│ self-assessment)            │
└──────────────┬──────────────┘
               ↓
┌─────────────────────────────┐
│ Score Tool                  │
│ (final alignment score)     │
└──────────────┬──────────────┘
               ↓
Final Output

🔄 Agent Execution Flow

  1. Planning

    • Planner scans documents
    • Determines required entities (e.g. revenue, profit, expansion)
  2. Action Selection

    • Agent autonomously selects next action:

      • summarize
      • extract claims
      • compare
      • detect contradictions
      • refine strategy
  3. Tool Execution

    • Each action is executed via a dedicated tool
    • Outputs stored in shared state
  4. Self-Evaluation

    • Agreement, coverage, and contradictions assessed
    • Confidence recomputed after each iteration
  5. Iteration / Refinement

    • Strategy refined if confidence is low
    • Agent stops automatically when confidence stabilizes
  6. Final Scoring

    • Alignment score derived from confidence
    • Full reasoning trace preserved

📊 Confidence & Alignment Logic

Confidence is computed using interpretable mathematical signals:

  • Agreement Consistency of extracted entities across documents

  • Coverage Presence of all required entities

  • Contradictions Numeric or semantic conflicts reduce confidence

Contradictions override agreement, ensuring conservative and realistic scoring.


📂 Project Structure

Autonomous_Hacks_AI_Hackathon/
│
├── agents/
│   ├── orchestrator.py        # Core agent loop & decisions
│   ├── planner_agent.py       # Strategy planning
│   ├── evaluator_agent.py     # Confidence calculation
│
├── tools/
│   ├── summarizer_tool.py
│   ├── claim_tool.py
│   ├── compare_tool.py
│   ├── detect_tool.py
│   ├── numeric_contradiction_tool.py
│   └── score_tool.py
│
├── core/
│   └── io.py                  # PDF / text ingestion
│
├── api/
│   └── routes.py              # FastAPI endpoints
│
├── ui/
│   └── index.html             # Drag-and-drop UI
│
├── main.py                    # Application entrypoint
├── requirements.txt
└── README.md

🖥️ User Interface

  • Drag & drop 3–5 documents

  • Add / remove documents dynamically

  • Displays:

    • Alignment score
    • Confidence score
    • Agent reasoning timeline

UI is intentionally minimal to highlight logic, reasoning, and explainability.


▶️ How to Run the Project

1️⃣ Create Virtual Environment

Windows

python -m venv env
env\Scripts\activate

Linux / macOS

python3 -m venv env
source env/bin/activate

2️⃣ Install Dependencies

pip install -r requirements.txt

3️⃣ Start the Server

uvicorn main:app --reload

4️⃣ Access the Application

Web UI

http://127.0.0.1:8000

Swagger API Docs

http://127.0.0.1:8000/docs

🧪 Test PDFs (in test dir) & Expected Results

✅ Test Set 1 — Mostly Aligned

  • Test1_Doc1.pdf
  • Test1_Doc2.pdf
  • Test1_Doc3.pdf

Expected: High alignment score (≈ 80–95)


⚠️ Test Set 2 — Numeric Contradiction

  • Test2_Doc1.pdf
  • Test2_Doc2.pdf
  • Test2_Doc3.pdf

Expected: Medium score, numeric conflict detected


❌ Test Set 3 — Direct Contradictions

  • Test3_Doc1.pdf
  • Test3_Doc2.pdf
  • Test3_Doc3.pdf

Expected: Low score, antonym contradictions


🧪 Test Set 4 — Coverage / Missing Claims

  • Test4_Doc1.pdf
  • Test4_Doc2.pdf
  • Test4_Doc3.pdf

Expected: Low–medium score, coverage mismatch


🏁 Final One-Line Summary

An LLM-free, deterministic, agentic document verification system that autonomously extracts, compares, and scores factual consistency across multiple documents with full explainability.


About

Agentic Document Classifier & Discrepancy Detector

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors