📄 Document Classifier & Discrepancy Detector

Problem Statement 4 — Agentic AI System

An LLM-free, autonomous, agentic document analysis system that classifies documents, detects discrepancies, and computes a confidence-aware alignment score using deterministic multi-step reasoning.

🚀 Project Overview

This project solves Problem Statement 4: Document Classifier + Discrepancy Detector.

Given 3–5 short documents (PDF or text), the system autonomously:

Reads and preprocesses documents
Extracts structured factual claims
Compares claims across documents
Detects numeric and semantic contradictions
Computes:
- Alignment Score
- Confidence Score
- Explainable reasoning trace

The system follows a plan → act → evaluate → refine loop and does not rely on LLMs or prompt-based generation.

❗ Key Clarification (Very Important)

This system does NOT use any Large Language Model (LLM).

It is a deterministic, rule-based agentic pipeline built using:

Regex-based fact extraction
Structured reasoning
Mathematical evaluation

This design is intentional and optimized for:

Auditability
Reliability
Zero hallucination
Enterprise and financial use-cases

🧠 Why This Is NOT a Prompt-Only Project

This project explicitly avoids:

❌ One-shot prompt → output
❌ Chatbot-style demos
❌ Probabilistic hallucinations
❌ Black-box reasoning
❌ Static or linear pipelines

Instead, it implements:

✅ Autonomous agent orchestration
✅ Multi-step reasoning loop
✅ Internal state & memory
✅ Self-evaluation & stopping criteria
✅ Fully explainable decision trace

🏗️ Agentic Architecture

ASCII diagrams are intentionally used for clarity and judge readability.

User Uploads Documents
        ↓
FastAPI API Layer
        ↓
OrchestratorAgent
        ↓
┌─────────────────────────────┐
│ PlannerAgent                │
│ (decides strategy &         │
│ required entities)          │
└──────────────┬──────────────┘
               ↓
┌─────────────────────────────┐
│ Tool Execution Layer        │
│                             │
│ - Summarizer Tool           │
│ - Claim Extraction Tool     │
│ - Comparison Tool           │
│ - Contradiction Tools       │
└──────────────┬──────────────┘
               ↓
┌─────────────────────────────┐
│ EvaluatorAgent              │
│ (confidence calculation &   │
│ self-assessment)            │
└──────────────┬──────────────┘
               ↓
┌─────────────────────────────┐
│ Score Tool                  │
│ (final alignment score)     │
└──────────────┬──────────────┘
               ↓
Final Output

🔄 Agent Execution Flow

Planning
- Planner scans documents
- Determines required entities (e.g. revenue, profit, expansion)
Action Selection
- Agent autonomously selects next action:
  - summarize
  - extract claims
  - compare
  - detect contradictions
  - refine strategy
Tool Execution
- Each action is executed via a dedicated tool
- Outputs stored in shared state
Self-Evaluation
- Agreement, coverage, and contradictions assessed
- Confidence recomputed after each iteration
Iteration / Refinement
- Strategy refined if confidence is low
- Agent stops automatically when confidence stabilizes
Final Scoring
- Alignment score derived from confidence
- Full reasoning trace preserved

📊 Confidence & Alignment Logic

Confidence is computed using interpretable mathematical signals:

Agreement Consistency of extracted entities across documents
Coverage Presence of all required entities
Contradictions Numeric or semantic conflicts reduce confidence

Contradictions override agreement, ensuring conservative and realistic scoring.

📂 Project Structure

Autonomous_Hacks_AI_Hackathon/
│
├── agents/
│   ├── orchestrator.py        # Core agent loop & decisions
│   ├── planner_agent.py       # Strategy planning
│   ├── evaluator_agent.py     # Confidence calculation
│
├── tools/
│   ├── summarizer_tool.py
│   ├── claim_tool.py
│   ├── compare_tool.py
│   ├── detect_tool.py
│   ├── numeric_contradiction_tool.py
│   └── score_tool.py
│
├── core/
│   └── io.py                  # PDF / text ingestion
│
├── api/
│   └── routes.py              # FastAPI endpoints
│
├── ui/
│   └── index.html             # Drag-and-drop UI
│
├── main.py                    # Application entrypoint
├── requirements.txt
└── README.md

🖥️ User Interface

Drag & drop 3–5 documents
Add / remove documents dynamically
Displays:
- Alignment score
- Confidence score
- Agent reasoning timeline

UI is intentionally minimal to highlight logic, reasoning, and explainability.

▶️ How to Run the Project

1️⃣ Create Virtual Environment

Windows

python -m venv env
env\Scripts\activate

Linux / macOS

python3 -m venv env
source env/bin/activate

2️⃣ Install Dependencies

pip install -r requirements.txt

3️⃣ Start the Server

uvicorn main:app --reload

4️⃣ Access the Application

Web UI

http://127.0.0.1:8000

Swagger API Docs

http://127.0.0.1:8000/docs

🧪 Test PDFs (in test dir) & Expected Results

✅ Test Set 1 — Mostly Aligned

Test1_Doc1.pdf
Test1_Doc2.pdf
Test1_Doc3.pdf

Expected: High alignment score (≈ 80–95)

⚠️ Test Set 2 — Numeric Contradiction

Test2_Doc1.pdf
Test2_Doc2.pdf
Test2_Doc3.pdf

Expected: Medium score, numeric conflict detected

❌ Test Set 3 — Direct Contradictions

Test3_Doc1.pdf
Test3_Doc2.pdf
Test3_Doc3.pdf

Expected: Low score, antonym contradictions

🧪 Test Set 4 — Coverage / Missing Claims

Test4_Doc1.pdf
Test4_Doc2.pdf
Test4_Doc3.pdf

Expected: Low–medium score, coverage mismatch

🏁 Final One-Line Summary

An LLM-free, deterministic, agentic document verification system that autonomously extracts, compares, and scores factual consistency across multiple documents with full explainability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 Document Classifier & Discrepancy Detector

Problem Statement 4 — Agentic AI System

🚀 Project Overview

❗ Key Clarification (Very Important)

🧠 Why This Is NOT a Prompt-Only Project

🏗️ Agentic Architecture

🔄 Agent Execution Flow

📊 Confidence & Alignment Logic

📂 Project Structure

🖥️ User Interface

▶️ How to Run the Project

1️⃣ Create Virtual Environment

2️⃣ Install Dependencies

3️⃣ Start the Server

4️⃣ Access the Application

🧪 Test PDFs (in test dir) & Expected Results

✅ Test Set 1 — Mostly Aligned

⚠️ Test Set 2 — Numeric Contradiction

❌ Test Set 3 — Direct Contradictions

🧪 Test Set 4 — Coverage / Missing Claims

🏁 Final One-Line Summary

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
agents		agents
api		api
core		core
test		test
tools		tools
ui		ui
utils		utils
.gitignore		.gitignore
Readme.md		Readme.md
main.py		main.py
requirement.txt		requirement.txt

Folders and files

Latest commit

History

Repository files navigation

📄 Document Classifier & Discrepancy Detector

Problem Statement 4 — Agentic AI System

🚀 Project Overview

❗ Key Clarification (Very Important)

🧠 Why This Is NOT a Prompt-Only Project

🏗️ Agentic Architecture

🔄 Agent Execution Flow

📊 Confidence & Alignment Logic

📂 Project Structure

🖥️ User Interface

▶️ How to Run the Project

1️⃣ Create Virtual Environment

2️⃣ Install Dependencies

3️⃣ Start the Server

4️⃣ Access the Application

🧪 Test PDFs (in test dir) & Expected Results

✅ Test Set 1 — Mostly Aligned

⚠️ Test Set 2 — Numeric Contradiction

❌ Test Set 3 — Direct Contradictions

🧪 Test Set 4 — Coverage / Missing Claims

🏁 Final One-Line Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages