Skip to content

An intelligent CV tailoring system that optimizes resumes for specific job postings using advanced LLM technology and cosine similarity analysis.

Notifications You must be signed in to change notification settings

Creative-Geek/Knitty

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

17 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Knitty: Resume Factory

Knitty Logo
An intelligent CV tailoring system that optimizes resumes for specific job postings using advanced LLM technology and cosine similarity analysis.

โš ๏ธ ALPHA VERSION WARNING โš ๏ธ

This entire project is currently in ALPHA stage and should be considered experimental. Both the FastAPI backend and Streamlit GUI are under active development and not recommended for production use. Use at your own risk for testing and experimentation purposes only.

๐Ÿš€ Overview

Knitty is a production-ready AI-powered CV tailoring system that optimizes resumes for specific job postings. The system analyzes your existing CV alongside a target job posting, then generates an optimized version that better aligns with the job requirements while maintaining authenticity and professional quality.

You can access Knitty through:

  • โš™๏ธ FastAPI Backend (โš ๏ธ ALPHA): RESTful API for programmatic access and integration
  • ๐Ÿ–ฅ๏ธ Streamlit GUI (โš ๏ธ ALPHA): User-friendly web interface for interactive CV enhancement

โš ๏ธ Both interfaces are experimental and in active development.

โœจ Key Features

  • ๐Ÿ”„ Multi-LLM Architecture: Leverages four specialized AI models for different tasks
  • ๐Ÿ“„ PDF Processing: Extracts and processes CV content from PDF files
  • ๐ŸŒ Web Scraping: Automatically fetches job postings from LinkedIn and other platforms
  • ๐Ÿง  RAG-Powered Analysis: Uses Retrieval-Augmented Generation for intelligent job content extraction
  • ๐Ÿ“Š Similarity Scoring: Measures CV-job alignment using cosine similarity
  • ๐ŸŽฏ Smart Enhancement: Generates tailored CVs while maintaining professional quality
  • ๐Ÿ” Iterative Improvement: Automatically retries generation if similarity doesn't improve
  • ๐Ÿ“ฑ HTML Export: Converts final CVs to professional HTML format

๐Ÿ—๏ธ Architecture

The system operates through a sophisticated multi-stage pipeline:

---
config:
  theme: redux
---
flowchart TD
    A["CV PDF"] --> B["PDF Text Extraction"]
    A1["Additional Info"] --> B1["Combine CV Content"]
    B --> B1
    B1 --> C["CV Keyword Extraction<br>Fast LLM"] & D["CV Embedding Generation<br>Embed LLM"] & M["Smart LLM Enhancement"]
    E["Job URL/Text"] --> F{"Input Type?"}
    F -- URL --> G["Web Scraping<br>Playwright + BeautifulSoup"]
    F -- Text --> H["Direct Text Processing"]
    G --> I["RAG Processing<br>Context LLM + Vector Store"]
    H --> I
    I --> J["Job Keyword Extraction<br>Fast LLM"] & K["Job Embedding Generation<br>Embed LLM"] & M
    C --> L["Cosine Similarity Analysis"] & M
    D --> L
    J --> L & M
    K --> L
    L --> M
    M --> N["Enhanced CV Generation"]
    N --> O["Similarity Check"]
    O -- Improved --> P["Final CV Output"]
    O -- Not Improved --> Q["Iterative Retry"]
    Q --> M
    P --> R["HTML Export<br>Special Sauce API"] & S["Markdown Export"]
     A:::Ash
     M:::Ash
     E:::Ash
     L:::Ash
     P:::Ash
    classDef Ash stroke-width:1px, stroke-dasharray:none, stroke:#999999, fill:#EEEEEE, color:#000000
    style A fill:#e1f5fe
    style M fill:#f3e5f5
    style E fill:#e8f5e8
    style L fill:#fff3e0
    style P fill:#e8f5e8

Loading

LLM Specialization

  • Fast LLM (for example: GPT-OSS-120B): Quick keyword extraction from CV and job postings
  • Context LLM: Large-context processing for web-scraped content and RAG operations
  • Embed LLM: Generates high-quality embeddings for similarity analysis
  • Smart LLM (Currently using: Gemini 2.5 Pro): Advanced CV enhancement and generation

๐Ÿ› ๏ธ Technical Implementation

Core Technologies

  • LangChain: Document processing, embeddings, and LLM orchestration
  • Playwright: Robust web scraping for job postings
  • BeautifulSoup: HTML cleaning and text extraction
  • NumPy: Mathematical operations for similarity calculations
  • PyPDF: PDF text extraction and processing

Advanced Features

  • RAG Pipeline: Uses InMemoryVectorStore for intelligent content retrieval
  • Async Processing: Non-blocking web scraping with proper Windows compatibility
  • Template System: YAML-based CV templates for consistent formatting
  • Configuration Management: File-based prompt and template management
  • Error Handling: Comprehensive error recovery and retry mechanisms

๐Ÿš€ Getting Started

Installation

# Clone the repository
git clone https://github.com/Creative-Geek/Knitty.git
cd Knitty

# Install dependencies
pip install -e .

# Install Playwright browsers for web scraping
playwright install chromium

Environment Setup

Create a .env file in the project root with your LLM API configurations:

# Fast LLM (Keyword Extraction) - e.g., Groq Llama 3
FAST_LLM_API_KEY="your-api-key"
FAST_LLM_API_BASE="https://api.groq.com/openai/v1"
FAST_LLM_MODEL_NAME="llama3-8b-8192"

# Smart LLM (CV Enhancement) - e.g., OpenAI GPT or Google Gemini
SMART_LLM_API_KEY="your-api-key"
SMART_LLM_API_BASE="https://api.openai.com/v1"
SMART_LLM_MODEL_NAME="gpt-4-turbo"

# Embedding LLM (Similarity Analysis) - e.g., OpenAI Embeddings
EMBED_LLM_API_KEY="your-api-key"
EMBED_LLM_API_BASE="https://api.openai.com/v1"
EMBED_LLM_MODEL_NAME="text-embedding-ada-002"

# Optional: HTML Generation Service
SPECIAL_SAUCE_API_URL="https://cv-service.com/generate"
SPECIAL_SAUCE_API_KEY="your-service-key"

Running the Application

โš ๏ธ ALPHA WARNING: Both interfaces are experimental. Use for testing and development only.

Option 1: FastAPI Backend (โš ๏ธ ALPHA)

python app.py
# or
uvicorn app:app --host 0.0.0.0 --port 8000 --reload

Access the API at http://localhost:8000 with interactive API docs at http://localhost:8000/docs

Option 2: Streamlit GUI (โš ๏ธ ALPHA)

streamlit run streamlit_app.py

Open your browser to http://localhost:8501 and start enhancing CVs through an intuitive interface.

Option 3: Jupyter Notebook

jupyter notebook notebook.ipynb

Great for experimentation and research.

๐Ÿ“Š How It Works

1. Input Processing

  • Extracts text from PDF CVs using PyPDFLoader
  • Combines CV content with additional information
  • Processes job postings via web scraping or direct text input

2. Keyword Analysis

  • Uses Fast LLM to extract professional keywords from both sources
  • Returns structured JSON arrays of relevant terms and phrases
  • Focuses on technical skills, experience, and job requirements

3. Similarity Measurement

  • Generates embeddings for keyword sets using specialized models
  • Calculates cosine similarity to quantify CV-job alignment
  • Establishes baseline metrics for improvement tracking

4. Intelligent Enhancement

  • Uses Smart LLM to generate optimized CV based on analysis
  • Maintains professional tone and template compliance
  • Includes iterative improvement with similarity feedback

5. Quality Assurance

  • Measures improvement in cosine similarity scores
  • Automatically retries if enhancement doesn't improve alignment
  • Exports final CV in both markdown and HTML formats

๐Ÿ“ˆ Results

The system typically achieves:

  • Baseline Similarity: 40-70% initial CV-job alignment
  • Improvement Range: +5% to +15% similarity increase
  • Quality Metrics: Professional presentation with strategic keyword integration
  • Processing Time: ~30-60 seconds for complete enhancement

๐Ÿ”ฎ Future Vision

You'll have to wait for that ๐Ÿ˜‰

๐Ÿค Contributing

We welcome contributions to enhance and evolve Knitty! Areas where we could use help:

  • Frontend Enhancement: Improve the Streamlit GUI with additional features
  • Backend Optimization: Performance improvements to the FastAPI server
  • Model Fine-tuning: Optimize LLM prompts and enhance CV generation quality
  • DevOps: Docker containerization and cloud deployment setup
  • Testing: Expand test coverage and add integration tests
  • Documentation: API documentation and user guides
  • Feature Development: New features like batch processing, templates, and analytics

๐Ÿ“š Documentation

โš–๏ธ License

This project is currently in research and development phase. License terms will be established as the project evolves toward production release.

๐Ÿ™ Acknowledgments

  • LangChain team for excellent LLM orchestration tools
  • OpenAI for embedding and language model APIs
  • Playwright team for robust web automation capabilities
  • The open-source community for inspiration and foundational technologies

Ready to revolutionize your job applications? Get started with Knitty today!

About

An intelligent CV tailoring system that optimizes resumes for specific job postings using advanced LLM technology and cosine similarity analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •