Automated script to download full bhavcopy and security deliverable data from NSE India website.
Downloads NSE bhavcopy files for specified date ranges
Analyzes existing NSE bhavcopy files in a directory and identifies missing dates
Shared module containing NSE market holidays. Loads actual holiday dates from a comprehensive CSV file (667+ holidays from 1990-2024+), with fallback to basic recurring holidays if the file is unavailable. Both the downloader and analyzer use this module to skip market closed days.
- ✅ Automated navigation through NSE website with calendar-based date selection
- ✅ Configurable date range via command-line arguments
- ✅ Weekly browser session batching (Monday-Friday) for improved performance
- ✅ Intelligent operation skipping - full navigation only on first day of week
- ✅ Automatic weekend and public holiday skipping with logged entries
- ✅ Shared public holiday configuration across all scripts
- ✅ Rotating browser user agents for each download
- ✅ Optimized wait times (5 seconds per selector)
- ✅ 60-second download wait with retry mechanism
- ✅ Random sleep intervals between downloads (3-7 seconds)
- ✅ Comprehensive logging to file and console (stored in
logs/folder) - ✅ CSV summary table with download status, file size, and pandas shape analysis
- ✅ Tracks failed downloads with weekday information
- Python 3.8 or higher
- Google Chrome browser installed
- ChromeDriver (will be managed automatically by selenium)
Create or refresh requirements.txt from pyproject.toml:
uv pip compile pyproject.toml -o requirements.txtThen install dependencies into your environment:
uv syncIf you prefer not to use uv, you can use requirements.txt directly with pip:
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install -r requirements.txtThis repository uses pre-commit hooks to automatically fix basic linting issues before commits. The hooks handle:
- Removing trailing whitespace
- Ensuring files end with a newline
- Formatting Python code with Black
- Sorting imports with isort
- Linting with flake8
- Formatting Markdown files
- Install pre-commit:
pip install pre-commitOr using uv:
uv add --dev pre-commit- Install the hooks:
pre-commit install- (Optional) Run on all files:
pre-commit run --all-filesNow, pre-commit will automatically run on every commit, fixing issues where possible.
To automatically fix linting issues when saving files (before committing), configure your editor to format on save. This complements the pre-commit hooks and provides immediate feedback.
-
Install the following extensions:
- Python (Microsoft)
- Pylint (Microsoft) or Flake8
- Black Formatter
- isort
- Prettier - Code formatter (for Markdown)
-
Update your
settings.json(User or Workspace):
{
"python.formatting.provider": "black",
"python.formatting.blackArgs": ["--line-length", "88"],
"python.sortImports.args": ["--profile", "black"],
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": true
},
"files.trimTrailingWhitespace": true,
"files.insertFinalNewline": true,
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter"
},
"[markdown]": {
"editor.defaultFormatter": "esbenp.prettier-vscode"
}
}- For Markdown formatting, ensure Prettier is configured for
.mdfiles.
This setup will automatically format Python code with Black, sort imports with isort, remove trailing whitespace, ensure files end with newlines, and format Markdown files on save.
python download_nse_bhavcopy.pyThis will download data from July 1-10, 2025 to data/202507/
python download_nse_bhavcopy.py --start-date 2025-06-01 --end-date 2025-06-15python download_nse_bhavcopy.py --start-date 2025-07-01 --end-date 2025-07-10 --output-dir data/custom_folderuv run python download_nse_bhavcopy.py --start-date 2025-07-01 --end-date 2025-07-10--start-date: Start date in YYYY-MM-DD format (default: 2025-07-01)--end-date: End date in YYYY-MM-DD format (default: 2025-07-10)--output-dir: Output directory (default: data/YYYYMM based on start date)
Download data for entire June 2025:
python download_nse_bhavcopy.py --start-date 2025-06-01 --end-date 2025-06-30Download single day:
python download_nse_bhavcopy.py --start-date 2025-07-15 --end-date 2025-07-15Use analyze_existing_files.py to scan a directory containing NSE bhavcopy files:
uv run python analyze_existing_files.py --input-dir "C:\path\to\nse\data" --output-dir analysisThis will generate:
existing_files_summary.csv- Details of all found files with size and shapemissing_files.csv- List of missing weekday dates (excluding weekends and public holidays)
--input-dir: Directory containing NSE CSV files (required)--output-dir: Where to save analysis results (default: analysis)--no-recursive: Search only in the specified directory, not subdirectories
The indian_holidays.py module manages NSE market holidays:
Primary Source: Loads from comprehensive CSV file with 667+ actual NSE holidays (1990-2024+)
- Path:
nse_holidays.csv(in repository root) - Includes all festival holidays, national holidays, and special market closures
Fallback: If CSV file is unavailable, uses basic recurring holidays:
- Republic Day - January 26
- Labour Day - May 1
- Independence Day - August 15
- Gandhi Jayanti - October 2
- Christmas - December 25
Both scripts automatically skip these dates as the market is closed.
The script creates the following:
- Downloaded CSV files:
sec_bhavdata_full_DDMMYYYY.csv
- download_log_YYYYMMDD_YYYYMMDD.txt: Detailed log of all operations
- download_summary_YYYYMMDD_YYYYMMDD.csv: Summary table with columns:
- Date
- Weekday
- Status (Success/Failed/Skipped)
- Filename
- Error (if any)
- File_Size_KB
- Rows (from pandas shape analysis)
- Columns (from pandas shape analysis)
You can modify the following variables in the script:
SLEEP_MINandSLEEP_MAX: Sleep interval range between downloads (default: 3-7 seconds)USER_AGENTS: List of browser user agents to rotate
- The script uses Selenium WebDriver with Chrome to automate browser interactions
- Direct URL navigation to archives page:
https://www.nseindia.com/all-reports#cr_equity_archives - Calendar interaction using
gj-pickerclass for date selection - Weekly browser session batching: new session starts Monday, reused through Friday
- First day of week performs full navigation, search, and checkbox selection
- Subsequent days in the same week skip directly to date selection for speed
- Weekends (Saturday/Sunday) are automatically skipped with logged entries
- Public holidays are automatically skipped with logged entries (see indian_holidays.py)
- User agents are rotated for each weekly session
- Sleep intervals are randomized to avoid triggering rate limits
- Failed downloads are logged with their weekday for reference
- Download waits up to 60 seconds with 2-second retry interval
- File analysis performed after all downloads complete using pandas