GitHub - earlblamier/pdf-organizer: PDF Organizer is a tool for managing multiple PDFs efficiently. It counts records per file, processes and organizes multiple PDFs, merges files, generates reports, and counts files in folders. Ideal for streamlining document workflows and keeping PDFs well-organized. 🚀

PDF Organizer

Version: 2.4.0-RC
Release Date: April 15, 2024
Author: Earl Lamier ([email protected])

Description
PDF Organizer is a Python-based tool designed to process, organize, and manage PDF files. It automates tasks such as adding sequence numbers to pages, grouping pages, merging PDFs, and generating detailed reports. This tool is ideal for efficiently handling large batches of PDF files.

Features

Adds sequence numbers to the lower-left corner of PDF pages.
Counts records per PDF file.
Processes multiple PDF files and organizes them into folders.
Merges PDF files into a single document.
Generates detailed reports for processed PDFs.
Counts and organizes files into folders based on specific criteria.

Requirements

Python 3.9 or higher
Libraries:
- arrow
- csv
- fitz (PyMuPDF)
- matplotlib
- numpy
- pandas
- PyPDF2
- reportlab
- pdfminer.six

Installation

Clone the repository:
(( git clone https://github.com/earlblamier/pdf-organizer.git ))
(( cd pdf-organizer )) # Navigate to the project directory
Install the required Python libraries:
(( pip install -r requirements.txt )) # Install dependencies from requirements.txt

Usage

Place the PDF files you want to process in the same directory as the script.
Run the script:
(( python PDF_Organizer_App_Product_v2.4.0-RC.py )) # Run the main Python script
Follow the prompts to enter the operator name and work order number.

Outputs

Processed PDFs: Organized into folders based on page groups.
Reports:
- Data_log_<date>.csv: Logs details of processed pages.
- Report_<date>.csv: Summary of processed PDFs, including record counts and total images.
- Group_page_counts_<date>.csv: Grouped page counts with total images and records.
Merged PDFs: Combined PDFs for each group.

Key Functions

pageData
Extracts and groups pages from PDF files into a DataFrame and organizes them into folders.
extractPages
Processes PDF files, groups pages, and creates blank PDFs for odd-numbered groups.
processFilesPdf
Processes PDF files and organizes them into folders based on page tags.
createReport
Generates a CSV report summarizing the processed PDFs.
mergePdf
Merges grouped PDF files into a single PDF for each group.

Example Workflow

Place your PDF files in the directory.
Run the script and provide the required inputs.
The script will:
- Extract and group pages.
- Create blank pages for odd-numbered groups.
- Organize PDFs into folders.
- Generate reports.
- Merge grouped PDFs into single files.

Known Issues

Blank PDF Creation: A race condition may occur when creating the first blank PDF file during long processes.
File Locking: Ensure CSV files are not open in another program while running the script.

Revision History
2.4.0-RC (April 15, 2024)

Fixed bugs related to merging 5-page PDFs with blanks.
Created blank pages for odd-numbered groups.

2.2

Fixed merge sort order.

2.0

Switched from PyPDF2 to PyMuPDF and pdfminer.six for text extraction.

1.11

Added user input for operator name and work order.
Updated logging and bar chart generation.

License
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
You can find the full license text in the LICENSE file or read it here.

Author
Developed by Earl Lamier
Contact: [email protected]

Acknowledgments
Special thanks to the Python community for providing the libraries used in this project.

💖 Support my Projects

If you find my projects useful, consider supporting me by buying me a coffee or a meal.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
LICENSE		LICENSE
PDF_Organizer_App_Product_v2.4.0-RC.py		PDF_Organizer_App_Product_v2.4.0-RC.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

💖 Support my Projects

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Languages

Uh oh!

License

earlblamier/pdf-organizer

Folders and files

Latest commit

History

Repository files navigation

💖 Support my Projects

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Languages

Packages