PyMarkup

A Python toolkit for estimating firm-level markups using production function-based marginal cost recovery.

Installation

git clone https://github.com/immortalsRDJ/PyMarkup
cd PyMarkup
uv sync --python 3.10

For WRDS data downloads, add the wrds extra:

uv sync --extra wrds

Quick Start

Option 1: Run Everything in One Go

The easiest way to use PyMarkup is with run_all(), which handles the entire pipeline:

from PyMarkup import MarkupPipeline, PipelineConfig

config = PipelineConfig(
    compustat_path="Input/DLEU/Compustat_annual.csv",
    macro_vars_path="Input/DLEU/macro_vars_new.xlsx",
    fred_api_key="your-fred-api-key",      # Or set FRED_API_KEY env var
    data_dir="Input",
)

pipeline = MarkupPipeline(config)
results = pipeline.run_all(
    download=True,           # Download data from WRDS/FRED/BLS
    skip_compustat=True,     # Skip if you already have Compustat data
    generate_figures=True,   # Generate output figures
)
results.save(output_dir="Output/", format="csv")

Option 2: Command Line

# Full pipeline with config file
pymarkup run-all --config config.yaml

# Skip download step (use existing data)
pymarkup run-all --config config.yaml --skip-download

# Skip only Compustat download (no WRDS credentials needed)
pymarkup run-all --config config.yaml --skip-compustat

Option 3: Step by Step

If you prefer more control, run each step separately:

from PyMarkup import MarkupPipeline, PipelineConfig, EstimatorConfig

config = PipelineConfig(
    compustat_path="Input/DLEU/Compustat_annual.csv",
    macro_vars_path="Input/DLEU/macro_vars_new.xlsx",
    estimator=EstimatorConfig(method="wooldridge_iv"),
)

pipeline = MarkupPipeline(config)
results = pipeline.run()  # Runs data prep -> estimation -> markup calculation
results.save(output_dir="Output/", format="csv")

Configuration

Setting Up Credentials

Copy the example config file:
```
cp config.example.yaml config.yaml
```

Edit config.yaml with your credentials:

fred_api_key: "your-fred-api-key"
wrds_username: "your-wrds-username"

Alternatively, set environment variables: FRED_API_KEY, WRDS_USERNAME

Data Requirements

Data Source	Credentials	How to Get
Compustat (WRDS)	WRDS account	Register at WRDS
CPI (FRED)	FRED API key	Free at FRED
PPI (BLS)	None	Public data
Macro variables	N/A	Included in repo: `Input/DLEU/macro_vars_new.xlsx`
NAICS descriptions	N/A	Included in repo: `Input/Other/NAICS_2D_Description.xlsx`

Pipeline Overview

Download -> Data Preparation -> Elasticity Estimation -> Markup Calculation -> Figures

1. Data Download

Downloads raw data from external sources:

from PyMarkup.data import download_compustat, download_cpi, download_ppi, load_config

config = load_config("config.yaml")
download_ppi(config)        # No credentials needed
download_cpi(config)        # Requires FRED API key
download_compustat(config)  # Requires WRDS credentials

2. Data Preparation

Cleans and prepares the Compustat panel:

Deduplicates firm-year observations
Extracts NAICS industry codes
Deflates monetary values by GDP
Computes market shares
Trims outliers

3. Elasticity Estimation

Estimates output elasticity of variable inputs (θ) at the industry-year level:

Method	Class	Use Case
Wooldridge IV	`WooldridgeIVEstimator`	Main method, addresses endogeneity via IV/2SLS
Cost Share	`CostShareEstimator`	Fast baseline, no regression needed
ACF	`ACFEstimator`	Robustness, two-stage GMM with control function

from PyMarkup.estimators import WooldridgeIVEstimator

estimator = WooldridgeIVEstimator(specification="spec2")
elasticities = estimator.estimate_elasticities(panel_data)

4. Markup Calculation

Computes firm-level markups:

markup = θ / cost_share
where cost_share = COGS / (COGS + capital_expense)

5. Figures

Figure	Function	Description
Aggregate Markup	`plot_aggregate_markup()`	Time series of aggregate markups
PPI vs Markup	`plot_markup_vs_ppi()`	Scatter plot with weighted OLS regression

6. Decomposition (Optional)

Dynamic Olley-Pakes decomposition of aggregate markup changes:

from PyMarkup.decomposition import OlleyPakesDecomposition, plot_decomposition

op = OlleyPakesDecomposition()
decomp_results = op.decompose(firm_markups)
plot_decomposition(decomp_results, output_path="Output/decomposition.pdf")

CLI Reference

# Run full pipeline
pymarkup run-all --config config.yaml [OPTIONS]
  --skip-download      Skip data download step
  --skip-compustat     Skip Compustat download only
  --skip-cpi           Skip CPI download only
  --skip-ppi           Skip PPI download only
  --no-figures         Skip figure generation
  --output PATH        Output directory (default: Output/)

# Run estimation only (requires existing data)
pymarkup estimate --config config.yaml

# Download data only
pymarkup download all --config config.yaml
pymarkup download ppi                        # PPI only, no credentials
pymarkup download cpi --config config.yaml   # CPI only

# Validate input data
pymarkup validate Input/DLEU/Compustat_annual.csv

Project Structure

src/PyMarkup/
├── core/              # Data preparation, markup calculation, figures
├── data/              # Data downloaders and loaders
├── estimators/        # WooldridgeIV, CostShare, ACF estimators
├── pipeline/          # MarkupPipeline orchestrator, config
├── decomposition/     # Dynamic Olley-Pakes decomposition
├── io/                # I/O schemas (Pydantic)
└── cli/               # CLI commands

Input/                 # Raw data (not version controlled)
Intermediate/          # Generated datasets, theta estimates
Output/                # Figures and tables

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
Input		Input
examples		examples
src		src
tests		tests
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
config.example.yaml		config.example.yaml
justfile		justfile
pyproject.toml		pyproject.toml
tmp_patch.txt		tmp_patch.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyMarkup

Installation

Quick Start

Option 1: Run Everything in One Go

Option 2: Command Line

Option 3: Step by Step

Configuration

Setting Up Credentials

Data Requirements

Pipeline Overview

1. Data Download

2. Data Preparation

3. Elasticity Estimation

4. Markup Calculation

5. Figures

6. Decomposition (Optional)

CLI Reference

Project Structure

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

immortalsRDJ/PyMarkup

Folders and files

Latest commit

History

Repository files navigation

PyMarkup

Installation

Quick Start

Option 1: Run Everything in One Go

Option 2: Command Line

Option 3: Step by Step

Configuration

Setting Up Credentials

Data Requirements

Pipeline Overview

1. Data Download

2. Data Preparation

3. Elasticity Estimation

4. Markup Calculation

5. Figures

6. Decomposition (Optional)

CLI Reference

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages