IEEE C37.118.2 microPMU data frame toolkit for synchrophasor data conversion, generation, inspection, and compression verification.
Produces exact TCP wire-format IEEE C37.118.2 binary streams from multiple data sources: LBNL microPMU event CSVs, LBNL continuous archive channel files, and a built-in synthetic generator with 23 composable power system scenarios. Parses and inspects binary streams, exports to CSV, and verifies compression integrity with bit-exact or lossy tolerance comparison.
- Convert LBNL microPMU CSV files to IEEE C37.118.2 binary format
- Batch convert entire event libraries with preserved directory structure
- Generate synthetic PMU streams with 23 composable power system scenarios (voltage, current, frequency, fault, status, timing, encoding edge cases)
- Inspect binary files with hex dumps, metadata summaries, and CSV export
- Verify compression integrity (bit-exact or lossy tolerance comparison)
- Import LBNL continuous archive data (streaming gzip channel files) with time slicing and chunking
- Download LBNL archive channel files with progress bars and resume support
- Corpus generation of diverse test datasets covering the full IEEE C37.118.2 parameter space
Output files are raw TCP wire-format -- byte-for-byte identical to what a PDC would receive from a real microPMU over TCP port 4712:
[CFG-2 frame][Data frame @ t=0][Data frame @ t=8.33ms][Data frame @ t=16.67ms]...
With session framing flags (--header-text, --include-cfg1, --include-cfg3, --include-commands), the full IEEE C37.118.2 session structure is produced:
[HDR][CFG-1][CFG-2][CFG-3][CMD:TurnOn][Data][Data]...[CFG-2 retransmit]...[CMD:TurnOff]
- Rust 1.85+ (Edition 2024)
- Cargo
cargo build --releaseThe optimized binary is at target/release/upmu-dataframes.
upmu-dataframes convert \
--input event_data.csv \
--output event_data.c37# Random mix of scenarios (deterministic with seed)
upmu-dataframes generate \
--output synthetic.c37 \
--duration 60 \
--scenario random_mix \
--seed 42
# Specific scenarios
upmu-dataframes generate \
--output synthetic.c37 \
--duration 30 \
--scenario sag,motor_start,freq_eventupmu-dataframes inspect --input event_data.c37
upmu-dataframes inspect --input event_data.c37 --hexdump --max-frames 10
upmu-dataframes inspect --input event_data.c37 --csv exported.csv# Bit-exact comparison
upmu-dataframes verify \
--original original.c37 \
--decompressed decompressed.c37
# Lossy comparison with tolerances
upmu-dataframes verify \
--original original.c37 \
--decompressed decompressed.c37 \
--mode lossy \
--mag-tolerance 0.01 \
--angle-tolerance 0.001 \
--compressed compressed.bin \
--jsonupmu-dataframes batch \
--input-dir /path/to/lbnl-events \
--output-dir /path/to/output# Download archive files for a location
upmu-dataframes download-archive \
--location a6_bus1 \
--output-dir ./vendor/lbnl_archive
# Import 5 minutes of archive data starting 1 hour in
upmu-dataframes import-archive \
--input-dir ./vendor/lbnl_archive \
--location a6_bus1 \
--output archive.c37 \
--offset 3600 \
--duration 300Convert a single LBNL microPMU CSV to IEEE C37.118.2 binary.
| Flag | Default | Description |
|---|---|---|
--input |
(required) | LBNL CSV input file |
--output |
(required) | Output .c37 binary file |
--station-name |
LBNL-uPMU |
PMU station identifier (max 16 chars) |
--idcode |
1 |
PMU stream ID (1-65534) |
--format |
polar |
Phasor notation: polar or rect |
--encoding |
float32 |
Data encoding: float32 or int16 |
--data-rate |
120 |
Reporting rate in frames per second |
Convert all CSV files in an LBNL event library directory tree.
| Flag | Default | Description |
|---|---|---|
--input-dir |
(required) | Root of LBNL event library |
--output-dir |
(required) | Output directory for .c37 files |
--format |
polar |
Phasor notation: polar or rect |
--encoding |
float32 |
Data encoding: float32 or int16 |
Generate synthetic IEEE C37.118.2 data with composable power system scenarios.
| Flag | Default | Description |
|---|---|---|
--output |
(required) | Output .c37 binary file |
--duration |
60 |
Stream duration in seconds |
--rate |
120 |
Reporting rate in frames per second |
--encoding |
float32 |
Data encoding: float32 or int16 |
--voltage-ln |
7200 |
Base line-to-neutral voltage (Vrms) |
--current |
300 |
Base current (Arms) |
--scenario |
none | Comma-separated scenario names or preset (see below) |
--seed |
none | RNG seed for deterministic output |
--inject-sag |
off | Legacy flag (equivalent to --scenario sag) |
--station-name |
SYNTH-PMU |
Station identifier |
--phasor-count |
8 |
Number of phasors (1-16) |
--nominal-freq |
60 |
Nominal grid frequency (50 or 60 Hz) |
--notation |
polar |
Phasor notation: polar or rect |
--phasor-encoding |
(from --encoding) | Phasor data type: float32 or int16 |
--analog-encoding |
(from --encoding) | Analog data type: float32 or int16 |
--freq-encoding |
(from --encoding) | Frequency data type: float32 or int16 |
--analog-count |
0 |
Number of analog channels |
--analog-preset |
none | Analog preset: substation |
--digital-count |
0 |
Number of digital words |
--digital-preset |
none | Digital preset: breaker |
--pmu-count |
1 |
Number of PMUs in aggregated stream (1-256) |
--time-base |
1000000 |
TIME_BASE for FRACSEC (1-16777215) |
--cfg2-interval |
none | Re-emit CFG-2 every N seconds |
--config-count |
1 |
CFGCNT field in CFG-2 |
--header-text |
none | Prepend HDR frame with ASCII station description |
--include-cfg1 |
off | Prepend CFG-1 capabilities frame before CFG-2 |
--include-cfg3 |
off | Include CFG-3 extended config frame after CFG-2 |
--include-commands |
off | Wrap data with CMD TurnOnData/TurnOffData |
Scenario names: sag, swell, cap_switching, motor_start, pv_cloud, freq_event, angle_jump, fault_lg, fault_ll, fault_llg, near_zero_current, int16_boundary, timestamp_rollover, sync_loss, config_change, data_quality, trigger, gps_unlock, leap_second, invalid_measurement, missed_frames, duplicate_frames, timing_jitter
Presets: random_mix (3-6 random scenarios with randomized parameters), all (all scenario types placed sequentially)
Parse and display IEEE C37.118.2 binary file contents.
| Flag | Default | Description |
|---|---|---|
--input |
(required) | Input .c37 binary file |
--hexdump |
off | Print hex + ASCII dump of frames |
--max-frames |
all | Limit display to first N frames |
--csv |
none | Export parsed data frames as CSV |
Compare original vs. decompressed C37.118.2 streams.
| Flag | Default | Description |
|---|---|---|
--original |
(required) | Original .c37 file (pre-compression) |
--decompressed |
(required) | Decompressed .c37 file |
--compressed |
none | Compressed file (for ratio calculation) |
--mode |
exact |
Verification mode: exact or lossy |
--mag-tolerance |
0.01 |
Magnitude tolerance (lossy mode) |
--angle-tolerance |
0.001 |
Angle tolerance in radians (lossy mode) |
--freq-tolerance |
0.001 |
Frequency tolerance in Hz (lossy mode) |
--json |
off | Output report as JSON |
Import LBNL continuous archive gzip channel files into IEEE C37.118.2 binary. Streams data without loading into memory.
| Flag | Default | Description |
|---|---|---|
--input-dir |
(required) | Directory containing .gz channel files |
--output |
(required) | Output .c37 file path |
--location |
(required) | Location: a6_bus1, bank_514, grizzly_bus1_2 |
--prefix |
none | Custom file prefix (overrides --location) |
--station-name |
LBNL-ARCHIVE |
Station name in CFG-2 |
--idcode |
1 |
PMU stream ID (1-65534) |
--format |
polar |
Phasor notation: polar or rect |
--encoding |
float32 |
Data encoding: float32 or int16 |
--data-rate |
120 |
Reporting rate in frames per second |
--offset |
none | Skip this many seconds from start |
--duration |
none | Process this many seconds |
--chunk-duration |
none | Split output into files of this duration |
Generate a diverse test corpus covering the full IEEE C37.118.2 parameter space.
| Flag | Default | Description |
|---|---|---|
--output-dir |
(required) | Output directory for .c37 files and manifest.json |
--seed |
42 |
Random seed for deterministic corpus |
--preset |
medium |
Corpus size: small, medium, large |
Download LBNL continuous archive channel files from powerdata-download.lbl.gov.
| Flag | Default | Description |
|---|---|---|
--output-dir |
./vendor/lbnl_archive |
Directory to save .gz files |
--location |
(required) | Location: a6_bus1, bank_514, grizzly_bus1_2 |
--channels |
all |
Channel set: all, voltage, current |
--force |
off | Re-download existing files |
Each output .c37 file contains:
- 1+ CFG-2 frames -- configuration metadata (station name, phasor names, data rate, encoding); optionally retransmitted periodically via
--cfg2-interval - N data frames -- one per sample at the configured reporting rate
Each data frame contains one data block per PMU. A single-PMU frame with 8 float32 polar phasors, 0 analog, 0 digital = 90 bytes:
| Field | Bytes | Description |
|---|---|---|
| Common header | 14 | SYNC + FRAMESIZE + IDCODE + SOC + FRACSEC |
| Per PMU (repeated for each PMU in multi-PMU streams): | ||
| STAT | 2 | Status word (data error, sync, time quality, trigger reason) |
| Phasors | 8×N (float32) or 4×N (int16) | N phasors, polar (mag+angle) or rectangular (real+imag) |
| FREQ | 4 or 2 | Frequency deviation from nominal (Hz) |
| DFREQ | 4 or 2 | Rate of change of frequency (Hz/s) |
| Analogs | 4×A (float32) or 2×A (int16) | A analog measurement channels |
| Digitals | 2×D | D digital status words (16 bits each) |
| CRC | 2 | CRC-16/IBM-3740 checksum |
Frame size varies with configuration. Encoding can be set independently for phasors, analogs, and frequency.
Protocol:
- CRC-16/IBM-3740 (poly=0x1021, init=0xFFFF) -- canonical test vector:
"123456789"->0x29B1 - Big-endian (network byte order) throughout
- SYNC words: Data=
0xAA02, CFG-2=0xAA32 - Station names null-padded to 16 bytes per spec (not space-padded)
- SOC is 32-bit Unix timestamp; FRACSEC scaled by TIME_BASE
Configuration (CFG-2):
- Variable phasor count (1-16) with per-channel PHUNIT type (voltage/current) and scale
- Variable analog channel count with per-channel ANUNIT type and scale
- Variable digital word count with DIGUNIT normal-status/valid-bits masks
- Multi-PMU config frames (NUM_PMU 1-256, each with independent channel layout)
- Independent encoding per type: phasors, analogs, and frequency can each be float32 or int16
- Polar or rectangular phasor notation
- 50 Hz or 60 Hz nominal frequency (FNOM)
- Positive and negative DATA_RATE (fps or seconds-per-frame)
- Configurable TIME_BASE (default 1,000,000; supports any value 1-16,777,215)
- CONFIG_COUNT (CFGCNT) version counter
Data frames:
- Full 16-bit StatWord: data error, PMU sync, data sorting, trigger detected, config change pending, data modified, time quality indicator, unlocked time, trigger reason
- FRACSEC time quality byte: 8 quality levels (Locked through Unreliable) + leap second pending/occurred flags
- Positive sequence derived via Fortescue transformation
- Round-trip validated: serialize → parse → compare field-by-field
Input CSVs follow the Lawrence Berkeley National Lab microPMU event library format:
- 120 Hz reporting rate (8.333 ms between samples)
- Columns: timestamp (ns), 3-phase voltage/current (angle + magnitude), power, sag/swell flags
- 8 phasors derived: VA, VB, VC, V+ (Fortescue), IA, IB, IC, I+
- Frequency derived from voltage phase-A angle rate of change
- ROCOF derived from frequency deviation rate of change
See docs/reference/lbnl_pmu_event_library_README.md for the full LBNL format specification.
The LBNL continuous archive (powerdata-download.lbl.gov) provides ~11.6 days of real-world 120 Hz data from 3 distribution locations. Unlike the event library, the archive stores each measurement channel as a separate gzip-compressed file (timestamp_ns,value). 12 channels per location (3-phase voltage + current, magnitude + angle).
Use download-archive to fetch the files, then import-archive to convert them. The import pipeline streams data through gzip decompression without loading into memory, so it can handle the full ~3 GB per channel.
- Architecture -- module layout, data flows, design decisions
- IEEE C37.118.2 Reference -- protocol quick reference
- Usage Guide -- detailed workflows and examples
cargo test
cargo clippy --all-targets -- -D warningsTests cover CRC computation, frame serialization/deserialization round-trips, CSV parsing, Fortescue transformation, synthetic scenario behavior, and end-to-end CLI workflows.
src/
├── main.rs # CLI entry point (clap)
├── lib.rs # Module declarations
├── c37118/ # IEEE C37.118.2 core
│ ├── types/ # Domain types (format, status, phasor, time)
│ ├── common.rs # CRC-16 + common header serializer
│ ├── config_frame.rs # CFG-2 frame serializer
│ ├── data_frame.rs # Data frame serializer
│ ├── scales.rs # Int16 scaling constants
│ └── parser/ # Binary deserializer
├── cli/ # Subcommand handlers
├── csv_input/ # LBNL CSV parser + enrichment
├── archive/ # LBNL continuous archive streaming reader
├── synthetic/ # Synthetic waveform generator
├── phasor_math.rs # Fortescue transform + frequency derivation
└── verify.rs # Compression verification framework
docs/
└── reference/ # IEEE standards, LBNL docs, reference implementations
BSD 3-Clause