Data Loading
Module: equser.data
Dependencies: base (numpy, pyarrow)
Load CPOW and PMon Parquet files with automatic scaling and timestamp parsing.
CPOW data
load_cpow_scaled(file_path) -> dict
Load a CPOW Parquet file and return scaled voltage/current arrays.
Handles both formats automatically:
- int32 (current): raw ADC counts scaled by
vscale/iscalefrom Parquet user metadata. - float (legacy): values already in V/A; scaling factors are 1.0.
Returns a dict with:
| Key | Type | Description |
|---|---|---|
table | pa.Table | Raw PyArrow Table |
VA, VB, VC | np.ndarray | Scaled voltage arrays (float64) |
IA, IB, IC, IN | np.ndarray | Scaled current arrays (float64) |
vscale | float | Voltage scaling factor applied |
iscale | float | Current scaling factor applied |
start_time | datetime or None | Parsed from metadata |
sample_rate | int | Always 32000 |
from equser.data import load_cpow_scaled
result = load_cpow_scaled('20250623_075056.parquet')
print(f"Peak voltage A: {result['VA'].max():.1f} V")
print(f"Samples: {len(result['VA'])}")
load_cpow(file_path) -> pa.Table
Load a CPOW Parquet file as a raw PyArrow Table with no scaling applied. Use this when you need the raw integer ADC values or want to handle scaling yourself.
Constants
| Constant | Value | Description |
|---|---|---|
SAMPLE_RATE_HZ | 32000 | CPOW sample rate |
CHANNELS | ['VA', 'VB', 'VC', 'IA', 'IB', 'IC', 'IN'] | Channel names |
NEUTRAL_CT_RATIO | 30 | Neutral CT sensitivity ratio vs. phase CTs |
PMon data
load_pmon(file_path) -> pa.Table
Load a PMon Parquet file as a PyArrow Table.
PMon files contain 10/12-cycle RMS measurements (10 cycles for 50 Hz grids, 12 cycles for 60 Hz). Common columns include:
| Column | Description |
|---|---|
time_us | Timestamp in microseconds |
FREQ | Line frequency (Hz) |
AVRMS, BVRMS, CVRMS | Phase RMS voltage |
AIRMS, BIRMS, CIRMS | Phase RMS current |
NIRMS | Neutral RMS current |
AWATT, BWATT, CWATT | Phase active power |
from equser.data import load_pmon
table = load_pmon('20250623_0750.parquet')
freq = table.column('FREQ').to_numpy()
print(f"Mean frequency: {freq.mean():.3f} Hz")
Timestamp parsing
parse_start_time(s) -> datetime
Parse an ISO 8601 timestamp string from CPOW metadata. Handles nanosecond precision by truncating to microseconds (Python datetime limit).
from equser.data import parse_start_time
dt = parse_start_time("2025-06-23T07:50:56.123456789Z")
print(dt) # 2025-06-23 07:50:56.123456+00:00
parse_filename_timestamp(filename) -> datetime | None
Extract a timestamp from an EQ data filename pattern.
Supports:
YYYYMMDD_HHMM(PMon files)YYYYMMDD_HHMMSS(CPOW files)
from equser.data import parse_filename_timestamp
dt = parse_filename_timestamp("20250623_075056.parquet")
print(dt) # 2025-06-23 07:50:56