Data Loading

Module: equser.data Dependencies: base (numpy, pyarrow)

Load CPOW and PMon Parquet files with automatic scaling and timestamp parsing.

CPOW data

load_cpow_scaled(file_path) -> dict

Load a CPOW Parquet file and return scaled voltage/current arrays.

Handles both formats automatically:

  • int32 (current): raw ADC counts scaled by vscale/iscale from Parquet user metadata.
  • float (legacy): values already in V/A; scaling factors are 1.0.

Returns a dict with:

KeyTypeDescription
tablepa.TableRaw PyArrow Table
VA, VB, VCnp.ndarrayScaled voltage arrays (float64)
IA, IB, IC, INnp.ndarrayScaled current arrays (float64)
vscalefloatVoltage scaling factor applied
iscalefloatCurrent scaling factor applied
start_timedatetime or NoneParsed from metadata
sample_rateintAlways 32000
from equser.data import load_cpow_scaled

result = load_cpow_scaled('20250623_075056.parquet')
print(f"Peak voltage A: {result['VA'].max():.1f} V")
print(f"Samples: {len(result['VA'])}")

load_cpow(file_path) -> pa.Table

Load a CPOW Parquet file as a raw PyArrow Table with no scaling applied. Use this when you need the raw integer ADC values or want to handle scaling yourself.

Constants

ConstantValueDescription
SAMPLE_RATE_HZ32000CPOW sample rate
CHANNELS['VA', 'VB', 'VC', 'IA', 'IB', 'IC', 'IN']Channel names
NEUTRAL_CT_RATIO30Neutral CT sensitivity ratio vs. phase CTs

PMon data

load_pmon(file_path) -> pa.Table

Load a PMon Parquet file as a PyArrow Table.

PMon files contain 10/12-cycle RMS measurements (10 cycles for 50 Hz grids, 12 cycles for 60 Hz). Common columns include:

ColumnDescription
time_usTimestamp in microseconds
FREQLine frequency (Hz)
AVRMS, BVRMS, CVRMSPhase RMS voltage
AIRMS, BIRMS, CIRMSPhase RMS current
NIRMSNeutral RMS current
AWATT, BWATT, CWATTPhase active power
from equser.data import load_pmon

table = load_pmon('20250623_0750.parquet')
freq = table.column('FREQ').to_numpy()
print(f"Mean frequency: {freq.mean():.3f} Hz")

Timestamp parsing

parse_start_time(s) -> datetime

Parse an ISO 8601 timestamp string from CPOW metadata. Handles nanosecond precision by truncating to microseconds (Python datetime limit).

from equser.data import parse_start_time

dt = parse_start_time("2025-06-23T07:50:56.123456789Z")
print(dt)  # 2025-06-23 07:50:56.123456+00:00

parse_filename_timestamp(filename) -> datetime | None

Extract a timestamp from an EQ data filename pattern.

Supports:

  • YYYYMMDD_HHMM (PMon files)
  • YYYYMMDD_HHMMSS (CPOW files)
from equser.data import parse_filename_timestamp

dt = parse_filename_timestamp("20250623_075056.parquet")
print(dt)  # 2025-06-23 07:50:56