Personal Project Writeup: Grid Intelligence Platform

Personal Project Writeup: Grid Intelligence Platform#

Wuhao Xia#

Grid Intelligence Platform began as a policy and data question: how can someone understand the behavior of the U.S. electricity grid without paying for expensive proprietary market intelligence tools? The proposal was to build an open-source dashboard that combines public operational data from the Energy Information Administration with weather, interconnection queue, solar resource, and selected market price data. Instead of treating these sources as separate charts, the project asks a more practical question: what would a small retail electricity provider, virtual power plant operator, clean energy developer, or policy analyst need to see in order to understand operational risk and transition opportunity for a balancing authority?

The project evolved from a data visualization exercise into a decision-oriented investigation tool. Early versions focused on loading EIA demand, forecast, generation, and interchange data and turning them into useful charts. As we tested the app, we realized that users needed interpretation, not just access to more plots. That led us to reorganize the app around six connected modules: Executive Briefing, Anomaly Detection, Arbitrage Signals, Transition Scoring, Compliance Reports, and About. We added a shared investigation context using Streamlit session state so a user could move across modules while keeping the same balancing authority, route, or ISO location in focus. We also added peer-median comparisons, “Why this matters” notes, a policy recommendation card, and a transparent contradiction detector that flags tensions such as strong transition potential but weak queue activity.

On the code side, the platform uses a two-stage architecture. The ETL layer in load_to_bigquery.py pulls data from external APIs and writes raw and aggregated tables to BigQuery. The dashboard in app.py then loads those tables once with Streamlit caching and serves all pages from in-memory DataFrames, which makes page navigation fast after startup. The analytical logic lives mostly in data_processing.py: forecast error calculations, BA anomaly status, interchange route scoring, LMP anomaly detection, transition scoring, queue summaries, and compliance-style reporting. We also included validation.py for Pandera schemas and tests for important processing behavior. This separation made the project easier to reason about because data fetching, validation, transformation, and presentation each had a defined role.

One of my main takeaways was that public data can be powerful, but only if the surrounding engineering makes it usable. Working with EIA and related grid datasets required careful handling of timestamps, balancing authority codes, missing values, inconsistent source coverage, and the difference between operational data and true market data. I also learned that a useful policy dashboard should be honest about uncertainty. For example, the arbitrage module does not claim to prove a tradable spread when LMP coverage is missing; it presents persistent physical flow patterns as a signal worth investigating. Similarly, the transition score is a transparent composite, not a black-box prediction.

This project also changed how I think about the role of software in policy analysis. A notebook can answer one question, but a dashboard has to support repeated exploration by someone who may not know the data structure in advance. That pushed us to build explanatory interface elements, preserve context across pages, and translate raw metrics into decision language. The final product is still a student project with real limitations, but it demonstrates a full pipeline from public data ingestion to cloud storage, analytical scoring, interactive visualization, and policy-relevant interpretation.