Ohio Districts

About Ohio Transparency Maps

Overview

Ohio Transparency Maps tracks how money flows through Ohio state politics. The site combines campaign finance filings, legislative records, election results, and machine learning analytics into an interactive dashboard covering 134 state legislative districts (33 Senate + 99 House).

Pages

Page What it shows
Home Interactive choropleth map, KPI cards, top recipients, compare views, donation flow arcs, and analytics highlights
Dashboard Full analysis with filters, detail panel, and 12 chart types across contributions, networks, industry, and spending
Network WebGL graph of 2,500+ donors and candidates, donor co-funding projections, industry influence map, GNN predictions
Legislation Bills, vote alignment heatmaps, co-sponsorship networks, and donation-legislation correlation
Analysis ML-powered insights: SHAP explainability, UMAP voting clusters, DIME ideology scores, PAC alignment, association rules
Explore Drag-and-drop dashboard builder with configurable chart widgets
Playground SQL console, chart builder, and D3 sketch pad for ad-hoc exploration

Data Sources

Methodology

Contribution Tiers

Districts are colored by total campaign contributions received:

Tier Range Color
None $0 Light gray
Low $1 -- $100K Light blue
Medium $100K -- $300K Medium blue
High $300K -- $600K Dark blue
Very High $600K+ Darkest blue

Donor Name Normalization

Organizations often appear under multiple name variants in filings. We maintain a curated alias mapping that maps ~300 raw names to canonical forms (e.g., "OHIO EDUCATION ASSN" and "OEA" both map to "Ohio Education Association").

Donor Industry Classification

Donor organizations are classified into 16 industry sectors (Energy, Healthcare, Labor/Unions, Real Estate, Legal, Finance & Insurance, Construction, Manufacturing, Technology, Agriculture, Education, Transportation, Telecom, Retail & Services, Lobbying & Gov Affairs, Other). Classification was performed using an LLM (Claude) based on organization names, then manually reviewed for accuracy.

Expenditure Category Classification

Campaign expenditure purposes are classified into 12 spending categories (Media & Advertising, Consulting, Events & Fundraising, Travel, Staff & Payroll, Office & Operations, Legal & Compliance, Printing & Mail, Polling & Research, Direct Voter Contact, Contributions to Others, Other). Classification was performed using an LLM (Claude) based on expenditure purpose descriptions, then manually reviewed.

Interactive Filtering

The dashboard sidebar supports real-time filtering across all visualizations. Filters include donor name search, contribution amount range, date range, party, and industry sector. When filters are active, charts re-aggregate from raw contribution and expenditure records (~28K contributions + ~16K expenditures) in the browser via DuckDB-WASM. Filter state is synced to the URL hash for shareable filtered views.

Network Analysis

The force-directed network, Sankey diagram, and chord diagram all show the top 20 donors (by total amount) connected to legislative candidates. Links represent aggregated donation totals across all filings.

The full network explorer (Network page) renders 2,500+ nodes using Sigma.js (WebGL). Build-time graph metrics are pre-computed with NetworkX: Louvain community detection for color grouping, PageRank for node sizing, and betweenness centrality for identifying bridge nodes. Bipartite projections show donor co-funding patterns and candidate co-donor networks.

Graph Neural Network Analytics

A 2-layer heterogeneous GraphSAGE encoder is trained on the donor-candidate bipartite graph to learn node embeddings that capture structural patterns in campaign finance flows. The model operates on a HeteroData graph with two node types (donors and candidates) and bidirectional donation edges.

Node features:

Training: The model is trained with a link prediction objective (binary cross-entropy) using an 80/10/10 train/val/test split via RandomLinkSplit. Training runs for 100 epochs with Adam optimizer on a single GPU, with the best checkpoint selected by validation loss.

Link prediction: After training, all non-existing donor-candidate pairs are scored by dot-product similarity in embedding space. The top 500 predicted links represent the most likely future donation connections. GNN scores are supplemented by three classical baselines: Jaccard coefficient, Adamic-Adar index, and preferential attachment.

Anomaly detection: An Isolation Forest (100 estimators, 5% contamination) is fitted on the 32-dimensional GNN embeddings to flag structurally unusual nodes — donors or candidates whose network neighborhood patterns are statistical outliers.

Visualization: Node embeddings are projected to 2D using UMAP (15 neighbors) for the embedding scatter plot. Anomalies are highlighted in red; other nodes are colored by Louvain community assignment.

Legislative Data

Legislative bill, vote, and sponsorship data is sourced from the LegiScan API via bulk dataset downloads. Legislators are matched to campaign finance candidates by last name, district number, and chamber. The Legislation page shows vote alignment, co-sponsorship networks, and correlations between campaign donations and bill sponsorship.

Bill Similarity

Bill similarity is computed using Jaccard indices on two dimensions: shared subject tags and shared sponsors. The combined similarity score is a weighted average of subject similarity (0.6) and sponsor similarity (0.4).

SHAP Explainability

SHAP (SHapley Additive exPlanations) values are computed for a vote prediction model to identify which features most influence how legislators vote. Global importance shows the top features across all legislators; per-legislator breakdowns show individual vote drivers.

DIME Ideology Scores

CFscores from the Database on Ideology, Money in Politics, and Elections (DIME) provide campaign finance-based ideology estimates for Ohio legislators. Scores are displayed as a beeswarm plot on a liberal-to-conservative spectrum.

Technical Stack

Source Code

GitHub Repository