AI Engineer · Computational Modeling · Interpretability
Experience
Shipped LangGraph and OpenAI API tooling to financial-institution clients on GCP. Owned LLMOps across model versioning, multi-environment deployment, and Airflow-orchestrated pipelines. Designed scalable ETL and streaming systems on Pub/Sub and BigQuery.
Delivered predictive well-selection models for an energy-sector client across 30+ operational variables. Built SHAP-driven feature importance outputs formatted for engineering stakeholders.
Shipped a knowledge-graph-augmented LLM system for physics question answering - ~200% improvement over baseline — contributing to three published papers. Coordinated three cross-functional ML sub-teams.
Current Work
A production evaluation server for open-weight LLMs paired with a mechanistic interpretability layer. Nine behavioral probe suites surface trust-critical failure modes in agentic and tool-calling deployments.
Agent-based simulation of AI policy discourse across 1,000+ agents and 30 seeds. Opinion Space Geometry is an important consideration for policy discourse simulation, with implications for the design of deliberative discourse platforms.
Building an NLA utility for small language models to explore their thoughts and evaluate their suitability for downstream tasks.
Skills & Projects
Publications