Python, SQL, R, Javascript, Stan, Rust, Lua
AWS (EC2, S3, Batch, Lambda), GCP
BigQuery, Redshift, PostgreSQL, Parquet
Experimental Design, Time Series, Causal Inference, Survival Analysis
Top 10% of test-takers
Issued Aug 2020; No Expiration Date
Pandas, D3, dplyr, ggplot2, Keras, Tensorflow
- Helped lift revenue by over $5 million by optimizing regional marketing using experiments, causal inference, and MMM
- Wrote over 100 articles and presented to the CEO and Management Team weekly, clearly explaining insights about products, promotions, and marketing
- Helped speed up financial reporting, from monthly to daily, by writing a program to combine several cost and revenue datasets
- Made accurate company data easily available to the over 100 person organization by building data pipelines and reporting systems
- Prioritized projects for the 4 person Data Science team and helped team members grow their skills
- Maintained and created projects by writing over 1,000 commits to the company's git repositories
Techniques used: Causal Inference, Media Mix Modeling with Shape and Carryover Effects, Design of Experiments
- Helped improve the conversion rate, leading to an increase of more than $20 million in annual revenue, by writing web experiment analysis software using Stan, R, and Bandit Algorithms
- Managed our machine learning process by configuring a containerized batch system on AWS which orchestrates over 36 hours of jobs a day
- Produced daily sales forecasts and helped explain revenue changes with Bayesian time series analysis
Techniques used: Time Series Analysis, Hierarchical / Multilevel Models, Bayesian State Space Models, Survival Analysis, Probabilistic Programming
- Wrote software to catch errors in large amounts of text
- Built a system to make generated text more variable using word embeddings
- Built a system to automatically optimize written content for conversion rates
Techniques used: Natural Language Processing (NLP), Contextual Bayesian Bandit Algorithms, Word Vectors, Python
- Wrangled, analyzed, and communicated insights from data on patient care, multinational financial flow, and TV viewership
- Identified meaningful anomalies using time series analysis
- Co-wrote a program that produced over 6,000 natural language medical clinic reports
Techniques used include: ARIMA Models for Prediction and Anomaly Detection in Times Series, Random Forest Supervised Learning Models, Proportional-Odds Cumulative Logistic Regression, Visualizations with D3 and R's ggplot2