John Geer

Raleigh-Durham-Chapel Hill Area, NC

Technical Skills

Programming Languages

Python, SQL, R, Javascript, Stan, Rust, Lua


AWS (EC2, S3, Batch, Lambda), GCP

Data Storage

BigQuery, Redshift, PostgreSQL, Parquet


Experimental Design, Time Series, Causal Inference, Survival Analysis

Triplebyte Certified Data Scientist

Top 10% of test-takers
Issued Aug 2020; No Expiration Date


Pandas, D3, dplyr, ggplot2, Keras, Tensorflow


Data Science Lead
Tuft & Needle
2018 to Present
  • Helped lift revenue by over $5 million by optimizing regional marketing using experiments, causal inference, and MMM
  • Wrote over 100 articles and presented to the CEO and Management Team weekly, clearly explaining insights about products, promotions, and marketing
  • Helped speed up financial reporting, from monthly to daily, by writing a program to combine several cost and revenue datasets
  • Made accurate company data easily available to the over 100 person organization by building data pipelines and reporting systems
  • Prioritized projects for the 4 person Data Science team and helped team members grow their skills
  • Maintained and created projects by writing over 1,000 commits to the company's git repositories

Techniques used: Causal Inference, Media Mix Modeling with Shape and Carryover Effects, Design of Experiments

Data Scientist
Tuft & Needle
2016 to 2018
  • Helped improve the conversion rate, leading to an increase of more than $20 million in annual revenue, by writing web experiment analysis software using Stan, R, and Bandit Algorithms
  • Managed our machine learning process by configuring a containerized batch system on AWS which orchestrates over 36 hours of jobs a day
  • Produced daily sales forecasts and helped explain revenue changes with Bayesian time series analysis

Techniques used: Time Series Analysis, Hierarchical / Multilevel Models, Bayesian State Space Models, Survival Analysis, Probabilistic Programming

Data Scientist, Research Squad
Automated Insights
Durham, NC
2015 to 2016
  • Wrote software to catch errors in large amounts of text
  • Built a system to make generated text more variable using word embeddings
  • Built a system to automatically optimize written content for conversion rates

Techniques used: Natural Language Processing (NLP), Contextual Bayesian Bandit Algorithms, Word Vectors, Python

Data Scientist
Automated Insights
Durham, NC
2014 to 2015
  • Wrangled, analyzed, and communicated insights from data on patient care, multinational financial flow, and TV viewership
  • Identified meaningful anomalies using time series analysis
  • Co-wrote a program that produced over 6,000 natural language medical clinic reports

Techniques used include: ARIMA Models for Prediction and Anomaly Detection in Times Series, Random Forest Supervised Learning Models, Proportional-Odds Cumulative Logistic Regression, Visualizations with D3 and R's ggplot2

Web Developer
Flying Apricot
2005 to 2012
  • Programmed web-apps and websites with Python and JavaScript
  • Improved site content with splits tests


Master of Applied Statistics
Pennsylvania State University
4.0 GPA
2012 to 2014
  • Focused on Predictive Analytics and Data Mining
  • Thesis: Built a predictive model of the number of views a TED talk will receive
Bachelor of Arts in Philosophy
Davidson College
2001 to 2005
  • Thesis: "Skepticism Regarding the External World"
  • Meaning: "Are we sure we know what's going on?"


Google & Eyebeam's Data Visualization Challenge
  • Received the "Deep Thought Badge" and Honorable Mention
  • Visualization of the connections in the US federal budget
  • Created in collaboration with Catherine Jahnes