John Geer


I help solve problems with statistics, code, and prose.



Head of Data Science at Tuft & Needle

  • Lead the data science team, prioritizing high impact projects and helping team members grow their skills
  • Helped optimize localized marketing, leading to millions in sales, using experiments, bayesian marketing mix modeling, and hierarchical time series models
  • Architected and wrote an ETL system to make accurate company data easily available.
  • Over 1,000 commits to the company's git repositories
  • Techniques used include: Marketing mix modeling with shape and carryover effets, tensorflow, product vectors



Data Scientist at Tuft & Needle

  • Wrote software that produces daily sales forecasts for finance and supply chain use
  • Setup a data science processing system on AWS to manage our machine learning jobs
    • Spins up and shuts down spot instances as needed
    • Runs dockerized jobs
    • Securely shares visualizations and data inside the company
  • Designed and analyzed experiments
  • Techniques used include: Probabilistic Programming with Stan, Hierarchical / Multilevel Models, Bayesian State Space Models, Survival Analysis, Design of Experiments



Lead Data Scientist, Research Squad at Automated Insights

  • Prototyped new product features and tools
  • Built a system which tests and chooses written content to increase conversion rates
  • Wrote software in Python to detect duplicates and catch errors in large amounts of text using Natural Language Processing (NLP)
  • Techniques used include: Natural Language Processing (NLP), Contextual Bayesian Bandit Algorithms, Word Vectors, Probabilistic Programming
  • Techniques used include: NLP, Contextual Bayesian Bandit Algorithms, Word Vectors



Data Scientist at Automated Insights

  • Wrangled, analyzed, and communicated insights from data on patient care, multinational financial flow, TV viewership, and many other topics
  • Created sample analyses to help clients identify the most useful insights
  • Built predictive models and anomaly detectors using R
  • Co-wrote a program that produced over 6,000 natural language medical clinic reports
    • Over 80% of clinic managers responded that the reports made understanding the data easier
  • Techniques used include: ARIMA Models for Prediction and Anomaly Detection in Times Series, Random Forest Supervised Learning Models, Proportional-Odds Cumulative Logistic Regression, Locally Weighted Regression (loess), Visualizations with D3, R's ggplot2, and SVG Vector Graphics
  • Techniques used include: ARIMA Models, Random Forest, Visualizations with D3, R's ggplot2, and SVG Vector Graphics



Flying Apricot Web Development

  • Programmed web-apps and websites with Python, HTML, and CSS
  • Helped small businesses get noticed online
  • Split-tested web pages to improve performance




Master of Applied Statistics

  • Pennsylvania State University
    • Focus on Predictive Analytics and Data Mining
    • Thesis: Built a predictive model with R of the number of views a TED talk will receive



Non-degree Graduate Student in Mathematics

  • University of Illinois at Urbana-Champaign
    • Linear Algebra, Probability Theory and Statistics, Single and Multivariate Calculus, Combinatorics


Bachelor of Arts in Philosophy

  • Davidson College
    • Thesis: "Skepticism Regarding the External World"
    • Meaning: "Are we sure we know what's going on?"


Relevant Award

Google & Eyebeam's Data Visualization Challenge


  • Received the "Deep Thought Badge" and Honorable Mention
  • Visualization of the connections in the US federal budget
  • Created in collaboration with Catherine Jahnes

Technical Skills

Programming Languages

  • R, Python, SQL, Javascript, Stan, Rust, Lua

Relevant Applications & Libraries

  • dplyr, Pandas, Keras, D3, Minitab, Mathematica

Markup Languages



  • Linux, OS X

Irrelevant Skills

  • Can juggle 5 balls, 4 rings, and 3 clubs
  • Helped build a ranch house as a worker on the Bar X Bar Ranch in Pecos, NM
  • Mediocre at unicycling