Colin Carroll

Blog // Selected Talks // Research and Open Source

I am a software engineer and mathematician in Somerville, MA, working on machine learning, often from a perspective of Bayesian statistics. It is easiest to get in touch via GitHub, Bluesky, or even LinkedIn.

There is a lot of material here! I have marked with a ★ those that I think are most interesting to read as of Spring 2025.

Blog

29 November, 2019 - PyMC3 v3.8

★ 29 November, 2019 - Unbiased MCMC with couplings

18 August, 2019 - Very Parallel MCMC Sampling

23 July, 2019 - A Tour of Probabilistic Programming Languages

★ 28 April, 2019 - Choice of Symplectic Integrator in Hamiltonian Monte Carlo

★ 21 April, 2019 - Step size adaptation in Hamiltonian Monte Carlo

★ 11 April, 2019 - Hamiltonian Monte Carlo from scratch

6 April, 2019 - Exercises in automatic differentiation using autograd and JAX

29 January, 2019 - Notes on Differential Geometry

24 November, 2018 - Animated MCMC

20 October, 2018 - Simulation Based Calibration in PyMC3

20 July, 2018 - Why I'm Excited about PyMC3 v3.5

28 January, 2018 - That was cool when the Mercury News did it 8 years ago

20 January, 2018 - A summary of "Generalizing Hamiltonian Monte Carlo with Neural Networks"

1 January, 2018 - Why you should not use Metropolis-Hastings

7 December, 2017 - Does this convince you that self-driving cars are safe?

14 October, 2017 - Handling multiple python versions

10 June, 2017 - Releasing a project on PyPI

20 May, 2017 - Setting the matplotlib backend

28 September, 2015 - Cross Country Predictions

Selected Talks

29 July, 2024 - The State of Bayesian Workflows in JAX — A similar talk to the below, at PyData Vermont targeted at working data scientists.

20 June, 2022 - Scalable Bayesian Workflows in JAX — A talk at "Bayesian Deep Learning for Cosmology and Time Domain Astrophysics" at Université Paris Cité for graduate students and researchers.

9 November, 2022 - JAX for Bayes — A talk at PyData NYC 2022 on doing Bayesian statistics with GPUs to an audience of data scientists and engineers.

12 July, 2021 - Adopting static typing in scientific projects — With Predrag Gruevski. A talk given at the SciPy 2021 conference virtually. Also available as a video.

★ November, 2019 - yourplotlib: Best practices for domain-specific matplotlib libraries — With Hannah Aizenman and Thomas Caswell. A talk at PyData NYC 2019.

12 July, 2019 - Intro to Bayesian Model Evaluation, Visualization, & Comparison Using ArviZ — With Ravin Kumar. A tutorial given at the SciPy 2019 conference in Austin, TX.

★ 18 June, 2019 - Pragmatic Probabilistic Programming — A talk at the Probabilistic and Differentiable Programming Summit in Menlo Park, CA for an audience of researchers and engineers.

17 October, 2018 - Tidy and beautiful: Visualizing Bayesian models with xarray and ArviZ — Talk at PyData NYC introducing ArviZ to an audience of working data scientists.

5 October, 2018 - ArviZ: a unified library for Bayesian model criticism and visualization in Python — A poster introducing ArviZ presented at PROBPROG: The International Conference on Probabilistic Programming at MIT.

11 May, 2018 - Fighting Gerrymandering with PyMC3 — With Dr. Karin Knudson. A talk at PyCon 2018 in Cleveland, OH. Also available as a video.

12 January, 2018 - Two Years of Open Source — A talk at Phillips Academy in Andover, MA for the course "The Open Source Movement".

27 December, 2017 - A Working Knowledge of Machine Learning in 45 Minutes — Lecture for MAS 500 at the MIT Media Lab, "Hands on Foundations in Media Technology".

★ 29 November, 2017 - Two views on regression with PyMC3 and scikit-learn — A talk at PyData NYC comparing the Bayesian PyMC3 approach to the scikit-learn approach.

★ 18 June, 2017 - Slides from "Hamiltonian Monte Carlo in PyMC3" — From a "Boston Bayesians" meetup, for a general audience. There is also github repo with all the talk materials in it.

17 September, 2016 - Build You A Machine Learning — Talk given at the Kensho Machine Learning Seminar in Cambridge, MA.

13 May, 2016 - Finding and Using Data — A talk for high school students at Phillips Academy in Andover, MA for their Data Science course.

10 September, 2014 - A Bayesian Approach to Regularization — A talk at the Rice Geometry Analysis Seminar in Houston, TX.

07 April, 2014 - Classification Algorithms — A survey of classification algorithms presented in Austin, TX after placing 12th in a Kaggle March Madness competition.

31 March, 2011 - The History and Mathematics of Area Minimizing Surfaces — A talk at Rice University aimed at undergraduates. Introduces problems in geometric measure theory.

13 January, 2011 - Currents and Differential Forms in Metric Spaces — A talk at the School on Analysis in Metric Spaces and Geometric Measure Theory at Centro di Ricerca Matematica Ennio De Giorgi.

10 February, 2009 - A Brief History of Infinity — A talk at Rice University aimed at undergraduates.

Research and Open Source Work

These projects are all implicitly ★, and are very collaborative efforts.

PyMC

(2016 - ) I have been a contributor to PyMC, a probabilistic programming library for Bayesian inference, for many years. I have also served in the governing council and helped mentor Google Summer of Code students. We wrote a paper about the software.

ArviZ

(2018 - ) I helped start the ArviZ project, a visualization library for Bayesian inference, and have continued to serve in the governing council. We wrote a paper about the software.

Bayeux

(2023 - ) A personal project I have open sourced. bayeux lets you write a probabilistic model in JAX and immediately have access to state-of-the-art inference methods for MCMC, optimization, and variational inference. It is what I use now (Spring 2025) for Bayesian tasks.

Running Markov Chain Monte Carlo on Modern Hardware and Software

(November 2024, with Pavel Sountsov and Matthew Hoffman) A chapter for the 2nd edition of the "Handbook of Markov Chain Monte Carlo" on efficiently running MCMC on GPUs and TPUs.

Scalable spatiotemporal prediction with Bayesian neural fields

(September 2024, with Feras Saad, Jacob Burnim, Brian Patton, Urs Köster, Rif A. Saurous, and Matthew Hoffman) A paper in Nature Communications introducing Bayesian neural fields for state-of-the-art spatiotemporal prediction with uncertainty. Includes an open source implementation.

AutoBNN: Probabilistic time series forecasting with compositional bayesian neural networks

(March 2024, with Urs Köster and Thomas Colthurst) Automated time series forecasting using interpretable Bayesian neural nets, with an open source implementation. Announced on the Google Research blog.

TensorFlow Probability

(2020 - ) I help maintain TFP, a library for Bayesian research using TensorFlow or JAX, and have spent time in the past improving the MCMC stack.

Side projects

All except for Oryx and Bambi are built and maintained by me.