Two views on regression with PyMC3 and scikit-learn

Abstract

Python has become one of the most popular languages for machine learning due in no small part to exceptional numeric libraries (NumPy, SciPy) that act as building blocks for exceptional machine learning libraries (scikit-learn, pandas). A side effect of the recent rise of deep learning frameworks (Theano, TensorFlow, PyTorch) has been to enable efficient sampling from complex statistical models, which can be considered a building block for probabilistic modeling libraries like PyMC3 and Edward.
In this talk, we will review how to solve regression problems using scikit-learn, and then show how to implement the same models in PyMC3. We extend these models to include regularization in both libraries, and talk about the geometric and statistical assumptions we make in each approach. Finally, we will reflect on why the existence of these two viewpoints is both algorithmically and mathematically beautiful.

Date

Wed, Nov 29, 2017

Event

PyData New York City 2017

Location

New York City, NY

Links

Slides