Name: Using Triangulation to Evaluate Machine Learning Models
Start: 2019-04-11T14:50:00-0400
End: 2019-04-11T15:20:00-0400

THE BIG FESTIVAL ABOUT SMALL CITIES
Tom Tom champions civic innovation, creativity, and entrepreneurship in America’s hometowns.

[Back to Tom Tom Festival]

Back To Schedule

Using Triangulation to Evaluate Machine Learning Models

Feedback form is now closed.

As machine learning grows in prominence, adoption in high-impact use cases such as anti-fraud and network security are growing. Having a high performing statistical model in these areas are critical: a false positive error leads to unnecessary work, while a false negative error increases exposure to potential threats. Since there are no perfect machine learning models, as data scientists our task is to first convince ourselves and then convince others that we have a statistical model worthy for production. Persuasion, though, can be difficult because many of the steps and assumptions that go into training a statistical model from data are difficult, if not impossible, to accurately share with the ultimate consumers of the model.
Drawing on ideas from the philosophy of science such as falsifiability and counterfactuals, we present a framework for triangulating the performance of machine learning models using a series of questions to help establish the validity of performance claims. In navigation tasks, triangulation can be used to determine one’s current location based on the angle and distance from other landmarks with known position. We believe triangulation of a different sort is necessary to determine the performance of machine learning models. Each of the steps that go into making a machine learning model including input data selection, sampling, outcome variable selection, feature creation, model selection and evaluation criteria shape the final model and provide necessary context for interpreting the performance results. Our framework highlights ways to uncover assumptions hidden in those choices and identify higher performing models.

You need this ticket from Eventbrite to sign up: Applied Machine Learning Conference.

Speakers

Andrew Fast

Chief Data Scientist, CounterFlow AI, Inc

Andrew Fast is the Chief Data Scientist and co-founder of CounterFlow AI, where he leads the implementation of streaming machine learning algorithms on CounterFlow AI's ThreatEye cloud-native analytics platform for Encrypted Traffic Analysis. Previously, Dr. Fast served as the Chief... Read More →

Sponsors