Real-ish Time Predictive Analytics with Spark Structured Streaming

Understanding your data is key to actually being able to learn from the potentially massive amounts of data you are storing every single day. Learn how to learn from your data and build a streaming spark application.

About This Session

In this workshop we will dive deep into what it takes to build and deliver an always-on "real-ish time" predictive analytics pipeline with Spark Structured Streaming.

The core focus of the workshop material will be on how to solve a common complex problem in which we have no labeled data in an unbounded timeseries dataset and need to understand the substructure of said chaos in order to apply common supervised and statistical modeling techniques to our data in a streaming fashion.

The example problem for the workshop will come from the telecommunications space but the skills you will leave with can be applied to almost any domain as long as you sprinkle in a little creativity and inject a bit of domain knowledge.

Skills Aquired:
1. Structured Streaming experience with Apache Spark.
2. Understand how to use supervised modeling techniques on unsupervised data (caveat: requires some domain knowledge and the good ol human touch).
3. Scala Ninjary

Time: 9:30 AM Saturday Room: Fireside D

The Speaker(s)

Scott Haines

Principal Software Engineer , Mr.

I work at Twilio on a huge distributed voice analytics system.

Real-ish Time Predictive Analytics with Spark Structured Streaming

About This Session

The Speaker(s)

Scott Haines

Principal Software Engineer , Mr.

Download

Share