Back To Schedule
Tuesday, November 19 • 10:55am - 11:30am
Running Apache Samza on Kubernetes - Weiqing Yang, LinkedIn Corporation

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Apache Samza is a distributed stream processing framework that allows you to process and analyze your data in real-time. It has been widely used at Linkedin and other companies on a large scale. Recently, we added Kubernetes as the new scheduler backend for Samza to run in distributed mode. In this talk, we will deep dive into the technical details about how Samza runs natively on Kubernetes by leveraging the primitives provided by Kubernetes for scheduling, storages, etc. We will also compare running Samza on Kubernetes with other existing solutions such as YARN and standalone mode. Finally, we will share some practices about running Kubernetes as a container orchestration framework for other big data processing engines.

avatar for Weiqing Yang

Weiqing Yang

Software Engineer, LinkedIn
Weiqing has been working in big data computation frameworks since 2015 and is an Apache Spark/HBase/Hadoop/Samza contributor. She is currently a software engineer in streaming infrastructure team at LinkedIn, working on Samza, Brooklin, etc. Before that, she worked in Spark team at... Read More →

Tuesday November 19, 2019 10:55am - 11:30am PST
Room 1AB - San Diego Convention Center Upper Level
  Machine Learning + Data