Loading…
Back To Schedule
Thursday, November 21 • 11:50am - 12:25pm
Kubernetizing Big Data and ML Workloads at Uber - Mayank Bansal & Min Cai, Uber

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Uber relies on Big Data and ML to make business critical decisions such as pricing, trip ETA, etc. Today, those workloads such as Hive and Spark are running on YARN. To save millions of dollars by efficient use of cluster resources, Uber is planning to use Kubernetes to co-locate BigData/ML and micro-service workloads.

Kubernetes is the de-facto standard for running micro-services. However, in comparison to YARN, it still lacks many features like hierarchical resource pools, elastic resource sharing, gang scheduling etc. To bridge this gap, we have re-architected Peloton to be a set of Kubernetes scheduler and controller plugins so that we can provide feature parity with YARN.

This talk will cover:
- Learnings of running large-scale BigData/ML on Kubernetes with Peloton
- Colocation of mixed workloads
- Federation across zones
- Feature and API parity with YARN

Speakers
avatar for Min Cai

Min Cai

Sr. Staff Engineer, Uber
Min Cai is a Sr. Staff Engineer in Compute Platform team at Uber working on all-active datacenters, cluster management and micro-service deployment systems. He received his Ph.D. degree in Computer Science from Univ. of Southern California. Before joining Uber, he was a Sr. Staff... Read More →
avatar for Mayank Bansal

Mayank Bansal

Staff Engineer, Uber
Mayank Bansal is currently working as a Staff engineer at Uber in data infrastructure team. He is co-author of Peloton. He is Apache Hadoop Committer and Oozie PMC and Committer. Previously he was working at ebay in hadoop platform team leading YARN and MapReduce effort. Prior to... Read More →



Thursday November 21, 2019 11:50am - 12:25pm PST
Room 15AB - San Diego Convention Center Mezzanine Level
  Machine Learning + Data