Room 6C - San Diego Convention Center [clear filter]
Monday, November 18

9:00am PST

ServiceMeshCon hosted by CNCF (Additional Registration + Fee Required)
ServiceMeshCon is a vendor-neutral conference on service mesh technologies, featuring maintainers across different service mesh projects and also showcasing the lessons learned from running service meshes in production.

How to register: Pre-registration is required. To register for ServiceMeshCon, add it on during your KubeCon + CloudNativeCon registration.

For questions regarding this event, please reach out to events@cncf.io.

Monday November 18, 2019 9:00am - 4:30pm PST
Room 6C - San Diego Convention Center Upper Level
Tuesday, November 19

10:55am PST

Making the Most Out of Kubernetes Audit Logs - Laurent Bernaille & Robert Boll, Datadog
The Kubernetes audit logs are a rich source of information: all of the calls made to the API server are stored, along with additional metadata such as usernames, timings, and source IPs. They help to answer questions such as “What is overloading my control plane?” or “Which sequence of events led to this problematic situation?”. These questions are hard to answer otherwise—especially in large clusters. At Datadog, we have been running clusters with 1000+ nodes for more than a year and during that time, the audit logs have proved invaluable.

In this talk, we will first introduce the audit logs, explain how they are configured, and review the type of data they store. We will then demo a functioning setup and show a few different types of analysis techniques. Finally, we will describe in detail several scenarios where they have helped us to diagnose complex problems.


Laurent Bernaille

Staff Engineer, Datadog
Laurent Bernaille worked several years as a consultant specializing in cloud, containers, and automation and helped organizations migrate to the public cloud and adopt containers. He is now Principal Engineer at Datadog and works closely with infrastructure teams, which are responsible... Read More →

Robert Boll

Senior Director of Engineering, Datadog

Tuesday November 19, 2019 10:55am - 11:30am PST
Room 6C - San Diego Convention Center Upper Level

11:50am PST

eBay Search On K8s - Mohnish Kodnani & Yashwanth Vempati, eBay
eBay currently has billions of items available for search. The search engine at any given time can get around 100K’s of queries per second for search within this inventory.
In order to support this scale of traffic and the size of the inventory we need thousands of servers. The inventory is sharded and then replicated across these servers to handle the traffic. In this talk we will go through how we migrated the application to kubernetes and its deployment architecture while meeting some of the business requirements for resiliency and availability. We will also go through our index distribution architecture that leverages kubernetes principles. At the end we will also share our challenges and learnings while deploying the application on kubernetes.

avatar for Mohnish Kodnani

Mohnish Kodnani

Sr MTS, Software Engineer, eBay
Mohnish works on eBay Search Engine’s Indexing and Data Acquisition domains. He is currently in-charge of migrating the Search Engine’s deployment on top of k8s. In his spare time he loves to travel, rock climb and spend time with his 5 year old son.

Yashwanth Vempati

MTS 1,Software Engineer, eBay
Yashwanth is a passionate engineer interested in solving complex business problems. Right now he is working on moving majority of traditional application into cloud native. He is also working on storing data from kubernetes clusters and use them for monitoring and machine learning... Read More →

Tuesday November 19, 2019 11:50am - 12:25pm PST
Room 6C - San Diego Convention Center Upper Level
  Application + Development

2:25pm PST

Russian Doll: Extending Containers with Nested Processes - Christie Wilson & Jason Hall, Google
Kubernetes extensibility has gone mainstream. From CRDs to admission controllers to custom schedulers, as a platform builder you have access to a powerful toolbox! But what about the humble Pod and its hardworking containers? What if you want to extend them? What tools are at your disposal?

In this talk you’ll learn how to extend a container by overriding its binary. This inventive approach is used by Prow (the CI/CD system that tests Kubernetes itself) and systems built on Tekton Pipelines (a Kubernetes based CI/CD platform) like Jenkins X and OpenShift Pipelines.

You’ll see how you can control the order of container execution within a Pod, stream logs to a persistent store at scale, and gracefully handle the appearance and lifecycle of injected sidecars. You’ll learn some of the benefits and drawbacks, as well as how to overcome the hurdles.

avatar for Jason Hall

Jason Hall

Software Engineer, Google
Jason Hall (he/him) is a software engineer at Google, currently working on the Tekton project. Before Tekton, he helped launch Google Cloud Build (formery Google Cloud Container Builder), and before that helped launch Google Cloud Source Repositories.
avatar for Christie Wilson

Christie Wilson

Software Engineer, Google
Christie Wilson (she/her) is a software engineer at Google and co-creator of the Tekton project. Over the past decade+ she has worked in the mobile, financial and video game industries. Prior to working at Google she built load testing tools for AAA video game titles, and founded... Read More →

Tuesday November 19, 2019 2:25pm - 3:00pm PST
Room 6C - San Diego Convention Center Upper Level

3:20pm PST

Kubernetes in Your 4x4 – Continuous Deployment Directly to the Car - Rafal Kowalski, Grape Up
Automotive industry is getting more and more digitalized. Vehicles are not only a mean of transportation, but they pursue to be the drivers' control center with multiple software components onboard. To keep pace with evolving customer expectations and the newest technological solutions, vehicle's software requires frequent updates. However, the delivery process in a scaled up environment is not straightforward. Developers and operators have to face challenges, which are unusual in the typical Cloud Native world. Even basic service deployment may be complicated due to network performance or geographical considerations. During this talk, Rafał will show how to use Kubernetes, KubeEdge, k3s, Jenkins and RSocket for building continuous deployment pipelines, which ship software directly to the car, deals with rollbacks and connectivity issues.

avatar for Rafał Kowalski

Rafał Kowalski

Cloud Solution Architect, Grape Up
Rafał Kowalski is a Cloud Solution Architect at Grape Up and a PhD student at the Complex Theory System Department at the Institute of Nuclear Physics Polish Academy of Science. His professional career, as well as scientific work, is related to delivering robust, scalable cloud-based... Read More →

Tuesday November 19, 2019 3:20pm - 3:55pm PST
Room 6C - San Diego Convention Center Upper Level

4:25pm PST

Measuring and Optimizing Kubeflow Clusters at Lyft - Konstantin Gizdarski, Lyft & Richard Liu, Google
Machine learning workloads are often resource-intensive operations. As companies adopt more of these workloads, tracking resource consumption and optimizing spending becomes more challenging.

At Lyft, we developed a system which scrapes metrics from Kubernetes clusters and persists them in data warehouses. We then built a pipeline that transforms snapshots into cluster utilization metrics along the dimensions of CPU, memory, and GPU. Finally we join these metrics into our cost and usage dataset, so teams can budget resources accordingly and reduce spending.

In this talk, we will give an overview of Infraspend - our infrastructure for tracking Kubernetes usage. Attendees will learn how the data we collected helped Lyft reduce spending for Kubeflow clusters. The audience will also gain insights into how Kubernetes clusters can be optimized without performance or stability compromises.

avatar for Richard Liu

Richard Liu

Senior Software Engineer, Google
Richard Liu is a Senior Software Engineer at Google Cloud. He is currently an owner and maintainer of the TensorFlow operator and Katib projects in Kubeflow. Previously he had worked as a software developer at Microsoft Azure.
avatar for Konstantin Gizdarski

Konstantin Gizdarski

Software Engineer, Lyft
Konstantin Gizdarski is a Software Engineer at Lyft, where he has been working on — among other things — surfacing the utilization and efficiency of Kubernetes infrastructure. Previously, he has worked on machine learning and product at both Facebook and Stripe.

Tuesday November 19, 2019 4:25pm - 5:00pm PST
Room 6C - San Diego Convention Center Upper Level
  Machine Learning + Data
Wednesday, November 20

10:55am PST

Performance Tuning and Day 2 Operations - Goutham Veeramachaneni, Grafana Labs
Cortex is a distributed version of Prometheus with a lot of moving parts. We have a pretty good getting started guide with enough information to get a working cortex cluster that can ingest data and answer queries. But there is limited material on the day 2 operations: Capacity planning, query performance debugging, and general health monitoring. In this talk, we will take you through the debugging workflow, the typical knobs that should be tweaked for optimal performance, the mixin for cortex that covers the dashboards and alerts, and in general how to approach debugging and maintaining an existing cortex cluster.

avatar for Gouthan Veeramachaneni

Gouthan Veeramachaneni

Senior Software Engineer, Grafana Labs
Goutham is a developer from India who started his journey as an infra intern at a large company where he worked on deploying Prometheus. After the initial encounter, he started contributing to Prometheus and interned with CoreOS, working on Prometheus's new storage engine. He is now... Read More →

Wednesday November 20, 2019 10:55am - 11:30am PST
Room 6C - San Diego Convention Center Upper Level
  Maintainer Track Sessions

11:50am PST

Case Study: AI-as-a-Service on Kubernetes at Scale and In Production - Itay Gabbay, Israel Ministry of Defense (MOD) & Tushar Katarki, Red Hat
AI is popular and yet faces two big challenges in the industry: 1) self-service and automation 2) Use in real production.

At the Israel Ministry of Defense we are taking on the challenges with containers and Kubernetes. We have built AI-as-a-service with open source tools and Kuberentes. Our Data Scientists use the service for data, experimentation and to deliver models into production iteratively with self-service and automation.

Using Kubernetes, we are able to run massive machine learning pipelines automatically, and improve our machine learning models. We implemented several principles of AutoML - a wide research area nowadays. Using AutoML & Kubernetes, we can further improve our machine learning models and pipelines - automatically.

Come find out how we built our AI service on Kubernetes, issues we ran into and best practices with a live demo and supporting slides.

avatar for Tushar Katarki

Tushar Katarki

Product Manager, Red Hat
Tushar Katarki is a senior technology professional with experience in cloud architecture, product management and engineering. He is currently at Red Hat as a product manager for OpenShift with focus on AI/ML on OpenShift . Tushar is involved with several open source projects around... Read More →

Itay Gabbay

Machine Learning Engineer, MOD Israel
Itay Gabbay is a software engineer specialized in machine learning and AutoML. He is currently at the Israeli ministry of defense, responsible for a machine learning platform he designed and implemented, based on OpenShift.

Wednesday November 20, 2019 11:50am - 12:25pm PST
Room 6C - San Diego Convention Center Upper Level
  Case Studies

2:25pm PST

Beyond Getting Started: Using OpenTelemetry to Its Full Potential - Sergey Kanzhelev, Microsoft & Morgan McLean, Google
OpenTelemetry is a cloud-native set of APIs and libraries used to generate, collect, and export telemetry from distributed systems. This session goes beyond a basic introduction, and demonstrates how you can customize OpenTelemetry’s components and architecture for the unique needs of your app. Attendees will learn how to set up and configure built-in data collectors, how to write their own instrumentation, how to extend and enrich automatically collected telemetry with app-specific information, and how to send this data to Prometheus and Jaeger for analysis.

avatar for Morgan McLean

Morgan McLean

Product Manager, Google
Morgan is a co-founder of OpenCensus and OpenTelemetry, and has spent much of his career as an engineer and product manager working on distributed systems and developer tools. Morgan is responsible for Google's distributed tracing, profiling, and debugging tools, including Stackdriver... Read More →
avatar for Sergey Kanzhelev

Sergey Kanzhelev

Staff Software Engineer, Google
Sergey Kanzhelev is a seasoned open source and cloud native maintainer working actively on Kubernetes. Sergey is serving as co-chair of SIG node. He is also one of OpenTelemetry founders. He is working on engineering aspect of software and its practical application. With the Kubernetes... Read More →

Wednesday November 20, 2019 2:25pm - 3:00pm PST
Room 6C - San Diego Convention Center Upper Level

3:20pm PST

Panel: Improving and Managing Kubernetes at Scale - Xiang Li, Alibaba; Corin Dwyer, Netflix; Amit Bose, Uber; June Liu & Harry Zhang, Pinterest
Companies like Alibaba, Uber, and Pinterest are managing a huge fleet of machines with demanding and complicated workload. To evolve our infrastructure and adopt Kubernetes, we faced many challenges around scalability, reliability, flexibility and operationality. And today, after overcame those difficulties, we are running some of the largest Kubernetes clusters in the world.

In this panel, we would like to share our real world experience on improving and managing Kubernetes with harsh requirements. We believe the stories are interesting themselves, and many of the lessons we learned also apply to small-mid size cluster operators and users.


Amit Bose

Senior Software Engineer II, Uber

June Liu

Staff Software Engineer, Pinterest Inc
After spending years in large organization, June joined Pinterest to explore the vast ocean of open source and start up spirit. Her interests focus on container orchestration, large scale cluster operations and developer tools. She currently works on the compute platform team at Pinterest... Read More →
avatar for Xiang Li

Xiang Li

Senior Staff Engineer, Alibaba
李响,阿里云智能资深技术专家,负责阿里巴巴大规模集群调度与管理系统,帮助阿里巴巴通过云原生技术初步完成了基础架构的转型,实现了资源利用率与软件的开发和部署效率的大幅提升,并同步支撑了云产品的技术演进。CNCF... Read More →

Harry Zhang

Software Engineer, Pinterest
Harry is a Software Engineer from Pinterest working on its Kubernetes based next generation container cloud. Harry is interested in large scale cluster management solutions and related technologies. Harry is currently a Kubernetes contributor and a CNCF Certified Kubernetes Administrator... Read More →

Corin Dwyer

Senior Software Engineer, Netflix
Corin Dwyer is a senior software engineer within the Netflix compute platform development team. Before working on Titus, Netflix's container platform, he worked on infrastructure engineering for the Netflix content organization and before that in healthcare. He has worked across the... Read More →

Wednesday November 20, 2019 3:20pm - 3:55pm PST
Room 6C - San Diego Convention Center Upper Level

4:25pm PST

Kubernetes Storage Cheat Sheet for VM Administrators - Manu Batra & Jing Xu, Google
Getting started in containers and Kubernetes can be daunting, especially when coming from the Virtual Machines world. The differences in storage models adds to the confusion. This session will explain the storage and data management differences between Virtual Machines and Containers. Specifically we will focus on:

- Translating the VM terminology and challenges to the Kubernetes container world.
- Drawing architectural parallels between the two approaches including storage operations and communication fundamentals.
- Discouraging the impulse to tackle storage problems the same way on Kubernetes as was done in the VM world.

You will leave this talk with an understanding of how storage works in Kubernetes ecosystem, with parallels to VM/hosts storage terminology, architecture, and operations.


Jing Xu

Software Engineer, Google
Jing Xu obtained her Ph.D. from Electrical and Computer Engineering Department, University of Florida in May 2011. After graduation, she had been a lecturer in School of Computer Science in Florida International University for about 4 years. She moved to Bay area in late 2014 and... Read More →

Manu Batra

Product Manager, Google
Manu Batra is Product Manager at Google driving product strategy and delivery for Anthos, Kubernetes Storage and Container Data Protection. In prior roles he’s working across startup and enterprise companies building storage & infrastructure management software.

Wednesday November 20, 2019 4:25pm - 5:00pm PST
Room 6C - San Diego Convention Center Upper Level

5:20pm PST

Thanos Deep Dive: Inside a Distributed Monitoring System - Bartlomiej Plotka & Frederic Branczyk, Red Hat
Thanos is an open-source CNCF Sandbox project that builds upon Prometheus components to create a global-scale highly available monitoring system. It seamlessly extends Prometheus in a few simple steps and it is already used in production by dozens of companies that aim for high multi-cloud scale for metrics while keeping low maintenance cost. During this talk, Frederic Branczyk and Bartek Plotka, core maintainers of Thanos and Prometheus projects, will explain advanced concepts behind the Thanos project. This talk will cover: - Possible deployment models - Integration points with other systems - Important advanced features e.g discovery, multi-label HA, query load balancing - Example solutions for multi-tenancy, authentication and cross-cluster communication in Thanos. Join this session to learn about advanced concepts and operational models of Thanos!

avatar for Bartłomiej Płotka

Bartłomiej Płotka

Mr, Google
Bartek Płotka is a Senior Software Engineer at Google. SWE by heart, with an SRE background, currently working on Cloud Observability. Previously Principal Software Engineer at Red Hat. Author of "Efficient Go" book with O'Reilly. As the co-founder of the CNCF Thanos project and... Read More →
avatar for Frederic Branczyk

Frederic Branczyk

Founder, Polar Signals
Frederic is the founder of Polar Signals. Before, he was a senior principal engineer and the main architect for all things Observability at Red Hat, which he joined through the CoreOS acquisition. Frederic is a Prometheus and Thanos maintainer and tenured as the tech lead for for... Read More →

Wednesday November 20, 2019 5:20pm - 5:55pm PST
Room 6C - San Diego Convention Center Upper Level
  Maintainer Track Sessions
Thursday, November 21

10:55am PST

Building a Dev/Test Loop for a Kubernetes Edge Gateway with Envoy Proxy - Flynn, Datawire
As we worked with the community to build the open source Ambassador API gateway on top of Envoy Proxy we learned a bunch of lessons about our dev/test loop. One of the more unpleasant realities that we’ve had to come to terms with is that writing code is easy. What's hard is making sure it's working, and making sure that it keeps working as changes are made.

Over the life of Ambassador we've gone through multiple cycles of adding tests to increase confidence, from simple unit tests to larger integration suites, such as our Kubernetes Acceptance Test (KAT) framework. Several times these tests have become too slow, and then we had to work to speed them up so our velocity doesn't suffer.

Join Flynn to learn what we would do again in regard to our dev/test loop if we chose to build another open source tool, and also (more critically), what we would change.

avatar for Flynn


Technical Evangelist, Buoyant
Flynn is a technology evangelist at Buoyant, spreading the good word and educating developers about the Linkerd service mesh, Kubernetes, and cloud-native development in general. He has spent four decades in software engineering from the kernel up through distributed applications... Read More →

Thursday November 21, 2019 10:55am - 11:30am PST
Room 6C - San Diego Convention Center Upper Level

11:50am PST

Linux Distribution Build Tools for Custom Container Images - Nisha Kumar & Joshua Lock, VMware
A typical container image builder takes a base OS from somewhere, runs scripts to add and modify all the things needed for an app to run, then deploys to a registry. This leads to large images which developers try to trim down by using multistage builds, removing files and squashing filesystem layers. Building container images in this way makes it difficult if not impossible to ascertain the license and security implications of using these images.

How do we generate app specific build and runtime images without having to maintain our own base OS images and build machinery?

Fortunately, this is a problem that has been solved in the Linux distribution world for some time. This talk will outline some popular tools and compare them against the requirements for a declarative and reproducible container OS builder which is not reliant on any external infrastructure.


Nisha Kumar

Security Engineer, Oracle
Nisha is a Security Engineer at Oracle. She has been a DevOps engineer for embedded systems and a Radio Frequency Engineer in semiconductor manufacturing. She has been involved in Open Source for more than 15 years. You can follow her work on Twitter at @_ctlfsh
avatar for Joshua Lock

Joshua Lock

Open Source Architect, Verizon
Joshua is Open Source Architect in Verizon's Open Source Program Office where he leads efforts to improve consistency around how Verizon uses open source. As part of his work at Verizon he works upstream on software supply chain security standards and tools; he is a steering committee... Read More →

Thursday November 21, 2019 11:50am - 12:25pm PST
Room 6C - San Diego Convention Center Upper Level
  Application + Development

2:25pm PST

Securing Communication Between Meshes and Beyond with SPIFFE Federation - Evan Gilman, Scytale & Oliver Liu, Google
One of the hottest features that Istio brings to the table is transparent, mutually-authenticated TLS between all workloads running on it. Under the covers, it relies on SPIFFE to provide the cryptographic identity that is used to perform this mutual authentication.

SPIFFE relies on an authority to issue identity. In an Istio mesh, Istio Citadel (CA) issues certificates to workloads by default... but, what happens when you have more than one Istio mesh, and hence more than one Citadel? Or Istio workloads talking to external services?

Enter SPIFFE federation. It allows SPIFFE identity issuers to peer with each other, enabling workloads in disparate domains to securely authenticate and communicate with each other. In this talk, we will describe the challenges involved here and how SPIFFE addresses them, as well as demonstrate SPIFFE federation between Istio mesh and SPIRE.

avatar for Evan Gilman

Evan Gilman

Staff Engineer, VMware
Evan Gilman is an engineer with a background in computer networks. With roots in academia, and currently working on the SPIFFE project, he has been building and operating systems in hostile environments his entire professional career. An open source contributor, speaker, and author... Read More →
avatar for Oliver Liu

Oliver Liu

Senior Software Engineer, Google
Dr. Oliver (Yonggang) Liu is a senior software engineer in Google. He is one of the early developers and core engineers of Istio. Oliver has 10 years of experience in research and development of distributed systems and service mesh. Oliver received his PhD degree from University of... Read More →

Thursday November 21, 2019 2:25pm - 3:00pm PST
Room 6C - San Diego Convention Center Upper Level

3:20pm PST

CoreDNS: Beyond the Basics - Cricket Liu, Infoblox & John Belamaric, Google
This session will cover aspects of CoreDNS's configuration beyond the basics, including signing DNS data with DNSSEC, supporting DNS over TLS (DoT), manipulating queries and responses, managing zone data with Git, running a full recursive DNS server with the unbound plugin, configuring CoreDNS to perform multi-cluster service discovery. The session is intended for people with a solid understanding of basic CoreDNS configuration who wish to support more advanced use cases or to extend CoreDNS's functionality.

avatar for Cricket Liu

Cricket Liu

Chief DNS Architect, Infoblox
Cricket Liu is an authority on the Domain Name System and the co-author of all of O'Reilly Media’s books on DNS, including the classic DNS and BIND. As Infoblox’s Chief DNS Architect, Cricket guides the development of Infoblox’s product and business strategy, and serves as a... Read More →
avatar for John Belamaric

John Belamaric

Senior Staff Software Engineer, Google
John Belamaric is a Senior Staff Software Engineer at Google with over 25 years of software design and development experience. As a co-chair of Kubernetes SIG Architecture, he provides leadership on production readiness, conformance, and overall software architecture for the Kubernetes... Read More →

Thursday November 21, 2019 3:20pm - 3:55pm PST
Room 6C - San Diego Convention Center Upper Level

4:25pm PST

Deep Dive Into the Latest Kubernetes Scheduler Features - Abdullah Gharaibeh, Google Inc.
Kubernetes Scheduler is the component of Kubernetes that assigns pods to nodes based on the configured scheduling requirements. Users can choose to run their clusters with high resource efficiency, high reliability, or other custom policies. The scheduler also implements a number of critical Kubernetes features, such as "Node Affinity", "Inter-pod affinity and anti-affinity" and the new "Even pod spreading" features. This talk will provide information on recent SIG Scheduling projects and features, including the the scheduling framework and even pod spreading. We will dedicate about half of the time of the presentation to audience questions and users' feedback.

avatar for Abdullah Gharaibeh

Abdullah Gharaibeh

Staff Software Engineer, Google
Abdullah is a staff software engineer at Google and sig-scheduling and working group batch co-chair. He works on Kubernetes and Google Kubernetes Engine, focusing on scheduling and batch workloads.

Thursday November 21, 2019 4:25pm - 5:00pm PST
Room 6C - San Diego Convention Center Upper Level
  Maintainer Track Sessions

5:20pm PST

Release the Kraken: Bring Sidecar Containers to Next Level - Di Xu, Ant Financial & Xiaoyu Zhang, Alibaba
Sidecar containers are well accepted and widely used nowadays. Sidecars are coupled with normal containers by sharing the same lifecycle and provide accessory features. This is a good pattern to enable applications to be composed of heterogeneous components and technologies by reducing coupling.

The demands of using sidecar containers in production environments are rapidly increasing, although sidecars have not formally identified. More issues and discussions have cropped up in Kubernetes community and slack channels.

Thus, we need a fine-grained way to manage the sidecars, including the starting/terminating order, the lifecycle of sidecars, etc. Also pre and post steps are introduced to better control the sidecars. Moreover, we will introduce some use scenarios on how we maximize the power of sidecars at a large scale in Alibaba Group and Ant Financial.

avatar for Di Xu

Di Xu

Senior Engineer, Tencent
Currently, he is working at Tencent as a staff engineer, leading a small team working on open source cloud native projects and distributed cloud platform development. Also, he is a top 50 code contributor in Kubernetes community. He had spoken many times at open source conferences... Read More →
avatar for Xiaoyu Zhang

Xiaoyu Zhang

Senior Engineer, Alibaba
Xiaoyu Zhang is a senior software engineer in Alibaba Group. He's a member of the Kubernetes organization. He mainly works on Kubernetes project and focuses on docs, kubectl, controller-manager, storage and runtime areas. He had multiple speeches in Cloud Native End User Conference... Read More →

Thursday November 21, 2019 5:20pm - 5:55pm PST
Room 6C - San Diego Convention Center Upper Level

Filter sessions
Apply filters to sessions.