November 13–15, 2018 - Shanghai, China
Simultaneous translation will be provided for all keynote and breakout sessions.
Thursday, November 15 • 16:45 - 17:20
Apache Spark on Kubernetes: A Technical Deep Dive - Yinan Li, Google

Apache Spark is currently the most popular open-source large-scale data processing framework. Previously, users could run Spark applications on standalone, Yarn, and Mesos clusters. In the Spark 2.3.0 release, Kubernetes became a new scheduler backend for Spark. This new scheduler backend enables Spark applications to run natively on Kubernetes by leveraging the Kubernetes scheduler for scheduling and running Spark drivers and executors. In this talk, we will give a deep dive into the technical details of the Kubernetes scheduler backend and explore all the exciting new things that this native Kubernetes integration brings to Apache Spark. We will also go over the roadmap and features that the Kubernetes community has planned for the scheduler backend over the next several releases of Spark.


Yinan Li

Software Engineer, Google
Yinan Li is currently a Software Engineer at Google. He focuses on work that enables large-scale data processing on Kubernetes, including the Kubernetes scheduler backend for Apache Spark. Yinan is a co-chair of SIG big-data of Kubernetes. He is also an Apache Spark contributor and... Read More →

2F Room 1
