Apache Spark on Kubernetes—Lessons Learned from Launching Millions of Spark Executors (Databricks Data+AI Summit 2022)

This article summarize an exciting sharing session hosted by Zhou Jiang, Aaruna Godthi from Apple on Data+AI Summit 2022. In this session, Zhou and Aaruna talked about how they built a centralized Apache Spark cluster on Kubernetes that processes 380K+ Spark jobs per day to support analytics workflow and scientists experimentation in Apple.

Understanding Spark Connect (Reynold Xin’s keynote on Data+AI Summit 2022)

"Spark Connect" creates a "thin client" to enable Spark query capability on low-compute devices, with re-architected Spark Driver to get around some short-comings in the Monolithic Driver and better support for multi-tenancy.