Apache Spark 2.4 comes packed with a lot of new functionalities and improvements, including the new barrier execution mode, flexible streaming sink, the native AVRO data source, PySpark’s eager evaluation mode, Kubernetes support, higher-order functions, Scala 2.12 support, and more.
Xiao Li and Wenchen Fan offer an overview of the
major features and enhancements in Apache Spark 2.4. Along the way, you’ll learn about the design and implementation of V2 of theData Source API and catalog federation in the upcoming Spark release. Then you’ll get the chance to ask all your burning Spark questions.