Video recording and post production done by OpenStack Foundation.
Apache Spark is a fast and general engine for large-scale data processing that provides efficient support for general data processing, ETL, streaming, and machine learning. It is capable of running analytic jobs up to 100X faster than Hadoop MapReduce. In this talk, we will introduce Spark and the various storage systems Spark can operate on.
In particular, we focus on the integration between Spark and OpenStack Swift, demonstrating the advanced models for executing Spark jobs directly on Swift objects.