Video recording and production done by OpenStack Foundation.
One of the most recent addition to Sahara (OpenStack Data-Processing-as-a-Service component) was the Storm plugin. Up to Juno, Openstack Sahara included only batch processing alternatives and Storm is one of the most popular open-source tools for real-time data analytics and stream processing.
Implementing real-time data processing differs from Hadoop and other batch processing approaches because the data cannot be stored and then processed. The processing takes place while the data traverses the system. As a consequence sub-second processing latencies can be achieved. Therefore, this plugin enables new types of applications to be executed in Sahara.
Real-time data processing is increasingly popular. Many such applications are now in our daily routine. One example, is online data summarization, where a high volume data feed (e.g., logs from a large cluster or sensor network) is summarized to let only relevant events be stored in a database. Other common examples, with varying performance and scalability requirements, include fraud detection, trend topics, and high frequency trading.
In this talk, we will present the Storm plugin for Sahara and guide the user through the essential steps to setup a scalable real-time data processing application. We will also share our plans for improving real-time data processing in OpenStack.