At the core of many of LinkedIn's analytics applications is a real-time data pipeline built on top of Apache Kafka. This system handles over 10 billion messages writes per day for thousands of production processes. This talk will cover some of the challenges of building and scaling this data pipeline for log data, system metrics, and other high-volume data streams. It will also cover some details of the design of Kafka, as well as some of the particular requirements of Hadoop data loads and real-time processing applications.