MongoDB has been designed for versatility, but the techniques you might use to build, say, an analytics engine or a hierarchical data store might not be obvious. In this talk, we’ll learn about MongoDB in practice by looking at four hypothetical application designs (based on real-world designs, of course). Topics to be covered include schema design, indexing, transactions (gasp!), trees, what’s fast, and what’s not. Sprinkled with tips, tricks, shoots, ladders, and trap doors, you’re guaranteed to learn something new in this interdisciplinary talk.
A recommendation engine can deliver relevant and engaging personalized content to your users. This presentation (followed by a 2-hour lab) will demonstrate how to write a graph-based recommendation algorithm using the graph database Neo4j. We will be using the publicly available MovieLens dataset, but you’ll also learn how to import your own data into Neo4j.
What is sharding? How does sharding work within MongoDB? Jesse Davis of 10gen answers these questions in this 20-minute talk.
This session will begin with a very quick overview of Couchbase NoSQL, the ecosystem and what it means to developers, architects and implementers. We will demonstrate with examples of how to use the different language APIs with liberal but simple code examples on Couchbase to drive home some of these concepts. We will cover a simple application and also some real-life case studies from big social gaming applications.
Hadoop is a general-purpose Big Data platform that offers distributed scalability for data storage and flexible options for working with data. In this session, we’ll learn how various Hadoop tools and techniques address particular data scenarios, including traditional reporting, NoSQL data management, and Machine Learning. In the lab, we’ll use Hive, the SQL engine for Hadoop, to query data at scale, as if Hadoop were a traditional data warehouse.
We will spend the 45 minute talk laying out the Hadoop landscape and how you would approach different kinds of problems using different tools. The 2-hour lab will focus on one tool, Hive, the SQL tool for using Hadoop as a data warehouse. The goal is to discuss how Hadoop is a flexible environment for all kinds of data needs, some well outside the traditional realms of data “solutions”, for example doing Machine Learning, and it supports NoSQL options. The lab shows how the traditional approach of using SQL to query the data still works with Hadoop, when that’s the most appropriate tool for particular needs. Actually, you get most of what SQL traditionally provides. We’ll highlight what’s different.