PhillyETE Screencast #30 - Druid - Real-Time Queries Meet Real-Time Data - Eric Tschetter

ARVE Error:

ARVE Error: loop not valid

loop not valid
autoplay not valid

Podcast: Play in new window | Download (Duration: 59:18 — 195.0MB) | Embed

From the abstract: “This talk will focus on the design considerations and architecture of Druid, an open-source, distributed, column-oriented analytical data store. Druid is an open source distributed system in use at Metamarkets (http://www.metamarkets.com) to facilitate rapid exploration of high dimensional spaces. We use Druid to expose impression monetization data to ad tech companies along any arbitrary combination of demographic, content and sales-based dimensions. One Druid cluster currently exposes a data set of >40 billion rows of data representing >2 trillion impressions in hypercubes of varying dimensionality (largest is 30+ dimensions) while allowing for exploration using top lists and timeseries in sub-second latencies. There will be a particular focus on how Druid can be used to ingest data in real-time on the write side and provide real-time access to data on the read side.

The Druid code can be found at http://www.github.com/metamx/druid.”

PhillyETE Screencast #30 – Druid – Real-Time Queries Meet Real-Time Data – Eric Tschetter