map reduce

Data I/O 2013 – Web-scale Data Processing: Practical approaches for low-latency and batch – Edward Capriolo

ARVE Error:

ARVE Error: loop not valid

loop not valid
autoplay not valid

Podcast: Play in new window | Download (Duration: 59:47 — 137.9MB) | Embed

In this talk, Hive and Cassandra author (and Hive committer and PMC member) Edward Capriolo will discuss common big-data software challenges and how they can be solved using both batch and stream processing. Technology focus will primarily be on Apache Kafka for publish-subscribe messaging, Storm for stream processing, and Apache Cassandra as a NoSQL data store.

TechCast #13 – Toby DiPasquale on Google, Map-Reduce, Hadoop, Amazon EC2 and more

Podcast: Play in new window | Download (Duration: 48:32 — 44.4MB) | Embed

This week we feature an interview with Toby DiPasquale of Invite Media. Toby and I discuss the Map-Reduce algorithm, which is the engine that powers Google’s indexing and data processing systems. We start off by discussing how Google started indexing pages, using traditional methods such as C/C++ routines. Quickly this became unmanageable, as the amount … Read More