ETE 2012 – Nathan Marz on Storm

Tags: , ,

From the abstract:

Storm makes it easy to write and scale complex realtime computations on a cluster of computers, doing for realtime processing what Hadoop did for batch processing. Storm guarantees that every message will be processed. And it’s fast – you can process millions of messages per second with a small cluster. Best of all, you can write Storm topologies using any programming language. Storm was open-sourced by Twitter in September of 2011 and has since been adopted by numerous companies around the world.

Storm provides a small set of simple, easy to understand primitives. These primitives can be used to solve a stunning number of realtime computation problems, from stream processing to continuous computation to distributed RPC. In this talk you’ll learn:
* The concepts of Storm: streams, spouts, bolts, and topologies
* Developing and testing topologies using Storm’s local mode
* Deploying topologies on Storm clusters
* How Storm achieves fault-tolerance and guarantees data processing
* Computing intense functions on the fly in parallel using Distributed RPC
* Making realtime computations idempotent using transactional topologies
* Examples of production usage of Storm