spark

Philly ETE 2015 – Helena Edelson – Streaming Big Data with Spark, Spark Streaming, Kafka, Cassandra and Akka

This talk presents Apache Spark, Spark Streaming, Apache Kafka, Apache Cassandra and Akka as supporting Lambda architecture in the context of a fault tolerant, streaming big data pipeline. We will walk through the Fault Tolerance story with these technologies to build applications, and how to easily implement and integrate them in a Scala Akka application for real-time delivery of meaning at high velocity, in highly distributed and concurrent environments.

Philly ETE #36 – Why Spark Is the Next Top (Compute) Model – Dean Wampler

From the abstract: Spark is an open-source computation platform for Big Data. Leaders in the Hadoop community, such as Cloudera, have embraced Spark as a replacement for MapReduce, the venerable standard for writing Hadoop jobs. This talk explores why this change is needed. Spark provides two important benefits compared to MapReduce. First, its performance is … Read More

Philly ETE 2014 – Dean Wampler – Why Spark Is the Next Top (Compute) Model

Spark provides two important benefits compared to MapReduce. First, its performance is significantly better than MapReduce. We’ll discuss why. Second, because Spark is implemented in Scala and rooted in the world of functional programming, it provides better, more composable primitives that make it easier for developers to create a wide variety of high-performance applications. We’ll discuss these primitives and look at some example applications.