big data

Philly ETE 2014 – Dean Wampler – Why Spark Is the Next Top (Compute) Model

Spark provides two important benefits compared to MapReduce. First, its performance is significantly better than MapReduce. We’ll discuss why. Second, because Spark is implemented in Scala and rooted in the world of functional programming, it provides better, more composable primitives that make it easier for developers to create a wide variety of high-performance applications. We’ll discuss these primitives and look at some example applications.

The DataPhilly Meetups

Sujan Kapadia writes: “This year I’ve started going to the DataPhilly meetups, and I think I’m hooked. The bottom line is DataPhilly talks are very intriguing, expose you to topics you don’t encounter everyday, and give you the chance to meet “non-traditional” developers (scientists and statisticians), whose ranks are rapidly growing.”

DevNews #67 – Monoliths begone, lock free APIs, Bunnies and RabbitMQ, and computer viruses by air

Links "Five tips for big software projects":http://blog.chariotsolutions.com/2013/10/5-tips-for-big-software-projects.html “Dismanteling the monoliths”:https://engineering.groupon.com/2013/misc/i-tier-dismantling-the-monoliths/ – rails apps converting to Nodejs at Groupon I’m taking a stab at lock-free this week – First, my reading took me to Mechanical Sympathy (which we’ve discussed before) and now that there is a JSR for some new constructions – (StampedLock) this site has … Read More

PhillyETE Screencast #31 – Going Big with Big Data – One Step at a Time – Anita Garimella Andrews

PlayPlay

From the abstract: “Big Data is almost scary nowadays. Some small, young companies are so advanced in their use of data – but their datasets are small, so statistical validity constantly comes up. Some Fortune 100 companies haven’t even started. And other large companies have such a morass of badly integrated, inaccurate or unused data … Read More

Chariot DevNews Episode #48 – Big Data all over the place

It’s the big return of the regular DevNews this week. My co-host Joel Confino and I discuss lots of big data stuff, including: They hype it, then they try to kill it – Why Big Data is not truth – just using Big Data techniques doesn’t make it easy to select good data to begin … Read More

Druid: Real-time Queries Meet Real-time Data

This talk will focus on the design considerations and architecture of Druid, an open-source, distributed, column-oriented analytical data store. There will be a particular focus on how Druid can be used to ingest data in real-time on the write side and provide real-time access to data on the read side.