big data

Twenty Years of Big Data

More, cheaper, faster: our own Keith Gregory recounts the changes in big data, data storage, and data engineering over the last two decades.

Rightsizing Data for Athena

Amazon Athena is a service that lets you run SQL queries against structured data files stored in S3. It takes a “divide and conquer” approach, spinning up parallel query execution engines that each examine only a portion of your data. The performance of these queries, however, depends on how you consolidate and partition your data. In this post I compare query times for a moderately large dataset, looking for the “sweet spot” between number of files and individual file size.

Philly ETE 2020 – Dan Pilone – Looking over the edge: Bridging the gaps between geospatial data, cloud computing, and local disaster response organizations

Check out our YouTube playlist to watch all the talks from Emerging Technologies for the Enterprise 2020. Abstract In this talk we look at the challenges of making geospatial data accessible and rapidly consumable in disaster response scenarios. The wide variety and large volume of commercial and public data available in AWS coupled with scalable … Read More

That’s not a Data Lake, THIS is a Data Lake – IoT on AWS – A Philly Cloud Computing Event

This talk will review two common use cases for the use of captured metric data: 1) Real-time analysis, visualization, and quality assurance, and 2) Ad-hoc analysis. The most common open source streaming options will be mentioned, however this talk be concerned with Apache Flink specifically. A brief discussion of Apache Beam will also be included in the context of the larger discussion of a unified data processing model.

PHLAI – Comcast's Artificial Intelligence Conference

I was lucky enough last week to attend PHLAI, a Comcast-sponsored conference on machine learning and artificial intelligence. The dreary weather did not dampen our spirits as practitioners and business stakeholders met to discuss one of the most important trends in our lifetime.