big data

Small Data: a pipeline for low-latency decision support

In my last post, I said that I didn’t think Postgres was a good choice for a decision support database, versus a task-specific DBMS such as Redshift. In this post I’m going to take the opposite stand, and say that there are cases where Postgres is appropriate: namely, low-latency systems that contain a limited amount of data.

TechChat Tuesdays #65: Redshift Execution Plans with Keith Gregory

Podcast: Play in new window | Download (Duration: 59:00 — 54.0MB) | Embed

In this week’s TechChat, we welcome Keith Gregory, our Cloud & Data Engineering Practice Lead here at Chariot. Keith is a prolific writer both on the Chariot blog as well as on his own, and is a wealth of knowledge on all things AWS. We touch on Redshift execution plans, how to appropriately size Redshift … Read More

A Deep Dive on Redshift Execution Plans

In this post I walk through several execution plans, explain what Redshift is doing in each, and highlight the parts of plans that indicate problems.

Twenty Years of Big Data

More, cheaper, faster: our own Keith Gregory recounts the changes in big data, data storage, and data engineering over the last two decades.

Craft a Data Strategy “Mindset” for 2022

A well-designed data strategy is critical to success. Here are 3 philosophies to help you design an optimal data strategy for your business.

Rightsizing Data for Athena

Amazon Athena is a service that lets you run SQL queries against structured data files stored in S3. It takes a “divide and conquer” approach, spinning up parallel query execution engines that each examine only a portion of your data. The performance of these queries, however, depends on how you consolidate and partition your data. In this post I compare query times for a moderately large dataset, looking for the “sweet spot” between number of files and individual file size.

15 Minutes With: Leslie Richards on the Extensive Role of Data at SEPTA

Our CMO Tracey-Welson Rossman sits down with Leslie Richards, the General Manager of SEPTA, to discuss the extensive role of data in public transit.

15 Minutes With: Lanaya Nelson on Big Data and Auto Insurance

In this interview, Lanaya Nelson from Motion Insurance discusses how harnessing drivers’ telematics and GPS data is disrupting the auto industry.

Philly ETE 2020 – Matthew Hanson – Open Standards and Open Software for Geospatial Imagery

Check out our YouTube playlist to watch all the talks from Emerging Technologies for the Enterprise 2020. Abstract As massive amounts of new geospatial data are collected, it is increasingly challenging to search and find data of interest. New upcoming NASA missions, such as NISAR and SWOT will be generating tens of terabytes a day, … Read More

Philly ETE 2020 – Dan Pilone – Looking over the edge: Bridging the gaps between geospatial data, cloud computing, and local disaster response organizations

Check out our YouTube playlist to watch all the talks from Emerging Technologies for the Enterprise 2020. Abstract In this talk we look at the challenges of making geospatial data accessible and rapidly consumable in disaster response scenarios. The wide variety and large volume of commercial and public data available in AWS coupled with scalable … Read More