15 Minutes With: Andrew Ganim on Building a Data Pipeline

In today’s episode of 15 Minutes With, Keith Gregory, Chariot’s AWS Practice Lead, talks to Andrew Ganim, one of Chariot’s experienced software consultants.   

Andrew’s most recent project was to help a multinational company better analyze their data by building a more robust data pipeline. He was brought in to clean up both their current pipeline code, and the data that was coming in.   This project involved restructuring, joining, and aggregating purchase data (transactional) and behavioral data (clicks, add-to-carts, etc.) from both brick-and-mortar stores, and online sales.  

Dealing with terabyte-scale data, international channels, data privacy laws from multiple countries, and all the different ways to represent retail transactions (like clearances and discounts) is extremely complex. So how did Andrew account for edge cases without leaving behind messy code? How did he partition all this data? What tools did he use?   

Andrew and Keith discuss.

Is there anything you’d like us to discuss in this series? Email info@chariotsolutions.com, or leave a thought below.