data-lake

IoT on AWS – Coping with Aging (Data)

Data has different purposes over time: when fresh, it can be used for real-time decision-making; as it ages, it becomes useful for analytics; eventually, it becomes a record, useful or perhaps not. Each of these stages requires a different approach to storage and management, and this talk looks at appropriate ways to work with your data at the different stages of its life.

By Keith Gregory, AWS Practice Lead at Chariot Solutions

IoT on AWS – That’s Not A Data Lake…

This talk will review two common use cases for the use of captured metric data: 1) Real-time analysis, visualization, and quality assurance, and 2) Ad-hoc analysis. Once metric data is generated, to support the use cases mentioned above it must be ingested properly using a robust and fault-tolerant streaming framework. The most common open source streaming options will be mentioned however this talk be concerned with Apache Flink specifically. A brief discussion of Apache Beam will also be included in the context of the larger discussion of a unified data processing model.

Best practices around data persistence will be discussed. An attempt will be made to eliminate confusion about the format data should take when it is ‘at rest’. Different serialization formats will be compared and discussed in context with the most typical analysis use cases. Finally fully managed solutions such as AWS Data Lake will be mentioned briefly. We will discuss their relative advantages and disadvantages.

By Eric Snyder, Software Architect at Chariot Solutions