tips and tricks

Aggregating Files in your Data Lake – Part 2

When I ran the Lambda from my previous post against Chariot’s CloudTrail repository, it took almost four minutes to process a single day’s worth of data. That seems like a long time, and as a developer I want to optimize everything I write. In this post I look into analyzing the current runtime, and options for improving it.

Client-side Data Persistence with IndexedDB

A Deep Dive into IndexedDB In a previous article, I compared client-side storage solutions: localStorage, sessionStorage, cookies, and touched briefly on IndexedDB. In the vast ecosystem of web storage solutions, IndexedDB stands out as a powerful, low-level API for client-side storage of significant amounts of structured data. While cookies, localStorage, and sessionStorage are suited for … Read More

Client-side Data Persistence for Web Applications

In the realm of web development, ensuring data persistence is a top priority. Web developers face the challenge of storing data seamlessly and effectively, often juggling various storage solutions to meet specific needs. One of the leading and supported options is IndexedDB, a powerful client-side database solution. Though SessionStorage, LocalStorage, and cookies have their places, … Read More

Electron, not a walk in the park

Recently, a project I worked on was considering using Electron as a fallback technology for an initial Progressive Web Application. At the time, the assumption was that since Electron uses Chromium, a browser, it should allow application developers to not only use the features of a PWA but also gain native access to technologies, such … Read More

How to run Apple OS X Sonoma Developer Beta on UTM from OS X Ventura

If you want to run OS X Sonoma, but can’t dedicate a computer to it, you could always install it on the UTM virtual machine engine. This allows you to test out Beta OS X features without taking over your primary machine’s OS. Pre-requisites: A Mac with Apple Silicon running OS X Ventura An Apple … Read More

Beyond the Bastion: Connecting to Your Resources in AWS

In a perfect world, there would never be a need to connect to your resources running on AWS. In the real world, it’s sometimes necessary to get your hands dirty and look at what’s happening on the actual machine, especially during development. This post dives into a few ways to connect your workstation to resources running inside a VPC. It started out as a how-to for using bastion hosts, but quickly expanded to look beyond the bastion.

Unbalanced Data in Redshift

Decision support databases have a number of quirks that are not obvious to the casual user, particularly someone coming from an OLTP background. In this post I look at how unbalanced distributions can impact your query performance, how you can identify imbalances, and what you can do to fix them.

Analyzing Glue Jobs with AWS X-Ray

It’s possible to analyze your Glue jobs using just the logs they produce. Possible. But it’s not a pleasant task: your log messages are buried in messages from the framework, and in the case of a distributed PySpark job they’ll be spread amongst multiple CloudWatch log streams. In this post I look at an alternative: AWS X-Ray, which captures and aggregates “trace segments” that monitor specific sections of your code. With X-Ray, you can easily see where your jobs are spending their time, and compare different runs.

Limiting Cross-stack References in CDK

Several years ago I wrote CloudFormation Tips and Tricks, in which I gave the advice to “use outputs lavishly, exports sparingly.” The reason is that when you export a value from one stack and import it into another you bind those stacks tightly together, and can’t change that exported value. For example, you might create … Read More