tips and tricks

How to run Apple OS X Sonoma Developer Beta on UTM from OS X Ventura

If you want to run OS X Sonoma, but can’t dedicate a computer to it, you could always install it on the UTM virtual machine engine. This allows you to test out Beta OS X features without taking over your primary machine’s OS. Pre-requisites: A Mac with Apple Silicon running OS X Ventura An Apple … Read More

Beyond the Bastion: Connecting to Your Resources in AWS

In a perfect world, there would never be a need to connect to your resources running on AWS. In the real world, it’s sometimes necessary to get your hands dirty and look at what’s happening on the actual machine, especially during development. This post dives into a few ways to connect your workstation to resources running inside a VPC. It started out as a how-to for using bastion hosts, but quickly expanded to look beyond the bastion.

Unbalanced Data in Redshift

Decision support databases have a number of quirks that are not obvious to the casual user, particularly someone coming from an OLTP background. In this post I look at how unbalanced distributions can impact your query performance, how you can identify imbalances, and what you can do to fix them.

Analyzing Glue Jobs with AWS X-Ray

It’s possible to analyze your Glue jobs using just the logs they produce. Possible. But it’s not a pleasant task: your log messages are buried in messages from the framework, and in the case of a distributed PySpark job they’ll be spread amongst multiple CloudWatch log streams. In this post I look at an alternative: AWS X-Ray, which captures and aggregates “trace segments” that monitor specific sections of your code. With X-Ray, you can easily see where your jobs are spending their time, and compare different runs.

Limiting Cross-stack References in CDK

Several years ago I wrote CloudFormation Tips and Tricks, in which I gave the advice to “use outputs lavishly, exports sparingly.” The reason is that when you export a value from one stack and import it into another you bind those stacks tightly together, and can’t change that exported value. For example, you might create … Read More

Rightsizing Data for Athena

Amazon Athena is a service that lets you run SQL queries against structured data files stored in S3. It takes a “divide and conquer” approach, spinning up parallel query execution engines that each examine only a portion of your data. The performance of these queries, however, depends on how you consolidate and partition your data. In this post I compare query times for a moderately large dataset, looking for the “sweet spot” between number of files and individual file size.

Three Approaches to Deploying Lambdas

“Traditional” deployment patterns separate the application from its infrastructure. Lambda deployments turn this model on its head, binding the infrastructure tightly to the running code. This can be a challenge, especially when developing in a team: it is all too easy for one developer to accidentally overwrite another’s work. In this post I look at several deployment options, and how they impact a development team.

Android background processing – a decision guide

If you’ve ever done anything in Android that takes more than 16 milliseconds to process then you’ve probably heard of Android services and background processing in general. Why 16 milliseconds? At Android’s refresh rate of 60 frames per second, the main thread can not be delayed by more than 16 milliseconds to avoid frame skipping … Read More