tips and tricks

Unbalanced Data in Redshift

Decision support databases have a number of quirks that are not obvious to the casual user, particularly someone coming from an OLTP background. In this post I look at how unbalanced distributions can impact your query performance, how you can identify imbalances, and what you can do to fix them.

Analyzing Glue Jobs with AWS X-Ray

It’s possible to analyze your Glue jobs using just the logs they produce. Possible. But it’s not a pleasant task: your log messages are buried in messages from the framework, and in the case of a distributed PySpark job they’ll be spread amongst multiple CloudWatch log streams. In this post I look at an alternative: AWS X-Ray, which captures and aggregates “trace segments” that monitor specific sections of your code. With X-Ray, you can easily see where your jobs are spending their time, and compare different runs.

Limiting Cross-stack References in CDK

Several years ago I wrote CloudFormation Tips and Tricks, in which I gave the advice to “use outputs lavishly, exports sparingly.” The reason is that when you export a value from one stack and import it into another you bind those stacks tightly together, and can’t change that exported value. For example, you might create … Read More

Rightsizing Data for Athena

Amazon Athena is a service that lets you run SQL queries against structured data files stored in S3. It takes a “divide and conquer” approach, spinning up parallel query execution engines that each examine only a portion of your data. The performance of these queries, however, depends on how you consolidate and partition your data. In this post I compare query times for a moderately large dataset, looking for the “sweet spot” between number of files and individual file size.

Three Approaches to Deploying Lambdas

“Traditional” deployment patterns separate the application from its infrastructure. Lambda deployments turn this model on its head, binding the infrastructure tightly to the running code. This can be a challenge, especially when developing in a team: it is all too easy for one developer to accidentally overwrite another’s work. In this post I look at several deployment options, and how they impact a development team.

Android background processing – a decision guide

If you’ve ever done anything in Android that takes more than 16 milliseconds to process then you’ve probably heard of Android services and background processing in general. Why 16 milliseconds? At Android’s refresh rate of 60 frames per second, the main thread can not be delayed by more than 16 milliseconds to avoid frame skipping … Read More

Hunting the Wolf: App UX and Database Review

The Wolf Golf Scorecard app for Android, by Rod Biresch of Chariot Solutions, is a record-keeping application for a classic four-player golf game. Recently, Chariot had the opportunity to redesign and refresh the look-and-feel and usability of the application (first released in 2016) through a complete design deep-dive process.