Blog

Strategies for Addressing Tech Debt

Tracey Welson-Rossman | April 11, 2024

Ignoring tech debt because it’s expensive doesn’t make the problem go away — it only kicks the can down the road. Businesses should approach technical debt as a routine, scheduled part of their development process.

Technology Trends in Gaming & Sports Betting

Michael Rappaport | April 3, 2024

One of the more remarkable turnarounds in recent American culture has been the embrace of sports betting and online gaming. What was once considered a vice and relegated to relatively few in-person jurisdictions has quickly exploded into the mainstream and online. Like with any growth industry, this has had cascading consequences for others that work in the sector. This includes technology providers and partners that have had to rush to address the many challenges of such meteoric growth within a…

Perils of Partitioning

Keith Gregory | March 22, 2024

Partitioning is one of the easiest ways to improve the performance of your data lake, because it reduces the amount of data scanned. But implementing partitions can be surprisingly challenging, as can their effective use. In this post I look at several of the issues that you should consider when partitioning your data.

Large Language Model (LLM) Coding Assistance

John Shepard | March 19, 2024May 30, 2024

Note: It has been about three months since this was originally written, so there is a certain amount of information that is out of date. See the addendum for updated information. With all the hype surrounding Generative AI/LLM, and all the hallucinations mentioned in the news, what are these actually good for? As it turns out LLMs trained for code generation are helpful. But what if you don’t want your code going to some cloud provider? The following is a…

Transforming Data with Amazon Athena

Keith Gregory | March 15, 2024

My prior posts used Lambda to do data transformation. But what if we could use a non-programmatic tool, in keeping with the Extract-Load-Transform mindset of the modern data pipeline. As it turns, we can: Amazon Athena can write data as well as query it. There are, of course, a few stumbles along the way. In this blog post I walk through the process of aggregating CloudTrail data using SQL.

From RAGs to Riches – Adding Context to Your LLM

Steve Wood | March 8, 2024March 7, 2024

In my previous post, Experiences in Fine-Tuning LLMs: Time + Power = Potato?, I covered my experiences around trying to fine-tune an LLM (large language model) with a dataset, which gave me less than stellar results. Ultimately, fine-tuning is best for a use-case where additional reasoning & logic needs to be added to an LLM, but it’s subpar for adding information. However, if you’re trying to get an LLM to answer questions using your data, then retrieval augmented generation (RAG)…

Aggregating Files in your Data Lake – Part 3

Keith Gregory | February 29, 2024March 27, 2024

In this final part of a three-part series, I add another aggregation step to combine a month’s worth of data and write it as Parquet.

Experiences in Fine-Tuning LLMs: Time + Power = Potato?

Steve Wood | February 27, 2024February 26, 2024

Embarking on the journey to fine-tune large language models (LLMs) can often feel like setting sail into uncharted waters, armed with hope and a map of best practices. Yet, despite meticulous planning and execution, the quest for improved performance doesn’t always lead to the treasure trove of success one might anticipate. And I know you may be wondering how potatoes come into play here, but I promise that we’ll get to it. From the challenges of data scarcity to resource…

Apple Silicon GPUs, Docker and Ollama: Pick two.

Ken Rimple | February 26, 2024April 3, 2024

If you’ve tried to use Ollama with Docker on an Apple GPU lately, you might find out that their GPU is not supported. But you can get Ollama to run with GPU support on a Mac. This article will explain the problem, how to detect it, and how to get your Ollama workflow running with all of your VRAM (which, on a Mac, is your DRAM too)!

Getting started with LLM in the Cloud with Amazon DLAMI EC2 Instances

Ken Rimple | February 21, 2024April 3, 2024

So you want to execute some custom CUDA-based AI processing on a GPU, but don’t have the hardware? Have an AWS account? Try using the DLAMI machine instances. This article explains how to get started if you need OS-level access.

← Previous 1 2 3 … 52 Next →

How can we help your company with your development needs?