Experiences in Fine-Tuning LLMs: Time + Power = Potato?

Embarking on the journey to fine-tune large language models (LLMs) can often feel like setting sail into uncharted waters, armed with hope and a map of best practices. Yet, despite meticulous planning and execution, the quest for improved performance doesn’t always lead to the treasure trove of success one might anticipate. And I know you … Read More

Apple Silicon GPUs, Docker and Ollama: Pick two.

As part of our research on LLMs, we started working on a chatbot project using RAG, Ollama and Mistral. Our developer hardware varied between Macbook Pros (M1 chip, our developer machines) and one Windows machine with a "Superbad" GPU running WSL2 and Docker on WSL. All hail the desktop with the big GPU. We planned … Read More

Getting started with LLM in the Cloud with Amazon DLAMI EC2 Instances

Large Language Model (LLM) chatbots like ChatGPT are all the rage these days. You may be experimenting with building one of your own using a model runtime engine like Ollama, possibly accessing it with the LangChain API, maybe integrating it with a Vector Database for your custom data and using Retrieval Augmented Generation (RAG), or … Read More

Using the JetBrains AI Assistant from WebStorm

Generative AI in WebStorm For those of you who don’t know me, I’ve been working as a developer since the 1990s, and so have experienced a ton of different technologies, APIs, and languages. That said, my memory isn’t perfect, and I don’t remember deep details of every library or tool I’ve used, even recently! (Context … Read More