Self-Hosting an AI-Powered Homelab: From Runtipi Setup to Cheshire Cat Adventure Introduction to Runtipi and Self-Hosting

by ian | Jan 18, 2026 | Uncategorized | 0 comments

Runtipi is an open-source personal homeserver orchestrator designed to make self-hosting easy for everyone. It runs on top of Docker and provides a friendly web interface so you can install and manage multiple services on a single server without wrestling with complex configurations or networking setup. The philosophy behind Runtipi is to democratize self-hosting – even beginners can spin up their own apps and services with minimal technical hassle. Instead of manually writing Docker Compose files or managing reverse proxies, Runtipi handles those details for you out of the box. You focus on what to host, and Tipi takes care of Docker containers, Traefik proxy, SSL certificates, and service orchestration behind the scenes.

Why use Runtipi? For a weekend tech warrior or intermediate user, Runtipi hits the sweet spot between power and simplicity. It comes with an extensive built-in app library (an App Store of sorts) featuring nearly 300 popular self-hosted applications – from media servers and home automation to developer tools and AI apps. These can be installed in one click via the web UI, with sane default configs and auto-discovery on your network. No need to manually configure Nginx or port-forward every new app – Runtipi’s integrated Traefik reverse proxy automatically routes traffic and can obtain HTTPS certificates for your apps using Let’s Encrypt. In short, Runtipi provides a “home cloud” experience: all your self-hosted services neatly managed in one place, with minimal fuss.

Runtipi’s web dashboard makes it easy to manage your self-hosted apps. In this example, several services (Nextcloud, Portainer, VaultWarden, etc.) have been installed and can be controlled from the My Apps screen. New apps can be added via the App Store with a single click.

## **Features at a glance:**

Quick Setup: Install Runtipi with a single shell command, and it bootstraps itself on any modern 64-bit Linux server. Within minutes, you’ll have a running web UI and all necessary backend components (Docker, database, proxy) configured for you.

User-Friendly Dashboard: A clean web interface lets you start/stop apps, adjust settings, view logs, and manage updates or backups without digging into the command line. You get an at-a-glance view of system status and installed services.

Massive App Library: The built-in app store offers hundreds of pre-packaged services ready to deploy. Want a Nextcloud file server, a PhotoPrism gallery, or an AI chatbot? Just find it in the list and hit install – Runtipi will pull the Docker images and set up the containers automatically

Safe and Modular: Each app is deployed as its own Docker Compose project on a shared network. They’re isolated for security but can talk to each other if needed. Runtipi itself uses modern, scalable components (NestJS server, PostgreSQL, RabbitMQ, Traefik) under the hood, but you don’t need to manage those directly. Everything runs on your server (or even a Raspberry Pi), keeping you in control of your data.

Backup and Update Friendly: Runtipi includes features to backup app data and update apps with one click, helping protect your data when upgrading containers. It’s designed so that even if you decide to move away, you could still run the same Docker containers manually – the configurations are standard Docker-compose format, just managed for you.

In summary, Runtipi aims to give you “home server management made easy”. It’s not a full Linux distro or OS; rather, it’s a layer on top of your OS that greatly simplifies running self-hosted apps. This was a perfect starting point for our self-hosting adventure. We set out to install Runtipi on our server and see how quickly we could get various services (including some AI tools) up and running.

## **Installing and Setting Up Runtipi**

Before installing, you’ll need a suitable host machine. Runtipi works on any 64-bit Linux (Ubuntu 18.04 or newer is recommended) and supports both x86_64/amd64 and arm64 architectures. That means you can install it on a home lab PC, a virtual private server, or devices like a Raspberry Pi 4 (which is arm64) – though note that not all apps in the library have arm64 Docker images, so Runtipi will hide unsupported apps if you’re on, say, a Pi. The hardware requirements are modest: at least a 64-bit CPU, 4 GB RAM, and ~10 GB free disk for a basic setup. Of course, heavier apps (databases, AI models, etc.) will benefit from more RAM and CPU. We used a machine with 8 GB RAM and an SSD, which is around the recommended spec for comfortable performance.

**Step 1:** Install Docker (if not already). Runtipi itself will install Docker for you if it’s missing. If you already have Docker, ensure the Compose plugin is installed (docker compose version should return a result). Also ensure your Docker is a recent version; outdated Docker can cause issues.

**Step 2:** Run the Runtipi installer script. The project provides a convenient install script accessible via cURL. On your Linux server, simply run:

curl -L https://setup.runtipi.io | bash

This one-liner downloads and executes the installer, which pulls the Runtipi Docker image and sets up the necessary configuration and system service. In case this direct script doesn’t work (due to cURL issues or firewall), they also offer a GitHub raw link alternative. After a few moments, the script should complete, indicating Runtipi is installed.

**Step 3:** Access the Web Dashboard. Once installed, Runtipi will be running its web UI (and Traefik proxy) on your server. By default, it may be reachable at http://tipi.local on your local network (mDNS name) or simply via the server’s IP address (e.g., http://192.168.x.x). The first time, you’ll set an admin login (or use a default provided and change it). After logging in, you’ll see the Dashboard and App Store sections. From here, you can start installing apps with a couple of clicks. For example, to install Nextcloud: you’d find Nextcloud in the App Store list, click Install, provide any required parameters (like setting an admin password for Nextcloud itself), and then Runtipi will pull the Docker image and deploy the container. In a minute or two, Nextcloud will appear in your My Apps list and you can open its web interface right from Runtipi’s dashboard. It’s that straightforward to get services running. 🚀

Using the CLI (optional): Runtipi also comes with a command-line interface runtipi-cli for advanced management tasks. Most day-to-day actions (start/stop services, installing apps) are easier through the web UI, especially for beginners. But the CLI is handy for scripting or troubleshooting. For instance, running sudo ./runtipi-cli installed will list all installed apps via the terminal. You can also update Runtipi itself via CLI (sudo ./runtipi-cli update latest to upgrade to the newest version), or reset the admin password if you get locked out (./runtipi-cli reset-password). The CLI mirrors many functions of the UI, so it’s nice for those who prefer or require shell access, but it’s entirely optional for normal use.

Security Note (if using a VPS): If you install Runtipi on a cloud VPS that has a public IP, be aware that out-of-the-box it will expose the dashboard and any installed apps to the internet, which could be risky. The documentation strongly advises locking down access – e.g. enable a firewall to block all external access except through a VPN, or use Cloudflare Tunnels for remote access. In our self-hosting journey, we stuck to a home network environment (accessible only within our private LAN and via VPN), which is the safer approach for newcomers. If you do go the VPS route, make sure to follow best practices: e.g., using Tailscale or WireGuard to create a secure private network for your services, and never exposing raw ports to the open internet without protections.

## Deploying Self-Hosted Apps with Runtipi

With Runtipi up and running, the fun part is exploring and deploying applications. The App Store (accessible via the dashboard’s top menu) contains hundreds of pre-configured services sorted by category and name. You can browse or search by name or category tags (like “Media”, “Security”, “AI”, etc.). Each app entry has a short description and often some tags indicating its type (for example, Nextcloud might be tagged “Productivity”, VaultWarden as “Security”, etc.).

Installing an app is typically as simple as clicking the Install button and filling out any basic configuration the app might need. Runtipi will download the necessary Docker image and set up a container (or multiple containers if the app requires, say, a database). It uses a declarative JSON format for app definitions behind the scenes, but you normally don’t have to interact with that directly – just use the UI. The app will then appear on your My Apps page, showing its status (running/stopped) and offering quick actions. You can usually click the app’s name to open its web interface via a friendly URL. By default, Runtipi uses Traefik to give each service a URL like http://appname.tipi (or a subdomain of your chosen domain if you set one up) and manages internal routing. If you configured a domain name for your Tipi instance, it can even auto-generate valid HTTPS certificates for each app via Let’s Encrypt – no manual certificate handling needed.

For example, after installing Nextcloud, we were able to navigate to our server’s nextcloud URL and complete Nextcloud’s setup (creating users, etc.) entirely through the browser. Runtipi had already taken care of launching the Nextcloud container and a database container it depends on, wiring them together on the Docker network, and exposing the service on the proper port with Traefik. Similarly, we installed VaultWarden (a self-hosted password manager) and it was up and running in seconds, accessible via a web URL with HTTPS, all automatically configured by Runtipi.

Most apps require little to no configuration to get started. However, Runtipi does allow customization if needed. Each app’s configuration (its Docker Compose spec, environment variables, volumes, etc.) can be overridden by the user via override files or the Custom Settings in the UI. This ensures that even when Runtipi updates an app definition, your custom changes persist. In our experience, the defaults usually work out-of-the-box, but advanced users can tweak things like attaching an external storage volume (for example, point Nextcloud’s data directory to an external drive) by editing the app’s config. This might involve SSHing into the server and editing a JSON or compose snippet in the Tipi config folder – a bit advanced, but doable. (The documentation has a guide on how to customize app configs and compose files if needed.)

One great aspect of Runtipi is that it is extensible. If an app you want isn’t in the official library, you aren’t entirely out of luck. You have a couple of options: you can manually run a Docker container alongside Runtipi (traditional way), or you can leverage Runtipi’s ability to add custom apps. The project provides a way to create your own app store or add community app stores. Essentially, you can write a JSON definition for your desired app (or use their online Docker-Compose-to-JSON converter to generate it from a standard docker-compose.yml). Then you can use the runtipi-cli appstore add command to include that definition. In our journey, we tried this for an app that wasn’t in the library, and it took only a few minutes to get it running via Tipi’s management. The Tipi forums and Discord are active with community members sharing custom app definitions, so over time the library is growing. (For beginners, this might be beyond the scope initially, but it’s nice to know that as you get more comfortable, you’re not limited to the built-in catalog – you can run anything in a container via Runtipi.)

So far, we’ve successfully self-hosted a variety of services using Runtipi: from media servers like Jellyfin, to dev tools like Gitea, and to utility apps like Uptime Kuma. The ease of use has been a game-changer. Instead of spending an entire weekend configuring Docker and troubleshooting, we got these services running in minutes and could spend our time actually using them. This set the stage for the next part of our experiment – bringing AI into the mix on our self-hosted platform.

## Self-Hosting AI: Running Local LLMs on Your Server

One of our goals was to self-host an AI assistant – basically running a large language model (LLM) locally on our own hardware, rather than relying on cloud APIs. There are a few reasons you might want to do this as a self-hoster: you get privacy (your data never leaves your server), you avoid API costs or limits, and you have full control over the AI model’s behavior and uptime. Thanks to some recent open-source projects, running a chatty LLM on consumer hardware is now quite feasible. We explored two popular tools for local LLM inference: LocalAI and Ollama. Each provides a way to run language models and expose them via an API (so they can be used similarly to OpenAI’s GPT API, but all local). We’ll go through both setups and our experiences integrating them into the homelab.
Option 1: LocalAI – Open-Source OpenAI Alternative

LocalAI is a free, open-source project that acts as a nearly drop-in replacement for the OpenAI API, but using local models. In other words, it provides a RESTful API on your server that speaks the same protocol as OpenAI’s ChatGPT API, except the actual inference is done by open models (like LLaMA, GPT4All, etc.) running on your hardware. LocalAI was originally a weekend project by developer Ettore “mulder” Di Giacinto that exploded in popularity – by late 2023 it grew into a community-driven platform with dozens of contributors and many new features. It now has over 40k stars on GitHub and is under active development, which speaks to its usefulness in the self-hosted AI space.

**The key features of LocalAI include:**

** OpenAI Compatibility:** It mirrors the OpenAI API schema, so existing apps or SDKs that work with OpenAI can be pointed to your LocalAI server by just changing the endpoint URL. This makes integration super easy.

**Multi-Model Support:** LocalAI isn’t limited to one model – you can run a variety of model families. It supports multiple backend engines like llama.cpp (for LLaMA/alpaca type models), GPT4All, and more, which means it can handle different model formats (GGML, GGUF, GPTQ, ONNX, HuggingFace transformers, etc.). It’s not just for text either; LocalAI can also run certain image generation or voice models, making it a versatile AI runtime.

**No GPU Required:** Importantly, you don’t need a fancy GPU (though it can use one if available). LocalAI is optimized to run on CPU-only systems, leveraging quantized models to make that feasible. Our homelab machine has no dedicated GPU, and we were still able to run a decent 7B-13B parameter model under the 8 GB RAM limit. (LocalAI’s documentation suggests ~10GB RAM is a comfortable minimum for LLMs.)

**Privacy and Offline:** Everything runs locally – no data ever leaves your server by design. You can even run it completely offline. This is great for privacy or for environments without reliable internet.

**Extendable Ecosystem:** The project has sister components like LocalAGI (for autonomous agent loops) and LocalRecall (for semantic search/embedding storage). We mostly focused on the core LLM functionality, but it’s good to know you can add vector database-backed memory or agent capabilities if you want to get fancy.

**How we set it up:** LocalAI provides a Docker image, which made deployment on Runtipi straightforward. At the time of our setup, LocalAI wasn’t in the Runtipi app store by default, so we used a custom approach. We took the official docker-compose.yml from LocalAI’s repo and used Runtipi’s dynamic compose feature to run it (alternatively, one could run it outside of Tipi using plain Docker). The compose setup basically runs a container that listens on port 8080 (by default) for API requests. We allocated a volume for model files so that downloaded models persist across restarts. Here’s a high-level summary of the steps:

**Deployed the LocalAI container:** We created a small compose file to run localai/localai:latest image. In Runtipi, we added this as a custom app. It was configured to map host port 8080 -> container 8080, and a volume models/ directory. Upon starting, this gave us a running LocalAI server at http://:8080. By default, it has no models loaded, so next we needed to add a model to use.

**Downloaded an AI model:** LocalAI doesn’t ship with large models (since models are big and you choose what you need). We decided to download a 7-billion-parameter LLaMA-2 variant. The LocalAI docs recommend some curated models. For example, the project’s README shows using a LLaMA2 Uncensored model:

# Example: clone LocalAI repo and download a model

git clone https://github.com/go-skynet/LocalAI.git
cd LocalAI
wget https://huggingface.co/TheBloke/Luna-AI-Llama2-Uncensored-GGUF/resolve/main/luna-ai-llama2-uncensored.Q4_0.gguf -O models/luna-ai-llama2.gguf
cp prompt-templates/getting_started.tmpl models/luna-ai-llama2.tmpl

In our case, since we already had the container running, we placed the model file in the mounted models/ directory (you can docker exec into the container or mount a host folder). We also added a simple prompt template (which helps format chat prompts for better results). This step can take a while – downloading a model might be a few GB in size. But it’s a one-time thing.

**Started the LocalAI API server:** If using docker-compose, docker compose up -d will launch the server with the model ready. In Runtipi’s case, we started the custom app. Once running, we tested that the model is recognized by calling the API. For example, running:

curl http://localhost:8080/v1/models | jq

returned a list including our luna-ai-llama2 model (meaning the server sees the model file). We then did a quick test query: a POST to http://localhost:8080/v1/chat/completions with a JSON body instructing the model (in OpenAI format). For instance:

{
“model”: “luna-ai-llama2”,
“messages”: [{ “role”: “user”, “content”: “Hello, what is the meaning of life?” }]
}

The response came back with a completion from the model after a short wait! We effectively had our own mini-ChatGPT running locally. The speed wasn’t instant (these smaller models run slower than big cloud GPUs, of course), but it was acceptable for non-real-time use. On our 8GB RAM machine, a 7B model in int4 quantization was able to generate a response in a few seconds. Larger models (13B, 30B) might need more memory and would be slower on CPU, so one has to balance model size with hardware capabilities.

At this point, anything that can speak to an OpenAI API could be pointed to our LocalAI server. This is where Runtipi’s other apps came in handy: we installed a web UI called ChatGPT UI (one of the apps available) which provides a nice chat interface in the browser. We configured ChatGPT UI to use our LocalAI’s API endpoint as the “OpenAI API” (since it’s a drop-in replacement). This allowed us to chat with the local model through a user-friendly web chat screen, just like we would with the real ChatGPT – except no internet involved, and running on our own hardware!

**Experience with LocalAI:** Overall, we found LocalAI to be a powerful solution but requiring a bit of tinkering. Managing model files and prompt templates is a manual process (you need to find the model weights you want and ensure they’re compatible). The upside is flexibility: you can choose from many community models. We did encounter that running on pure CPU means you won’t get the speed of OpenAI’s cloud – for long answers the responses took many tens of seconds or a minute. But for our use (non-critical, fun Q&A and experimentation), this was fine. We also tried enabling partial GPU acceleration on another machine with an older NVIDIA card, and LocalAI supported it (with appropriate build flags or container images, it can offload some computation to GPU).

LocalAI’s project is evolving quickly, and they are even working on an upcoming version 2 with more features. It felt satisfying to have this level of AI capability entirely self-hosted – no keys, no fees, and complete control.

## Option 2: Ollama – Simplified Local LLM Runner

The second tool we explored is Ollama, which takes a slightly different approach. Ollama is an open-source inference framework that also aims to make running LLMs locally dead-simple. Think of it as a one-stop LLM engine: you install Ollama on your machine, and it provides commands to download models and run chats, as well as an API similar to OpenAI’s. One way to see Ollama is as a polished productized version of the local LLM experience, with a focus on ease-of-use and cross-platform support (it works on macOS, Windows, and Linux). In fact, by late 2025 Ollama had become extremely popular – the project boasts over 150k GitHub stars and a huge community of contributors, making it one of the most established local LLM solutions.

Why Ollama? The value proposition of Ollama is:

Privacy – like LocalAI, it runs entirely on your hardware, so data stays local.

Massive Accessibility – it simplifies setup a lot. For example, getting a model running with plain llama.cpp can be a multi-hour ordeal of compiling code and converting model files; with Ollama, it’s often a one-line command.

Model Management – Ollama comes with its own model registry and format. You can ollama pull to download a model from their library, which includes a bunch of popular openly-licensed models. It handles the conversion/quantization behind the scenes. You can also bring your own models and add them to Ollama’s list with an Modelfile (similar to a Dockerfile but for models).

Hardware Flexibility – Ollama supports various hardware through model quantization. It can run on CPU, but also can utilize NVIDIA GPUs, AMD GPUs (via ROCm), and Apple Silicon (Metal acceleration) – basically covering all the bases. The ability to run quantized models means even weaker machines or those with limited VRAM can still run models by using 4-bit or 5-bit precision etc. We liked this because it meant Ollama could adapt to whatever machine we installed it on.

We integrated Ollama via Runtipi as well. Conveniently, the Runtipi app store included Ollama – in fact, it had three variants: Ollama – CPU, Ollama – Nvidia, and Ollama – AMD, each preconfigured for those hardware scenarios. Essentially, these are Dockerized versions of Ollama. We chose the CPU one for our server (since we didn’t have a GPU). Installing it through Tipi was one-click, similar to other apps. Under the hood, this pulled the ollama/ollama Docker image. The container by default exposes port 11434, which is Ollama’s API port. Runtipi’s template also mounted a volume to persist model data (/root/.ollama in the container holds downloaded models).

Once the Ollama service was up, we proceeded to use it. Using Ollama feels a bit different than LocalAI: you typically interact with it through its CLI or client libraries initially. For example, to download a model, you might run an Ollama CLI command. We opened a shell into the container (or you could use Runtipi’s console if available) and ran something like:

ollama pull llama2

(Ollama’s model library includes shorthand names; this is just an example – one might specify a particular variant like ollama pull llama2:7b etc., or choose from their listed models on the Ollama website). The pull command fetched the model from Ollama’s repository. After pulling, we could run queries. For instance, ollama run llama2 would drop us into an interactive chat with the model in the terminal. More interestingly, Ollama also exposes a local REST API. If you send a POST to http://:11434/api/chat with a JSON payload (similar to OpenAI’s format), it will generate a completion. We tested this via curl from another machine and got responses. Additionally, Ollama has an official Python library and other client libraries for easy integration.

To integrate with a UI, we again used the ChatGPT UI web interface, but pointed it to the Ollama API endpoint (port 11434). It worked just as with LocalAI – since Ollama’s API is also OpenAI-compatible, the UI could communicate without issues. We chatted with a LLaMA2 model running via Ollama and got pretty good responses. The performance was similar order-of-magnitude to LocalAI for the same model (both ultimately use llama.cpp under the hood for CPU inference). However, we did notice that Ollama’s ease of use was superior. For example, no need to hunt down model files manually – the pull command handled it, and it even got the right prompt template configured internally. Also, the management of models is built-in: ollama list shows your installed models, ollama purge can remove ones, etc. It feels like a polished product.

Another neat feature: Ollama supports streaming responses and some advanced capabilities like “thinking” (it can return tokens about its reasoning progress), embeddings generation, and even has an optional web UI if you run it in certain modes. We didn’t deeply explore those, but it’s good to know Ollama is quite feature-rich.

Docker & Hardware specifics: Since we ran via Docker, a quick note – if you have a GPU and want to use it with Ollama’s container, you need to follow additional steps (like installing NVIDIA Container Toolkit on the host and running the container with –gpus=all for Nvidia, or using the ollama:rocm image for AMD ROCm support). Runtipi’s separate app entries likely handle those flags for you (the Nvidia variant probably includes the –gpus option and pre-install steps). On our CPU instance, none of that was needed; it just worked out-of-the-box.

In summary, LocalAI vs Ollama: Both allowed us to run local LLMs successfully. LocalAI felt more like a “DIY toolkit” – very flexible and extensible (with agents, vector memory add-ons, etc.), and it aligns closely with the open-source HuggingFace ecosystem (you directly manage model files and can use many formats). Ollama felt more like a “product” – extremely user-friendly, opinionated about how models are packaged (it uses its own .bin or .ollama model format behind scenes), but that simplicity is a huge plus when you just want it to work. For intermediate users, we’d actually recommend trying Ollama first for a smoother start, and then moving to more complex setups like LocalAI if you have specific needs. In fact, Runtipi’s inclusion of Ollama in the app store shows how it’s become a go-to solution for self-hosted AI. As one article put it, “Ollama is one of the simplest and most efficient ways to run and develop with LLMs locally”– our experience echoed that sentiment. With either option, though, we achieved the core goal: a private ChatGPT-like AI running on our homelab.

##Building an Interactive Self-Hosting Adventure with Cheshire Cat AI

After setting up various services and a local AI, we thought of a fun project to tie it all together: what if we create an interactive “choose your own adventure” game that teaches people about self-hosting? The idea was to leverage our self-hosted LLM to guide users through scenarios (e.g. setting up a server, troubleshooting an app) in a story-like format – a bit like a text adventure game that’s also informative. To do this, we turned to Cheshire Cat AI, an open-source framework for building custom AI chat agents.

Cheshire Cat AI is a framework that lets you create AI agents with memory, tools, and custom behaviors, packaged as a microservice. It’s designed to be production-ready and highly extensible via plugins. Notably, it has built-in support for Retrieval-Augmented Generation (RAG) using a vector database (Qdrant) for memory. This means your agent can have long-term memory or knowledge base documents it can refer to when responding. In our case, we wanted the agent to “know” about self-hosting topics (like all the content from our blog posts and documentation we gathered) so it could present information accurately during the game. Cheshire Cat made this straightforward: you can upload documents (PDFs, text, Markdown, web pages) directly into its memory store via the admin interface. It will index those and use them to give more factually grounded answers. We prepared a “library” of our notes, how-tos, and even excerpts from Runtipi’s docs – essentially a knowledge base about self-hosting – and fed it into the Cheshire Cat agent’s memory.

On the deployment side, Cheshire Cat runs as a single Docker container (very convenient for our Runtipi server). We used the official image ghcr.io/cheshire-cat-ai/core:latest. It includes a web admin panel (by default on port 1865) where you can chat with the agent live and manage settings. We exposed that through our Runtipi’s Traefik so that we could access the “Cheshire Cat” interface from our browser. In the admin panel, we configured the underlying LLM backend to use our local Ollama instance (Cheshire Cat can interface with external LLMs via Langchain, and Ollama was one of the supported options). This way, when the agent needs to generate a response, it asks our Ollama server to do the heavy lifting of actual text generation, using whichever model we have loaded. We also pointed it to use Qdrant (which runs inside the container) as the vector DB.

Now, creating the adventure game logic leveraged Cheshire Cat’s plugin system. A plugin in Cheshire Cat is basically a Python module where you can hook into the AI’s behavior or add tools. We developed a custom plugin called, say, “SelfHostingAdventure”. In it, we used hooks to set the stage and guide the conversation. For example, we wrote a hook to override the agent’s initial system prompt to establish a persona/storyline: “You are a wise but whimsical guide leading the user through a choose-your-own-adventure in building their first self-hosted server. You start by greeting the user as they enter a mysterious server room… [etc].” This ensured the AI maintained a narrative style (slightly dramatic and fun) rather than a dry instructional tone. Cheshire Cat’s hook API let us easily set this kind of custom prompt prefix in a few lines of code. We also used tool functions as needed – for instance, a tool to output a code snippet in response to certain triggers (like if the user chooses an action that requires running a command, the agent can call a tool that formats the correct command from our knowledge base). Tools in Cheshire Cat are functions you define that the LLM can invoke (similar to OpenAI function calling). We defined one such tool to fetch relevant reference info from memory – essentially performing a vector search in Qdrant for a given query and returning a summary. This way, whenever the story hit a technical detail (say, “the user wants to install Docker – what do they do?”), the agent could use the tool to retrieve the exact command or steps from the documentation we loaded, and present it in the narrative.

Cheshire Cat’s plugin approach was refreshingly straightforward: just create a folder and drop in a Python file with your hook and tool definitions. When we saved our plugin, the Cheshire Cat container auto-detected it (hot reload) and the changes took effect immediately. We iterated by chatting with the agent in the admin UI, testing different branches of the “adventure”, and refining our prompts. For example, at one branch the user (player) might choose: “Install Runtipi on a Raspberry Pi” or “Use a cloud VM”. We had the agent ask this as a multiple-choice question. Depending on the answer, we had conditional logic in the plugin (the hook can inspect user messages and set the next system prompt or tool invocation accordingly). This gave the feel of branching paths like a real choose-your-own adventure book. And because the underlying LLM is generative, the responses weren’t strictly scripted – the AI would embellish the story, react to user input (even off the rails input), making each play-through a bit unique while still conveying the core lessons.

One of the most powerful aspects was that the agent had access to factual data from our documents. Cheshire Cat’s built-in RAG + our uploaded docs meant that when the user encountered, for example, an error in the story (“Oh no, Docker isn’t installed! What now?”), the agent could actually provide the real solution (like the exact curl … | bash command or a troubleshooting tip) pulling straight from Runtipi’s docs or our blog posts, but phrased in a narrative way. This made the game educational. We essentially created a dynamic documentation in the form of a game, which can be a more engaging way for newcomers to learn. And if the user asks a random question or deviates, the AI can still answer based on the general knowledge base or its pretrained model, so it’s forgiving.

To deploy this on our website, we are using Cheshire Cat’s API. The plan is to embed a chat interface on the site that connects to our self-hosted Cheshire Cat service. Thanks to Cheshire Cat’s support for WebSocket chat and a simple REST API, embedding an interactive chat window that speaks to the agent is relatively straightforward. We just ensure our server is online and reachable (through a domain and maybe a tunnel for external access). The end result: visitors to our blog can play the “Self-Hosting Choose Your Own Adventure”, chatting with the Cheshire Cat-guided AI, which uses all the knowledge we collected to help them virtually navigate setting up their own homelab. And since it’s powered by an LLM, the dialogue can adapt and even answer questions outside the scripted path – far more flexible than a static text or decision tree.

In essence, Cheshire Cat AI allowed us to turn documentation into an interactive, gamified experience. It serves as a bridge between the raw knowledge (stored in vectors) and a conversational, fun presentation layer. The fact that it’s all containerized and ran seamlessly on the same Runtipi server (right alongside our Ollama and other apps) is a testament to how far self-hosting has come – we truly have our own little cloud, complete with an AI storyteller in residence!

##Conclusion and Next Steps

Our journey began with exploring Runtipi as a means to simplify self-hosting, and it ended with a creative experiment to use AI in a novel way on our homelab. Along the way, we set up a robust homeserver that can one-click deploy myriad applications, learned how to run large language models locally (comparing two leading tools), and leveraged an AI framework to build an interactive learning game. For fellow enthusiasts and weekend warriors looking to dive into self-hosting, we hope this multi-part adventure demonstrates that the barrier to entry has never been lower. With projects like Runtipi, you don’t need to be a DevOps expert to self-host your own cloud services. And with the likes of LocalAI and Ollama, you can even host your own “AI assistant” on hardware you own, keeping control of your data and experimenting with cutting-edge AI in your garage or living room.

In upcoming posts, we plan to provide detailed guides for specific parts of this journey, such as “Step-by-Step: Self-Hosting ChatGPT with LocalAI (no cloud required)” and “How We Built a Choose-Your-Own-Adventure AI Game with Cheshire Cat”. We’ll include code snippets and configuration files for those who want to replicate or modify what we built. The beauty of this being self-hosted and open-source is that you can customize every piece: want to swap Runtipi for a different orchestrator? You can. Prefer a different LLM model or a totally different game storyline? Go for it. We encourage you to tinker and make it your own.

For now, if you’re new to self-hosting, give Runtipi a try and spin up something fun – maybe your own photo gallery or music streaming server – to get the hang of it. If you’re feeling adventurous, try deploying LocalAI or Ollama on it and marvel at the fact that you have a personal AI running in your homelab. And if you do, why not have a chat with our Cheshire Cat on the blog? You might just learn something new about self-hosting, and have a good laugh or two as you play through the adventure. Happy self-hosting! 😸🎉

Written By

undefined

Explore More Articles

Hello world!

May 3, 2025 | Uncategorized

Welcome to WordPress. This is your first post. Edit or delete it, then start writing!

A big welcome!!!

Self-Hosting an AI-Powered Homelab: From Runtipi Setup to Cheshire Cat Adventure Introduction to Runtipi and Self-Hosting

Written By

Explore More Articles

Hello world!

0 Comments

Submit a Comment Cancel reply

Connect with Us

Explore Our Services

Resources