Saturday, March 22, 2025 at 12:00:00 AM
First thing first, why? Why did I decide to do this? It all started with the latest hosting provider bill, which was quite high... Read "very high" or "too high" for what I was doing. I had started receiving emails about needing to upgrade because of the resources my site was using.
It was a Wordpress site goddammit! It had nothing there! I mean, yeah, a couple of articles, but nothing that was "using up resources" as they put it up.
This started me toying with the idea of leaving the hosting provider and Wordpress. I wanted to host my website myself - nowadays, we have pretty stable connections and electricity, plus I got a couple of idle PCs around.
Also - and this is a key factor - I didn't want to use an existing CMS. I wanted something I built, controlled and owned. So I tried with a couple of questions I had to answer before beginning; here's kind of the Q&A as I progressed:
Q: Why do you want to create your own CMS?
A: I want to understand how it works under the hood so I can appreciate the existing ones. Plus, there's a few things lacking in the existing CMS, I can't extend them myself without learning stuff I don't necessarily want to learn at this point.
Q: So what DO you want to learn?
A: I want to learn more about AI, RAG, and everything required in the middle. I'm always interested in "trying" stacks too.
Q: Given what you want to learn and your CMS objective, how can you achieve both objectives in one shot?
A: That's it! Let's implement a CMS that will have AI behind it. Maybe something that can leverage the CMS data to answer questions... That would make it questions about me.
Q: When would anyone want to know anything about you?
A: Well, maybe if they know me, are professional contacts, or potential recruiters; they might be interested in playing with my new "toy"?
Q: So what will the Web app be about then?
A: About my career, the projects I work or worked on... A portfolio! Yes, that would be awesome. Let's build a portfolio where you can chat with my "assistant" and ask questions about my projects and career. That sounds like a great plan.
I had an idea. Now, I needed a plan. What did I need? I wrote a 1st version of user requirements with the help of OpenAI's ChatGPT. Then I came up with a 2nd version and eventually a 3rd one
At that point, I had a clear idea of what I wanted to accomplish. I even had an idea of the stack by then:
But then I was kind of at a loss for the AI part. What did I need? How is it working? I ended up doing tons of reading (there's ton of literature on OpenAI's website as well as Anthropic's website) and the most eye-opening read was everything Martin Fowler wrote on the topic. He's been doing research on the topic lately and I learned a lot about
Again, since I wanted to run everything locally (at least for now) I had to find the necessary tools for these... So I ended up setting up:
At that point, I was all ready for implementation!
I started. Everything runs on CI/CD with Github, local Jenkins, and my portfolio repository. It is private
for now, but I will switch it to public
very soon once I added more test coverage and cleaned up a bit.
I had a development plan that I started with. I of course created a 2nd version of the plan And then a 3rd version!!
I need to go back to these and keep only the "final" and update the statuses; I didn't. But long story short, everything is mostly there; it can be deployed and used (although not perfect yet).
This is work in progress. What I learned throughout my journey though, especially about AI, is the following:
a single GPU won't cope in a real world scenario. It is too slow. Buy some credits with OpenAI, Anthropic, or anyone else... For a small scale app like mine, it will take years before you reach the cost tipping point; if you ever reach it! For example, my RTX 3060ti cost me 500$ a few years ago... Fine, I already have it. But if I pay 10$ to OpenAI, I'll be good for a few months before reaching the end of the credits. Do some math; but using OpenAI brings responses down to near-instant, whereas my GPU can take up to a minute for a less than ideal answer.
there's always something more to do about RAG. You'll want to categorize in collections. You'll need a very robust set of metadata. You'll realize you should break down some entries in different embeddings. Etc. etc. For this reason alone, I've been adding "new features" almost daily just to improve the embeddings and retrieval.
Prompt engineering requires tuning. This may sound obvious; but nothing works better than trial and error. I ended up making a "prompt management" system in my app so admins can "tune" the prompts.
It is hard to strike a good balance of "good context" versus "good instructions". But that's what we need to aim for.
Pipeline. Yeah, we need a pipeline for the AI to work fine. For example, mine has the following flow:
All these steps can take some time. But from what I see, the "slow" ones for me are really the GPT part; generating the response. As such, my plan is to switch that part to OpenAI or Anthropic (I will test both) so that I can leverage my hardware for the "prep" and then rely on an external service for the "generation".