Visualizing Embeddings: Or how to Improve Retrieval-Augmented Generation

Wednesday, March 26, 2025 at 12:00:00 AM

Lately, I've been toying with this portfolio, which is supposed to be AI-powered. Sort of.

In the world of AI and machine learning, Retrieval-Augmented Generation (RAG) promises to bridge the gap between raw language models and contextually rich, accurate responses. My recent experience revealed a somewhat critical flaw: not all embeddings are created equal. What with my visualization past experience, I thought "who knows! Maybe visualization can be the key to understanding what is wrong?"

The Initial Challenge

Like many developers exploring RAG, I started with off-the-shelf solutions. I was using Ollama and Mistral models, expecting seamless performance. The reality? Disappointingly generic responses that missed the nuanced context of my specific document collections. Hallucinations? You bet!

The Embedding Visualization Breakthrough

Frustrated by mediocre results, I began developing a custom embedding visualization tool. The goal is simple yet theoretically powerful: understand how my embeddings actually represent my documents.

What is an Embedding?

At its core, an embedding is a dense vector representation of text. Each document becomes a point in a high-dimensional space, where semantic similarities create clusters and relationships. I recommend you read about it, the internet is full of interesting content.

The Wave Visualization Approach

I created a what I call a "wave visualization" that represents embedding dimensions on a horizontal axis. Each point is an aggregate of a document's embedding and is put on the Y axis as different vector dimensions. The visualization reveals:

Semantic similarities between documents
How different documents cluster
Potential areas of embedding weakness

Key Insights from Visualization

Dimensional Variations: Not all embedding dimensions are equally informative
Clustering Patterns: Some documents are closer than others in the embedding space. What are those areas?
Quality of Representation: Weak embeddings can lead to poor retrieval results

Technical Implementation

The visualization uses:

Cosine similarity calculations
Dimension normalization
Color-coded representation of document relationships

// Simplified cosine similarity calculation
const calculateCosineSimilarity = (vec1: number[], vec2: number[]) => {
  const dotProduct = vec1.reduce((sum, val, i) => sum + val * vec2[i], 0);
  const magnitude1 = Math.sqrt(vec1.reduce((sum, val) => sum + val * val, 0));
  const magnitude2 = Math.sqrt(vec2.reduce((sum, val) => sum + val * val, 0));

  return dotProduct / (magnitude1 * magnitude2);
};

Ongoing Improvements

My current approach is a work in progress. Next steps include:

Fine-tuning embedding models
Experimenting with different embedding techniques
Using visualization insights to improve document chunking strategies
The holy grail: display dynamic prompts embeddings on the map to have an idea of what it will most likely touch

Conclusion

Embedding visualization isn't just a debugging tool—it's a window into the semantic understanding of my AI system. By seeing how documents are represented, I can (theoretically) make more informed decisions about our RAG pipelines.

Stay tuned for more updates on this embedding exploration journey!

Curious Coder: A Portfolio