Skip to content

Building Your Own Vector Database

Many working versions of MySQL for WordPress do not support the vector field.

However, we can create our version, limited for large datasets but useful for up to 10,000 documents or rows.

The Magic Trick: Using LONGTEXT Column

Since MySQL doesn't have a vector field, we'll hack it by storing vectors as text in a LONGTEXT column!

┌─────┬───────────────┬──────────────────────────────────┐
│ id  │     text      │       vector (LONGTEXT)          │
├─────┼───────────────┼──────────────────────────────────┤
│  1  │  "pizza"      │  "[0.9,0.1,0.2]"                 │
│  2  │  "pasta"      │  "[0.8,0.2,0.3]"                 │
│  3  │  "car"        │  "[0.1,0.8,0.9]"                 │
└─────┴───────────────┴──────────────────────────────────┘

Step 1: Create the Database Table

CREATE TABLE vector_items (
    id INT AUTO_INCREMENT PRIMARY KEY,
    text VARCHAR(500) NOT NULL,
    embedding LONGTEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_text (text)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Limitations & When to Upgrade

This Approach is Slow When:

  • You have 10,000+ vectors (loads all into PHP memory).
  • You need sub-second search times.
  • Vector dimensions are very large (1000+). This is true in our case as we use 1536 dimensions but we have a small dataset so it balances out.
  • As we move past 205, more applications of MySQL will have the version that supports the vector field along with indexing.

When to Use Real Vector DB:

  • Production apps with millions of vectors
  • Real-time search requirements
  • Use: Pinecone, Weaviate, Milvus, or pgvector (PostgreSQL extension)

his Approach Works Great For:

  • Small to medium datasets (< 10,000 items)
  • Learning and prototyping
  • Side projects and MVPs
  • When you already have MySQL and don't want extra infrastructure

Summary

You just learned how to:

  • ✅ Store vectors as comma-separated text in LONGTEXT
  • ✅ Calculate cosine similarity in pure PHP
  • ✅ Build a working vector search engine
  • ✅ Integrate with OpenAI embeddings
  • ✅ Optimize with caching and filtering

The Magic:

MySQL stores the vectors as text, PHP does the math to find similar items.

Simple but effective!