MTEB stands for Massive Text Embedding Benchmark—a comprehensive evaluation framework designed to test how well text embedding models perform across a wide range of natural language processing tasks2.

Best Sentence-to-Vector-to-Sentence (S-V-S) Pairings by Dimension

384D: Efficiency-Focused

Best Pairing: MiniLM-L6-v2 + vec2text-base

python

# Sentence → Vector (384D) encoder = "sentence-transformers/all-MiniLM-L6-v2" MTEB Avg: 63.05 Speed: 14,200 sentences/sec on CPU Size: 80MB Vector → Sentence decoder = "jxm/vec2text-base-embeddings-384d" ROUGE-L: ~0.42 Cosine recovery: ~0.85

768D: Balanced Performance (Your Current Target)

Best Pairing: GTR-T5-base + vec2text-gtr-base

python

# Sentence → Vector (768D)
encoder = "sentence-transformers/gtr-t5-base"
MTEB Avg: 66.13
Particularly strong on retrieval tasks
Size: 218MB

Vector → Sentence
decoder = "jxm/vec2text-gtr-base"
ROUGE-L: ~0.68
Cosine recovery: ~0.91
BERT Score: ~0.89

1024D: High Performance

Best Pairing: E5-large-v2 + Custom vec2text

python

# Sentence → Vector (1024D) encoder = "intfloat/e5-large-v2" MTEB Avg: 69.78 Excellent multilingual support Size: 1.34GB Vector → Sentence Note: No official vec2text for 1024D yet Options: 1. Project to 768D and use vec2text-gtr-base 2. Train custom vec2text model projection_matrix = learn_projection(1024, 768)

1536D: Premium Quality

Best Pairing: OpenAI ada-002 Alternative (Open Source)

python

# Since you want open source, best alternatives: Option 1: GTR-T5-XL encoder = "sentence-transformers/gtr-t5-xl" MTEB Avg: 68.42 Size: 1.24GB Option 2: Cohere Embed v3 (if API acceptable) MTEB Avg: 64.47 Vector → Sentence Currently no vec2text for 1536D Recommendation: Train custom using your approach

2048D: Maximum Expressiveness

Best Pairing: Custom ensemble approach

python

# No standard models at 2048D
Recommendation: Ensemble approach
class Ensemble2048D:
 def __init__(self):
 self.models = [
 "gtr-t5-xl", # 768D
 "e5-large-v2", # 1024D
 "bge-large-en" # 1024D
 ]

 def encode(self, text):
 embeddings = [model.encode(text) for model in self.models]
 return np.concatenate(embeddings)[:2048]

Comprehensive Benchmark Comparison

DimensionEncoder ModelMTEB AvgDecoder ModelROUGE-LCosine RecoveryBERT Score 384Dall-MiniLM-L6-v263.05vec2text-base-384d0.420.850.81 768Dgtr-t5-base66.13vec2text-gtr-base0.680.910.89 768Dgte-base63.13vec2text-gte-base~0.60~0.88~0.86 1024De5-large-v269.78[Projected to 768D]~0.65~0.87~0.85 1536Dgtr-t5-xl68.42[Custom needed]---

Estimated based on model architecture

Key Findings for Your VMM Architecture

768D Sweet Spot: GTR-T5-base offers the best balance of quality and decoder availability

Cosine Alignment: GTR models consistently achieve >0.90 cosine recovery

MTEB Leaders: E5 models dominate MTEB benchmarks but lack decoders

Implementation for Multi-Dimensional Support

python

class MultiDimS2V2S:
 """Support multiple dimensions with fallback"""

 def __init__(self):
 self.encoders = {
 384: "all-MiniLM-L6-v2",
 768: "gtr-t5-base",
 1024: "e5-large-v2",
 1536: "gtr-t5-xl"
 }

 self.decoders = {
 384: "vec2text-base-384d",
 768: "vec2text-gtr-base",
 # Fallback for higher dims
 1024: self.project_and_decode,
 1536: self.project_and_decode
 }

 def project_and_decode(self, embedding, target_dim=768):
 """Project to 768D for decoding"""
 projection = self.projections[f"{len(embedding)}_to_{target_dim}"]
 projected = projection @ embedding
 return self.decoders768

Recommendation for Your Project

Given your focus on 768D and need for high-quality reconstruction:

Primary: Use GTR-T5-base + vec2text-gtr-base (best proven combination)

Future-proof: Store embeddings at multiple dimensions as you planned

Consider: Training a custom vec2text model for 1536D if you find 768D limiting

The GTR-T5 family provides the best balance of:

High MTEB scores (66.13)

Excellent reconstruction (ROUGE-L: 0.68)

Proven vec2text decoder

Reasonable size for your M4 Mac

🧪 What MTEB Measures

MTEB evaluates models on 40+ tasks (in English v2) across categories like:

🔍 Retrieval (e.g., document or passage search)

🧠 Semantic Textual Similarity (STS)

🗂️ Classification (topic, intent, etc.)

🧩 Clustering (grouping similar texts)

📄 Reranking (ranking search results)

🧾 Summarization & QA (in some variants)

Each model gets a composite score (MTEB avg) based on its performance across these tasks. This score helps compare generalist embedders like NV-Embed-v2 or STELLA across use cases.

🏆 Why It Matters

Standardized: Offers a unified benchmark for comparing embedding models.

Modular: Supports different domains (e.g., medical, legal, multilingual).

Practical: Helps developers choose the right model for retrieval, clustering, or semantic search.

You can explore the official MTEB leaderboard to see how models stack up.

#ModelDir.DimsRAM class (fp16 est.)Perf / signalArch / notesMPS (Mac GPU)License 1STELLA EN 400M v5sent→vec512–8192 (MRL); 1024 rec.~1 GB1024d ≈ 8192d (Δ≈0.001 MTEB, model card)GTE-derived; Matryoshka dim slicing. Hugging FaceYes(PyTorch mps) †MIT (HF card) 2all-mpnet-base-v2sent→vec768~0.8 GBStrong SBERT baselineMPNet encoder (Sentence-Transformers). Hugging FaceYes(PyTorch mps) †Apache-2.0 3BGE-large-en-v1.5sent→vec1024~1.3–1.6 GBSOTA-class retrievalBERT-family; FlagEmbedding. Hugging Face+1Yes(PyTorch mps) †MIT 4E5-large-v2(intfloat)sent→vec1024~1.4–1.7 GBStrong retrieval/STS24-layer transformer; HF usage shown. Hugging FaceYes(PyTorch mps) †Apache-2.0 5GTE-large-en-v1.5sent→vec1024~1–1.7 GBSOTA in size tier“Transformer++” (BERT + RoPE + GLU), ctx 8192. Hugging Face+1Yes(PyTorch mps) †MIT 6GTR-T5-basesent→vec768~0.9 GBClassic T5 retrievalT5-based ST model. Hugging FaceYes(PyTorch mps) †Apache-2.0 7all-MiniLM-L6-v2sent→vec384~0.2 GBGreat speed/quality6-layer MiniLM distilled encoder. Hugging FaceYes(PyTorch mps) †Apache-2.0 8Jina-embeddings-v3sent→vec1024 (MRL down to 32)~1.8–2.2 GBMultilingual; long ctxXLM-R backbone + RoPE; MRL truncation. Hugging Face Jina AIYes(PyTorch mps) †Apache-2.0 9nomic-embed-text-v1.5sent→vec768 (MRL 768→64)~0.8–1.0 GBGood price/qualityMRL; multiple output dims supported. Hugging Face docs.nomic.aiYes(PyTorch mps) †Apache-2.0 10Snowflake Arctic-embed-L v2.0sent→vec1024~1.3–1.5 GBEnterprise retrievalE5-Large–style retriever; multilingual in v2.0. Hugging Face NVIDIA NIM APIsYes(PyTorch mps) †Apache-2.0 11multilingual-E5-largesent→vec1024~1.4–1.7 GBStrong multilingual24-layer transformer; infoNCE. Hugging FaceYes(PyTorch mps) †Apache-2.0 12BGE-M3sent→vec1024~2.2 GBHybrid dense/sparse/multi-vector569M params; 8k ctx; multilingual. Hugging Face BGE ModelYes(PyTorch mps) †MIT 13Voyage-3 / 3.5sent→vec2048 / 1024 / 512 / 256APISOTA-level APIAdjustable dims via API. Voyage AI Voyage AIN/A(hosted API)Proprietary 14Cohere Embed v3 (EN/ML)sent→vec1024 or 384APIStrong MTEB; fastv3 sizes and dims per docs. Cohere Cohere DocumentationN/A(hosted API)Proprietary 15Vec2Textvec→sentAny (per target embedder)Dep. on decoderEMNLP’23: high-fid recon (e.g., exact on many 32-tok inputs)Controlled generation to match a fixed embedding; open lib. ACL Anthology GitHubYes(PyTorch mps) †MIT 16ZSInvert (Universal Zero-shot Embedding Inversion)vec→sentAnyResearch codeZero-shot, fast & query-efficientAdversarial decoding; train-once corrector. arXiv+1Likely(PyTorch) †Research (arXiv) 17ALGEN (Few-shot Inversion)vec→sentAnyResearch codeFew-shot, cross-model/domainAlign victim→attack space + generative decoding. arXiv ACL AnthologyLikely(PyTorch) †Research (ACL’25) 18InvBERTvec→sentToken-level ctx repsResearch codeFeasible recon from contextual embeddingsTwo variants (seq2seq / classify). jcls.io OpenReviewLikely(PyTorch) †Research 19Generative Embedding Inversion (Li et al., 2023)vec→sentAnyResearch codeGenerative attack improves recoveryEarly generative inversion for sentence embeds. ACL AnthologyLikely(PyTorch) †Research 20RetroMAE_(reconstruction pretrain)_vec→sent(pretrain task)N/AN/AEncoder recreates text from its sentence embedding + masked inputMAE-style pretraining with embed→decoder reconstruction. arXivYes(PyTorch mps) †Apache-2.0

† MPS notes (Apple GPU on Mac): PyTorch’s MPS backend accelerates Transformers on Apple silicon using Metal. Most HF/SBERT models “just work” on mps, but a few ops may fall back to CPU (set PYTORCH_ENABLE_MPS_FALLBACK=1). Conversion to Core ML is also possible (Optimum/Exporters) if you want ANE/GPU deployment. Apple Developer Hugging Face+2Hugging Face+2 GitHub

Extra notes you’ll care about

New vec→text additions you asked for: ZSInvert, ALGEN, InvBERT, Li-2023 Generative Inversion, and RetroMAE (embed-to-text reconstruction during pretraining). These complement Vec2Text and give you zero-shot and few-shot options across black-box embedders. arXiv+2arXiv+2 jcls.io ACL Anthology

If you plan to invert embeddings, treat vectors as sensitive. Multiple papers show high-fidelity reconstructions under realistic assumptions. ACL Anthology+1

Want this as a CSV (with fp32/fp16/int8 memory columns) or filtered to 384 / 768 / 1024 only? I can generate it directly.

High-confidence pairs (evaluated in papers)

GTR-T5-base (768d) → Vec2Text

EMNLP’23 shows very strong reconstructions (e.g., ~92% exact match on 32-token snippets with sequence-beam search); GTR-T5 models output 768-d vectors. ACL Anthology Hugging Face

OpenAI text-embedding-ada-002 (1536d) → Vec2Text

Same paper reports solid exact-match rates on ada-002; the legacy ada-002 uses 1536 dims. ACL Anthology OpenAI Community

Contriever (768d) → ZSInvert (zero-shot)

ZSInvert evaluates on Contriever and recovers semantically faithful text (F1 > 50, cosine > 0.90) without per-encoder training; Contriever exports 768-d vectors. ar5iv Hugging Face

GTR-T5 (768d) → ZSInvert (zero-shot)

Same ZSInvert study includes GTR; zero-shot, high cosine similarity. ar5iv

GTE-large-en-v1.5 (1024d) → ZSInvert (zero-shot)

GTE is one of the evaluated encoders; large-en-v1.5 outputs 1024-d vectors. ar5iv Hugging Face

GTE-Qwen2-1.5B-instruct (1536d) → ZSInvert (zero-shot)

Explicitly listed among ZSInvert’s encoders; model card notes 1536-d embeddings. ar5iv Hugging Face

Sentence-T5 (768d) → ALGEN (few-shot)

ALGEN trains a local decoder (FLAN-T5-decoder) and aligns to victim embedders; with ~1k alignment samples it gets strong Rouge/Cosine on T5 embeddings. Sentence-T5 uses 768-d vectors. ACL Anthology Hugging Face

GTR-T5 (768d) → ALGEN (few-shot)

Reported Rouge-L ≈ 38 and cosine ≈ 0.89 with 1k leaked samples. ACL Anthology

mT5 embeddings (≈768d) → ALGEN (few-shot)

Rouge-L ≈ 43, cosine ≈ 0.94 at 1k samples. ACL Anthology

mBERT (768d) → ALGEN (few-shot)

Rouge-L ≈ 40, cosine ≈ 0.92 at 1k samples. ACL Anthology

OpenAI text-embedding-ada-002 (1536d) → ALGEN (few-shot)

Rouge-L ≈ 41 and cosine ≈ 0.93 at 1k samples. ACL Anthology

OpenAI text-embedding-3-large (3072d) → ALGEN (few-shot)

Rouge-L ≈ 41 and cosine ≈ 0.91 at 1k samples; 3-large uses 3072 dims. ACL Anthology OpenAI Platform

Sentence-BERT (SBERT, e.g., 768d) → GEIA (generative inversion)

GEIA reconstructs ordered sequences across SBERT family. (Paper evaluates SBERT/SimCSE/ST5/MPNet.) ACL Anthology

SimCSE-RoBERTa (768d) → GEIA

Same GEIA paper shows good lexical overlap (ROUGE-1 ≈ 0.59–0.72; BLEU-1 ≈ 0.35–0.46 across victims). SimCSE-RoBERTa exports 768-d embeddings. ACL Anthology Hugging Face

all-MPNet-base-v2 (768d) → GEIA

MPNet is one of GEIA’s evaluated victims; all-mpnet-base-v2 is 768-d. ACL Anthology Hugging Face

Sentence-T5 (768d) → GEIA

Also explicitly evaluated as a victim model in GEIA. ACL Anthology

Solid “works in practice” pairs (generalizable decoders, widely used encoders)

E5-large-v2 (1024d) → ZSInvert (zero-shot)

ZSInvert is universal and tested across BERT/T5-style encoders; E5-large-v2 is a popular 1024-d BERT-style encoder, making it a good match in the same family. ar5iv Hugging Face

Contriever (768d) → ALGEN (few-shot)

ALGEN’s method is encoder-agnostic once you align a small leaked set; contrary encoders like Contriever are typical BERT-style and align well under ALGEN’s linear map. (ALGEN framework + Contriever dim.) ACL Anthology Hugging Face

STELLA-en-400M-v5 (choose 768d/1024d) → ZSInvert (zero-shot)

ZSInvert is embedding-agnostic; STELLA 400M v5 offers multiple output dims (e.g., 768/1024), so you can pick a dimension that matches your store. ar5iv Hugging Face

all-MiniLM-L6-v2 (384d) → GEIA

GEIA generalizes across SBERT/SimCSE-style encoders; all-MiniLM-L6-v2 (384-d) is a compact SBERT model often used in the wild. ACL Anthology Hugging Face

Quick pairing tips

Match dimensions and normalization: Most encoders L2-normalize outputs; decoders like Vec2Text/ZSInvert/ALGEN/GEIA expect the same normalization you used when storing vectors.

Pooling matters: For encoders that need it (E5/GTE/Contriever), use the paper’s recommended pooling (usually mean-pool last hidden + L2-norm) so inversion matches what produced the stored vectors. Hugging Face

Pick the right decoder style:

- Vec2Text → best when you can afford per-encoder training; it yields _exact_ matches on some encoders/datasets. ACL Anthology

- ZSInvert → best when you want _zero-shot_ support across many encoders with high semantic fidelity. ar5iv

- ALGEN → best when you can get ~1–1000 leaked (text,embedding) pairs for the victim and want strong results fast via linear alignment + a single local decoder. ACL Anthology

- GEIA → good general generative baseline across classic SBERT/SimCSE/T5/MPNet families. ACL Anthology

Best Sentence-to-Vector-to-Sentence (S-V-S) Pairings by Dimension

384D: Efficiency-Focused

MTEB Avg: 63.05

Speed: 14,200 sentences/sec on CPU

Size: 80MB

Vector → Sentence

ROUGE-L: ~0.42

Cosine recovery: ~0.85

768D: Balanced Performance (Your Current Target)

MTEB Avg: 66.13

Particularly strong on retrieval tasks

Size: 218MB

Vector → Sentence

ROUGE-L: ~0.68

Cosine recovery: ~0.91

BERT Score: ~0.89

1024D: High Performance

MTEB Avg: 69.78

Excellent multilingual support

Size: 1.34GB

Vector → Sentence

Note: No official vec2text for 1024D yet

Options:

1. Project to 768D and use vec2text-gtr-base

2. Train custom vec2text model

1536D: Premium Quality

Option 1: GTR-T5-XL

MTEB Avg: 68.42

Size: 1.24GB

Option 2: Cohere Embed v3 (if API acceptable)

MTEB Avg: 64.47

Vector → Sentence

Currently no vec2text for 1536D

Recommendation: Train custom using your approach

2048D: Maximum Expressiveness

Recommendation: Ensemble approach

Comprehensive Benchmark Comparison

Key Findings for Your VMM Architecture

Implementation for Multi-Dimensional Support

Recommendation for Your Project

🧪 What MTEB Measures

🏆 Why It Matters

Extra notes you’ll care about

High-confidence pairs (evaluated in papers)

Solid “works in practice” pairs (generalizable decoders, widely used encoders)

Quick pairing tips

Related Research

Preposition Vs Semantic Chunking

Product Requirements Document (PRD)

In SemanticGPSPositioning.forward():

INVERSE_STELLA: Product Requirements Document