Vector Store Adapters¶
VectorPin ships thin adapters for the major vector databases. Adapters do two things:
- Walk records — Iterate the collection yielding
(id, vector, metadata, pin)tuples for verification. - Attach pins — Write a pin into the record's metadata in whichever shape the backend prefers.
The adapter protocol lives at src/vectorpin/adapters/base.py and is intentionally thin. Community contributions for new backends are welcome.
Status¶
| Backend | Status | Install | Notes |
|---|---|---|---|
| LanceDB (default) | Alpha | pip install 'vectorpin[default]' |
Embedded, file-based, no daemon. Recommended. |
| Chroma | Alpha | pip install 'vectorpin[chroma]' |
Both persistent and HTTP modes. |
| Qdrant | Alpha | pip install 'vectorpin[qdrant]' |
Server-side payload filtering. |
| Pinecone | Alpha | pip install 'vectorpin[pinecone]' |
Hosted only. |
| pgvector | Planned | — | |
| FAISS | Planned | Use LanceDBAdapter (embedded, has metadata column natively). |
All adapters present the same iter_records() / attach_pin() interface. The backend differences are limited to where the pin physically lives in the underlying record.
Storage Convention¶
By convention, pins are stored under the metadata key vectorpin. Specifically:
| Backend | Pin lives at |
|---|---|
| LanceDB | A typed schema column literally named vectorpin (string-valued, holding the pin JSON). |
| Chroma | The metadata dict, under key vectorpin. |
| Qdrant | The payload dict, under key vectorpin. |
| Pinecone | The metadata dict, under key vectorpin. |
Backends without free-form metadata fields are out of scope — provenance must travel with the data, not in a sidecar.
LanceDB (default)¶
LanceDB is the recommended default: embedded, file-based, no daemon, with a typed schema column that holds the Pin natively. It matches the Symbiont runtime's default vector backend.
Pin a corpus¶
from vectorpin import Signer
from vectorpin.adapters import LanceDBAdapter
adapter = LanceDBAdapter.connect("./data/vector_db", "rag-corpus")
signer = Signer.generate(key_id="prod-2026-05")
for record in adapter.iter_records():
pin = signer.pin(
source=record.metadata["text"],
model="text-embedding-3-large",
vector=record.vector,
)
adapter.attach_pin(record.id, pin)
Verify a corpus¶
from vectorpin import Verifier
from vectorpin.adapters import LanceDBAdapter
adapter = LanceDBAdapter.connect("./data/vector_db", "rag-corpus")
verifier = Verifier({"prod-2026-05": public_key_bytes})
failed = 0
for record in adapter.iter_records():
if record.pin is None:
continue
result = verifier.verify(
record.pin,
source=record.metadata["text"],
vector=record.vector,
)
if not result.ok:
print(f"FAIL {record.id} [{result.error.value}] {result.detail}")
failed += 1
assert failed == 0, f"{failed} records failed verification"
Connection options¶
LanceDBAdapter.connect accepts a URI (directory path, s3://, gs://, or LanceDB Cloud connection string), a table name, and optional column overrides:
adapter = LanceDBAdapter.connect(
uri="s3://my-bucket/vector_db",
table_name="rag-corpus",
id_column="id", # default: "id"
vector_column="vector", # default: "vector"
)
Symbiont schema¶
For Symbiont deployments: Symbiont's source text lives in the content column. Symbiont's column literally named source is upstream provenance (a URL), not VectorPin's source argument. Pass source=record.metadata["content"] when calling signer.pin. See tests/test_adapter_lancedb_symbiont.py for an end-to-end example.
Chroma¶
Chroma offers both an embedded persistent client and a remote HTTP client. The adapter supports both.
Persistent (embedded)¶
from vectorpin.adapters import ChromaAdapter
adapter = ChromaAdapter.connect_persistent("./chroma_db", "my-rag")
HTTP¶
adapter = ChromaAdapter.connect_http(
host="chroma.internal",
port=8000,
collection_name="my-rag",
ssl=False,
)
Pinning¶
for record in adapter.iter_records():
pin = signer.pin(
source=record.metadata["text"],
model="text-embedding-3-large",
vector=record.vector,
)
adapter.attach_pin(record.id, pin)
The pin is stored as a JSON string under metadata["vectorpin"]. Chroma metadata is dict[str, str | int | float | bool], so the pin survives the JSON-string round trip without loss.
Qdrant¶
Qdrant supports both local and Qdrant Cloud deployments. Pins are written into the payload dict.
from vectorpin.adapters import QdrantAdapter
adapter = QdrantAdapter.connect(
url="http://localhost:6333",
collection_name="my-rag",
api_key=None, # set for Qdrant Cloud
)
for record in adapter.iter_records(batch_size=256):
pin = signer.pin(
source=record.metadata["text"],
model="text-embedding-3-large",
vector=record.vector,
)
adapter.attach_pin(record.id, pin)
Qdrant's payload filtering means you can query for unpinned records server-side:
# Pseudo — exact API depends on qdrant-client version
unpinned = client.scroll(
collection_name="my-rag",
scroll_filter={"must_not": [{"key": "vectorpin", "match": {"any": ["*"]}}]},
)
Pinecone¶
Pinecone is hosted-only. Pins are stored under metadata["vectorpin"] as a JSON string.
from vectorpin.adapters import PineconeAdapter
adapter = PineconeAdapter.connect(
api_key="...",
index_name="my-rag",
)
for record in adapter.iter_records():
pin = signer.pin(
source=record.metadata["text"],
model="text-embedding-3-large",
vector=record.vector,
)
adapter.attach_pin(record.id, pin)
Pinecone metadata values are size-limited (40 KiB per record). VectorPin pins are well under 1 KiB at typical sizes, so you'll never hit the limit — but if you stuff large extra payloads in, double-check.
Choosing a Backend¶
| If you... | Use |
|---|---|
| Just want pinning without standing up a server | LanceDB (default) |
| Already run Chroma | Chroma |
| Need server-side payload filtering | Qdrant |
| Are on Pinecone today | Pinecone |
| Run Symbiont | LanceDB (matches Symbiont's default backend) |
LanceDB also gives you a typed vectorpin column, which is more grep-able than a JSON blob in a metadata dict — useful when reasoning about partial backfills.
Writing a New Adapter¶
The adapter protocol is two methods plus a record dataclass. Sketch:
from dataclasses import dataclass
from typing import Iterator
import numpy as np
from vectorpin import Pin
@dataclass
class PinnedRecord:
id: str
vector: np.ndarray
metadata: dict
pin: Pin | None
class MyBackendAdapter:
@classmethod
def connect(cls, ...) -> "MyBackendAdapter":
...
def iter_records(self, batch_size: int = 256) -> Iterator[PinnedRecord]:
...
def attach_pin(self, record_id: str, pin: Pin) -> None:
...
See src/vectorpin/adapters/base.py for the canonical protocol and the existing adapters for working examples.
See Also¶
- CLI Guide — Command-line equivalents to programmatic auditing
- Getting Started — End-to-end pinning + verification walkthrough
- Pin Protocol — Wire format and verification order