Skip to content

langchain-cockroachdb

LangChain integration for CockroachDB with native vector support

Tests codecov PyPI version Python 3.9+ License


Overview

This package provides LangChain abstractions backed by CockroachDB, leveraging CockroachDB's native VECTOR type and C-SPANN (CockroachDB SPANN) indexes for vector search at scale in a distributed, horizontally scalable database.

Key Features

🎯 Vector Store

  • Native Vector Support: Uses CockroachDB's native VECTOR type (not pgvector)
  • C-SPANN Indexes: CockroachDB's vector index algorithm optimized for distributed systems
  • Advanced Filtering: Rich metadata filtering with $and/$or/$gt/$in operators
  • Hybrid Search: Combine full-text search (TSVECTOR) with vector similarity
  • Query-Time Tuning: Adjust beam size for accuracy/speed tradeoff

🏗️ Reliability Features

  • Automatic Retry Logic: Handles 40001 serialization errors with exponential backoff
  • SERIALIZABLE Isolation: Built for CockroachDB's default isolation level
  • Multi-Tenancy: Index prefix columns for efficient tenant isolation
  • Connection Pooling: Configurable connection pools with health checks
  • Horizontal Scalability: Designed for distributed deployments

💬 Chat History

  • Persistent Storage: Store conversation history in CockroachDB
  • Session Management: Organize by session/thread ID
  • LangChain Integration: Drop-in replacement for other chat history implementations

🔄 Async & Sync APIs

  • Async-First: High-performance async operations for I/O concurrency
  • Sync Wrapper: Simple synchronous API for scripts and batch jobs
  • Connection Pooling: Efficient connection reuse across async operations

Quick Example

import asyncio
from langchain_cockroachdb import AsyncCockroachDBVectorStore, CockroachDBEngine
from langchain_openai import OpenAIEmbeddings

async def main():
    # Initialize engine
    engine = CockroachDBEngine.from_connection_string(
        "cockroachdb://user:pass@host:26257/db?sslmode=verify-full"
    )

    # Create table
    await engine.ainit_vectorstore_table(
        table_name="documents",
        vector_dimension=1536,
    )

    # Initialize vector store
    vectorstore = AsyncCockroachDBVectorStore(
        engine=engine,
        embeddings=OpenAIEmbeddings(),
        collection_name="documents",
    )

    # Add documents
    await vectorstore.aadd_texts([
        "CockroachDB is a distributed SQL database",
        "LangChain makes it easy to build LLM applications",
    ])

    # Search
    results = await vectorstore.asimilarity_search(
        "Tell me about databases",
        k=2
    )

    print(results)
    await engine.aclose()

asyncio.run(main())

Why CockroachDB?

  • Distributed by Design: Scale horizontally across regions
  • Native Vector Support: First-class VECTOR type and C-SPANN indexes
  • SERIALIZABLE: Strong consistency without sacrificing performance
  • Cloud Native: Deploy anywhere (AWS, GCP, Azure, on-prem)
  • PostgreSQL Compatible: Familiar SQL with distributed superpowers

Getting Started

Choose your path:

  • Quick Start


    Get up and running in 5 minutes

    Quick Start

  • Guides


    Learn key concepts and patterns

    Guides

  • API Reference


    Detailed API documentation

    API Docs

  • Examples


    Working code examples

    Examples

Community & Support

Contributing

Contributions welcome! This is an open-source project built for the community.

Contributing Guide

License

Apache License 2.0 - see LICENSE for details.


Built with ❤️ for the CockroachDB and LangChain communities