AS
AgSkills.dev
MARKETPLACE

pinecone

Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.

20.0k
1.8k

Preview

SKILL.md
name
pinecone
description
Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
version
1.0.0
author
Orchestra Research
license
MIT
tags
[RAG, Pinecone, Vector Database, Managed Service, Serverless, Hybrid Search, Production, Auto-Scaling, Low Latency, Recommendations]
dependencies
[pinecone-client]

Pinecone - Managed Vector Database

The vector database for production AI applications.

When to use Pinecone

Use when:

  • Need managed, serverless vector database
  • Production RAG applications
  • Auto-scaling required
  • Low latency critical (<100ms)
  • Don't want to manage infrastructure
  • Need hybrid search (dense + sparse vectors)

Metrics:

  • Fully managed SaaS
  • Auto-scales to billions of vectors
  • p95 latency <100ms
  • 99.9% uptime SLA

Use alternatives instead:

  • Chroma: Self-hosted, open-source
  • FAISS: Offline, pure similarity search
  • Weaviate: Self-hosted with more features

Quick start

Installation

pip install pinecone-client

Basic usage

from pinecone import Pinecone, ServerlessSpec # Initialize pc = Pinecone(api_key="your-api-key") # Create index pc.create_index( name="my-index", dimension=1536, # Must match embedding dimension metric="cosine", # or "euclidean", "dotproduct" spec=ServerlessSpec(cloud="aws", region="us-east-1") ) # Connect to index index = pc.Index("my-index") # Upsert vectors index.upsert(vectors=[ {"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"category": "A"}}, {"id": "vec2", "values": [0.3, 0.4, ...], "metadata": {"category": "B"}} ]) # Query results = index.query( vector=[0.1, 0.2, ...], top_k=5, include_metadata=True ) print(results["matches"])

Core operations

Create index

# Serverless (recommended) pc.create_index( name="my-index", dimension=1536, metric="cosine", spec=ServerlessSpec( cloud="aws", # or "gcp", "azure" region="us-east-1" ) ) # Pod-based (for consistent performance) from pinecone import PodSpec pc.create_index( name="my-index", dimension=1536, metric="cosine", spec=PodSpec( environment="us-east1-gcp", pod_type="p1.x1" ) )

Upsert vectors

# Single upsert index.upsert(vectors=[ { "id": "doc1", "values": [0.1, 0.2, ...], # 1536 dimensions "metadata": { "text": "Document content", "category": "tutorial", "timestamp": "2025-01-01" } } ]) # Batch upsert (recommended) vectors = [ {"id": f"vec{i}", "values": embedding, "metadata": metadata} for i, (embedding, metadata) in enumerate(zip(embeddings, metadatas)) ] index.upsert(vectors=vectors, batch_size=100)

Query vectors

# Basic query results = index.query( vector=[0.1, 0.2, ...], top_k=10, include_metadata=True, include_values=False ) # With metadata filtering results = index.query( vector=[0.1, 0.2, ...], top_k=5, filter={"category": {"$eq": "tutorial"}} ) # Namespace query results = index.query( vector=[0.1, 0.2, ...], top_k=5, namespace="production" ) # Access results for match in results["matches"]: print(f"ID: {match['id']}") print(f"Score: {match['score']}") print(f"Metadata: {match['metadata']}")

Metadata filtering

# Exact match filter = {"category": "tutorial"} # Comparison filter = {"price": {"$gte": 100}} # $gt, $gte, $lt, $lte, $ne # Logical operators filter = { "$and": [ {"category": "tutorial"}, {"difficulty": {"$lte": 3}} ] } # Also: $or # In operator filter = {"tags": {"$in": ["python", "ml"]}}

Namespaces

# Partition data by namespace index.upsert( vectors=[{"id": "vec1", "values": [...]}], namespace="user-123" ) # Query specific namespace results = index.query( vector=[...], namespace="user-123", top_k=5 ) # List namespaces stats = index.describe_index_stats() print(stats['namespaces'])

Hybrid search (dense + sparse)

# Upsert with sparse vectors index.upsert(vectors=[ { "id": "doc1", "values": [0.1, 0.2, ...], # Dense vector "sparse_values": { "indices": [10, 45, 123], # Token IDs "values": [0.5, 0.3, 0.8] # TF-IDF scores }, "metadata": {"text": "..."} } ]) # Hybrid query results = index.query( vector=[0.1, 0.2, ...], sparse_vector={ "indices": [10, 45], "values": [0.5, 0.3] }, top_k=5, alpha=0.5 # 0=sparse, 1=dense, 0.5=hybrid )

LangChain integration

from langchain_pinecone import PineconeVectorStore from langchain_openai import OpenAIEmbeddings # Create vector store vectorstore = PineconeVectorStore.from_documents( documents=docs, embedding=OpenAIEmbeddings(), index_name="my-index" ) # Query results = vectorstore.similarity_search("query", k=5) # With metadata filter results = vectorstore.similarity_search( "query", k=5, filter={"category": "tutorial"} ) # As retriever retriever = vectorstore.as_retriever(search_kwargs={"k": 10})

LlamaIndex integration

from llama_index.vector_stores.pinecone import PineconeVectorStore # Connect to Pinecone pc = Pinecone(api_key="your-key") pinecone_index = pc.Index("my-index") # Create vector store vector_store = PineconeVectorStore(pinecone_index=pinecone_index) # Use in LlamaIndex from llama_index.core import StorageContext, VectorStoreIndex storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

Index management

# List indices indexes = pc.list_indexes() # Describe index index_info = pc.describe_index("my-index") print(index_info) # Get index stats stats = index.describe_index_stats() print(f"Total vectors: {stats['total_vector_count']}") print(f"Namespaces: {stats['namespaces']}") # Delete index pc.delete_index("my-index")

Delete vectors

# Delete by ID index.delete(ids=["vec1", "vec2"]) # Delete by filter index.delete(filter={"category": "old"}) # Delete all in namespace index.delete(delete_all=True, namespace="test") # Delete entire index index.delete(delete_all=True)

Best practices

  1. Use serverless - Auto-scaling, cost-effective
  2. Batch upserts - More efficient (100-200 per batch)
  3. Add metadata - Enable filtering
  4. Use namespaces - Isolate data by user/tenant
  5. Monitor usage - Check Pinecone dashboard
  6. Optimize filters - Index frequently filtered fields
  7. Test with free tier - 1 index, 100K vectors free
  8. Use hybrid search - Better quality
  9. Set appropriate dimensions - Match embedding model
  10. Regular backups - Export important data

Performance

OperationLatencyNotes
Upsert~50-100msPer batch
Query (p50)~50msDepends on index size
Query (p95)~100msSLA target
Metadata filter~+10-20msAdditional overhead

Pricing (as of 2025)

Serverless:

  • $0.096 per million read units
  • $0.06 per million write units
  • $0.06 per GB storage/month

Free tier:

  • 1 serverless index
  • 100K vectors (1536 dimensions)
  • Great for prototyping

Resources

GitHub Repository
davila7/claude-code-templates
Stars
20,091
Forks
1,865
Open Repository
Install Skill
Download ZIP2 files