Author avatar

pinecone-vector-db-mcp-server

by zx8086

Tags

4.8 (120)

MCP Pinecone Vector Database Server

This project implements a Model Context Protocol (MCP) server that allows reading and writing vectorized information to a Pinecone vector database. It's designed to work with both RAG-processed PDF data and Confluence data.

Features

  • Search for similar documents using text queries
  • Add new vectors to the database with custom metadata
  • Process and upload Confluence data in batch
  • Retrieve database statistics
  • Delete vectors by ID

Prerequisites

  • Node.js 18+ or Bun runtime
  • Pinecone API key
  • OpenAI API key (for generating embeddings)

Installation

  1. Clone this repository:

    git clone https://github.com/yourusername/mcp-pinecone.git
    cd mcp-pinecone
    
  2. Install dependencies:

    npm install
    # or with Bun
    bun install
    
  3. Create a .env file based on the provided .env.example:

    cp .env.example .env
    
  4. Edit the .env file with your API keys:

    PINECONE_API_KEY=your-pinecone-api-key
    OPENAI_API_KEY=your-openai-api-key
    

Usage

Building the Project

npm run build
# or with Bun
bun run build

Running the MCP Server

npm start
# or with Bun
bun start

Running in Development Mode

npm run dev
# or with Bun
bun dev

Processing Confluence Data

To process Confluence data from a JSON file:

npm run process-confluence -- path/to/vector_content.json [namespace]
# or with Bun
bun run process-confluence path/to/vector_content.json [namespace]

If the namespace is not specified, it will default to "capella-document-search".

MCP Integration

This server implements the Model Context Protocol, making it compatible with Claude and other MCP-enabled systems. Here's a sample connection snippet:

import { McpClient } from "@modelcontextprotocol/sdk/client/mcp.js";

const client = new McpClient();
await client.connect(/* transport */);

// Search for similar documents
const searchResult = await client.invokeTool("search-vectors", {
  query: "Digital Media Assets",
  topK: 5
});

console.log(searchResult);

Available Tools

The server provides the following tools:

  1. search-vectors - Search for similar documents
  2. add-vector - Add a single document to the database
  3. process-confluence - Process and upload Confluence data
  4. get-stats - Get database statistics
  5. delete-vectors - Delete vectors by ID

Database Structure

The server is configured to work with the following Pinecone setup:

  • Host: platform-engineering-rag-sdpgq06.svc.aped-4627-b74a.pinecone.io
  • Index: platform-engineering-rag
  • Default namespace: capella-document-search

Metadata Schema

The server maintains a consistent metadata schema to ensure compatibility between different data sources:

PDF Documents (Existing)

ID: [document-id]-page[page-number]-chunk[chunk-index]
author: [author-name]
chunkIndex: [index]
chunkSize: [size]
collection: "documentation"
contentType: "pdf"
creationDate: [date]
...

Confluence Documents (New)

ID: confluence-[page-id]-[item-id]
title: [title]
pageId: [page-id]
spaceKey: [space-key]
type: [type]
content: [text-content]
author: [author-name]
source: "confluence"
collection: "documentation"
scope: "media_assets"
...

Contributing

  1. Fork the repository
  2. Create your feature branch: git checkout -b feature/my-new-feature
  3. Commit your changes: git commit -am 'Add some feature'
  4. Push to the branch: git push origin feature/my-new-feature
  5. Submit a pull request

License

MIT

Related Services

playwright-mcp

Server

4.8 (120)
View Details →

blender-mcp

Server

4.8 (120)
View Details →

tavily-mcp

Server

4.8 (120)
View Details →