Last updated

Vector and Graph Storage

Papr Memory's hybrid approach combines vector similarity search with graph relationships to create a powerful, context-aware memory system. This page explains how these two paradigms work together behind the scenes while keeping the developer experience simple.

How Vector Embeddings Work

At the core of Papr Memory is a vector embedding system that transforms content into numerical representations:

"Meeting notes from our product planning session" → [0.123, -0.456, 0.789, ..., -0.321]

These high-dimensional vectors (typically 1536 dimensions) capture the semantic meaning of content, enabling powerful similarity search.

Vector search allows you to find relevant content based on meaning, not just keywords:

  • "Product roadmap planning" can match "Our Q3 feature timeline discussion"
  • "Customer feedback issues" can match "User complaints about the login page"
  • "Budget allocation for marketing" can match "Q2 spending plan for advertising campaigns"

Graph Structure

While vector search finds semantically similar content, graph relationships add essential context:

  • Temporal connections: What happened before or after?
  • Contextual connections: What belongs together?
  • Hierarchical relationships: What is part of what?
  • Reference relationships: What refers to what?

This creates a rich, interconnected knowledge graph of your memories.

How Papr Memory Uses Both

Papr Memory automatically combines these approaches:

  1. Vector search finds semantically relevant memories
  2. Graph traversal explores connections between memories
  3. Hybrid ranking merges results for the most relevant information

The best part? You don't need to manage any of this complexity. Papr handles it all for you.

Real-World Example: Multi-Hop Retrieval

Consider a scenario where you need to plan dinner for a guest. Your memory system contains several disconnected pieces of information:

Memory 1: "Pen is coming over next weekend."
          metadata: { eventId: "evt_001", guest: "Pen", date: "2025-05-03" }

Memory 2: "Pen loves eating pasta."
          metadata: { userId: "Pen", likes: ["pasta"] }

Memory 3: "Margherita Pizza, Caesar Salad, Spaghetti Carbonara, Tiramisu"
          metadata: { restaurantId: "R1", city: "Seattle", cuisine: "Italian" }

Challenge: "What to order for the weekend?"

Let's see how different search approaches handle this question:

A traditional vector search might find Memory 1 because it mentions "weekend," but it doesn't connect this to food preferences.

Query: "Pen is coming, what does she like?" Result: "Pen loves eating pasta." (But fails to connect this to actual menu items)

Papr Memory's hybrid approach:

  1. Identifies that Pen is the weekend guest (from Memory 1)
  2. Discovers Pen's food preference for pasta (from Memory 2)
  3. Connects this preference to a specific menu item (from Memory 3)
  4. Returns a complete answer: "Pen is coming and she likes pasta. Order Spaghetti Carbonara"

This multi-hop retrieval would be nearly impossible with vector search alone, but becomes natural when combining semantic search with graph relationships.

What makes this powerful:

  • No explicit coding of these connections is required
  • Context is maintained across multiple memories
  • The answer is constructed from multiple independent sources
  • Information is connected even when stored in different formats

Using the API

Adding Memories

When you add a memory, Papr automatically:

  • Generates vector embeddings
  • Extracts and stores metadata
  • Creates graph relationships
  • Indexes everything for fast retrieval
  • Updates existing memories that are similar
# Add a simple memory
memory = client.memory.add(
    content="Meeting notes from project kickoff",
    type="text"
)

# Add memory with relationships through metadata
memory = client.memory.add(
    content="Meeting notes from our Q2 planning session. We discussed the roadmap for our mobile app redesign.",
    type="text",
    metadata={
        "topics": "planning, roadmap, mobile, design",
        "hierarchical_structures": "Projects/MobileApp/Planning",
        "conversationId": "conv-123"
    }
)

Searching Memories

When you search, Papr combines vector similarity with graph traversal automatically:

# Search with a detailed query
results = client.memory.search(
    query="Find our discussion about the mobile app redesign timeline and resource allocation from last month's planning meeting. I need to review the estimated completion dates we agreed on."
)

# Add parameters for more control
results = client.memory.search(
    query="Find all discussions related to customer feedback about the checkout process in the last quarter. I specifically need information about payment-related issues that multiple customers reported.",
    max_memories=20,
    max_nodes=10,
    rank_results=True,
    headers={"Accept-Encoding": "gzip"}  # For better performance with large responses
)

Best Practices

Writing Effective Queries

For best results:

  1. Be specific and detailed

    // Good query:
    "Find our discussion about API rate limiting from the backend planning meeting last week. I need the specific limits we agreed on for free vs. paid tiers."
    
    // Less effective query:
    "API limits"
  2. Include context

    // Good query:
    "Find notes from the marketing strategy sessions where we discussed social media campaigns for the product launch."
    
    // Less effective query:
    "social media"
  3. Specify time frames when relevant

    // Good query:
    "Find budget discussions from Q1 planning meetings where we allocated resources for the new product development."
    
    // Less effective query:
    "budget"

Optimizing Performance

  1. Use rich metadata when adding memories

    • Add detailed topics
    • Structure information with hierarchical_structures
    • Include temporal and contextual information
  2. Enable compression for large responses

    • Add Accept-Encoding: gzip in your API request headers
    • Configure your client to use compression
  3. Tune result parameters

    • Adjust max_memories and max_nodes based on your needs
    • Use rank_results for specialized ranking scenarios

Behind the Scenes

While you don't need to manage this complexity, here's what happens when you search:

Your Query
Query Processing
Vector Search
Graph Traversal
Result Ranking
Final Results

The search process:

  1. Analyzes your query to understand intent and context
  2. Performs vector similarity search to find semantically relevant memories
  3. Traverses the knowledge graph to find connected memories
  4. Ranks combined results based on relevance to your query
  5. Returns the most relevant memories

Next Steps