Last updated

Papr Memory API

The memory layer that turns AI agents from forgetful assistants into intelligent systems.

TL;DR: Store with POST /v1/memory. Retrieve with POST /v1/memory/search.
Ranked #1 on Stanford's STaRK benchmark with 91%+ accuracy and <100ms retrieval. Our Predictive Memory Graph connects context across sources and anticipates what users need before they ask.

Why Papr?

Agents hallucinate because they can't recall the right context. Conversation history lives in one place, documents in another, structured data in databases—none of it connected. Traditional retrieval returns fragments, not understanding.

We rebuilt the memory layer from scratch as a new type of database for AI that connects context across sources, predicts what users want, and powers agents to surface insights before users ask.

Key Capabilities

  • Predictive Memory Graph – Maps real relationships across all your data. A line of code → support ticket → AI conversation → Slack thread → design decision. Your knowledge becomes one connected story.
  • #1 Retrieval Accuracy – 91%+ on Stanford's STaRK benchmark. Gets better as your memory grows, not worse.
  • Under 100ms Response – Predictive caching anticipates what users will ask next and pre-loads context.
  • Retrieval → Understanding → Insight – Query with natural language or GraphQL. Go beyond text retrieval to surface actual insights.
  • Private by Design – Built-in ACLs, namespace boundaries, and permission management. Data never leaks across users.
  • Open by Default – Run fully open-source self-hosted, or use managed cloud. Same API, full control.

How It Works

Papr unifies RAG + memory in one API. Store memories with POST /v1/memory. Retrieve with POST /v1/memory/search. Query insights with GraphQL or natural language.

Under the hood, the Predictive Memory Graph connects context across all your data sources:

Three Input Paths

  1. Documents (POST /v1/document) - Upload PDFs or Word docs. System analyzes and selectively creates memories.
  2. Messages/Chat - Send conversation history. System analyzes and extracts important information.
  3. Direct Memory (POST /v1/memory) - Explicitly create memories with full control. Perfect for agent self-documentation.

Predictive Memory Engine

  • Vector Index – Semantic similarity across all content types
  • Knowledge Graph – Real relationships between entities, not just similar text
  • Predictive Layer – Anticipates what users will ask and pre-caches context for <100ms retrieval
  • Connected Context – Links things like code → tickets → conversations → decisions across time

Two Query Modes

  1. Natural Language Search (POST /v1/memory/search) - Ask questions, get relevant & re-ranked memories + graph entities. Semantic + graph combined.
  2. GraphQL (POST /v1/graphql) - Run structured queries for analytics, aggregations, and relationship analysis.

Query Layer (2 Modes)
Input Layer (3 Paths)
Memories + Entities
Natural Language Search
Structured Insights
GraphQL Analytics
Memory Engine
Vector Embeddings
Predictive Models
Knowledge Graphs
Dual Memory
User + Agent
Intelligent Analysis
Documents
PDFs, Word
Intelligent Analysis
Messages/Chat
Conversations
Direct Memory API
Explicit Data

What You Can Build

Deployment Options

Papr Memory is available in three deployment modes:

☁️ Papr Cloud

Fully managed service. Get started in 5 minutes with zero infrastructure management.

Get Started → | Learn More →

🔒 Hybrid Cloud

Managed service in your cloud (AWS/Azure/GCP). Data stays in your environment, we handle operations.

Enterprise → | Talk to Sales →

🐳 Self-Hosted

Run the open-source version on your own infrastructure with complete control.

Setup Guide → | GitHub →

All options use identical APIs - code written for one works with all three. Compare deployment options →

Dual Memory Types

Papr supports two types of memories, enabling comprehensive AI capabilities:

  • User Memories - Information about users: preferences, history, context, conversations. Enables personalization.
  • Agent Memories - Agent documents its own workflows, learnings, reasoning patterns. Enables self-improvement.

Both stored and queried the same way, allowing agents to not just personalize for users, but to learn and improve their own capabilities over time.

Next Steps