Papr Memory API
The memory layer that turns AI agents from forgetful assistants into intelligent systems.
TL;DR: Store with
POST /v1/memory. Retrieve withPOST /v1/memory/search.
Ranked #1 on Stanford's STaRK benchmark with 91%+ accuracy and <150ms retrieval (when cached). Our Predictive Memory Graph connects context across sources and anticipates what users need before they ask.
Why Papr?
Agents hallucinate because they can't recall the right context. Conversation history lives in one place, documents in another, structured data in databases—none of it connected. Traditional retrieval returns fragments, not understanding.
We rebuilt the memory layer from scratch as a new type of database for AI that connects context across sources, predicts what users want, and powers agents to surface insights before users ask.
What You'd Build Yourself — But Unified
Most teams start with simple memory (SQLite + keyword search). It works for demos, then breaks in production: vocabulary mismatch, memory drift, context explosion, no cross-session coherence.
Papr gives you everything you'd build (event logs, keyword search, semantic embeddings, knowledge graphs, consolidation, ACLs) as a single API instead of 6 systems you orchestrate manually.
- ✅ Start simple:
POST /v1/messagesworks like SQLite event storage - ✅ Add intelligence: Enable
enable_agentic_graph=truefor hybrid retrieval (keyword + vector + graph) - ✅ Scale without rewrites: Same API from prototype to millions of users
See Why Papr for detailed comparison with DIY approaches.
Key Capabilities
- Predictive Memory Graph – Maps real relationships across all your data. A line of code → support ticket → AI conversation → Slack thread → design decision. Your knowledge becomes one connected story.
- #1 Retrieval Accuracy – 91%+ on Stanford's STaRK benchmark. Gets better as your memory grows, not worse.
- Under 150ms Response – Predictive caching anticipates what users will ask next and pre-loads context (when cached).
- Hybrid Retrieval – Combines keyword search (like BM25), semantic vectors, and graph relationships in one query. No manual fusion logic.
- Continuous Innovation – We stay on the cutting edge with latest advances. You get improvements automatically, not frozen in time.
- Full Flexibility – Open source, customizable via schemas, self-hostable. You keep control while we handle maintenance.
- Private by Design – Built-in ACLs, namespace boundaries, and permission management. Data never leaks across users.
Why Teams Choose Papr Over DIY
DIY gets you: Basic RAG with standard performance
Papr gets you: State-of-the-art that stays state-of-the-art
- Performance: 91%+ accuracy vs. ~70-80% typical, <150ms.
- Innovation: Latest advances deployed automatically
- Control: Open source + custom schemas vs. locked into what you built
- Simple: Quickly get to production vs. months to build
Start Here
Start with one of these two tracks:
Evaluate Fit
- Decision Tree - Quick decision guide (Should I use Papr?)
- Why Papr - Detailed comparison with code examples
Start Building
- Capability Matrix - Map jobs-to-be-done to exact API capabilities
- Use Cases - See what you can build by capability pattern
- Quick Start - Ship a working prototype quickly
- Golden Paths - Four canonical integration paths
- Agent Integration Pack - Deterministic integration docs for AI coding agents
How It Works
Papr unifies RAG + memory in one API. Store memories with POST /v1/memory. Retrieve with POST /v1/memory/search. Query insights with GraphQL or natural language.
Under the hood, the Predictive Memory Graph connects context across all your data sources:
Three Input Paths
- Documents (
POST /v1/document) - Upload PDFs or Word docs. System analyzes and selectively creates memories. - Messages/Chat - Send conversation history. System analyzes and extracts important information.
- Direct Memory (
POST /v1/memory) - Explicitly create memories with full control. Perfect for agent self-documentation.
Predictive Memory Engine
- Vector Index – Semantic similarity across all content types
- Knowledge Graph – Real relationships between entities, not just similar text
- Predictive Layer – Anticipates what users will ask and pre-caches context for <150ms retrieval (when cached)
- Connected Context – Links things like code → tickets → conversations → decisions across time
Two Query Modes
- Natural Language Search (
POST /v1/memory/search) - Ask questions, get relevant & re-ranked memories + graph entities. Semantic + graph combined. - GraphQL (
POST /v1/graphql) - Run structured queries for analytics, aggregations, and relationship analysis.
What You Can Build
Personal AI Assistant
Store/retrieve conversations across sessions
Document Q&A
Build intelligent document chat
Customer Experience
Answer FAQs and resolve multi-step tickets
Enterprise SaaS
Multi-tenant knowledge management
Document Intelligence
Process contracts, reports with auto extraction
Domain Knowledge Graphs
Custom ontologies for specialized domains
Graph Analytics
Query insights with GraphQL
Deployment Options
Papr Memory is available in three deployment modes:
Hybrid Cloud
Managed service in your cloud (AWS/Azure/GCP). Data stays in your environment, we handle operations.
All options use identical APIs - code written for one works with all three. Compare deployment options →
Dual Memory Types
Papr supports two types of memories, enabling comprehensive AI capabilities:
- User Memories - Information about users: preferences, history, context, conversations. Enables personalization.
- Agent Memories - Agent documents its own workflows, learnings, reasoning patterns. Enables self-improvement.
Both stored and queried the same way, allowing agents to not just personalize for users, but to learn and improve their own capabilities over time.