Papr Memory API
The memory layer that turns AI agents from forgetful assistants into intelligent systems.
TL;DR: Store with
POST /v1/memory. Retrieve withPOST /v1/memory/search.
Ranked #1 on Stanford's STaRK benchmark with 91%+ accuracy and <100ms retrieval. Our Predictive Memory Graph connects context across sources and anticipates what users need before they ask.
Why Papr?
Agents hallucinate because they can't recall the right context. Conversation history lives in one place, documents in another, structured data in databases—none of it connected. Traditional retrieval returns fragments, not understanding.
We rebuilt the memory layer from scratch as a new type of database for AI that connects context across sources, predicts what users want, and powers agents to surface insights before users ask.
Key Capabilities
- Predictive Memory Graph – Maps real relationships across all your data. A line of code → support ticket → AI conversation → Slack thread → design decision. Your knowledge becomes one connected story.
- #1 Retrieval Accuracy – 91%+ on Stanford's STaRK benchmark. Gets better as your memory grows, not worse.
- Under 100ms Response – Predictive caching anticipates what users will ask next and pre-loads context.
- Retrieval → Understanding → Insight – Query with natural language or GraphQL. Go beyond text retrieval to surface actual insights.
- Private by Design – Built-in ACLs, namespace boundaries, and permission management. Data never leaks across users.
- Open by Default – Run fully open-source self-hosted, or use managed cloud. Same API, full control.
How It Works
Papr unifies RAG + memory in one API. Store memories with POST /v1/memory. Retrieve with POST /v1/memory/search. Query insights with GraphQL or natural language.
Under the hood, the Predictive Memory Graph connects context across all your data sources:
Three Input Paths
- Documents (
POST /v1/document) - Upload PDFs or Word docs. System analyzes and selectively creates memories. - Messages/Chat - Send conversation history. System analyzes and extracts important information.
- Direct Memory (
POST /v1/memory) - Explicitly create memories with full control. Perfect for agent self-documentation.
Predictive Memory Engine
- Vector Index – Semantic similarity across all content types
- Knowledge Graph – Real relationships between entities, not just similar text
- Predictive Layer – Anticipates what users will ask and pre-caches context for <100ms retrieval
- Connected Context – Links things like code → tickets → conversations → decisions across time
Two Query Modes
- Natural Language Search (
POST /v1/memory/search) - Ask questions, get relevant & re-ranked memories + graph entities. Semantic + graph combined. - GraphQL (
POST /v1/graphql) - Run structured queries for analytics, aggregations, and relationship analysis.
What You Can Build
💬 Personal AI Assistant
Store/retrieve conversations across sessions
📄 Document Q&A
Build intelligent document chat
📊 Customer Experience
Answer FAQs and resolve multi-step tickets
🏢 Enterprise SaaS
Multi-tenant knowledge management
📑 Document Intelligence
Process contracts, reports with auto extraction
🧠 Domain Knowledge Graphs
Custom ontologies for specialized domains
📈 Graph Analytics
Query insights with GraphQL
Deployment Options
Papr Memory is available in three deployment modes:
🔒 Hybrid Cloud
Managed service in your cloud (AWS/Azure/GCP). Data stays in your environment, we handle operations.
All options use identical APIs - code written for one works with all three. Compare deployment options →
Dual Memory Types
Papr supports two types of memories, enabling comprehensive AI capabilities:
- User Memories - Information about users: preferences, history, context, conversations. Enables personalization.
- Agent Memories - Agent documents its own workflows, learnings, reasoning patterns. Enables self-improvement.
Both stored and queried the same way, allowing agents to not just personalize for users, but to learn and improve their own capabilities over time.