Last updated

Papr Memory API

The memory layer that turns AI agents from forgetful assistants into intelligent systems.

TL;DR: Store with POST /v1/memory. Retrieve with POST /v1/memory/search.
Ranked #1 on Stanford's STaRK benchmark with 91%+ accuracy and <150ms retrieval (when cached). Our Predictive Memory Graph connects context across sources and anticipates what users need before they ask.

Why Papr?

Agents hallucinate because they can't recall the right context. Conversation history lives in one place, documents in another, structured data in databases—none of it connected. Traditional retrieval returns fragments, not understanding.

We rebuilt the memory layer from scratch as a new type of database for AI that connects context across sources, predicts what users want, and powers agents to surface insights before users ask.

What You'd Build Yourself — But Unified

Most teams start with simple memory (SQLite + keyword search). It works for demos, then breaks in production: vocabulary mismatch, memory drift, context explosion, no cross-session coherence.

Papr gives you everything you'd build (event logs, keyword search, semantic embeddings, knowledge graphs, consolidation, ACLs) as a single API instead of 6 systems you orchestrate manually.

  • Start simple: POST /v1/messages works like SQLite event storage
  • Add intelligence: Enable enable_agentic_graph=true for hybrid retrieval (keyword + vector + graph)
  • Scale without rewrites: Same API from prototype to millions of users

See Why Papr for detailed comparison with DIY approaches.

Key Capabilities

  • Predictive Memory Graph – Maps real relationships across all your data. A line of code → support ticket → AI conversation → Slack thread → design decision. Your knowledge becomes one connected story.
  • #1 Retrieval Accuracy91%+ on Stanford's STaRK benchmark. Gets better as your memory grows, not worse.
  • Under 150ms Response – Predictive caching anticipates what users will ask next and pre-loads context (when cached).
  • Hybrid Retrieval – Combines keyword search (like BM25), semantic vectors, and graph relationships in one query. No manual fusion logic.
  • Continuous Innovation – We stay on the cutting edge with latest advances. You get improvements automatically, not frozen in time.
  • Full Flexibility – Open source, customizable via schemas, self-hostable. You keep control while we handle maintenance.
  • Private by Design – Built-in ACLs, namespace boundaries, and permission management. Data never leaks across users.

Why Teams Choose Papr Over DIY

DIY gets you: Basic RAG with standard performance
Papr gets you: State-of-the-art that stays state-of-the-art

  • Performance: 91%+ accuracy vs. ~70-80% typical, <150ms.
  • Innovation: Latest advances deployed automatically
  • Control: Open source + custom schemas vs. locked into what you built
  • Simple: Quickly get to production vs. months to build

See detailed comparison →

Start Here

Start with one of these two tracks:

Evaluate Fit

  • Decision Tree - Quick decision guide (Should I use Papr?)
  • Why Papr - Detailed comparison with code examples

Start Building

How It Works

Papr unifies RAG + memory in one API. Store memories with POST /v1/memory. Retrieve with POST /v1/memory/search. Query insights with GraphQL or natural language.

Under the hood, the Predictive Memory Graph connects context across all your data sources:

Three Input Paths

  1. Documents (POST /v1/document) - Upload PDFs or Word docs. System analyzes and selectively creates memories.
  2. Messages/Chat - Send conversation history. System analyzes and extracts important information.
  3. Direct Memory (POST /v1/memory) - Explicitly create memories with full control. Perfect for agent self-documentation.

Predictive Memory Engine

  • Vector Index – Semantic similarity across all content types
  • Knowledge Graph – Real relationships between entities, not just similar text
  • Predictive Layer – Anticipates what users will ask and pre-caches context for <150ms retrieval (when cached)
  • Connected Context – Links things like code → tickets → conversations → decisions across time

Two Query Modes

  1. Natural Language Search (POST /v1/memory/search) - Ask questions, get relevant & re-ranked memories + graph entities. Semantic + graph combined.
  2. GraphQL (POST /v1/graphql) - Run structured queries for analytics, aggregations, and relationship analysis.

Query Layer (2 Modes)

Input Layer (3 Paths)

Memory Engine

Vector Embeddings

Predictive Models

Knowledge Graphs

Dual Memory
User + Agent

Documents
PDFs, Word

Intelligent Analysis

Messages/Chat
Conversations

Intelligent Analysis

Direct Memory API
Explicit Data

Natural Language Search

GraphQL Analytics

Memories + Entities

Structured Insights

What You Can Build

Deployment Options

Papr Memory is available in three deployment modes:

Papr Cloud

Fully managed service. Get started in 5 minutes with zero infrastructure management.

Get Started → | Learn More →

Hybrid Cloud

Managed service in your cloud (AWS/Azure/GCP). Data stays in your environment, we handle operations.

Enterprise → | Talk to Sales →

Self-Hosted

Run the open-source version on your own infrastructure with complete control.

Setup Guide → | GitHub →

All options use identical APIs - code written for one works with all three. Compare deployment options →

Dual Memory Types

Papr supports two types of memories, enabling comprehensive AI capabilities:

  • User Memories - Information about users: preferences, history, context, conversations. Enables personalization.
  • Agent Memories - Agent documents its own workflows, learnings, reasoning patterns. Enables self-improvement.

Both stored and queried the same way, allowing agents to not just personalize for users, but to learn and improve their own capabilities over time.

Next Steps