Knowledge Graph Generation

Control how knowledge graphs are created from your memories using auto or manual modes.

Overview

When you add memories to Papr, the system can automatically build a knowledge graph by:

Extracting entities from content
Identifying relationships between entities
Connecting related memories
Using custom schemas (if provided) to guide structure

You have two modes for controlling this process: Auto mode (let AI handle it) or Manual mode (specify exactly what you want).

How Graph Generation Works

Content added - You add memory via document upload, message, or direct API
Schema selected - System uses custom schema if provided, otherwise system schema
Entities extracted - Predictive models identify entities in the content
Relationships mapped - System identifies how entities relate to each other
Graph built - Entities and relationships are added to knowledge graph
Predictive connections - Additional connections are made based on patterns

Auto Mode (Recommended)

Auto mode lets the AI analyze content and automatically extract entities and relationships. This is the recommended approach for most use cases.

Basic Auto Mode

from papr_memory import Papr
import os

client = Papr(x_api_key=os.environ.get("PAPR_MEMORY_API_KEY"))

response = client.memory.add(
    content="Meeting with Jane Smith from Acme Corp about Q4 project timeline. She mentioned they need delivery by December 15th.",
    graph_generation={
        "mode": "auto"
    }
)

The system will automatically:

Extract entities (Jane Smith, Acme Corp, Q4 project)
Identify relationships (Jane works for Acme Corp, project has deadline)
Build appropriate graph structure

Auto Mode with Custom Schema

Provide a schema_id to guide entity extraction for your domain:

response = client.memory.add(
    content="Meeting with Jane Smith from Acme Corp about Q4 project timeline",
    graph_generation={
        "mode": "auto",
        "auto": {
            "schema_id": "crm_schema"  # Your custom CRM schema
        }
    }
)

The system will use your CRM schema to understand that Jane Smith should be a "Contact", Acme Corp should be a "Company", etc.

Auto Mode with Simple Schema Mode

For consistency, use simple_schema_mode which limits extraction to system schema + one custom schema:

response = client.memory.add(
    content="Customer Jane Doe purchased iPhone 15 Pro for $999",
    graph_generation={
        "mode": "auto",
        "auto": {
            "schema_id": "ecommerce_schema",
            "simple_schema_mode": True  # Recommended for production
        }
    }
)

This ensures consistent results between document processing and direct memory creation.

Auto Mode with Property Overrides

Use property overrides to ensure consistent entity IDs across your graph:

response = client.memory.add(
    content="Jane Smith from Acme Corp mentioned interest in Enterprise plan",
    graph_generation={
        "mode": "auto",
        "auto": {
            "schema_id": "crm_schema",
            "simple_schema_mode": True,
            "property_overrides": [
                {
                    "nodeLabel": "Contact",
                    "match": {"name": "Jane Smith"},
                    "set": {
                        "id": "contact_jane_smith",
                        "role": "decision_maker",
                        "source": "direct_outreach"
                    }
                },
                {
                    "nodeLabel": "Company",
                    "match": {"name": "Acme Corp"},
                    "set": {
                        "id": "company_acme",
                        "industry": "technology",
                        "size": "enterprise"
                    }
                }
            ]
        }
    }
)

Property overrides ensure that:

Jane Smith always maps to the same entity ID
You can add consistent metadata across mentions
Entity resolution works reliably across documents

Manual Mode (Exact Control)

Manual mode lets you specify exactly what entities and relationships to create. Use this when you have structured data or need precise control.

Basic Manual Mode

response = client.memory.add(
    content="Signed service agreement with Acme Corp for $50,000 annually",
    graph_generation={
        "mode": "manual",
        "manual": {
            "nodes": [
                {
                    "id": "contract_acme_2024",
                    "label": "Contract",
                    "properties": {
                        "title": "Service Agreement 2024",
                        "value": 50000,
                        "term": "annual",
                        "status": "active",
                        "signed_date": "2024-03-15"
                    }
                },
                {
                    "id": "company_acme",
                    "label": "Company",
                    "properties": {
                        "name": "Acme Corp",
                        "industry": "Technology",
                        "size": "Enterprise"
                    }
                }
            ],
            "relationships": [
                {
                    "source_node_id": "company_acme",
                    "target_node_id": "contract_acme_2024",
                    "relationship_type": "HAS_CONTRACT",
                    "properties": {
                        "signed_date": "2024-03-15",
                        "start_date": "2024-04-01"
                    }
                }
            ]
        }
    }
)

Manual Mode with Schema Validation

Even in manual mode, specify a schema to ensure your nodes match your domain model:

response = client.memory.add(
    content="New product launch: Widget Pro at $299",
    graph_generation={
        "mode": "manual",
        "manual": {
            "nodes": [
                {
                    "id": "product_widget_pro",
                    "label": "Product",  # Must match schema node type
                    "properties": {
                        "name": "Widget Pro",
                        "price": 299.00,
                        "category": "electronics",
                        "in_stock": True,
                        "sku": "SKU-001"
                    }
                }
            ],
            "relationships": []
        }
    },
    metadata={
        "schema_id": "ecommerce_schema"  # Validates against this schema
    }
)

Agent Memory Example

Agents can document their own workflows, learnings, and reasoning patterns as memories:

# Agent documents its refund workflow
response = client.memory.add(
    content="""
    Refund Request Workflow:
    1. Verify customer account status and history
    2. Check purchase date and confirm within return window
    3. Review product condition requirements
    4. Apply refund policy based on timeframe:
       - Within 30 days: Full refund
       - 30-60 days: Store credit only
       - After 60 days: Case-by-case basis
    5. Document reason for refund
    6. Process refund through accounting system
    """,
    metadata={
        "role": "assistant",  # This is an agent memory
        "category": "learning",
        "workflow_type": "customer_service",
        "topics": ["refunds", "customer_service", "policy"]
    },
    graph_generation={
        "mode": "auto",
        "auto": {
            "simple_schema_mode": True
        }
    }
)

# Agent documents a successful resolution pattern
response = client.memory.add(
    content="When customers express frustration about delayed shipping, immediately: 1) Acknowledge their concern, 2) Check tracking details, 3) Offer specific ETA, 4) Provide discount code for inconvenience. This approach has 85% satisfaction rate.",
    metadata={
        "role": "assistant",
        "category": "learning",
        "success_metric": 0.85,
        "topics": ["customer_service", "shipping", "escalation"]
    },
    graph_generation={
        "mode": "auto",
        "auto": {"simple_schema_mode": True}
    }
)

These agent memories enable the AI to:

Remember successful workflows and apply them to similar situations
Learn from past interactions
Improve responses based on what worked before
Build institutional knowledge that persists across sessions

Choosing Between Auto and Manual

Use Auto Mode When:

Processing natural language content (conversations, documents)
You want the AI to identify entities and relationships
Content structure is not rigidly defined
You have a custom schema to guide extraction
You want the system to make predictive connections

Examples:

Customer conversations
Meeting notes
Email content
Document analysis
Social media posts

Use Manual Mode When:

You have structured data with known entities
Entity IDs and relationships are critical and must be exact
Importing from existing databases
Building specific graph patterns
Testing schema designs

Examples:

Database imports
API integrations
CRM sync
Financial transactions
Audit logs

Hybrid Approach

Start with auto mode and add property overrides for critical entities:

response = client.memory.add(
    content="Jane Smith called about upgrading from Basic to Pro plan",
    graph_generation={
        "mode": "auto",
        "auto": {
            "schema_id": "saas_schema",
            "simple_schema_mode": True,
            "property_overrides": [
                {
                    "nodeLabel": "Customer",
                    "match": {"name": "Jane Smith"},
                    "set": {
                        "id": "cust_jane_smith",
                        "current_plan": "basic",
                        "account_value": 99,
                        "contact_method": "phone"
                    }
                }
            ]
        }
    }
)

This gives you AI-powered extraction with manual control over key entities.

TypeScript Examples

Auto Mode

import Papr from '@papr/memory';

const client = new Papr({
  xAPIKey: process.env.PAPR_MEMORY_API_KEY
});

const response = await client.memory.add({
  content: "Meeting with Jane Smith from Acme Corp about Q4 project",
  graph_generation: {
    mode: "auto",
    auto: {
      schema_id: "crm_schema",
      simple_schema_mode: true,
      property_overrides: [
        {
          nodeLabel: "Contact",
          match: { name: "Jane Smith" },
          set: {
            id: "contact_jane_smith",
            role: "decision_maker"
          }
        }
      ]
    }
  }
});

Manual Mode

const response = await client.memory.add({
  content: "New contract signed with Acme Corp",
  graph_generation: {
    mode: "manual",
    manual: {
      nodes: [
        {
          id: "contract_acme_2024",
          label: "Contract",
          properties: {
            title: "Service Agreement 2024",
            value: 50000,
            status: "active"
          }
        }
      ],
      relationships: []
    }
  }
});

Best Practices

1. Use Simple Schema Mode in Production

graph_generation={
    "mode": "auto",
    "auto": {
        "simple_schema_mode": True  # Consistency across operations
    }
}

This ensures consistent entity extraction whether you're uploading documents or adding memories directly.

2. Define Property Overrides for Key Entities

For entities that appear frequently, use property overrides:

property_overrides=[
    {
        "nodeLabel": "Customer",
        "match": {"email": "jane@example.com"},
        "set": {"id": "cust_jane", "segment": "enterprise"}
    }
]

3. Start with Auto, Switch to Manual for Critical Data

Use auto mode for general content, manual mode for critical structured data:

# Auto for conversational content
client.memory.add(content=conversation, graph_generation={"mode": "auto"})

# Manual for financial transactions
client.memory.add(content=transaction, graph_generation={"mode": "manual", "manual": {...}})

4. Use Descriptive Node IDs

Create human-readable IDs that make graph debugging easier:

nodes=[
    {"id": "customer_jane_smith", ...},  # Good
    {"id": "cust_12345", ...}  # Less readable
]

5. Leverage Custom Schemas

Always provide a custom schema for domain-specific extraction:

graph_generation={
    "mode": "auto",
    "auto": {
        "schema_id": "your_domain_schema",  # Critical for accuracy
        "simple_schema_mode": True
    }
}

6. Document Agent Learnings

Have agents document their successful workflows:

# Good practice: Agent documents what works
client.memory.add(
    content="Workflow for handling X: step 1, step 2, step 3. Success rate: 90%",
    metadata={"role": "assistant", "category": "learning"}
)

Common Patterns

CRM Contact Tracking

response = client.memory.add(
    content="Call with John at TechCorp - interested in Enterprise plan upgrade",
    graph_generation={
        "mode": "auto",
        "auto": {
            "schema_id": "crm_schema",
            "property_overrides": [
                {
                    "nodeLabel": "Contact",
                    "match": {"name": "John"},
                    "set": {"company": "TechCorp", "interest_level": "high"}
                }
            ]
        }
    }
)

E-commerce Order

response = client.memory.add(
    content="Order #12345 from Jane Doe: 2x Widget Pro ($299 each)",
    graph_generation={
        "mode": "manual",
        "manual": {
            "nodes": [
                {
                    "id": "order_12345",
                    "label": "Order",
                    "properties": {"order_id": "12345", "total": 598}
                },
                {
                    "id": "customer_jane_doe",
                    "label": "Customer",
                    "properties": {"name": "Jane Doe"}
                },
                {
                    "id": "product_widget_pro",
                    "label": "Product",
                    "properties": {"name": "Widget Pro", "price": 299}
                }
            ],
            "relationships": [
                {
                    "source_node_id": "customer_jane_doe",
                    "target_node_id": "order_12345",
                    "relationship_type": "PLACED"
                },
                {
                    "source_node_id": "order_12345",
                    "target_node_id": "product_widget_pro",
                    "relationship_type": "CONTAINS",
                    "properties": {"quantity": 2}
                }
            ]
        }
    }
)

Project Management

response = client.memory.add(
    content="New task: Implement user authentication. Assigned to: Alex. Due: 2024-04-15",
    graph_generation={
        "mode": "auto",
        "auto": {
            "schema_id": "project_schema",
            "property_overrides": [
                {
                    "nodeLabel": "Task",
                    "set": {
                        "status": "open",
                        "priority": "high",
                        "project_id": "proj_auth"
                    }
                }
            ]
        }
    }
)

Troubleshooting

Entities Not Being Extracted

Check if custom schema is properly defined
Verify property descriptions are clear and LLM-friendly
Try manual mode to see if schema validation passes
Review required properties - may be too strict

Duplicate Entities Created

Use property overrides with consistent IDs
Check unique_identifiers in schema definition
Consider using enums for exact matching on key fields

Relationships Not Forming

Verify relationship types are defined in schema
Check allowed_source_types and allowed_target_types
Ensure entities are created before relationships reference them

Graph Structure Inconsistent

Enable simple_schema_mode for consistency
Use same schema_id across related operations
Add property overrides for key entities

Next Steps

Custom Schemas - Define domain ontologies
GraphQL Analysis - Query your knowledge graph
Document Processing - Extract from documents
API Reference - Complete endpoint documentation