Last updated

Knowledge Graphs

Knowledge graphs provide a structured representation of information by connecting entities (nodes) through relationships (edges). In the Papr Memory API, knowledge graphs are built automatically through predictive models and can be customized with domain-specific schemas.

Understanding Knowledge Graphs in Papr

Unlike traditional knowledge graph systems where you manually create nodes and relationships, Papr uses predictive models to automatically:

  • Extract entities from your content
  • Identify relationships between entities
  • Connect related memories
  • Build graph structure that enhances retrieval

This approach combines the power of knowledge graphs with the simplicity of a memory API.

How Knowledge Graphs Are Built

Automatic Graph Generation

When you add memories or upload documents, Papr's predictive models analyze the content and automatically:

  1. Identify Entities - People, companies, products, concepts, etc.
  2. Extract Relationships - How entities connect to each other
  3. Connect Memories - Link related information across your memory store
  4. Structure Knowledge - Build a queryable graph representation

Example: Adding a memory about a customer interaction

from papr_memory import Papr
import os

client = Papr(x_api_key=os.environ.get("PAPR_MEMORY_API_KEY"))

response = client.memory.add(
    content="Met with Jane Smith from Acme Corp. She's interested in upgrading to our Enterprise plan for her team of 50 users. Budget approved, targeting Q2 implementation.",
    graph_generation={"mode": "auto"}
)

The system automatically extracts:

  • Entities: Jane Smith (Contact), Acme Corp (Company), Enterprise plan (Product)
  • Relationships: Jane works_for Acme Corp, Acme interested_in Enterprise plan
  • Attributes: Team size (50), Timeline (Q2), Budget status (approved)

Custom Schemas Guide Extraction

Define your domain ontology to guide what entities and relationships the system extracts:

# First, create a custom schema
schema = client.schemas.create(
    name="CRM Schema",
    description="Customer relationship management schema",
    node_types={
        "Contact": {
            "name": "Contact",
            "label": "Contact",
            "properties": {
                "name": {"type": "string", "required": True},
                "role": {"type": "string", "required": False},
                "email": {"type": "string", "required": False}
            },
            "required_properties": ["name"],
            "unique_identifiers": ["name"]
        },
        "Company": {
            "name": "Company",
            "label": "Company",
            "properties": {
                "name": {"type": "string", "required": True},
                "industry": {"type": "string", "required": False},
                "size": {"type": "string", "enum_values": ["startup", "smb", "enterprise"]}
            },
            "required_properties": ["name"],
            "unique_identifiers": ["name"]
        }
    },
    relationship_types={
        "WORKS_FOR": {
            "name": "WORKS_FOR",
            "allowed_source_types": ["Contact"],
            "allowed_target_types": ["Company"]
        },
        "INTERESTED_IN": {
            "name": "INTERESTED_IN",
            "allowed_source_types": ["Company"],
            "allowed_target_types": ["Product"]
        }
    }
)

# Use the schema when adding memories
response = client.memory.add(
    content="Met with Jane Smith from Acme Corp about Enterprise plan",
    graph_generation={
        "mode": "auto",
        "auto": {
            "schema_id": schema.data.id,
            "simple_schema_mode": True
        }
    }
)

Now the system uses your schema to guide entity extraction and ensure consistent graph structure.

Dual Memory Types in Graphs

Knowledge graphs can represent two types of memories:

User Memories

Information about users: preferences, history, interactions, context.

client.memory.add(
    content="User prefers email notifications and dark mode UI",
    metadata={
        "role": "user",  # User memory
        "user_id": "user_123",
        "category": "preferences"
    },
    graph_generation={"mode": "auto"}
)

Agent Memories

Agent documents its own workflows, learnings, and reasoning patterns.

client.memory.add(
    content="When handling upgrade requests: 1) Verify account standing, 2) Check usage metrics, 3) Confirm budget approval, 4) Schedule implementation call. Success rate: 85%",
    metadata={
        "role": "assistant",  # Agent memory
        "category": "learning",
        "workflow_type": "sales"
    },
    graph_generation={"mode": "auto"}
)

Both types are stored in the same graph, enabling agents to:

  • Learn from past interactions
  • Improve their reasoning over time
  • Build institutional knowledge

Querying Knowledge Graphs

Natural Language Search with Graph Context

The search endpoint automatically leverages the knowledge graph:

search_response = client.memory.search(
    query="What companies are interested in Enterprise plan?",
    user_id="user_123",
    enable_agentic_graph=True,  # Enable graph-enhanced search
    max_memories=20,
    max_nodes=15  # Include graph entities in results
)

# Results include both memories and related graph entities
for memory in search_response.data.memories:
    print(f"Memory: {memory.content}")

for node in search_response.data.nodes:
    print(f"Entity: {node.label} - {node.properties['name']}")

The enable_agentic_graph parameter activates advanced graph traversal that:

  • Resolves ambiguous entity references
  • Finds multi-hop connections
  • Enriches context with related entities
  • Improves retrieval accuracy

GraphQL for Structured Queries

Use GraphQL to query your knowledge graph directly for analytics and insights:

# Find all companies and their contacts
response = client.graphql.query(
    query="""
    query GetCompanies {
      companies {
        name
        size
        contacts {
          name
          role
        }
        interested_in_products {
          name
          pricing_tier
        }
      }
    }
    """
)

for company in response.data['companies']:
    print(f"Company: {company['name']} ({company['size']})")
    print(f"  Contacts: {len(company['contacts'])}")
    print(f"  Interested in: {len(company['interested_in_products'])} products")

Multi-Hop Queries

Traverse relationships across multiple hops:

# Find customers interested in products purchased by other premium customers
response = client.graphql.query(
    query="""
    query RelatedCustomers($customerId: ID!) {
      customer(id: $customerId) {
        name
        tier
        purchased_products {
          name
          also_purchased_by(where: { tier: "premium" }) {
            name
            email
            company {
              name
              industry
            }
          }
        }
      }
    }
    """,
    variables={"customerId": "cust_123"}
)

Aggregations and Analytics

Extract insights from your graph:

response = client.graphql.query(
    query="""
    query SalesAnalytics {
      companies_by_size: companies {
        size
        count: aggregate {
          count
        }
        avg_team_size: aggregate {
          avg(field: "team_size")
        }
        revenue: opportunities_aggregate {
          sum {
            value
          }
        }
      }
    }
    """
)

for segment in response.data['companies_by_size']:
    print(f"{segment['size']}: {segment['count']} companies")
    print(f"  Avg team size: {segment['avg_team_size']}")
    print(f"  Total revenue: ${segment['revenue']['sum']}")

Property Overrides for Consistent Entities

Ensure entities are consistently identified across multiple mentions:

response = client.memory.add(
    content="Spoke with Jane from Acme about deployment timeline",
    graph_generation={
        "mode": "auto",
        "auto": {
            "schema_id": "crm_schema",
            "property_overrides": [
                {
                    "nodeLabel": "Contact",
                    "match": {"name": "Jane"},
                    "set": {
                        "id": "contact_jane_smith",
                        "full_name": "Jane Smith",
                        "company": "Acme Corp"
                    }
                },
                {
                    "nodeLabel": "Company",
                    "match": {"name": "Acme"},
                    "set": {
                        "id": "company_acme",
                        "full_name": "Acme Corp",
                        "industry": "technology"
                    }
                }
            ]
        }
    }
)

This ensures that "Jane", "Jane Smith", and "Jane from Acme" all map to the same entity in your graph.

Document Processing and Graphs

Upload documents and extract structured knowledge graphs:

# Upload a contract with schema-guided extraction
response = client.document.upload(
    file=open("service_agreement.pdf", "rb"),
    schema_id="legal_contract_schema",
    simple_schema_mode=True,
    hierarchical_enabled=True,
    property_overrides=[
        {
            "nodeLabel": "Contract",
            "set": {"status": "active", "department": "legal"}
        }
    ]
)

The system analyzes the document and extracts:

  • Contract entities (title, value, dates)
  • Party entities (companies, individuals, roles)
  • Obligation entities (responsibilities, deadlines)
  • Relationships between all entities

Manual Graph Generation

For critical structured data, specify exact graph structure:

response = client.memory.add(
    content="Q4 2024 Enterprise deal with Acme Corp: $50,000 annual contract",
    graph_generation={
        "mode": "manual",
        "manual": {
            "nodes": [
                {
                    "id": "company_acme",
                    "label": "Company",
                    "properties": {
                        "name": "Acme Corp",
                        "size": "enterprise"
                    }
                },
                {
                    "id": "deal_q4_2024",
                    "label": "Deal",
                    "properties": {
                        "value": 50000,
                        "term": "annual",
                        "quarter": "Q4 2024",
                        "status": "closed"
                    }
                }
            ],
            "relationships": [
                {
                    "source_node_id": "company_acme",
                    "target_node_id": "deal_q4_2024",
                    "relationship_type": "SIGNED",
                    "properties": {
                        "date": "2024-12-15",
                        "contract_id": "CNT-2024-456"
                    }
                }
            ]
        }
    }
)

Visualizing Your Knowledge Graph

While Papr doesn't provide built-in visualization, you can query your graph with GraphQL and visualize using tools like:

  • Neo4j Browser - Connect to your Neo4j instance directly
  • D3.js - Build custom web visualizations
  • Cytoscape.js - Interactive graph visualization
  • Gephi - Advanced network analysis

Example: Export graph data for visualization

# Get all nodes and relationships
response = client.graphql.query(
    query="""
    query ExportGraph {
      nodes: all_entities {
        id
        label
        properties
      }
      edges: all_relationships {
        source_id
        target_id
        type
        properties
      }
    }
    """
)

# Convert to visualization format (e.g., D3.js force layout)
graph_data = {
    "nodes": response.data['nodes'],
    "links": [
        {
            "source": edge['source_id'],
            "target": edge['target_id'],
            "type": edge['type']
        }
        for edge in response.data['edges']
    ]
}

Best Practices

1. Define Custom Schemas for Your Domain

Guide extraction by defining your domain's entities and relationships:

schema = client.schemas.create(
    name="Your Domain Schema",
    node_types={...},
    relationship_types={...}
)

2. Use Simple Schema Mode for Consistency

graph_generation={
    "mode": "auto",
    "auto": {
        "simple_schema_mode": True  # System + one custom schema
    }
}

3. Property Overrides for Key Entities

Ensure consistent entity IDs for frequently mentioned entities:

property_overrides=[
    {
        "nodeLabel": "Customer",
        "match": {"name": "Important Customer"},
        "set": {"id": "cust_important", "vip": True}
    }
]

For complex queries, enable graph-enhanced search:

search_response = client.memory.search(
    query="...",
    enable_agentic_graph=True,
    max_nodes=15
)

5. Use GraphQL for Analytics

For structured insights and aggregations, use GraphQL:

response = client.graphql.query(
    query="""
    query Analytics {
      # Your GraphQL query
    }
    """
)

6. Document Agent Learnings in the Graph

Have agents document their workflows and learnings:

client.memory.add(
    content="Workflow for X: step 1, step 2, step 3...",
    metadata={"role": "assistant", "category": "learning"}
)

Common Use Cases

CRM and Sales

  • Track customer interactions and relationships
  • Identify upsell opportunities through connections
  • Analyze deal pipelines and conversion paths
  • Document successful sales workflows (agent memory)
  • Extract entities from contracts (parties, obligations, terms)
  • Track relationships between documents and clauses
  • Analyze contract networks and dependencies
  • Document legal reasoning patterns (agent memory)

Product and Engineering

  • Map code dependencies and relationships
  • Track bug relationships and root causes
  • Document system architecture
  • Store engineering best practices (agent memory)

Research and Knowledge Management

  • Connect research papers and citations
  • Build concept maps and ontologies
  • Track hypothesis testing and results
  • Document research methodologies (agent memory)

Next Steps