Last updated

Messages Management

Store and manage conversation history with automatic memory creation and intelligent session compression.

Overview

The Messages API enables you to:

  • Store chat messages with session grouping
  • Automatically create memories from conversations
  • Retrieve conversation history with pagination
  • Compress long conversations for LLM context
  • Track processing status for message sessions

Messages vs Direct Memories

When to use Messages:

  • Chat applications and conversational AI
  • You want automatic analysis and memory creation
  • Message history needs to be preserved
  • Session-based context is important

When to use Direct Memory API:

  • Structured data with known entities
  • Explicit control over what gets stored
  • No conversation context needed
  • You've already processed/analyzed the content

Basic Usage

Storing Messages

Store a chat message with optional AI processing:

from papr_memory import Papr
import os

client = Papr(x_api_key=os.environ.get("PAPR_MEMORY_API_KEY"))

# Store a user message
response = client.messages.store(
    content="I need help planning my Q4 product launch. Budget is $50k.",
    role="user",
    session_id="session_123",
    title="Q4 Product Launch Planning",  # Optional: session title
    external_user_id="user_456",
    process_messages=True  # Enable AI analysis and memory creation
)

print(f"Message stored: {response.message_id}")

Processing Flow

When process_messages: true:

  1. Message is immediately stored
  2. Background processing analyzes the message
  3. If memory-worthy, creates a memory with role-based categorization
  4. Links the message to the created memory

Role-Based Memory Categories:

  • User messages → categories: preference, task, goal, facts, context
  • Assistant messages → categories: skills, learning

Retrieving Conversation History

Get message history for a session with pagination:

# Get conversation history
history = client.messages.get_history(
    session_id="session_123",
    limit=50,  # Max 100
    skip=0     # For pagination
)

print(f"Total messages: {history.total_count}")

# Display messages
for message in history.messages:
    print(f"{message.role}: {message.content}")
    if message.created_memory_id:
        print(f"  → Created memory: {message.created_memory_id}")

# Check if summaries are available
if history.summaries:
    print(f"Short-term summary: {history.summaries.short_term}")

Session Compression

For long conversations, use session compression to get hierarchical summaries optimized for LLM context.

# Compress conversation for LLM context
compressed = client.messages.compress_session(
    session_id="session_123"
)

# Get summaries at different levels
print(f"Short-term (recent): {compressed.summaries.short_term}")
print(f"Medium-term (key points): {compressed.summaries.medium_term}")
print(f"Long-term (overview): {compressed.summaries.long_term}")

# Use context_for_llm in your prompts
context = compressed.context_for_llm
print(f"Compressed context ({len(context)} chars): {context}")

Compression Details:

  • Automatically generated every 15 messages
  • Available on-demand via compress endpoint
  • Returns hierarchical summaries (short/medium/long-term)
  • Includes context_for_llm field optimized for LLM prompts

Session Status

Check session metadata and processing status:

# Get session status
status = client.messages.get_session_status(
    session_id="session_123"
)

print(f"Title: {status.title}")
print(f"Message count: {status.message_count}")
print(f"Created memories: {status.memories_created}")
print(f"Last activity: {status.last_activity_at}")

Batch Processing

Trigger batch processing of session messages into memories:

# Process all messages in a session
response = client.messages.process_session(
    session_id="session_123"
)

print(f"Processing status: {response.status}")
print(f"Messages queued: {response.queued_count}")

Complete Chat Application Example

Here's a complete example of building a chat application with messages:

from papr_memory import Papr
import os

class ChatApplication:
    def __init__(self):
        self.client = Papr(x_api_key=os.environ.get("PAPR_MEMORY_API_KEY"))
    
    def handle_user_message(self, session_id: str, user_id: str, message: str, title: str = None):
        """Store user message and create memories"""
        # Store user message
        user_msg = self.client.messages.store(
            content=message,
            role="user",
            session_id=session_id,
            title=title,  # Set once per session
            external_user_id=user_id,
            process_messages=True  # Enable memory creation
        )
        
        return user_msg.message_id
    
    def handle_assistant_response(self, session_id: str, response: str):
        """Store assistant response and capture learnings"""
        # Store assistant message
        assistant_msg = self.client.messages.store(
            content=response,
            role="assistant",
            session_id=session_id,
            process_messages=True  # Capture skills and learnings
        )
        
        return assistant_msg.message_id
    
    def get_conversation_context(self, session_id: str, max_messages: int = 20):
        """Get recent conversation context"""
        history = self.client.messages.get_history(
            session_id=session_id,
            limit=max_messages
        )
        
        # Build context string for LLM
        context = []
        for msg in history.messages:
            context.append(f"{msg.role}: {msg.content}")
        
        return "\n".join(context)
    
    def get_compressed_context(self, session_id: str):
        """Get compressed context for long conversations"""
        compressed = self.client.messages.compress_session(
            session_id=session_id
        )
        
        # Use pre-compressed context
        return compressed.context_for_llm
    
    def search_relevant_memories(self, session_id: str, query: str):
        """Search memories relevant to current conversation"""
        # Get user from session
        history = self.client.messages.get_history(session_id=session_id, limit=1)
        user_id = history.messages[0].external_user_id if history.messages else None
        
        # Search memories
        results = self.client.memory.search(
            query=query,
            external_user_id=user_id,
            enable_agentic_graph=True
        )
        
        return results.data.memories

# Usage
chat = ChatApplication()

# New conversation
session_id = "conv_product_launch_2024"
user_id = "user_sarah"

# User asks question
chat.handle_user_message(
    session_id=session_id,
    user_id=user_id,
    message="What were the key takeaways from our Q3 planning meeting?",
    title="Q4 Planning Discussion"
)

# Search relevant memories
memories = chat.search_relevant_memories(
    session_id=session_id,
    query="Q3 planning meeting takeaways"
)

# Generate response with memories
assistant_response = f"Based on our Q3 planning: {memories[0].content}"

# Store assistant response
chat.handle_assistant_response(
    session_id=session_id,
    response=assistant_response
)

# After many messages, get compressed context
context = chat.get_compressed_context(session_id)
print(f"Compressed context for LLM: {context}")

Best Practices

1. Use Consistent Session IDs

Generate predictable session IDs for easier management:

# Good: Descriptive session IDs
session_id = f"user_{user_id}_conv_{timestamp}"

# Also good: UUID for uniqueness
import uuid
session_id = str(uuid.uuid4())

2. Set Session Title Early

Set the title parameter in your first message to make sessions discoverable:

client.messages.store(
    content="First message",
    session_id="new_session",
    title="Product Launch Discussion",  # Set this early
    role="user"
)

3. Enable Processing Selectively

Not every message needs to become a memory:

# Important conversation - create memories
client.messages.store(
    content="Decision: We'll launch in Q4 with $50k budget",
    process_messages=True  # Create memory
)

# Casual chat - just store
client.messages.store(
    content="Thanks!",
    process_messages=False  # Skip memory creation
)

4. Compress Long Conversations

After ~20+ messages, use compression to reduce token usage:

if message_count > 20:
    # Use compressed context instead of full history
    context = client.messages.compress_session(session_id).context_for_llm
else:
    # Use full history for short conversations
    context = client.messages.get_history(session_id)

Use messages for recent context, search for historical context:

# Recent conversation context
recent = client.messages.get_history(session_id, limit=10)

# Historical relevant memories
historical = client.memory.search(
    query=user_question,
    external_user_id=user_id,
    max_memories=5
)

# Combine both for comprehensive context
full_context = f"{historical_memories}\n\nRecent conversation:\n{recent_messages}"

When to Use Each Endpoint

EndpointUse When
POST /v1/messagesStoring every chat message
GET /v1/messages/sessions/{id}Need full conversation history
GET /v1/messages/sessions/{id}/compressLong conversations (>20 msgs)
GET /v1/messages/sessions/{id}/statusCheck session metadata
POST /v1/messages/sessions/{id}/processBatch process stored messages

Advanced Patterns

Multi-User Conversations

Track group conversations with multiple participants:

# Store messages from different users in same session
client.messages.store(
    content="I think we should prioritize mobile first",
    role="user",
    session_id="team_meeting_123",
    external_user_id="user_alice"
)

client.messages.store(
    content="Agreed, but we need to consider desktop users too",
    role="user",
    session_id="team_meeting_123",
    external_user_id="user_bob"
)

Context Window Management

For very long conversations, combine compression with pagination:

def get_llm_context(session_id, max_tokens=4000):
    # Try compressed version first
    compressed = client.messages.compress_session(session_id)
    context = compressed.context_for_llm
    
    if len(context) < max_tokens:
        return context
    
    # If still too long, use most recent messages only
    history = client.messages.get_history(session_id, limit=10)
    return "\n".join([f"{m.role}: {m.content}" for m in history.messages])

Hybrid Memory Strategy

Use messages for chat, direct memories for structured data:

# Chat messages - automatic analysis
client.messages.store(
    content="User said they prefer blue theme",
    role="user",
    session_id=session_id,
    process_messages=True
)

# Structured preference - direct memory
client.memory.add(
    content="User UI preferences",
    metadata={
        "external_user_id": user_id,
        "preference_type": "ui",
        "theme": "blue",
        "font_size": "large"
    }
)

Troubleshooting

Messages Not Creating Memories

Issue: process_messages=True but no memories created

Solutions:

  • Check if message content is substantive (system filters trivial messages like "ok", "thanks")
  • Verify authentication is correct
  • Check background processing isn't disabled server-side

Missing Summaries

Issue: summaries field is null in history response

Solutions:

  • Summaries are generated every 15 messages - add more messages
  • Use compress_session endpoint to generate on-demand
  • Check if session has at least 5 messages (minimum for compression)

High Token Usage

Issue: Conversation history too large for LLM context

Solutions:

  • Use compress_session instead of full history
  • Implement pagination with limit parameter
  • Combine compressed summaries with recent messages only

Next Steps