Last updated

Advanced Search

The Papr Memory API offers sophisticated search capabilities to implement complex search patterns and filters. This guide provides in-depth information on advanced search techniques.

While basic search provides good results for simple queries, advanced search techniques can significantly improve retrieval accuracy for complex use cases.

Leverage embedding-based similarity search:

// Perform vector search with custom embeddings
const results = await papr.memory.vectorSearch({
  embedding: [0.1, 0.2, 0.3, ... ], // Your custom embedding vector
  dimensions: 1536, // Number of dimensions in your embedding
  similarityMetric: "cosine",
  threshold: 0.75, // Minimum similarity threshold
  limit: 10
});

// Generate embeddings on-the-fly and search
const results = await papr.memory.vectorSearch({
  text: "Customer complained about checkout process",
  embeddingModel: "text-embedding-3-large", // OpenAI model used
  similarityMetric: "cosine",
  limit: 10
});

Hybrid Search Strategies

Combine different search methods for better results:

// Advanced hybrid search with custom weights
const results = await papr.memory.hybridSearch({
  query: "payment processing errors",
  hybridStrategy: {
    semantic: {
      weight: 0.6,
      model: "text-embedding-3-large"
    },
    keyword: {
      weight: 0.3,
      fields: ["content", "metadata.title"]
    },
    vector: {
      weight: 0.1,
      embedding: customEmbedding,
      dimensions: 1536
    }
  },
  limit: 10
});

Implement multi-stage retrieval pipelines:

// First stage: broad recall using keyword search
const firstStageResults = await papr.memory.search({
  query: "customer refund policy",
  limit: 50 // Get more results for the first stage
});

// Second stage: rerank using semantic similarity
const secondStageResults = await papr.memory.rerank({
  query: "customer refund policy",
  memories: firstStageResults,
  model: "rerank-multilingual-v1",
  limit: 10 // Narrow down to top results
});

Implement faceted search to allow users to refine results:

// Perform faceted search
const results = await papr.memory.facetedSearch({
  query: "product feedback",
  facets: [
    {
      field: "metadata.category",
      limit: 5
    },
    {
      field: "metadata.department",
      limit: 5
    },
    {
      field: "metadata.sentiment",
      limit: 3
    }
  ],
  limit: 20
});

// Access facet counts
console.log(results.facets);
// Example output:
// {
//   "metadata.category": [
//     { value: "feedback", count: 12 },
//     { value: "bug_report", count: 5 },
//     ...
//   ],
//   "metadata.department": [
//     { value: "engineering", count: 8 },
//     { value: "product", count: 7 },
//     ...
//   ]
// }

Implement recency bias in search results:

// Search with time-weighting
const results = await papr.memory.search({
  query: "marketing campaign ideas",
  timeWeighting: {
    field: "context.timestamp",
    decay: 0.5,
    scale: "30d" // Half the relevance score after 30 days
  },
  limit: 10
});

Customize search results based on user profiles:

// Personalized search
const results = await papr.memory.personalizedSearch({
  query: "product recommendations",
  userId: "user-123",
  userProfile: {
    interests: ["photography", "technology", "travel"],
    recentPurchases: ["camera", "tripod"],
    preferences: {
      priceRange: "premium",
      brands: ["Canon", "Sony"]
    }
  },
  personalizationStrength: 0.7, // How much to bias towards user profile
  limit: 10
});

Find memories based on location data:

// Geospatial search for memories near a location
const results = await papr.memory.geoSearch({
  location: {
    lat: 37.7749,
    lon: -122.4194
  },
  radius: "10km",
  locationField: "metadata.location",
  query: "coffee shop",
  limit: 10
});

Query Expansion

Improve recall by automatically expanding queries:

// Search with query expansion
const results = await papr.memory.search({
  query: "laptop problems",
  queryExpansion: {
    enabled: true,
    method: "thesaurus",
    expansionTerms: 3 // Add up to 3 related terms
  },
  limit: 10
});

// Custom query expansion
const expandedQuery = await papr.memory.expandQuery({
  query: "laptop problems",
  expansionMethods: ["synonyms", "related-concepts", "common-issues"],
  limit: 5 // Number of expansion terms
});

console.log(expandedQuery);
// Example output: "laptop problems notebook issues computer malfunctions device errors hardware failures"

Contextual Filters

Filter results based on context:

// Apply contextual filters
const results = await papr.memory.search({
  query: "project timeline",
  contextualFilters: {
    "user.department": "engineering",
    "user.accessLevel": { $gte: 3 },
    "session.projectId": "proj-123"
  },
  limit: 10
});

Custom Metadata Filtering

Use custom metadata fields to create precise filters:

// Search with custom metadata filters
const results = await papr.memory.search({
  query: "customer feedback about mobile app",
  metadata: {
    // Standard metadata fields
    topics: ["mobile", "feedback", "customer"],
    location: "US",
    // Custom metadata fields for precise filtering
    customMetadata: {
      app_version: "2.1.0",
      customer_tier: "enterprise",
      priority: "high",
      department: "mobile_team",
      issue_category: "performance",
      platform: "iOS",
      sentiment_score: { $gte: 0.7 }  // Using comparison operators
    }
  },
  enable_agentic_graph: true,
  limit: 15
});

// Complex custom metadata filtering with multiple conditions
const complexResults = await papr.memory.search({
  query: "project blockers and technical debt",
  metadata: {
    hierarchical_structures: "Engineering/Projects",
    customMetadata: {
      // Multiple conditions
      status: ["blocked", "at_risk"],
      priority: "critical",
      team: "backend",
      // Numeric comparisons
      effort_estimate: { $lte: 40 },  // Less than or equal to 40 hours
      impact_score: { $gte: 8 },      // Impact score 8 or higher
      // Date ranges
      created_after: "2024-01-01",
      // Boolean conditions
      requires_architecture_review: true,
      has_security_implications: false
    }
  },
  limit: 20
});

Custom Metadata Best Practices

  1. Consistent Field Names: Use consistent naming conventions across your application
  2. Appropriate Data Types: Store numbers as numbers, booleans as booleans
  3. Indexable Values: Keep custom metadata values reasonably sized and structured
  4. Hierarchical Organization: Use nested objects for related custom fields
// Good custom metadata structure
const wellStructuredMetadata = {
  customMetadata: {
    // Project-related fields
    project: {
      id: "proj-123",
      name: "Mobile App Redesign",
      phase: "development"
    },
    // Quality metrics
    metrics: {
      complexity_score: 7.5,
      test_coverage: 0.85,
      performance_rating: "good"
    },
    // Business context
    business: {
      customer_impact: "high",
      revenue_impact: 150000,
      compliance_required: true
    },
    // Technical context
    technical: {
      framework: "React Native",
      requires_migration: false,
      api_version: "v2"
    }
  }
};

Search with RAG

Implement Retrieval Augmented Generation (RAG) for AI responses:

// RAG implementation
const memories = await papr.memory.search({
  query: "How do I reset my password?",
  limit: 5
});

// Format retrieved memories for RAG
const context = memories.map(mem => 
  `Source: ${mem.metadata.title || 'Unknown'}\n${mem.content}`
).join('\n\n');

// Use context in AI response generation
const completion = await papr.ai.complete({
  prompt: `
    Use the following information to answer the question.
    
    Context:
    ${context}
    
    Question: How do I reset my password?
    
    Answer:`,
  model: "gpt-4",
  temperature: 0.3,
  maxTokens: 500
});

Building Custom Relevance Models

Develop custom models to rank search results:

// Train a custom relevance model
await papr.search.trainRelevanceModel({
  name: "customer-support-relevance",
  trainingData: {
    positiveExamples: [...], // Array of relevant query/memory pairs
    negativeExamples: [...], // Array of irrelevant query/memory pairs
  },
  modelType: "neural-network",
  hyperParameters: {
    epochs: 10,
    learningRate: 0.01,
    batchSize: 32
  }
});

// Use custom relevance model
const results = await papr.memory.search({
  query: "How do I cancel my subscription?",
  relevanceModel: "customer-support-relevance",
  limit: 10
});

Query Understanding

Extract intent and entities from search queries:

// Analyze query intent and entities
const queryAnalysis = await papr.search.analyzeQuery({
  query: "Show me red Nike running shoes under $100",
  extractors: ["intent", "entities", "filters"]
});

console.log(queryAnalysis);
// Example output:
// {
//   intent: "product_search",
//   entities: [
//     { type: "brand", value: "Nike" },
//     { type: "product_type", value: "running shoes" },
//     { type: "color", value: "red" },
//     { type: "price_range", value: "under $100" }
//   ],
//   filters: {
//     "metadata.brand": "Nike",
//     "metadata.product_type": "running shoes",
//     "metadata.color": "red",
//     "metadata.price": { $lt: 100 }
//   }
// }

// Search using analyzed query
const results = await papr.memory.search({
  analyzedQuery: queryAnalysis,
  limit: 10
});

Best Practices

  1. Combine Search Strategies: Use hybrid approaches for complex queries
  2. Optimize for Your Domain: Train relevance models on your specific data
  3. Monitor Search Performance: Track metrics like precision and recall
  4. Use Context Effectively: Incorporate user context for personalized results
  5. Test with Real Queries: Evaluate search performance with actual user queries

Next Steps