Last updated

Advanced Search

The Papr Memory API offers sophisticated search capabilities to implement complex search patterns and filters. This guide provides in-depth information on advanced search techniques.

While basic search provides good results for simple queries, advanced search techniques can significantly improve retrieval accuracy for complex use cases.

Leverage embedding-based similarity search:

// Perform vector search with custom embeddings
const results = await papr.memory.vectorSearch({
  embedding: [0.1, 0.2, 0.3, ... ], // Your custom embedding vector
  dimensions: 1536, // Number of dimensions in your embedding
  similarityMetric: "cosine",
  threshold: 0.75, // Minimum similarity threshold
  limit: 10
});

// Generate embeddings on-the-fly and search
const results = await papr.memory.vectorSearch({
  text: "Customer complained about checkout process",
  embeddingModel: "text-embedding-3-large", // OpenAI model used
  similarityMetric: "cosine",
  limit: 10
});

Hybrid Search Strategies

Combine different search methods for better results:

// Advanced hybrid search with custom weights
const results = await papr.memory.hybridSearch({
  query: "payment processing errors",
  hybridStrategy: {
    semantic: {
      weight: 0.6,
      model: "text-embedding-3-large"
    },
    keyword: {
      weight: 0.3,
      fields: ["content", "metadata.title"]
    },
    vector: {
      weight: 0.1,
      embedding: customEmbedding,
      dimensions: 1536
    }
  },
  limit: 10
});

Implement multi-stage retrieval pipelines:

// First stage: broad recall using keyword search
const firstStageResults = await papr.memory.search({
  query: "customer refund policy",
  limit: 50 // Get more results for the first stage
});

// Second stage: rerank using semantic similarity
const secondStageResults = await papr.memory.rerank({
  query: "customer refund policy",
  memories: firstStageResults,
  model: "rerank-multilingual-v1",
  limit: 10 // Narrow down to top results
});

Implement faceted search to allow users to refine results:

// Perform faceted search
const results = await papr.memory.facetedSearch({
  query: "product feedback",
  facets: [
    {
      field: "metadata.category",
      limit: 5
    },
    {
      field: "metadata.department",
      limit: 5
    },
    {
      field: "metadata.sentiment",
      limit: 3
    }
  ],
  limit: 20
});

// Access facet counts
console.log(results.facets);
// Example output:
// {
//   "metadata.category": [
//     { value: "feedback", count: 12 },
//     { value: "bug_report", count: 5 },
//     ...
//   ],
//   "metadata.department": [
//     { value: "engineering", count: 8 },
//     { value: "product", count: 7 },
//     ...
//   ]
// }

Implement recency bias in search results:

// Search with time-weighting
const results = await papr.memory.search({
  query: "marketing campaign ideas",
  timeWeighting: {
    field: "context.timestamp",
    decay: 0.5,
    scale: "30d" // Half the relevance score after 30 days
  },
  limit: 10
});

Customize search results based on user profiles:

// Personalized search
const results = await papr.memory.personalizedSearch({
  query: "product recommendations",
  userId: "user-123",
  userProfile: {
    interests: ["photography", "technology", "travel"],
    recentPurchases: ["camera", "tripod"],
    preferences: {
      priceRange: "premium",
      brands: ["Canon", "Sony"]
    }
  },
  personalizationStrength: 0.7, // How much to bias towards user profile
  limit: 10
});

Find memories based on location data:

// Geospatial search for memories near a location
const results = await papr.memory.geoSearch({
  location: {
    lat: 37.7749,
    lon: -122.4194
  },
  radius: "10km",
  locationField: "metadata.location",
  query: "coffee shop",
  limit: 10
});

Query Expansion

Improve recall by automatically expanding queries:

// Search with query expansion
const results = await papr.memory.search({
  query: "laptop problems",
  queryExpansion: {
    enabled: true,
    method: "thesaurus",
    expansionTerms: 3 // Add up to 3 related terms
  },
  limit: 10
});

// Custom query expansion
const expandedQuery = await papr.memory.expandQuery({
  query: "laptop problems",
  expansionMethods: ["synonyms", "related-concepts", "common-issues"],
  limit: 5 // Number of expansion terms
});

console.log(expandedQuery);
// Example output: "laptop problems notebook issues computer malfunctions device errors hardware failures"

Contextual Filters

Filter results based on context:

// Apply contextual filters
const results = await papr.memory.search({
  query: "project timeline",
  contextualFilters: {
    "user.department": "engineering",
    "user.accessLevel": { $gte: 3 },
    "session.projectId": "proj-123"
  },
  limit: 10
});

Search with RAG

Implement Retrieval Augmented Generation (RAG) for AI responses:

// RAG implementation
const memories = await papr.memory.search({
  query: "How do I reset my password?",
  limit: 5
});

// Format retrieved memories for RAG
const context = memories.map(mem => 
  `Source: ${mem.metadata.title || 'Unknown'}\n${mem.content}`
).join('\n\n');

// Use context in AI response generation
const completion = await papr.ai.complete({
  prompt: `
    Use the following information to answer the question.
    
    Context:
    ${context}
    
    Question: How do I reset my password?
    
    Answer:`,
  model: "gpt-4",
  temperature: 0.3,
  maxTokens: 500
});

Building Custom Relevance Models

Develop custom models to rank search results:

// Train a custom relevance model
await papr.search.trainRelevanceModel({
  name: "customer-support-relevance",
  trainingData: {
    positiveExamples: [...], // Array of relevant query/memory pairs
    negativeExamples: [...], // Array of irrelevant query/memory pairs
  },
  modelType: "neural-network",
  hyperParameters: {
    epochs: 10,
    learningRate: 0.01,
    batchSize: 32
  }
});

// Use custom relevance model
const results = await papr.memory.search({
  query: "How do I cancel my subscription?",
  relevanceModel: "customer-support-relevance",
  limit: 10
});

Query Understanding

Extract intent and entities from search queries:

// Analyze query intent and entities
const queryAnalysis = await papr.search.analyzeQuery({
  query: "Show me red Nike running shoes under $100",
  extractors: ["intent", "entities", "filters"]
});

console.log(queryAnalysis);
// Example output:
// {
//   intent: "product_search",
//   entities: [
//     { type: "brand", value: "Nike" },
//     { type: "product_type", value: "running shoes" },
//     { type: "color", value: "red" },
//     { type: "price_range", value: "under $100" }
//   ],
//   filters: {
//     "metadata.brand": "Nike",
//     "metadata.product_type": "running shoes",
//     "metadata.color": "red",
//     "metadata.price": { $lt: 100 }
//   }
// }

// Search using analyzed query
const results = await papr.memory.search({
  analyzedQuery: queryAnalysis,
  limit: 10
});

Best Practices

  1. Combine Search Strategies: Use hybrid approaches for complex queries
  2. Optimize for Your Domain: Train relevance models on your specific data
  3. Monitor Search Performance: Track metrics like precision and recall
  4. Use Context Effectively: Incorporate user context for personalized results
  5. Test with Real Queries: Evaluate search performance with actual user queries

Next Steps