Observability

Monitor prompt usage, track analytics, and analyze scoring data across all experiments

What is Observability?

Observability in LaikaTest provides complete visibility into how your prompts are performing in production. It combines:

  • Usage Analytics: Track how often each prompt and version is used
  • Score Analytics: Monitor performance metrics across experiments
  • Experiment Results: Compare variants and determine winners
  • Time-Series Data: See trends and patterns over time

Analytics Dashboard

The dashboard provides real-time insights into your prompts and experiments:

Project Overview
  • • Total prompts and versions
  • • Active experiments
  • • API usage and rate limits
  • • Recent activity timeline
Prompt Analytics
  • • Usage count per prompt
  • • Version distribution
  • • Average scores by version
  • • Error rates and failures
Experiment Analytics
  • • Variant performance comparison
  • • Traffic distribution across buckets
  • • Score aggregations by variant
  • • Statistical significance indicators

Retrieving Score Data

Get Scores for an Experiment

TypeScript
const response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}`,  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();// Returns array of individual score recordsdata.forEach(score => {  console.log('Bucket:', score.bucket_id);  console.log('Scores:', score.scores);  console.log('Submitted:', score.created_at);});

Get Aggregated Scores

TypeScript
const response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}/aggregate`,  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();// Returns aggregated metrics by variantdata.forEach(variant => {  console.log('Variant:', variant.bucket_name);  console.log('Average Scores:', variant.avg_scores);  console.log('Total Samples:', variant.count);});

Filter by Time Range

TypeScript
const response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}?` +  new URLSearchParams({    start_date: '2025-01-01',    end_date: '2025-01-31'  }),  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });

Score Name Discovery

Find all unique score names used in an experiment:

TypeScript
const response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}/score-names`,  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();console.log('Score names:', data.score_names);// Example: ['rating', 'helpful', 'response_time_ms', 'user_feedback']

Tip: Use score name discovery to dynamically build analytics dashboards without hardcoding metric names.

Event Tracking

LaikaTest tracks events for every experiment interaction. Events capture:

  • Assignment Events: When a user is assigned to a variant
  • Outcome Events: Success or failure of the interaction
  • Context Data: User context variables used in assignment
  • Metadata: Request/response details, timing, etc.

Retrieving Events

TypeScript
const response = await fetch(  `https://api.laikatest.com/api/v1/events/experiment/${experimentId}`,  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();// Returns detailed event historydata.forEach(event => {  console.log('Variant:', event.variant_name);  console.log('Outcome:', event.outcome);  console.log('Score:', event.score);  console.log('Context:', event.context);  console.log('Timestamp:', event.created_at);});

Metrics You Can Track

User Engagement
  • • Click-through rates
  • • Time spent on page
  • • Interaction depth
  • • Bounce rates
Business Outcomes
  • • Conversion rates
  • • Revenue per user
  • • Cart abandonment
  • • Purchase completion
Quality Metrics
  • • User satisfaction scores
  • • Helpfulness ratings
  • • Sentiment analysis
  • • Error rates
Performance Metrics
  • • Response times
  • • Token usage
  • • API latency
  • • Cache hit rates

Analyzing Experiment Results

To determine which variant is performing better:

  1. Collect scores for a statistically significant sample size
  2. Compare average scores across variants
  3. Look for consistent patterns over time
  4. Consider both primary and secondary metrics
  5. Validate results with user feedback

Statistical Significance: LaikaTest provides statistical indicators in the dashboard to help you determine when you have enough data to make confident decisions.

Real-Time Monitoring

The dashboard updates in real-time as new scores and events are recorded. You can:

  • See live traffic distribution across variants
  • Monitor score submissions as they happen
  • Track experiment progress toward end criteria
  • Receive alerts for anomalies or performance issues

Best Practices

  • Define Success Metrics Early: Decide what you're measuring before launching experiments
  • Track Multiple Metrics: Don't rely on a single metric - monitor primary and secondary indicators
  • Set Sample Size Goals: Determine how many samples you need for statistical significance
  • Monitor Regularly: Check analytics frequently to catch issues early
  • Document Insights: Record why certain variants performed better for future reference
  • Compare Apples to Apples: Ensure variants have similar traffic and time periods when comparing

Exporting Data

You can export analytics data via the API for custom analysis:

TypeScript
// Export all scores for an experimentconst response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}?` +  new URLSearchParams({    limit: '1000',    offset: '0'  }),  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();// Process in your own analytics toolsconst csv = convertToCSV(data);fs.writeFileSync('experiment-results.csv', csv);

Next Steps