Observability

Monitor prompt usage, track analytics, and analyze scoring data across all experiments

What is Observability?

Observability in LaikaTest provides complete visibility into how your prompts are performing in production. It combines:

Usage Analytics: Track how often each prompt and version is used
Score Analytics: Monitor performance metrics across experiments
Experiment Results: Compare variants and determine winners
Time-Series Data: See trends and patterns over time

Analytics Dashboard

The dashboard provides real-time insights into your prompts and experiments:

Project Overview

• Total prompts and versions
• Active experiments
• API usage and rate limits
• Recent activity timeline

Prompt Analytics

• Usage count per prompt
• Version distribution
• Average scores by version
• Error rates and failures

Experiment Analytics

• Variant performance comparison
• Traffic distribution across buckets
• Score aggregations by variant
• Statistical significance indicators

Retrieving Score Data

Get Scores for an Experiment

TypeScript

const response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}`,  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();// Returns array of individual score recordsdata.forEach(score => {  console.log('Bucket:', score.bucket_id);  console.log('Scores:', score.scores);  console.log('Submitted:', score.created_at);});

Get Aggregated Scores

TypeScript

const response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}/aggregate`,  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();// Returns aggregated metrics by variantdata.forEach(variant => {  console.log('Variant:', variant.bucket_name);  console.log('Average Scores:', variant.avg_scores);  console.log('Total Samples:', variant.count);});

Filter by Time Range

TypeScript

const response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}?` +  new URLSearchParams({    start_date: '2025-01-01',    end_date: '2025-01-31'  }),  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });

Score Name Discovery

Find all unique score names used in an experiment:

TypeScript

const response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}/score-names`,  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();console.log('Score names:', data.score_names);// Example: ['rating', 'helpful', 'response_time_ms', 'user_feedback']

Tip: Use score name discovery to dynamically build analytics dashboards without hardcoding metric names.

Event Tracking

LaikaTest tracks events for every experiment interaction. Events capture:

Assignment Events: When a user is assigned to a variant
Outcome Events: Success or failure of the interaction
Context Data: User context variables used in assignment
Metadata: Request/response details, timing, etc.

Retrieving Events

TypeScript

const response = await fetch(  `https://api.laikatest.com/api/v1/events/experiment/${experimentId}`,  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();// Returns detailed event historydata.forEach(event => {  console.log('Variant:', event.variant_name);  console.log('Outcome:', event.outcome);  console.log('Score:', event.score);  console.log('Context:', event.context);  console.log('Timestamp:', event.created_at);});

Metrics You Can Track

User Engagement

• Click-through rates
• Time spent on page
• Interaction depth
• Bounce rates

Business Outcomes

• Conversion rates
• Revenue per user
• Cart abandonment
• Purchase completion

Quality Metrics

• User satisfaction scores
• Helpfulness ratings
• Sentiment analysis
• Error rates

Performance Metrics

• Response times
• Token usage
• API latency
• Cache hit rates

Analyzing Experiment Results

To determine which variant is performing better:

Collect scores for a statistically significant sample size
Compare average scores across variants
Look for consistent patterns over time
Consider both primary and secondary metrics
Validate results with user feedback

Statistical Significance: LaikaTest provides statistical indicators in the dashboard to help you determine when you have enough data to make confident decisions.

Real-Time Monitoring

The dashboard updates in real-time as new scores and events are recorded. You can:

See live traffic distribution across variants
Monitor score submissions as they happen
Track experiment progress toward end criteria
Receive alerts for anomalies or performance issues

Best Practices

Define Success Metrics Early: Decide what you're measuring before launching experiments
Track Multiple Metrics: Don't rely on a single metric - monitor primary and secondary indicators
Set Sample Size Goals: Determine how many samples you need for statistical significance
Monitor Regularly: Check analytics frequently to catch issues early
Document Insights: Record why certain variants performed better for future reference
Compare Apples to Apples: Ensure variants have similar traffic and time periods when comparing

Exporting Data

You can export analytics data via the API for custom analysis:

TypeScript

// Export all scores for an experimentconst response = await fetch(  `https://api.laikatest.com/api/v1/scores/experiments/${experimentId}?` +  new URLSearchParams({    limit: '1000',    offset: '0'  }),  {    headers: {      'Authorization': `Bearer ${userToken}`    }  });const { data } = await response.json();// Process in your own analytics toolsconst csv = convertToCSV(data);fs.writeFileSync('experiment-results.csv', csv);

Next Steps

Experiments - Design and run your first experiment
Score Evaluation - Learn how to push scores
SDK Documentation - Explore all SDK methods