Why Measurement Matters

You can't improve what you can't measure. Without a consistent, multi-factor scoring system, brands are left guessing whether their efforts to appear in AI recommendations are actually working.

The challenge with measuring AI visibility is that AI responses are inherently variable. The same prompt can produce different results across models, across runs, and across time. A single observation tells you almost nothing about your actual visibility. You need aggregate data, weighted scoring, and statistical awareness of variance to draw meaningful conclusions.

Clarify's visibility scoring system combines multiple signals into a single 0-100 score that represents how visible your brand is across AI recommendation systems, while accounting for the natural variability of AI-generated responses.

Visibility Score Components

The visibility score is a composite metric calculated as: VisibilityScore = BaseVisibility × (1 + StrengthBonus + CitationBonus) × ModelWeight

Mention Rate

Mention Rate measures the percentage of relevant prompts where an AI model includes your brand. If you track 20 prompts and AI mentions you on 12, your Mention Rate is 60%. Each prompt's mention includes a stability indicator: High (80%+ of runs), Medium (50-79%), or Low (less than 50%). A High stability mention is worth more in the overall score.

Rank & Top-3 Rate

When AI generates recommendation lists, position matters. Rank measures your average position across prompts where you appear. Top-3 Rate measures how often you appear in the first three positions. Rank is normalized across list lengths to ensure fair comparison.

Recommendation Strength

Not all mentions are equal. Clarify classifies recommendation strength into four levels:

AI Confidence

AI Confidence measures how reliable the visibility data itself is. It is a weighted combination of four factors:

Model Weights

A brand that scores well on ChatGPT but poorly on Claude and Gemini will have a moderate overall score. Cross-model visibility is more durable and reflective of genuine authority.

Limitations & Variance

AI responses are inherently variable. Clarify runs multiple queries per prompt per scan cycle and aggregates across runs. Scores may fluctuate between scans — a difference of a few points may be within normal variance. Clarify flags statistically significant changes and distinguishes them from noise.

Factors Clarify cannot measure directly include personalization, real-time web access variability, and model update timing. The scoring is designed to be robust in the face of these limitations.