The current public benchmark profile is intentionally fixed so verified runs can be compared consistently.
Methodology
The public benchmark policy behind verified VRAM Check results.
This page explains the shared benchmark profile, score construction, trust states, class system, and leaderboard eligibility rules so users can understand exactly what a result means before they compare or share it.
Official releases can be verified locally with vramcheck verify-release --file .\vramcheck-windows.exe.
Score flow
How the public score is constructed.
A VRAM Check score is not a synthetic composite. It is a real inference result governed by a canonical profile and a public trust model.
VRAM Check uses one fixed TinyLlama 1.1B Q4_K_M profile so verified runs stay comparable across systems and over time.
Decode throughput is the canonical score. Prefill throughput and TTFT remain visible because they describe different parts of the inference experience.
The benchmark runs multiple passes and selects the representative pass closest to the multi-metric median.
Decode, prefill, and TTFT variance are measured so unstable or noisy runs do not enter the competitive board as if they were equally trustworthy.
Metrics
What each visible metric is trying to tell you.
The primary leaderboard score. It reflects sustained token generation speed after the model is already running.
How quickly the model processes prompt tokens before generation begins. This is critical for long prompts and larger contexts.
Median time-to-first-token across the canonical prompts. Lower is better because it reflects interaction latency, not just throughput.
Total session time includes strict passes, warmup, and setup. Canonical pass time is only the selected scored pass used in the result.
Trust states
Why not every useful run becomes a competitive rank.
Acceleration is active, the canonical profile ran, and the run carries the verification metadata expected by the backend.
The run is real and verified but did not satisfy the active leaderboard gate, usually because of stability or eligibility policy.
The run is useful for practical guidance but does not represent an officially ranked accelerated benchmark.
A fast exploratory fit result that never enters the leaderboard and should never be interpreted as verified throughput.
Leaderboard eligibility
What a run must satisfy before it can claim competitive position.
- Runtime acceleration must be active and the execution profile must be verified GPU.
- The canonical profile must complete successfully with the expected verification metadata.
- Strict stability checks must stay within the active decode, prefill, and TTFT policy thresholds.
- The backend must classify the run as eligible, not provisional or compatibility.
Class system
Class describes capability. Rank describes field position.
Class is the absolute capability band of the machine under the active policy.
Current public scale: S, A,B, C, D,E.
Rank is competitive context inside the active leaderboard pool or board segment.
A machine can have a modest class and still place well in a narrow field, or a high class and a weaker competitive position in a crowded frontier board.
Verification path
What users can validate locally before they trust the binary.
Download official release -> compare published .sha256 -> run `vramcheck verify-release --file .\vramcheck-windows.exe` -> benchmark
The public supported installation path is the official binary release. npm-style global installers are not offered yet because the product still prioritizes verified binary distribution and release clarity over package-manager surface area.