> BROMANDER_LABS

CAN YOURUN IT?

Honest, math-based estimates for running local LLMs on your hardware. VRAM fit, KV cache, and tokens-per-second — calculated the way llama.cpp actually runs them.

All math runs in your browser. We never see your specs.

01 — Hardware

GPU

System RAM

02 — Model

LLM

The "default" local model. Sweet spot for 8–16 GB VRAM.

System report

Verdict

S-TIER

38.4

tokens / sec

Fits comfortably in VRAM. You're running Llama 3.1 8B at full GPU speed.

Best fit: Q8_0auto-selected

Memory budget

Memory required9.98 GB / 11.0 GB VRAM usable

╳ 11.0 GB

Weights8.53 GB

KV cache0.54 GB

Overhead0.91 GB

VRAM

12 GB

Bandwidth

504 GB/s

Context

→ Mindful AI

A scanner for your hardware. An app for your habits.

Shinery is digital wellness for people who actually use AI.

Open Shinery

─── Your Report Card ───

This is the actual image that shows on X, LinkedIn, and Facebook when you share the link.

Live preview

Built by

Bromander Studios — Hire Us