reads the header only · no upload

Drop a .gguf file here

or click to choose

processed in your browser · never uploaded

Drop or click to replace

A tool that opens a downloaded .gguf model in your browser and shows what's inside. It lists the architecture (general.architecture — llama, qwen2, flux, etc.), the quantization type (Q4_K_M, Q5_K_M, Q8_0… derived from file_type), the total parameter count, the breakdown of tensor types, and metadata like context length, embedding length and block count. It's for when you grab a model from Hugging Face or ComfyUI-GGUF and you can't remember whether it's Q4 or Q5, you want to confirm how many billions of parameters it has or which architecture it is, or you just want to look inside before loading it into llama.cpp — the answers are read straight from the header embedded in the file itself. The key feature is that it reads only the file header (the magic, metadata KV pairs and tensor info) and never loads the weights, so even a multi-GB quantized model opens instantly with no waiting. Because the model never needs to leave your machine, this tool never uploads the file — all parsing happens locally in your browser. It's the GGUF counterpart of the safetensors metadata viewer.

How to use

  1. Drop in a .gguf file or click to choose one (it parses the header only — the weights are never loaded).
  2. It automatically shows the architecture, quantization type, parameter count, tensor types, and model info.
  3. Use "Copy all" to keep the model info. The full metadata KV list is there too if you want it.

FAQ

Is the file uploaded to a server?

No. All parsing happens in your browser. The .gguf file is never uploaded, stored, or sent anywhere — it is read only on your device, so it's safe to inspect even an unreleased model.

Can it open multi-GB models?

Yes. It never loads the weights — it reads only the header at the start of the file (the metadata KV pairs and tensor info). That's why even a multi-GB quantized model opens instantly with no waiting.

How is the quantization type (e.g. Q4_K_M) determined?

It maps the general.file_type (llama.cpp's ftype) in the metadata to a common quantization label. If that isn't recorded, it shows the most frequent ggml tensor type as the quantization.

How is the parameter count calculated?

It sums the shape (the product of the dimensions) of each tensor recorded in the header's tensor info. It never reads the weights — only the shape numbers — so it's accurate and fast.

What's the difference from safetensors?

GGUF is the quantized model format used by llama.cpp and ComfyUI-GGUF, bundling metadata and tensors into one binary. If you want to inspect a .safetensors file instead, use the sister tool, the safetensors metadata viewer.

Only a little metadata shows up — why?

Some files simply record few KV pairs. The tensor count, parameter count and quantization type are computed from the tensor info, so you can still see those even when the metadata is sparse.