freellmpool › providers › NVIDIA

Free NVIDIA NIM API: the variety one

NVIDIA's build.nvidia.com is the free tier to use when you want to try lots of different models from one key. Its NIM catalog spans dozens of free models — Llama 3.3 70B, the NVIDIA Nemotron family, Moonshot's Kimi, Qwen, Mistral Small and more — all behind one OpenAI-compatible endpoint at https://integrate.api.nvidia.com/v1. Get a key free at build.nvidia.com. Free usage is metered by credits, so NVIDIA is best as a breadth provider in a pool rather than a high-volume primary — which is exactly what freellmpool makes easy.

What NVIDIA's free tier is good for

Model evaluation and variety. If you want to compare Nemotron against Llama against Kimi without creating a separate account for each, build.nvidia.com gives you all of them from one key. It's less about raw daily volume (credits run down) and more about access to a broad shelf — including some models you won't easily find free elsewhere.

A few of the free models

meta/llama-3.3-70b-instruct — strong general Llama.
nvidia/nemotron-3-super-120b-a12b — NVIDIA's own reasoning-tuned model.
moonshotai/kimi-k2.6 — large MoE, good at agentic tasks.
mistralai/mistral-small-4-119b-2603 — capable Mistral variant.

The full free list runs to dozens of models; list them with freellmpool models -p nvidia.

Get a key and call it

curl https://integrate.api.nvidia.com/v1/chat/completions \
  -H "Authorization: Bearer $NVIDIA_API_KEY" -H "Content-Type: application/json" \
  -d '{"model":"meta/llama-3.3-70b-instruct","messages":[{"role":"user","content":"Hi"}]}'

Limits and gotchas

Free access is credit-metered rather than a flat daily request count, so heavy use depletes faster than on Cerebras or Groq.
The catalog churns — models are added and retired; pin IDs you depend on and expect occasional changes.
Model IDs are namespaced by vendor (e.g. meta/, nvidia/, mistralai/) — copy them exactly.

Pool NVIDIA with other free tiers

The natural setup: NVIDIA for breadth and the occasional niche model, with faster/higher-volume tiers (Cerebras, Groq) carrying the bulk of traffic. freellmpool lets you pin NVIDIA when you want a specific model and pool everything otherwise:

pip install freellmpool
export NVIDIA_API_KEY=...                # plus other free keys
freellmpool models -p nvidia            # browse NVIDIA's free catalog
freellmpool ask -m nvidia/nemotron-3-super-120b-a12b "..."   # pin a specific model

FAQ

Is the NVIDIA NIM API free?

Yes. build.nvidia.com offers free, credit-metered access to a large model catalog via an OpenAI-compatible endpoint at integrate.api.nvidia.com/v1. Create a key at build.nvidia.com.

Why use NVIDIA's free tier?

Breadth — dozens of models (Llama, Nemotron, Kimi, Qwen, Mistral and more) from a single key, ideal for trying and comparing models without many signups.

What's the catch with NVIDIA's free tier?

It's metered by credits rather than a flat daily request cap, so high-volume use depletes faster than on a request-capped tier like Cerebras.

Part of freellmpool (MIT, open source). Catalog and limits change — check NVIDIA's docs. Updated 2026-06-03.