freellmpool › guide
Free LLM tiers drift through the day — keys expire, providers go down, and daily caps fill up. freellmpool ships capacity tools that tell you, locally, which of your free providers are usable right now, which are near their quota, and which keys to add next. freellmpool is a free, open-source tool that pools the free tiers of 18 LLM providers behind one OpenAI-compatible endpoint; these commands keep that pool healthy.
pip install freellmpool
freellmpool capacity status --target 5
capacity status reads your provider catalog, your environment (which keys are set), and
your per-day usage counters, then labels every provider with a status. It never calls a provider, so
it's instant:
$ freellmpool capacity status --target 5
LLM capacity: 4/5 healthy providers
Action recommended: add 1 provider(s).
healthy groq Groq used=12/1000 models=9 key=GROQ_API_KEY
low_quota github GitHub Models used=130/150 models=35 key=GITHUB_TOKEN
healthy cerebras Cerebras used=0/14400 models=2 key=CEREBRAS_API_KEY
...
| Status | Meaning |
|---|---|
healthy | Configured and usable from local state. |
low_quota | Usage is above 80% of the daily request hint. |
exhausted | Usage reached the daily request hint. |
invalid_key | Your local key inventory says the key has expired. |
missing | The provider exists in the catalog but isn't configured. |
--target N flags when you have fewer than N healthy providers; --all also
lists missing providers and external-only catalog candidates you could add. By default it refreshes an
advisory external catalog over the network (a read-only metadata fetch); pass --no-catalog-sync
to keep it fully local.
capacity status reads local state; providers health goes one step further and
sends a tiny real request to each configured provider, so you can tell a missing key from a rate-limited
or down provider:
$ freellmpool providers health
provider/model status latency note
groq/llama-3.3-70b-versatile ok 237 ms 2 tok
mistral/mistral-small-latest ok 539 ms 2 tok
cerebras/gpt-oss-120b rate_limited - HTTP 429
2/3 providers ok
Use -p groq,cerebras to test a subset, --timeout to bound each call, and
-m <model> to pin a model.
To reach a target number of healthy providers, ask for a checklist of which keys to create:
$ freellmpool keys checklist --target 5
Manual key checklist to reach 5 healthy providers:
- cerebras: create a key manually, then set CEREBRAS_API_KEY
keys add then walks you through it — it writes the key to your config.toml and
records metadata (name, dates, notes) in an optional inventory at
~/.config/freellmpool/keys.toml. The inventory is metadata only; raw secrets stay in your
config or environment.
freellmpool keys add groq # configure a known provider
freellmpool keys add Hyperbolic # match & import from the external catalog
freellmpool keys add MyProvider --base-url https://api.example.com/v1 --yes
If the name isn't a local provider, keys add checks the synced external catalog
(mnfst/awesome-free-llm-apis), matching typos
and model names with a small fuzzy search, and can import the suggestion. Or it builds a minimal
OpenAI-compatible provider and autodiscovers its models from the GET /models endpoint.
providers.toml stays the source of truth for routing; freellmpool doesn't send traffic to a
discovered provider until you've imported it and set its key. Imported endpoints are validated (https,
no junk) before they're written.When the proxy is running (freellmpool proxy), open
http://127.0.0.1:8080/dashboard. Alongside request counts, cache hits, and estimated
savings, it shows a healthy-provider count and a per-provider capacity table — the same signal as
capacity status, in the browser.
Run freellmpool capacity status: it sorts your configured providers by health and
remaining capacity, so the top healthy rows are the ones to use. The pool also picks
automatically per request and fails over when one is rate-limited.
The provider statuses are computed from local state only. By default it also refreshes the advisory
external catalog over the network; pass --no-catalog-sync to skip that and stay fully
offline. Live provider probing is a separate command, providers health.
The key inventory holds metadata only (provider, env-var name, optional dates and notes), never raw
secrets. keys add writes the actual key to config.toml (chmod 600); you can
also just keep keys in environment variables.