An all-in-one CLI app to run LLMs locally

LlamaEdge

Run open source LLMs locally or on the edge

Plus: an OpenAI compatible API server for open source LLMs

Start by running the following command lines on your terminal

bash <(curl -sSfL 'https://code.flows.network/webhook/iwYN1SdN3AmPgR5ao5Gt/run-llm.sh')

Deploys a portable LLM chat app that runs on Linux, macOS, x86, arm, Apple Silicon and NVIDIA GPUs.

Support all Llama2 series of models in GGUF format

Book a demo

Give us a star

Interact with the LLM via CLI locally

Interact with the LLM via a web interface locally

FAQs

What kind of models does LlamaEdge support?

LlamaEdge supports any llama2-based LLMs including your own fine-tuned models!

How does LlamaEdge work?

LlamaEdge provides an one-in-all script to install WasmEdge, download the selected model file, download the portable inference app, and then run it.

Does it support GPUs?

Yes, LlamaEdge will automatically take advantage of the hardware accelerators (eg GPUs) you have on the device.

Can I use it at work?

Of course you can. You can also contact the technical support via this form.

What kind of tech stack you use?

The source code for the LlamaEdge is here. The tech stack is Rust + Wasm + llama.cpp. It is lightweight, portable, high-performance,and container-ready.

Stay in touch

Resources

Contact