LlamaEdge
Run open source LLMs locally or on the edge
Plus: an OpenAI compatible API server for open source LLMs
Start by running the following command lines on your terminal

bash <(curl -sSfL 'https://code.flows.network/webhook/iwYN1SdN3AmPgR5ao5Gt/run-llm.sh')

Deploys a portable LLM chat app that runs on Linux, macOS, x86, arm, Apple Silicon and NVIDIA GPUs.
Support all Llama2 series of models in GGUF format
Powered by Rust & WasmEdge (A CNCF hosted project)
Book a demo
Give us a star
Interact with the LLM via CLI locally
Interact with the LLM via a web interface locally
FAQs
What kind of models does LlamaEdge support?
LlamaEdge supports any llama2-based LLMs including your own fine-tuned models!
How does LlamaEdge work?
LlamaEdge provides an one-in-all script to install WasmEdge, download the selected model file, download the portable inference app, and then run it.
Does it support GPUs?
Yes, LlamaEdge will automatically take advantage of the hardware accelerators (eg GPUs) you have on the device.
Can I use it at work?
Of course you can. You can also contact the technical support via this form.
What kind of tech stack you use?
The source code for the LlamaEdge is here. The tech stack is Rust + Wasm + llama.cpp. It is lightweight, portable, high-performance,and container-ready.
Stay in touch