LlamaEdge supports any llama2-based LLMs including your own fine-tuned
models!
How does LlamaEdge work?
LlamaEdge provides an one-in-all script to install WasmEdge, download the
selected model file, download the portable inference app, and then run it.
Does it support GPUs?
Yes, LlamaEdge will automatically take advantage of the hardware
accelerators (eg GPUs) you have on the device.
Can I use it at work?
Of course you can. You can also contact the technical support via this form.
What kind of tech stack you use?
The source code for the LlamaEdge is here. The tech
stack is Rust + Wasm + llama.cpp. It is lightweight, portable, high-performance,and container-ready.