LLM

Tutorial: Integrating Locally-Run DeepSeek R1 Distilled Llama Model with Cursor

In this article, we'll explore how to integrate the DeepSeek R1 Distilled Llama-8B model with the highly-rated intelligent code editor Cursor to create a private coding assistant. DeepSeek R1 is a powerful open-source language model with efficient inference capabilities and cost-effectiveness, making it particularly suitable for developers and researchers. Cursor is a popular AI code editor that can rely on different LLMs to complete code assistance tasks. While large language models specifically trained for coding tasks have shown excellent results, we've found that the trending DeepSeek's coding capabilities are quite impressive and comparable to many programmers.…
LLM AI inference Rust WebAssembly
DeepSeek Topples OpenAI's O1: The Chinese Model Rewriting AI Economics at 1/70th the Cost

This is adapted from an interview when DeepSeek V2 impressed many when it debut in July, 2024. With the DeepSeek R1 coming out, we are seeing more heated discussions on X: “So the Chinese have open sourced a model that can out think any PhD I've ever met.” The original interview was in Chinese by Sina Finance. For the full English translation of the…
LLM AI inference Rust WebAssembly
Run DeepSeek R1 on your own Devices in 5 mins

In our previous article, we got a peek into the lastest interview on the founder of DeepSeek. DeepSeek R1 is a powerful and versatile open source LLM model that challenges established players like OpenAI with its advanced reasoning capabilities, cost-effectiveness, and open-source availability. While it has some limitations, its innovative approach and strong performance make it a valuable tool for developers, researchers, and businesses alike. For those interested in exploring its capabilities, the model and its distilled versions are readily accessible on platforms like Hugging Face and GitHub.…
LLM AI inference Rust WebAssembly
Tiktok Refugees' Guide to RedNote

Getting Started: The Basics What is RedNote? Think of RedNote as TikTok meets Instagram meets Pinterest. While TikTok focuses on short videos, RedNote (or “Little Red Book”) is more about lifestyle content with both photos and videos. It's like having a digital lifestyle magazine where you're both the reader and creator. Download & Registration Tips Download from the App Store or Google Play (search for &ldqu…
LLM Translation TikTok RedNote Translation
RustCoder: AI-Assisted Rust Learning

Rust has been voted the most beloved programming language by StackOverflow users eight years in a row. It is already widely used in mission critical software, including the Linux Kernel. In February 2024, the U.S. government released an official report to urge developer adoption of Rust over C/C++ in all government software due to its memory-safety and performance. Rust is also the hottest language among innovative startups of all sizes. For example, Elon Musk’s xAI is using Rust for all its AI infrastructure — a clear signal of where the industry is headed.…
LLM AI inference Rust WebAssembly
WasmEdge at KubeCon NA 2024: AI-Driven Video Translation

Second State is thrilled to bring its open source work around LLMs, WasmEdge and Gaia, to KubeCon + CloudNativeCon North America 2024, November 12-15, 2024 in Salt Lake City, Utah. This year, Second State unveils VideoLangua.com—a groundbreaking platform using the open source WasmEdge and Gaia tech stack to deliver high-quality video translation, dubbing, and subtitling. This innovation highlights Second State’s commitment to democratizing access to global communication through advanced open-source solutions.…
LLM AI inference Rust WebAssembly
WebAssembly Devroom at FOSDEM 2025 – Call for Speakers Open

We're excited to announce that the WebAssembly Devroom will be held on 2nd February 2025 at FOSDEM 2025 in Brussels, Belgium. WebAssembly is expanding its use cases from browsers to the cloud, and this devroom is a fantastic opportunity for the community to meet and discuss the latest developments in the WebAssembly ecosystem. This is cohosted by WasmEdge and NTT software innovation lab engineers. About FOSDEM: FOSDEM is a free event for software developers to meet, share ideas, and collaborate.…
LLM AI inference Rust WebAssembly
Lightweight and cross-platform LLM agents on Ascend 910B

The Ascend 910B is a popular alternative to the Nvidia H100 in China. While it is a powerhouse for AI training workloads, we are mostly interested in its inference performance. That is especially relevant as new Ascend NPUs are released for edge devices. Recently, Huawei generously donated 5 bare metal servers with 8x Ascend 910B each to support the GOSIM Super Agent hackathon event. Those machines are truly beasts costing well over $100k USD each.…
LLM AI inference Rust WebAssembly
Run FLUX.1 [schnell] on your MacBook

FLUX.1 is an open-source image generation model developed by Black Forest Labs, the creators of Stable Diffusion. They recently released FLUX.1 [schnell], a lightweight, high-speed variant designed for local use, ideal for personal projects, and licensed under Apache 2.0. With WasmEdge's release of version 0.14.1, which includes Stable Diffusion plugin support, you can use LlamaEdge (the Rust + Wasm stack) to run the FLUX.1 [schnell] model and Stable Diffusion model and generate images directly on your machine without needing to install complex Python packages or C++ toolchains!…
LLM AI inference Rust WebAssembly
Getting started with Qwen2.5-14B

The Qwen 2.5 series includes models ranging from 0.5B to 110B parameters, optimized for diverse tasks like coding, logical reasoning, and natural language understanding. These models, including smaller ones (0.5B, 1.8B, 4B, 7B, 14B) for edge devices and larger ones (72B, 110B) for enterprise use, have seen significant improvements in instruction-following, logic, and over 29 languages support. They have long-context support (up to 128K input tokens and over 8k token generation), and can generate structured outputs like JSON.…
LLM AI inference Rust WebAssembly

1
2
3
4
5