Hugging Face

WasmEdge at OSSummit Korea and KubeCon NA 2025: AI at the Edge and Rust Firmware for Voice Agents

WasmEdge, the lightweight, high-performance WebAssembly runtime under the Cloud Native Computing Foundation (CNCF), just wrapped up a whirlwind tour that took us from the vibrant Open Source Summit Korea in Seoul to the massive gathering at KubeCon + CloudNativeCon North America 2025 in Atlanta. Across both continents, one theme was undeniable: the future of AI is open, portable, and running at the edge. Here is a look. WasmEdge at OSSummit Korea 2025: Real-Time AI Voice Agent at the Edge with Rust Miley Fu, WasmEdge founding member and CNCF Ambassador from Second State, took stage at Open Source Summit Korea on November 4-5, presenting “Orchestrating Real-Time Multimodal AI Agents with Rust”…
LLM AI inference Rust WebAssembly Hugging Face
Getting started with OpenAI’s gpt-oss

OpenAI just got a lot more open. OpenAI announced two state-of-the-art open-weight language models: gpt-oss-120b and gpt-oss-20b. Both models provide full chain-of-thought (CoT) and support Structured outputs, tool use, and function calling. According to OpenAI, The gpt-oss-120b model matches the core reasoning performance of OpenAI’s o4-mini while running efficiently on a single 80 GB GPU. Meanwhile, the gpt-oss-20b model delivers results comparable to OpenAI’s o3-mini on standard benchmarks and can run on edge devices with just 16 GB of memory—making it well-suited for on-device applications, local inference, and fast iteration without the need for expensive infrastructure.…
LLM AI inference Rust WebAssembly Hugging Face
Getting Started with SmolLM3‑3B‑GGUF for Long‑Context Multilingual Reasoning

SmolLM3 is a compact 3 billion‑parameter transformer that delivers state‑of‑the‑art performance at the 3B–4B scale, supporting six major languages and extended contexts up to 128 000 tokens. This powerful yet compact model offers capabilities comparable to 4B models, making it lightweight and suitable for edge devices. It excels in long-context reasoning, able to handle up to 128,000 tokens from documents, transcripts, or logs. Furthermore, its multilingual instruction-tuning for English, French, Spanish, German, Italian, and Portuguese makes it ideal for global applications.…
LLM AI inference Rust WebAssembly Hugging Face