A WASI-like extension for Tensorflow

Nov 14, 2020 • 5 minutes to read

AI inference is a computationally intensive task that could benefit greatly from the speed of Rust and WebAssembly. However, the standard WebAssembly sandbox provides very limited access to the native OS and hardware, such as multi-core CPUs, GPU and specialized AI inference chips. It is not ideal for the AI workload.

The popular WebAssembly System Interface (WASI) provides a design pattern for sandboxed WebAssembly programs to securely access native host functions. The WasmEdge Runtime extends the WASI model to support access to native Tensorflow libraries from WebAssembly programs. It provides the security, portability, and ease-of-use of WebAssembly and native speed for Tensorflow.

Table of contents

A Rust example

Prerequisite

You need to install WasmEdge and Rust.

Build

$ rustup target add wasm32-wasi
$ cargo build --target wasm32-wasi --release

Run

The wasmedge-tensorflow-lite utility is the WasmEdge build that includes the Tensorflow and Tensorflow Lite extensions.

$ wasmedge-tensorflow-lite target/wasm32-wasi/release/classify.wasm < grace_hopper.jpg
It is very likely a <a href='https://www.google.com/search?q=military uniform'>military uniform</a> in the picture

Make it run faster

To make Tensorflow inference run much faster, you could AOT compile it down to machine native code, and then use WasmEdge sandbox to run the native code.

$ wasmedgec-tensorflow target/wasm32-wasi/release/classify.wasm classify.so
$ wasmedge-tensorflow-lite classify.so < grace_hopper.jpg
It is very likely a <a href='https://www.google.com/search?q=military uniform'>military uniform</a> in the picture

Code walkthrough

It is fairly straightforward to use the WasmEdge Tensorflow API. You can see the entire source code in main.rs.

First, it reads the trained TFLite model file (ImageNet) and its label file. The label file maps numeric output from the model to English names for the classified objects.

    let model_data: &[u8] = include_bytes!("models/mobilenet_v1_1.0_224/mobilenet_v1_1.0_224_quant.tflite");
    let labels = include_str!("models/mobilenet_v1_1.0_224/labels_mobilenet_quant_v1_224.txt");

Next, it reads the image from STDIN and converts it to the size and RGB pixel arrangement required by the Tensorflow Lite model.

    let mut buf = Vec::new();
    io::stdin().read_to_end(&mut buf).unwrap();

    let flat_img = wasmedge_tensorflow_interface::load_jpg_image_to_rgb8(&buf, 224, 224);

Then, the program runs the TFLite model with its required input tensor (i.e., the flat image in this case), and receives the model output. In this case, the model output is an array of numbers. Each number corresponds to the probability of an object name in the label text file.

    let mut session = wasmedge_tensorflow_interface::Session::new(&model_data, wasmedge_tensorflow_interface::ModelType::TensorFlowLite);
    session.add_input("input", &flat_img, &[1, 224, 224, 3])
           .run();
    let res_vec: Vec<u8> = session.get_output("MobilenetV1/Predictions/Reshape_1");

Let's find the object with the highest probability, and then look up the name in the labels file.

    let mut i = 0;
    let mut max_index: i32 = -1;
    let mut max_value: u8 = 0;
    while i < res_vec.len() {
        let cur = res_vec[i];
        if cur > max_value {
            max_value = cur;
            max_index = i as i32;
        }
        i += 1;
    }

    let mut label_lines = labels.lines();
    for _i in 0..max_index {
      label_lines.next();
    }

Finally, it prints the result to STDOUT.

    let class_name = label_lines.next().unwrap().to_string();
    if max_value > 50 {
      println!("It {} a <a href='https://www.google.com/search?q={}'>{}</a> in the picture", confidence.to_string(), class_name, class_name);
    } else {
      println!("It does not appears to be any food item in the picture.");
    }

A JavaScript example

Prerequisite

You need to install WasmEdge. You also need the QuickJS interpreter for WasmEdge. It is the qjs_tf.wasm file in the WasmEdge repo.

You can build you own qjs_tf.wasm from the wasmedge-quickjs project.

Run

The wasmedge-tensorflow-lite utility is the WasmEdge build that includes the Tensorflow and Tensorflow Lite extensions.

$ cd <WasmEdge>/tools/wasmedge/examples/js

# Download the Tensorflow example
$ wget https://raw.githubusercontent.com/second-state/wasmedge-quickjs/main/example_js/tensorflow_lite_demo/aiy_food_V1_labelmap.txt
$ wget https://raw.githubusercontent.com/second-state/wasmedge-quickjs/main/example_js/tensorflow_lite_demo/food.jpg
$ wget https://raw.githubusercontent.com/second-state/wasmedge-quickjs/main/example_js/tensorflow_lite_demo/lite-model_aiy_vision_classifier_food_V1_1.tflite
$ wget https://raw.githubusercontent.com/second-state/wasmedge-quickjs/main/example_js/tensorflow_lite_demo/main.js

$ wasmedge-tensorflow-lite --dir .:. qjs_tf.wasm main.js
label: Hot dog
confidence: 0.8941176470588236

Code walkthrough

It is fairly straightforward to use the WasmEdge JavaScript Tensorflow API. You can see the entire source code in main.js.

First, it reads the image from a file and converts it to the size and RGB pixel arrangement required by the Tensorflow Lite model.

let img = new Image('food.jpg')
let img_rgb = img.to_rgb().resize(192,192)
let rgb_pix = img_rgb.pixels()

Then, the program runs the TFLite model with its required input tensor (i.e., the pixel image in this case), and receives the model output. In this case, the model output is an array of numbers. Each number corresponds to the probability of an object name in the label text file.

let session = new TensorflowLiteSession('lite-model_aiy_vision_classifier_food_V1_1.tflite')
session.add_input('input',rgb_pix)
session.run()
let output = session.get_output('MobilenetV1/Predictions/Softmax');
let output_view = new Uint8Array(output)

Let's find the object with the highest probability, and then look up the name in the labels file.

let max = 0;
let max_idx = 0;
for (var i in output_view){
    let v = output_view[i]
    if(v>max){
        max = v;
        max_idx = i;
    }
}
let label_file = std.open('aiy_food_V1_labelmap.txt','r')
let label = ''
for(var i = 0; i <= max_idx; i++){
    label = label_file.getline()
}
label_file.close()

Finally, it prints the result to the console.

print('label:')
print(label)
print('confidence:')
print(max/255)

Deployment options

All the tutorials below use the WasmEdge Rust SDK for Tensorflow to create AI inference functions. Those Rust functions are then compiled to WebAssembly and deployed together with WasmEdge on the cloud. If you are not familar with Rust, you can try our experimental AI inference DSL.

Serverless functions

The following tutorials showcase how to deploy WebAssembly programs (written in Rust) on public cloud serverless platforms. The WasmEdge Runtime runs inside a Docker container on those platforms. Each serverless platform provides APIs to get data into and out of the WasmEdge runtime through STDIN and STDOUT.

Second Sate FaaS and Node.js

The following tutorials showcase how to deploy WebAssembly functions (written in Rust) on the Second State FaaS. Since the FaaS service is running on Node.js, you can follow the same tutorials for running those functions in your own Node.js server.

Service mesh

The following tutorials showcase how to deploy WebAssembly functions and programs (written in Rust) as sidecar microservices.

  • The Dapr template shows how to build and deploy Dapr sidecars in Go and Rust languages. The sidecars then use the WasmEdge SDK to start WebAssembly programs to process workloads to the microservices.

Data streaming framework

The following tutorials showcase how to deploy WebAssembly functions (written in Rust) as embedded handler functions in data streaming frameworks for AIoT.

  • The YoMo template starts the WasmEdge Runtime to process image data as the data streams in from a camera in a smart factory.
RustJavaScriptWebAssemblyNode.jsGolangDaprVercelNetlifyAWSTencentFaaSRust FaaSServerlesscloud computingAITensorflow
A high-performance, extensible, and hardware optimized WebAssembly Virtual Machine for automotive, cloud, AI, and blockchain applications