Deploying Tensorflow models in production with less than 50 lines of code

Serverless Tensorflow functions in public clouds

For software developers and students, artificial intelligence pays. In 2021, the average annual salary for Tensorflow (a popular AI framework) developers is $148,508. Skills in artificial intelligence are now mandatory in even entry level programming jobs. In fact, it is quite easily to follow an online tutorial and train your own Tensorflow model for tasks such as image recognition and natural language processing. You only need some basic Python knowledge to do the training and then run the model for a demo. Just knowing how to use simple Python to train a model is not going to command a big salary.

However, it is much harder to make the model you just trained available to the rest of the world as a reliable web service. For developers, there are significant challenges in deploying Tensorflow models in production. Companies and employers pay top dollars for people who can overcome these challenges.

The Python language and frameworks are inefficient in preparing and processing input and output data for the model. It is estimated that 90% of AI computation workload is on data preparation. The Python language is simply too heavy and slow.
It is very hard to scale the service on demand. Due to the computational requirements for AI inference, a server machine could be temporarily blocked even when there are only a few requests. Scaling the number of servers up and down on-demand is crucial.

Yet, there are simple solutions to this problem. With the Second State VM and Tencent Cloud, you can deploy Tensorflow models as a service in production with less than 50 lines of simple Rust code. Check out this. You will be able to deploy your production-ready models on Tencent Cloud for free!

If you go through the steps in this article and deploy a Tensorflow serverless function on Tencent Cloud, we’d like to send you a very cool “serverless” face mask. Fill out this form to claim yours!

How it works

Our choices of technologies to address the above challenges are as follows.

The Rust programming language is very fast and memory safe. It is an ideal choice for high performance data processing and model execution logic.
WebAssembly acts as the bridge between the modern Rust program, the latest Tensorflow library, and the hardened operating systems in public clouds. WebAssembly is the compatibility sandbox and is crucial for developer experience.
Tencent Cloud Serverless provides the scalable infrastructure to run the Rust and WebAssembly function for Tensorflow inference. Together with the serverless framework, Tencent Cloud Serverless provides a great developer experience in worry-free software deployment.

Now, let's see how it works. First of all, fork this template project from GitHub, and go through the prerequisites. You can use GitHub Codespaces IDE, or our Docker image, or install Rust, ssvmup, and serverless framework on your own computer.

Here is an image recognition AI as a service. It utilizes a Tensorflow model trained for recognizing food item on images. With less than 50 lines of simple Rust code, you can deploy it on Tencent Cloud Serverless, which scales on-demand and only charges you for actual usage.

See it in Action

The code

The Rust code to load and execute a Tensorflow model against an input image to recognize what's on that image. This particular model is in Tensorflow Lite format and trained to recognize food items on the input image.

// Load the model data from a file
let model_data: &[u8] = include_bytes!("lite-model_aiy_vision_classifier_food_V1_1.tflite");

// Read the input image data from Tencent Cloud's API gateway
let mut buffer = String::new();
io::stdin().read_to_string(&mut buffer).expect("Error reading from STDIN");
let obj: FaasInput = serde_json::from_str(&buffer).unwrap();
let img_buf = base64::decode_config(&(obj.body), base64::STANDARD).unwrap();

// Resize the image to the size needed by the Tensorflow model
let flat_img = ssvm_tensorflow_interface::load_jpg_image_to_rgb8(&img_buf, 192, 192);

// Execute the model against the input image and gets the result tensor value
let mut session = ssvm_tensorflow_interface::Session::new(&model_data, ssvm_tensorflow_interface::ModelType::TensorFlowLite);
session.add_input("input", &flat_img, &[1, 192, 192, 3]).run();
let res_vec: Vec<u8> = session.get_output("MobilenetV1/Predictions/Softmax");

The res_vec vector contains a list of probabilities for each of objects in the image (e.g., the probably for a cake in this image is 0.8). The Rust code below reads in the labels for those objects, and prints out the object label with the highest probability from the Tensorflow model output.

let labels = include_str!("aiy_food_V1_labelmap.txt");

let mut i = 0;
let mut max_index: i32 = -1;
let mut max_value: u8 = 0;
while i < res_vec.len() {
    let cur = res_vec[i];
    if cur > max_value {
        max_value = cur;
        max_index = i as i32;
    }
    i += 1;
}

let mut label_lines = labels.lines();
for _i in 0..max_index {
    label_lines.next();
}

let class_name = label_lines.next().unwrap().to_string();
if max_value > 50 && max_index != 0 {
    println!("The image {} contains a <a href='https://www.google.com/search?q={}'>{}</a>", confidence.to_string(), class_name, class_name);
} else {
    println!("No food item is detected");
}

Deploy your own

Under the hood, the Rust code is compiled into WebAssembly bytecode, and executed in the SSVM WebAssembly runtime. The SSVM is pre-configured to access the high-performance Tensorflow native library across many operation system environments, including the serverless containers at Tencent Cloud. Tencent Cloud Serverless, in turn, provides a simple solution for scaling the Tensorflow inference function.

Open a Terminal window in the Codespaces IDE, and run the following command from Docker or your command line to build your cloud function.

$ ssvmup build —enable-aot

In the Terminal window, run the following commands to deploy the Tensorflow cloud function to the Tencent Cloud.

$ cp pkg/scf.so scf/

$ sls deploy
... ...
website: https://sls-website-ap-hongkong-kfdilz-1302315972.cos-website.ap-hongkong.myqcloud.com

Load the deployed URL in any web browser and have fun!

Conclusion

In this article, we discussed how to create simple, safe, and high-performance Rust functions to run Tensorflow models, and how to deploy those functions on public clouds as scalable and on-demand AI services.

Now, deploy your own Tensorflow serverless function on Tencent Cloud for free, and claim a cool “serverless” face mask.