Overview

YBPU Model Compiler is a web-based service that converts PyTorch and ONNX models into optimized, deployable libraries for embedded devices and x86 PC/Server.

Embedded Model

Model data is compiled directly into the library. No external files needed at runtime.

Auto Configuration

Input shape, normalization parameters, and model type are automatically detected.

Cross-Compilation

Pre-compiled for ARM targets (Raspberry Pi 3, 4, 5, generic ARM) and Linux x86_64 (PC/Server).

Quick Start Guide

Follow these simple steps to compile your model:

1

Create an Account

Register with your email address and verify your account.

2

Select Target Platform

Choose your target device (e.g., Linux x86_64, Raspberry Pi 4 64-bit).

3

Choose Precision

Select FP32 for accuracy, FP16 for balance, or INT8 for speed.

4

Upload Your Model

Upload a TorchScript (.pt) or ONNX (.onnx) model file.

5

Download Package

Wait for compilation and download your ready-to-use library.

Single machine only. The downloaded library runs on one device only and cannot be used for multi-machine deployment.

Important: Your PyTorch model must be saved as TorchScript format:
import torch
model = YourModel()
model.eval()
traced = torch.jit.trace(model, torch.randn(1, 3, 224, 224))
traced.save("model.pt")

Supported Platforms

The compiler supports ARM-based embedded targets and Linux x86_64 (PC/Server):

Platform Chip Architecture Recommended For
Linux x86_64 (PC/Server) Intel/AMD 64-bit x86_64 Desktop, server, x86 dev machines
Raspberry Pi 4 (32-bit) Cortex-A72 ARMv7-A Legacy 32-bit OS
Raspberry Pi 3 (64-bit) Cortex-A53 ARMv8-A Older devices
Raspberry Pi 3 (32-bit) Cortex-A53 ARMv7-A Legacy support
Generic ARM64 Linux ARMv8 ARMv8-A Other 64-bit ARM boards
Generic ARM32 Linux ARMv7 ARMv7-A Other 32-bit ARM boards
Linux x86_64 build prerequisites (server)

To compile for Linux x86_64, the server must have YBPU built for the host and OpenCV installed. Build YBPU once:

cd ybpu
mkdir -p build-host-gcc-linux && cd build-host-gcc-linux
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host.gcc.toolchain.cmake ..
make -j4
Install OpenCV:
sudo apt install libopencv-dev

Or ensure pkg-config opencv4 (or opencv) works.

Supported Models

Compatible with Most Open-Source Models

Deep-ET YBPU Compiler supports most mainstream open-source neural network models. Simply upload your PyTorch or ONNX model, and our intelligent system will automatically handle preprocessing, conversion, and optimization.

Accepted File Formats

.pt PyTorch .pth PyTorch .onnx ONNX

10 Model Categories

We provide optimized support for the following categories with automatic preprocessing and postprocessing:

1

Image Classification

e.g., YOLO11-cls, ResNet, MobileNet...

Image → Class labels

3

Instance Segmentation

e.g., YOLO11-seg, YOLOv8-seg...

Image → Masks

4

Rotated Detection (OBB)

e.g., YOLO11-obb, YOLOv8-obb...

Image → Rotated boxes

5

Pose Estimation

e.g., YOLO11-pose, YOLOv8-pose...

Image → Keypoints

6

Face Detection

e.g., SCRFD, RetinaFace, ArcFace...

Image → Faces

7

Crowd Counting

e.g., P2PNet...

Image → Count

8

Video Matting

e.g., RVM...

Image → Alpha

9

OCR

e.g., PaddleOCR...

Image → Text

10

Speech Recognition & Synthesis

e.g., Whisper (ASR), Piper (TTS)...

Audio ↔ Text

Smart Model Recognition

Our compiler automatically detects your model architecture and applies optimal settings. For custom or unknown models, AI-assisted analysis ensures proper configuration.

File Size Limit: Maximum upload size is 500 MB.

Input Formats

The compiled library automatically adapts to different input formats. Just pass your data - preprocessing is handled automatically!

Image Input

For vision models - just pass OpenCV Mat

cv::Mat image = cv::imread("photo.jpg");
auto results = model.detect(image);

Audio Input

For speech recognition models

std::vector<float> audio = load_wav("speech.wav");
auto result = model.transcribe(audio);

Text Input

For TTS models

auto result = model.synthesize("Hello!");
// result.audio contains waveform
Automatic Preprocessing: The library handles all preprocessing internally - color conversion, normalization, resizing, padding, and more. Just pass your raw data!

Precision Options

Choose the right precision level for your use case:

FP32

Single Precision Float

  • Highest accuracy
  • Largest file size
  • Best for development/testing
Accuracy: 100%
Size: 100%

INT8

8-bit Integer

  • Reduced accuracy
  • 75% smaller file size
  • Fastest inference
Accuracy: ~85%
Size: 25%

Thread Count

Choose the number of CPU threads for inference based on your target device:

4
Raspberry Pi 4/5

Use 4 threads for best performance

2-4
Other ARM Devices

Match your CPU core count

Tip: More threads = faster inference but higher power usage. For battery devices, use fewer threads.

Auto-Tune

The library automatically optimizes for your hardware on first run - no configuration needed!

1
First Run

Benchmarks ~100 configurations (5-10 seconds)

2
Cache

Saves optimal settings automatically

3
Run Fast

Uses cached settings instantly

Zero Configuration: Just run your model - the library handles all optimizations automatically!

On supported platforms (including Linux x86_64), the library automatically selects the best compute backend for your machine on first run. No manual configuration required.

Automatic Analysis

The compiler intelligently analyzes your model and handles all configuration automatically:

Input Shape
Model Type
Normalization
Preprocessing
Postprocessing
Just Upload: No manual configuration needed - our AI-powered analysis handles everything automatically!

Output Package Structure

After compilation, you'll receive a .tar.gz package containing 3 files:

ybpu_model_rpi4-64_fp32/
├── libybpu_model.a    # Static library (with embedded model)
├── ybpu_model.h       # C++ header file
└── model.lic          # License file (SHA-signed)

Key Features

  • Embedded Model: Model weights are compiled into the library - no external files needed
  • License Protection: Time-limited license based on your selected duration (1-365 days)
  • Simple Integration: Just link the .a file, include the .h header, and keep the .lic file

How to Use in Your Project

Project Structure

my_project/
├── CMakeLists.txt
├── main.cpp
├── lib/
│   └── libybpu_model.a    # Copy from downloaded package
├── include/
│   └── ybpu_model.h       # Copy from downloaded package
└── model.lic              # Copy to executable directory

CMakeLists.txt

cmake_minimum_required(VERSION 3.10)
project(my_inference_app)

set(CMAKE_CXX_STANDARD 11)

# Find OpenCV (required)
find_package(OpenCV REQUIRED)

# Include directories
include_directories(${CMAKE_SOURCE_DIR}/include)
include_directories(${OpenCV_INCLUDE_DIRS})

# Your application
add_executable(my_app main.cpp)

# Link with YBPU model library and OpenCV
target_link_libraries(my_app 
    ${CMAKE_SOURCE_DIR}/lib/libybpu_model.a
    ${OpenCV_LIBS}
    pthread
)

Build Commands

# On your target device (e.g., Raspberry Pi)

# 1. Install OpenCV if not already installed
sudo apt update
sudo apt install libopencv-dev

# 2. Create build directory and compile
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make

# 3. Copy license file to executable directory
cp ../model.lic .

# 4. Run your application
./my_app test_image.jpg

Direct g++ Compilation

Put libybpu_model.a at the end of the command to avoid undefined reference to YbpuModel::....

Method 1-L and -lybpu_model (library last):

g++ -O2 -o my_app main.cpp \
    -I./include \
    $(pkg-config --cflags --libs opencv4) \
    -lpthread \
    -L./lib -lybpu_model

Method 2 — Full path to .a file (recommended on Raspberry Pi):

g++ -O2 -o my_app main.cpp \
    -I./include \
    $(pkg-config --cflags --libs opencv4) \
    -lpthread \
    ./lib/libybpu_model.a

Copy model.lic to the same directory as the executable before running.

License System

The compiled library includes a time-limited license protection system.

How It Works

  • When you compile a model, you specify a license duration (1-365 days)
  • The compiler generates a SHA-signed license file (model.lic)
  • The library checks the license at runtime before loading the model
  • If the license has expired, the model will not load

Machine Binding (No Copy to Other Machines)

The library is bound to the first machine it runs on. On first run, it generates a binding (using the machine's MAC address and a SHA key) and stores it in a hidden file next to your executable. If you copy the application (including the library) to another machine, the library will not run there and will report an error.

  • First run: Binding is created automatically; no action needed.
  • Same machine: Runs normally on subsequent launches.
  • Different machine: Loading fails with: This library is bound to another machine and cannot run on this device. Use a new download for the other machine.

License File (model.lic)

The license file contains:

{
  "model_name": "your_model",
  "model_hash": "sha256...",
  "created_at": "2026-02-28",
  "expire_at": "2026-03-30",
  "valid_days": 30,
  "license_id": "ybpu-xxxx-xxxx",
  "signature": "sha256..."
}

Using License in C++

#include "ybpu_model.h"
#include <iostream>

int main() {
    // Check license status before loading
    if (!YbpuModel::is_license_valid()) {
        std::cerr << "License expired on: " 
                  << YbpuModel::get_license_expire_date() << std::endl;
        return 1;
    }
    
    std::cout << "License valid for " 
              << YbpuModel::get_license_days_remaining() 
              << " more days" << std::endl;
    
    // Create model (will fail if license expired)
    YbpuModel model;
    
    if (!model.is_loaded()) {
        // Error message will indicate license issue if applicable
        std::cerr << "Error: " << model.get_last_error() << std::endl;
        return 1;
    }
    
    // ... rest of your code
    return 0;
}

License API Reference

Static Method Description
YbpuModel::is_license_valid() Returns true if license is still valid
YbpuModel::get_license_days_remaining() Returns number of days until expiration
YbpuModel::get_license_expire_date() Returns expiration date string (YYYY-MM-DD)
Important: The model.lic file must be in the same directory as your executable, or specify the path using YBPU_LICENSE_PATH environment variable.

License Renewal

To renew an expired license:

  1. Re-upload your model to the compiler
  2. Select your desired license duration
  3. Download the new package with fresh model.lic
  4. Replace only the model.lic file (library remains the same)

Contact help@deep-et.com for enterprise licensing options.

C++ API Usage

Basic Usage

#include "ybpu_model.h"
#include <opencv2/opencv.hpp>

int main() {
    // Create model instance - model is already embedded!
    YbpuModel model;
    
    if (!model.is_loaded()) {
        std::cerr << "Failed to load model: " << model.get_last_error() << std::endl;
        return 1;
    }
    
    // Read image with OpenCV (BGR format)
    cv::Mat image = cv::imread("test.jpg");
    
    // Run inference - preprocessing is automatic!
    std::vector<float> output = model.inference(image);
    
    // Process output based on model type
    // Classification: find max probability
    auto max_it = std::max_element(output.begin(), output.end());
    int class_id = std::distance(output.begin(), max_it);
    float confidence = *max_it;
    
    std::cout << "Predicted class: " << class_id << std::endl;
    std::cout << "Confidence: " << confidence << std::endl;
    
    return 0;
}

API Reference

Method Description
YbpuModel() Constructor - loads the embedded model automatically
YbpuModel(param, bin) Load external model files (optional)
bool is_loaded() Check if model loaded successfully
int get_input_width() Get expected input width
int get_input_height() Get expected input height
int get_input_channels() Get expected input channels
vector<float> inference(cv::Mat) Run inference on an image
int get_num_threads() Get current thread count for inference
void set_num_threads(int) Set thread count for inference (1-16)
string get_last_error() Get last error message

Building Your Project

# On your target device (e.g., Raspberry Pi)

# 1. Extract the package
tar -xzf ybpu_model_rpi4-64_fp32.tar.gz
cd ybpu_model_rpi4-64_fp32

# 2. Set up your project with the lib and header files
mkdir -p my_project/{lib,include}
cp libybpu_model.a my_project/lib/
cp ybpu_model.h my_project/include/
cp model.lic my_project/

# 3. Build with CMake (see CMakeLists.txt example above)
cd my_project
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make

# 4. Don't forget the license file!
cp ../model.lic .

# 5. Run your application
./my_app test_image.jpg

CMake Integration

cmake_minimum_required(VERSION 3.10)
project(my_app)

set(CMAKE_CXX_STANDARD 11)
find_package(OpenCV REQUIRED)

# Add YBPU model library
add_subdirectory(path/to/ybpu_model_package ybpu_model)

add_executable(my_app main.cpp)
target_link_libraries(my_app ybpu_model ${OpenCV_LIBS} OpenMP::OpenMP_CXX pthread)

Frequently Asked Questions

Q: Do I need to specify model files when using the library?

A: No! The model is embedded in the library. Just use YbpuModel model; - no file paths needed.

Q: What image format should I use?

A: Use OpenCV's default BGR format. The library handles all preprocessing (resize, color conversion, normalization) automatically.

Q: How do I save my PyTorch model correctly?

A: Use TorchScript tracing:

model.eval()
traced = torch.jit.trace(model, example_input)
traced.save("model.pt")

Q: Build fails with "OpenCV not found"

A: Install OpenCV on your target device:

sudo apt install libopencv-dev

Q: Can I use custom normalization?

A: Yes, you can override normalization with:

model.set_normalize({mean_b, mean_g, mean_r}, {scale_b, scale_g, scale_r});

Q: Why is inference slow?

A: Make sure to build with optimizations:

cmake -DCMAKE_BUILD_TYPE=Release ..

Q: What's the maximum model size?

A: Maximum upload size is 500 MB. For larger models, consider using FP16 or INT8 quantization.