Guide to Deep - learning Model Conversion in HarmonyOS Next

This article aims to deeply explore the technical details related to deep - learning model conversion in the Huawei HarmonyOS Next system (up to API 12 as of now), and summarize it based on actual development practices. It mainly serves as a vehicle for technical sharing and communication. There may be mistakes and omissions. Colleagues are welcome to put forward valuable opinions and questions so that we can make progress together. This article is original content, and any form of reprint must indicate the source and the original author.

I. Model Conversion Requirements and Framework Support

(1) Necessity Analysis

In the development of intelligent applications for HarmonyOS Next, deep - learning model conversion is a crucial step. Due to differences in model structure, data format, and calculation methods among different deep - learning frameworks (such as TensorFlow, PyTorch, etc.), HarmonyOS Next devices require models in specific formats and specifications to run efficiently. Therefore, converting models trained in other frameworks into formats that can be recognized and run by HarmonyOS Next can fully utilize existing model resources, avoid redundant development, and ensure the compatibility and performance of models within the HarmonyOS Next ecosystem.

(2) Introduction to Supported Frameworks and Conversion Tools

Supported Deep - learning Frameworks HarmonyOS Next supports multiple mainstream deep - learning frameworks, including TensorFlow and PyTorch. These frameworks are widely used in academia and industry, boasting rich model libraries and powerful functions. For example, TensorFlow is favored for its efficient computational graph construction and distributed training capabilities, while PyTorch is loved by developers for its dynamic computational graph and concise code style. By supporting these frameworks, HarmonyOS Next provides developers with more choices and flexibility.
Overview of Conversion Tools HarmonyOS Next provides corresponding model conversion tools for different frameworks. For example, for TensorFlow models, the OMG offline model conversion tool may be used (assuming it is mentioned in the document); for PyTorch models, there are also specific conversion methods and tools (such as relevant scripts or tools that may be mentioned in the document). The main function of these conversion tools is to convert model files from the original frameworks (such as TensorFlow's pb files, PyTorch's pt files, etc.) into formats that can be understood and run by HarmonyOS Next devices. During the conversion process, the model may also be optimized and adjusted to adapt to the operating environment of HarmonyOS Next.

(3) Comparison of Model Conversion Characteristics and Process Differences among Different Frameworks

Characteristics and Process of TensorFlow Model Conversion
- Characteristics: TensorFlow models are usually represented in the form of computational graphs. During the conversion process, the nodes and operations in the computational graph need to be transformed into forms that can be recognized by HarmonyOS Next. Its model structure is relatively complex, containing a large number of nodes and edges. Therefore, when converting, the connection relationships and data flows between nodes need to be carefully processed. For example, special conversion logic is required when dealing with variable sharing and control - flow operations in TensorFlow.
- Process: First, the original TensorFlow model file (in.pb format) and relevant configuration files (if any) need to be prepared. Then, use the OMG offline model conversion tool (assuming). Configure conversion parameters, such as specifying the model file path, output path, input and output formats, etc. The conversion tool will parse the computational graph of the TensorFlow model, perform conversion operations according to the configuration parameters, and generate a model file that can be run on HarmonyOS Next. During this process, the input and output nodes of the model may need to be clearly specified to ensure that the converted model can correctly receive and output data.
Characteristics and Process of PyTorch Model Conversion
- Characteristics: PyTorch models are based on dynamic computational graphs, and the model definition is more flexible and intuitive. When converting, the modules and layer structures in the PyTorch model need to be transformed into the model representation form of HarmonyOS Next. Compared with TensorFlow, the conversion of PyTorch models may focus more on transforming the model structure defined by Python code into an efficient computational representation, while handling the compatibility of some special operations in PyTorch (such as operations related to the automatic differentiation mechanism) after conversion.
- Process: For PyTorch models, first ensure that the necessary dependencies are installed (such as a specific version of PyTorch mentioned in the document). Then, according to the conversion script or tool provided in the document (assuming it is a specific Python script), pass in the PyTorch model definition file (in.py format) and the model parameter file (in.pth format), as well as the parameters required for conversion, such as the output path and input shape. The conversion script will parse the structure and parameters of the PyTorch model and convert it into a format acceptable to HarmonyOS Next. During the conversion process, some dynamic operations in the model (such as dynamically created layers) may need to be statically processed to adapt to the static model deployment requirements of HarmonyOS Next.

II. Model Conversion Steps and Parameter Configuration

(1) Detailed Explanation of Conversion Steps

Environment Setup and Dependency Installation
- For using the OMG offline model conversion tool (taking TensorFlow model conversion as an example), first ensure that the Java Runtime Environment (JRE) is installed in the system, as the tool may be developed based on Java. At the same time, according to the requirements of the tool, some other dependent libraries or software packages may also need to be installed. For example, a specific version of the protobuf library may be required to handle the serialization and deserialization operations of model files.
- When converting PyTorch models, if using a specific Python script, ensure that the correct version of Python (such as Python 3.x as required in the document) and the PyTorch library (such as the specified PyTorch 1.11 version) are installed in the system. In addition, other relevant Python libraries may also need to be installed, such as numpy for data processing, which may be used during the model conversion process.
Execution Process of Model Conversion
- Taking the conversion of TensorFlow models as an example, after preparing the environment and dependencies, run the OMG offline model conversion tool. First, specify the model conversion mode through command - line parameters (such as the running mode 0 mentioned in the document, indicating no - training mode), the type of deep - learning framework (here it is TensorFlow, corresponding to the parameter value 3), the path of the original model file (specified using the --model parameter), the path of the calibration - method quantization configuration file (--cal_conf parameter), the absolute path of the output model file (--output parameter), and input and output format parameters. Then, the tool will read the original model according to these parameters, perform conversion operations, and save the converted model to the specified output path.
- For PyTorch model conversion, execute the corresponding Python conversion script. In the script, pass in the PyTorch model definition file path, the model parameter file path, and other conversion parameters, such as the input shape (--input_shape parameter). The script will load the PyTorch model according to these parameters, perform conversion processing, and generate a model file that can be run on HarmonyOS Next.

(2) In - depth Explanation of Parameter Configuration and Example Interpretation

Input and Output Format Parameters
- During the model conversion process, input and output format parameters (such as --input_format) are crucial. For different frameworks and model types, the formats of input and output data may vary. For example, for image data, it may support the NHWC (Number of batches, Height, Width, Channels) or NCHW (Number of batches, Channels, Height, Width) format. When converting, the appropriate format needs to be selected according to the actual requirements of the model and the support of HarmonyOS Next devices. If the model uses the NHWC format during training and is incorrectly specified as the NCHW format during conversion, it may lead to data reading errors or abnormal calculation results when the model is running.
- Take an image - classification TensorFlow model as an example. Suppose the input data format of the model during training is NHWC. When using the OMG offline model conversion tool, the --input_format parameter should be set to NHWC to ensure that the input data format of the converted model is consistent with that during training. At the same time, for the output format, it also needs to be correctly configured according to the output requirements of the model. For example, if the model outputs a class probability vector, its data type and shape need to be clearly defined so that the output results can be correctly parsed and used in the HarmonyOS Next application.
Model Structure Definition Parameters (such as Input Shape Parameters)
- The input shape parameter (such as --input_shape) is used to specify the dimensional information of the model's input data. This is crucial for model conversion because different models have strict requirements for the shape of input data. For example, a convolutional neural network model may require the input image to be 224x224 pixels in size and have 3 channels. During conversion, this information needs to be accurately passed to the conversion tool through the --input_shape parameter. If the input shape parameter is configured incorrectly, the converted model may not be able to correctly process the input data, resulting in incorrect inference results or program crashes.
- Suppose we have a PyTorch - based object - detection model whose input is an image. When converting, the --input_shape parameter needs to be set according to the actual input requirements of the model. For example, if the model requires the input image to have a shape of (1, 3, 416, 416) (indicating a batch size of 1, 3 channels, and a height and width of 416 pixels), then the --input_shape parameter should be set to the corresponding value in the conversion script to ensure that the converted model can correctly receive and process the input image data.

(3) Common Problems and Solutions during the Conversion Process

Model File Path Error Problem During the conversion process, the problem of incorrect model file paths often occurs. For example, the specified original model file path does not exist or contains incorrect characters. The solution is to carefully check the correctness of the model file path, ensure that the file name and folder name in the path are spelled correctly, and that the file actually exists in the specified path. Also, pay attention to whether the file path format meets the requirements of the operating system. For example, use backslashes \ in Windows systems and forward slashes / in Linux systems.
Dependency Library Version Incompatibility Problem When using the model conversion tool, the problem of dependency library version incompatibility may be encountered. For example, the version of a certain library required by the conversion tool is inconsistent with the version installed in the system, resulting in errors during the conversion process. The solution to this problem is to install the correct version of the dependency library according to the documentation requirements of the conversion tool. Virtual environments (such as venv or conda environments in Python) can be used to isolate the dependency libraries of different projects and avoid version conflicts. At the same time, when installing the dependency library, pay attention to checking the installation log to promptly discover and solve possible installation errors.
Parameter Configuration Error Problem Parameter configuration errors are also common problems. For example, incorrect input and output format parameters, or input shape parameters that do not match the actual requirements of the model. The solution is to carefully read the documentation of the conversion tool, understand the meaning and function of each parameter, and ensure that the parameter configuration is consistent with the structure and data requirements of the model. You can refer to relevant example codes or tutorials to carefully check and adjust the parameter configuration. If you are not sure about the correct value of a certain parameter, you can try using the default value or conduct some simple tests to determine the appropriate parameter value.

III. Verification and Optimization of the Converted Model

(1) Explanation of Verification Methods

Correctness Verification
- Data Input - Output Verification: Input a set of known test data into the converted model and check whether the output of the model is consistent with the expected results. For example, for an image - classification model, some images with labeled categories can be input, and check whether the classification results output by the model are correct. If the output of the model does not match the expected classification labels, it may indicate that there are problems in the model conversion process, such as model structure damage or parameter errors.
- Model Structure Verification: Verify whether the structure of the converted model is consistent with the original model by viewing the structure information of the converted model. Some tools or methods can be used to visualize the model structure and check whether the layers, nodes, and connection relationships in the model are correctly converted. For example, after converting a TensorFlow model, tools like TensorBoard can be used to view the computational graph structure of the model to ensure that it matches the computational graph structure of the original model. For PyTorch models, the structure information of the model can be printed for verification.
Performance Verification
- Inference Speed Testing: On HarmonyOS Next devices, test the inference speed of the converted model. A timer can be used to measure the time required for the model to process a batch of data, and the average inference speed of the model can be obtained by taking the average of multiple tests. Compare the inference speed of the converted model with that of the original model (if possible) or other similar models to evaluate whether the performance of the converted model meets the requirements. If the inference speed is too slow, further optimization of the model may be required, or factors that may affect performance, such as unreasonable computational resource allocation or improper model parameter settings, need to be found.
- Resource Occupancy Evaluation: Monitor the resource occupancy of the model during operation, including memory occupancy and CPU/GPU usage. Use the performance monitoring tools provided by the system (such as the resource monitoring tools自带 in the HarmonyOS Next system or third - party performance monitoring software) to observe the consumption of system resources by the model when processing data. If the model occupies too much memory or CPU/GPU resources, it may affect the operation of other applications on the device. The model needs to be optimized, such as using model compression techniques to reduce memory occupancy and optimizing the allocation of computational tasks to improve resource utilization.

(2) Proposing Optimization Methods

Application of Model Compression Technologies
- Pruning Technology: Reduce the number of parameters and computational complexity of the model by removing unimportant connections or neurons in the model. For example, in a convolutional neural network, connections or neurons that have little impact on the model output can be cut off according to indicators such as neuron activity or weight size. When applying pruning technology, it is necessary to select an appropriate pruning strategy and threshold to avoid over - pruning, which may lead to a decline in model performance. The model can be fine - tuned after pruning to recover some performance losses.
- Quantization Technology: Convert the parameters in the model from high - precision data types (such as 32 - bit floating - point numbers) to low - precision data types (such as 8 - bit integers), thereby reducing the storage size and computational amount of the model. When performing quantization, appropriate quantization methods (such as uniform quantization, non - uniform quantization) and quantization parameters (such as quantization range, quantization bit number, etc.) need to be selected according to the data distribution of the model. At the same time, pay attention to the impact of quantization on model accuracy, and some technical means (such as quantization - aware training) can be used to reduce accuracy loss.
Optimization and Adjustment of Model Structure
- Layer Fusion: For some consecutive layers (such as a convolutional layer and the subsequent activation layer), they can be fused into one layer to reduce the storage and computational overhead of intermediate results during the calculation process. For example, fuse the convolutional layer and the ReLU activation layer into a new convolutional layer, and directly apply the ReLU activation function when calculating the convolution to improve computational efficiency.
- Model Simplification: Appropriately simplify the model according to the requirements of the application. For example, reduce the number of layers or neurons in the model. If the application scenario does not require extremely high model accuracy, some complex structures can be removed to reduce the computational complexity of the model and improve the running speed of the model. However, when simplifying the model, sufficient testing and evaluation should be carried out to ensure that the model still meets the basic requirements of the application after simplification.

(3) Demonstration of a Complete Case

Case Background and Selection of the Original Model Suppose we want to develop a smart photo album application based on HarmonyOS Next. The application requires a model that can recognize the facial expressions (such as happy, sad, angry, etc.) of people in photos. We have selected a facial - expression recognition model that has been trained in the TensorFlow framework. This model performs well in terms of accuracy, but it is large in size and has a high computational complexity, which is not very suitable for direct operation on HarmonyOS Next devices.
Model Conversion Process
- First, set up the model conversion environment and install the required dependent libraries (such as the Java runtime environment, protobuf library, etc.) according to the TensorFlow model conversion steps mentioned above.
- Then, prepare the original TensorFlow facial - expression recognition model file (assumed to be expression_recognition.pb) and the calibration dataset. The calibration dataset is used to analyze the distribution of model parameters during the quantization process. We have selected some representative facial - expression photos as calibration data.
- Run the OMG offline model conversion tool and configure the conversion parameters. Set the running mode to 0 (no - training mode), the deep - learning framework type to 3 (TensorFlow), the original model file path to expression_recognition.pb, the calibration - method quantization configuration file path according to the calibration dataset and quantization requirements (assumed to be calibration_config.prototxt), the absolute path of the output model file to harmonyos_expression_recognition.pb, the input format to NHWC according to the input requirements of the model (since the model input is image data), and the input shape to (1, 48, 48, 1) according to the requirements of the model for the image (indicating a batch size of 1, an image height and width of 48 pixels, and 1 channel because it is a grayscale image).
- After running the conversion tool, wait for the conversion process to complete. During the conversion process, carefully observe the log information output on the console to ensure that no errors occur. If an error occurs, check whether the parameter configuration, file path, etc. are correct according to the error message and make corresponding adjustments.
Verification and Optimization of the Converted Model
- Correctness Verification: Input a set of test facial - expression photos into the converted model