Setting up the AI Development Environment and Using Tools in HarmonyOS Next

This article aims to deeply explore the technical details related to setting up the AI development environment and using tools in the Huawei HarmonyOS Next system (up to API 12 as of now), and summarize it based on actual development practices. It mainly serves as a vehicle for technical sharing and communication. There may be mistakes and omissions. Colleagues are welcome to put forward valuable opinions and questions so that we can make progress together. This article is original content, and any form of reprint must indicate the source and the original author.

I. Overview of the AI Development Environment in HarmonyOS Next

(1) Introduction to Hardware and Software Environment Requirements

Hardware Environment
- Processor: For AI development in HarmonyOS Next, it is recommended to use a processor with good performance, such as a multi-core CPU. For example, Intel Core i5 and above series processors can provide sufficient computing power to support complex computational tasks during model training and inference. In the model training stage, especially for the training of deep learning models, a large number of matrix operations require powerful CPU computing power for acceleration. If the processor performance is insufficient, the model training time will increase significantly.
- Memory: Sufficient memory is the key to ensuring the smooth progress of the development process. Generally, at least 8GB of memory is required. For the development and training of large models, 16GB or more memory is more appropriate. During the model training process, data loading, storage of intermediate calculation results, etc., all require memory. If the memory is insufficient, the system may frequently use virtual memory, making the development process slow or even resulting in out-of-memory errors.
- Storage: A certain amount of storage space is required to store development tools, model files, datasets, etc. It is recommended to use a solid-state drive (SSD) because it has a faster read and write speed compared to traditional mechanical hard drives, which can accelerate data reading and storage and improve development efficiency. For example, when dealing with large-scale image datasets, the fast read and write speed of the storage can reduce data loading time and make the model training process smoother.
Software Environment
- Operating System: HarmonyOS Next itself serves as the development platform, and its operating system version needs to be compatible with AI development tools and libraries. Ensure that the latest version of the HarmonyOS Next system is installed, and update system patches in a timely manner to obtain the best stability and performance support. At the same time, some other basic software may need to be installed, such as the Java Runtime Environment (JRE), because some development tools may be developed based on Java.
- Development Toolkit: Install the specific AI development toolkit for HarmonyOS Next, such as the HiAI Foundation Kit. These toolkits provide a series of APIs and functions, making it convenient for developers to develop, train, and deploy AI models. For example, the HiAI Foundation Kit includes algorithm libraries for model training, tools for model optimization, and interfaces related to hardware acceleration.
- Dependent Libraries and Frameworks: According to development requirements, some additional dependent libraries and frameworks may need to be installed. For example, in deep learning development, popular deep learning frameworks such as TensorFlow or PyTorch may need to be installed. These frameworks provide rich functions for model construction and training, which can greatly improve development efficiency. At the same time, some mathematical calculation libraries (such as numpy), image processing libraries (such as the HarmonyOS-compatible library of OpenCV), etc., may also need to be installed to assist in data processing and model development.

(2) Differences in Environment Requirements for Different Development Scenarios

Model Training Scenario
- Computational Resource Requirements: Model training requires a large amount of computational resources, especially when dealing with large-scale datasets and complex model structures. In addition to the high-performance processor and sufficient memory mentioned earlier, for deep learning model training, if possible, using a GPU or NPU for acceleration will significantly improve training efficiency. GPUs have powerful parallel computing capabilities and can accelerate matrix operations in neural networks. For example, when training a deep convolutional neural network model, using a GPU can reduce the training time from several days to several hours or even shorter.
- Data Storage and Processing Capabilities: The storage and processing of training data are important aspects of the model training scenario. Sufficient storage space is required to store a large amount of training data, and the ability to read and preprocess data quickly. For example, in an image recognition project, tens of thousands or even millions of image data may need to be stored, so a large-capacity hard drive is required. At the same time, efficient data processing capabilities are also important, including operations such as data cleaning, augmentation, and normalization. These operations need to be completed within a reasonable time to ensure the smooth progress of the training process.
Model Deployment Scenario
- Device Compatibility: When deploying a model, the hardware and software environment of the target device needs to be considered. There are many types of HarmonyOS Next devices, and the hardware configurations (such as CPU performance, memory size, whether NPU is supported, etc.) and operating system versions of different devices may vary. Therefore, during the development process, it is necessary to ensure that the model can run normally on different types of HarmonyOS Next devices. For example, for low-end devices, the model needs to be optimized to make it run efficiently with limited resources; for high-end devices, their hardware advantages can be fully utilized to achieve more advanced functions.
- Runtime Environment Support: Model deployment requires a corresponding runtime environment to load and execute the model. Ensure that the necessary runtime libraries and interpreters are installed on the target device. For example, for a quantized model, a specific quantization inference engine may be required to run it. At the same time, consider the performance and resource occupancy of the runtime environment to avoid affecting the overall performance of the device due to an overly large or inefficient runtime environment.

(3) Comparison of the Impact of Different Environment Configurations on Development Efficiency and Application Performance

Impact of Hardware Environment Configuration
- A high-performance hardware configuration (such as a high-end CPU, large-capacity memory, and fast storage) can significantly improve development efficiency. During the model development process, a fast processor can accelerate code compilation, model training, and debugging. For example, when compiling an AI project with a large amount of code, a high-performance CPU can reduce the compilation time, allowing developers to modify and test the code more quickly. A large-capacity memory can avoid system freezes and errors caused by insufficient memory, making the development process smoother. Fast storage (SSD) can speed up data reading and storage, especially when dealing with large-scale datasets, reducing data loading time and improving development efficiency.
- In terms of application performance, a good hardware configuration also helps to improve the training and inference speed of the model. For example, using a GPU or NPU for model training can greatly shorten the training time, enabling the model to converge to better performance more quickly. After the model is deployed to the device, high-performance hardware can make the model process input data more quickly, provide a more timely response, and enhance the user experience. However, high-end hardware configurations usually come at a higher cost, which may not be economically feasible for some small projects or developers with limited resources.
Impact of Software Environment Configuration
- Choosing the right development toolkit and dependent libraries can greatly improve development efficiency. For example, using a powerful and easy-to-use AI development toolkit (such as the HiAI Foundation Kit) can provide rich APIs and tools, reducing the workload of developers. At the same time, dependent libraries and frameworks that are highly compatible with the HarmonyOS Next system (such as well-adapted versions of TensorFlow or PyTorch) can avoid compatibility issues and make the development process smoother. Reasonable software environment configuration can also improve the maintainability and scalability of the code, facilitating subsequent development and optimization work.
- In terms of application performance, an optimized software environment can make the model run more efficiently at runtime. For example, by configuring appropriate quantization tools and libraries, the model can be quantized, reducing the model size and improving computational efficiency, thus achieving faster inference speed on the device. In addition, an optimized software environment can make better use of hardware resources. For example, by reasonably setting the parameters of the library, the GPU or NPU can give full play to its acceleration effect and improve the overall performance of the model.

II. Introduction and Use of Key Development Tools

(1) Functions and Usage Methods of Model Conversion Tools

OMG Offline Model Conversion Tool (Assuming Mentioned in the Document)
- Functions: This tool is mainly used to convert models trained in different deep learning frameworks (such as TensorFlow, PyTorch, etc.) into formats that can be recognized and run by HarmonyOS Next devices. It can optimize and adjust the model to make it adapt to the operating environment of HarmonyOS Next, and may also support functions such as model quantization to reduce the model size and improve computational efficiency.
- Usage Methods:
  - Preparation Work: First, ensure that the dependent environment required by the OMG offline model conversion tool, such as the Java Runtime Environment (JRE), has been installed. Then, prepare the original model file to be converted (such as a TensorFlow pb model or a PyTorch pt model) and the calibration dataset (if quantization is required).
  - Parameter Configuration: When running the tool, it is necessary to configure through command-line parameters. For example, use the --mode parameter to specify the running mode (such as running mode 0 mentioned in the document, indicating the no-training mode, and currently only this mode may be supported); select the type of deep learning framework through the --framework parameter, for example, 3 represents TensorFlow, and 5 represents PyTorch or ONNX; use the --model parameter to specify the path of the original model file; the --cal_conf parameter sets the path of the calibration method quantization configuration file (if quantization is carried out), and this file contains some key configuration information during the quantization process, such as the selection of the quantization algorithm and the setting of the quantization range; the --output parameter specifies the absolute path of the model file after quantization; the --input_shape parameter sets the shape of the input data according to the input requirements of the model, ensuring that it is consistent with the actual input node shape of the model.
  - Execution of Conversion: After configuring the parameters, run the tool to start the model conversion. The tool will analyze the original model according to the calibration dataset (if any), determine the quantization parameters (if quantization is carried out), then convert the parameters in the model into low-precision data types (if quantized), and generate the converted model file. During the conversion process, pay attention to observing the log information output on the console, and promptly discover and solve possible problems, such as data format mismatch and path errors.
Other Model Conversion Tools (If Other Tools Are Mentioned in the Document)
- [Introduce the functions and usage methods of other model conversion tools in a similar structure]

(2) Functions and Usage Demonstration of Quantization Tools

Function Introduction The main function of the quantization tool is to convert the high-precision data types (such as 32-bit floating-point numbers) in the model into low-precision data types (such as 8-bit integers), thereby reducing the storage size of the model, lowering the computational complexity, and to a certain extent, improving the computational efficiency of the model, making it more suitable for running on HarmonyOS Next devices with limited resources. At the same time, the quantization tool will try to maintain the performance of the model during the conversion process, and reduce the impact of quantization on the model accuracy through some technical means (such as quantization-aware training).
Example of the Usage Process (Taking the Relevant Tool Mentioned in the Document as an Example) Suppose we use a certain quantization tool mentioned in the document to quantize a TensorFlow model.
- Preparation Stage: Ensure that the dependencies required by the quantization tool have been installed, and have the TensorFlow model file (.pb format) to be quantized and the corresponding calibration dataset. The calibration dataset is used to analyze the distribution of model parameters to determine appropriate quantization parameters.
- Parameter Configuration: Set the relevant parameters according to the requirements of the quantization tool. For example, it may be necessary to specify the model file path, the calibration dataset path, the quantization output path, the quantization method (such as uniform quantization or non-uniform quantization), the quantization range, and other parameters. The setting of these parameters will directly affect the performance and size of the quantized model.
- Execution of Quantization: Run the quantization tool. The tool will read the model file and the calibration dataset according to the configuration parameters, and perform quantization processing on the model. During the quantization process, the tool will determine the quantization interval and mapping relationship according to the data distribution, convert the parameters in the model into low-precision data types, and generate the quantized model file. For example, during the quantization process of an image classification model, the tool will analyze the parameter distribution of the image data after passing through each layer of the model, determine the appropriate quantization range, and convert the parameters represented by 32-bit floating-point numbers into 8-bit integers, thus reducing the model size. The quantized model may reduce the storage requirement by several times or even dozens of times, and at the same time, on some hardware platforms, the computational efficiency may be significantly improved because low-precision calculations are usually faster than high-precision calculations.

(3) Common Problems and Solutions in Tool Usage

Common Problems and Solutions of Model Conversion Tools
- Incorrect Model File Path: When using the model conversion tool, the problem of an incorrect specified model file path often occurs. For example, there may be typos in the path, incorrect file extensions, or the file does not exist in the specified path. The solution is to carefully check the accuracy of the model file path, ensure that the file name and folder name are spelled correctly, and that the file actually exists in the specified location. You can use the file manager to view the file path, or use the ls (Linux system) or dir (Windows system) command in the command line to check if the file exists.
- Incompatible Framework Version: If the version of the deep learning framework used is not compatible with the model conversion tool, the conversion may fail. For example, the model conversion tool may only support a specific version of TensorFlow or PyTorch. The solution is to check the documentation of the model conversion tool, determine the supported framework version, and install the corresponding version of the framework. If it is not possible to upgrade or downgrade the framework version, you can try to find other conversion tools or methods that are compatible with the current framework version.
- Calibration Dataset Problem: When performing model quantization conversion, problems with the calibration dataset may lead to poor quantization results or conversion failure. For example, the number of samples in the calibration dataset is too small, the data distribution is too different from the actual application data, or the data format is incorrect. The solution is to ensure that the calibration dataset is sufficiently representative and contains various possible input situations. You can increase the number of samples in the calibration dataset, or use data with a distribution similar to the actual application data for calibration. At the same time, check whether the data format of the calibration dataset is consistent with the requirements of the tool, such as whether the size and number of channels of the image meet the requirements.
Common Problems and Solutions of Quantization Tools
- Improper Setting of Quantization Parameters: The setting of quantization parameters directly affects the performance and size of the quantized model. If the quantization range is set unreasonably, it may lead to data overflow or excessive accuracy loss. For example, if the quantization range is too small, some larger parameter values may not be accurately represented, affecting the accuracy of the model. The solution is to reasonably set the quantization range according to the distribution of model parameters. You can determine the appropriate quantization range by analyzing the maximum and minimum values of model parameters and the data distribution histogram. At the same time, some quantization tools may provide functions to automatically determine the quantization range, and you can try to use these functions.
- Excessive Accuracy Loss: During the quantization process, the problem of excessive accuracy loss of the model may occur, resulting in a decline in the performance of the quantized model in actual applications. This may be caused by improper selection of the quantization method, insufficient calibration dataset, or the model structure being more sensitive to quantization. The solution is to first check whether the calibration dataset is rich enough and representative. If the dataset is insufficient, you can increase the dataset or use data augmentation techniques to expand the dataset. Then, try different quantization methods, such as uniform quantization and non-uniform quantization, and compare their impacts on the model accuracy. For some models with high accuracy requirements, techniques such as quantization-aware training can be considered to take the impact of quantization into account during the training process and reduce accuracy loss. In addition, if the model structure allows, the model can be appropriately adjusted, such as reducing the number of layers or parameters, to make the model more robust to quantization.

III. Optimization and Expansion of the Development Environment and Tools

(1) Optimization Methods of the Development Environment

Configuration of Acceleration Libraries
- Configuration of GPU Acceleration Libraries: If the development environment includes a GPU, configuring the corresponding GPU acceleration library can significantly improve the speed of model training and inference. For example, for deep learning frameworks such as TensorFlow or PyTorch, GPU acceleration libraries such as CUDA and cuDNN can be installed (assuming that the GPU supports the CUDA computing platform). During the installation process, ensure that the versions of CUDA and cuDNN are compatible with the GPU driver version. After the configuration is completed, the deep learning framework will automatically detect and use the GPU for computing acceleration. For example, when training a deep neural network model, after using the GPU acceleration library, the training time can be shortened by several times or even dozens of times, greatly improving development efficiency.
- Configuration of NPU Acceleration Libraries (if applicable): For devices or development environments that support NPU, configuring the NPU acceleration library can further enhance performance. According to the documentation and relevant guidelines of HarmonyOS Next, install the acceleration library corresponding to the NPU and perform the correct configuration. In this way, during the model training and inference process, the NPU can accelerate specific computational tasks, such as convolution operations in neural networks. For example, in some image processing and machine learning applications, using the NPU acceleration library can significantly improve the processing speed of the model and reduce energy consumption at the same time.
Optimization of System Parameters
- Adjustment of File System Parameters: Optimizing file system parameters can improve data read and write speeds, thereby accelerating data loading and storage during the development process. For example, for Linux systems, parameters such as the cache size and pre-read block size of the file system can be adjusted. By increasing the file system cache size, the number of times data is read from the disk can be reduced, and the data reading