DEV Community

Cover image for Adding a TFLite model to your android app
Terrence Aluda
Terrence Aluda

Posted on • Updated on

Adding a TFLite model to your android app

Hello there. Perhaps you are wondering how to straight-forwardly add some Machine Learning into your app. The documentation currently talks about on-device training. Furthermore, you are likely to run into data type incompatibility errors when trying to run some inferences. In this tutorial, we are going to see how we can add an already-trained model into your app and get predictions from it. It will perform a heart attack disease probability detection. Let's dig in.

Prerequisites

To follow through, you need hands-on knowledge in the following areas:

  • Python Programming
  • Machine Learning. Basic deep-learning knowledge is a plus.
  • Kotlin programming
  • Android development

You will also need the following tools installed.

  • TensorFlow. We will use this for creating the model.
  • We will be using this Kaggle dataset. Download and extract it. Our interest will be the heart.csv file.
  • Pandas. This will be used to help in the easy manipulation of the dataset.
  • Scikit-learn. It will be for preprocessing the data.
  • You also need Android Studio and Jupyter Notebook installed. If you don't have Jupyter Notebook, you can use the normal Python scripts. But for the article, Jupyter Notebook will be used.

What we will be doing

To achieve the goals, we will follow these steps:

  1. Create a TensorFlow model.
  2. Convert it to TFLite.
  3. Create an Android app and install the dependencies needed.
  4. Import the converted TFLite model.
  5. Add the code to access the model and run the inferences.

Creating the TensorFlow model

Be sure to start your Jupyter Notebook server before this.

In your working folder, create a folder called datasets. In the created folder, add the heart.csv file you extracted earlier in it. When done, move back to the root of your folder.

In a new cell, import the modules we need using the code below. (In the next steps, you will be adding the codes in new cells).

from sklearn.model_selection
import train_test_split
import tensorflow as tf
import pandas as pd
Enter fullscreen mode Exit fullscreen mode

These were discussed in the prerequisites section. We will be using the Keras(tf.keras) option provided by TensorFlow.

We then access the dataset and read it using pandas using this code.

file_path = 'datasets/heart.csv'
heart_data = pd.read_csv(file_path)
Enter fullscreen mode Exit fullscreen mode

Create a pipeline that will impute it and then scale it.

from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

proc_pipeline = Pipeline([
    ('imputer', SimpleImputer(strategy="median")),
    ('std_scaler', StandardScaler()),
])
Enter fullscreen mode Exit fullscreen mode

Remove the output column.

heart_data_no_output = heart_data.drop("output", axis=1)
Enter fullscreen mode Exit fullscreen mode

Run the pipeline.

heart_data_imputed = proc_pipeline.fit_transform(heart_data_no_output)
Enter fullscreen mode Exit fullscreen mode

Split it into train data, test data, and validation data while maintaining a random state using a seed of 42.

X_train_full, X_test, y_train_full, y_test = train_test_split(heart_data_imputed, heart_data.output, random_state=42)

X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full, random_state=42)
Enter fullscreen mode Exit fullscreen mode

Create a class called Model.

class Model(tf.Module):
    def __init__(self):
        input = tf.keras.layers.Input(shape=X_train.shape[1:])
        hidden1 = tf.keras.layers.Dense(50, activation="sigmoid")(input)
        hidden2 = tf.keras.layers.Dense(50, activation="sigmoid")(hidden1)
        output = tf.keras.layers.Dense(1)(hidden2)
        self.model = tf.keras.models.Model(inputs=[input], outputs=[output])
        opt = tf.keras.optimizers.SGD(learning_rate=0.01)
        self.model.compile(loss="mse", optimizer=opt, metrics=['accuracy'])
        self.model.fit(X_train, y_train, epochs=20,
                       validation_data=(X_valid, y_valid))

    @tf.function(input_signature=[
        tf.TensorSpec([None, 13], tf.float32),
    ])
    def predictor(self, x):
        predictions = self.model(x)
        return {
            "prediction": predictions
        }
Enter fullscreen mode Exit fullscreen mode

We create the model(deep learning model) in the constructor. The model contains 4 layers:

  • One input(input) layer takes in the number of train features as the shape parameter. That will be 13, equating to 13 neurons.
  • Two hidden layers(hidden1 and hidden2) each with 50 neurons, sigmoid activation functions, and each taking the previous layer as the input. This follows the Keras Functional API architecture. We used the sigmoid activation function since we are trying to get the probability.
  • The output layer(output) had only 1 neuron since we are outputting the highest probability, which is only one thing.

We use a Stochastic Gradient Descent Optimizer with a learning rate of 0.01. This optimizer will be used when compiling the model. The Mean Squared Error(mse) and accuracy are used for the loss and performance measuring metrics.

In fitting the model, we use 20 epochs while passing in the necessary data.

After the constructor, we have a TensorFlow function called predictor. We pass in the shape of the data(input) and the data type of the input in the function's signature.

    @tf.function(input_signature=[
        tf.TensorSpec([None, 13], tf.float32),
    ])
Enter fullscreen mode Exit fullscreen mode

The input will be passed in as an argument signified by the parameter, x. We use the self.model(x) statement. This returns data containing the model's prediction. We will use a map with a key, prediction, to tap it.

    def predictor(self, x):
        predictions = self.model(x)
        return {
            "prediction": predictions
        }
Enter fullscreen mode Exit fullscreen mode

Convert the model to TFlite

We will first initialize the model.

tcardio_model = Model()
Enter fullscreen mode Exit fullscreen mode

We then save it in a directory called our_model. Here is where we will pass in the signatuure of our Tensorflow function. We give it the signature key, predictor. This will be used by our TFLite model to identify the function.

SAVED_MODEL_DIR = "saved_model"

tf.saved_model.save(
    tcardio_model,
    SAVED_MODEL_DIR,
    signatures={
        'predictor':
            tcardio_model.predictor.get_concrete_function(),
    })
Enter fullscreen mode Exit fullscreen mode

The next code is where will create the TFLite model.

import os

converter = tf.lite.TFLiteConverter.from_saved_model(SAVED_MODEL_DIR)

converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS, 
    tf.lite.OpsSet.SELECT_TF_OPS 
]

converter.experimental_enable_resource_variables = True
tflite_model = converter.convert()

model_file_path = os.path.join('tcardio_model.tflite')
with open(model_file_path, 'wb') as model_file:
    model_file.write(tflite_model)
Enter fullscreen mode Exit fullscreen mode

It starts by importing the os module since we are accessing the file system.

The next line simply instructs the Python interpreter to create a variable for a TFLite converter from the saved model.

In the array(converter.target_spec.supported_ops), we add options for enabling TFLite operations.

And finally, we add the TFLite model to the root path.

Run all cells. If everything completes successfully, you
will see a file called tcardio_model.tflite in your working directory.

Creating the Android app and install the dependencies needed

Follow the process of creating an android app. You can choose the Empty Activity option. But we won't modify the UI in any way.

Open the module-level build.gradle file and add the TFLite dependencies. Take note of the versions. There is a newly released version that isn't compatible with what we are going to do next. Stay tuned for the newly updated article.

implementation 'org.tensorflow:tensorflow-lite:2.9.0'
implementation 'org.tensorflow:tensorflow-lite-gpu:2.10.0'
implementation 'org.tensorflow:tensorflow-lite-support:0.4.2'
Enter fullscreen mode Exit fullscreen mode

Sync the Gradle.

Import the converted TFLite model.

After the build is done, create a new folder called assets. Then copy the TFLite model generated into the directory.

In the MainActivity class, we will add the code necessary for accessing the model ad running the inferences. This is achieved in the next step.

Add the code to access the model and run the inferences

Open the MainActivity.kt file. In the file create a companion object and add the model's path.

    companion object {

        private const val MODEL_PATH = "tcardio_model.tflite"
    }
Enter fullscreen mode Exit fullscreen mode

Then declare a class-level variable for the TFLite model using a lazy delegation. We pass in the TFLite file(in ByteBuffer form) in the Interpreter class.

    private val tflite by lazy {
        Interpreter(
            FileUtil.loadMappedFile(this, MODEL_PATH))
    }

Enter fullscreen mode Exit fullscreen mode

Due to the compatibility issues between Kotlin's(Java) Float and TensorFlow Float(float32 or float64) datatypes, we need a helper function to convert the Kotlin Float(s) to float buffers before sending them to the TFLite model. We will use the method below.

    fun floatArrayToBuffer(floatArray: FloatArray): FloatBuffer? {
        val byteBuffer: ByteBuffer = ByteBuffer
            .allocateDirect(floatArray.size * 4)

        byteBuffer.order(ByteOrder.nativeOrder())

        val floatBuffer: FloatBuffer = byteBuffer.asFloatBuffer()

        floatBuffer.put(floatArray) 
        floatBuffer.position(0)
        return floatBuffer
    }
Enter fullscreen mode Exit fullscreen mode

It first allocates a capacity of 4 since Java Floats are 4 bytes in size.

NOTE: We are referring to Java because Kotlin is based on Java and still uses some of Java's classes.

We then set the buffer allocation order to be the same as that of the underlying JVM machine. We then use the asFloatBuffer() method to convert the byte buffer to a float buffer. The method finishes by adding the float array we will pass into the float buffer we just created and sets the float buffer's start position to 0.

For performing the prediction, we will need a method called doInference().

    fun doInference():  Map<String, FloatBuffer?> {
        val input = floatArrayOf(63.0F, 1.0F, 3.0F, 145.0F , 233.0F , 1.0F, 0.0F, 150.0F, 0.0F, 2.3F, 0.0F, 0.0F, 1.0F)

        val inF = floatArrayToBuffer(input)

        var outs = floatArrayOf(0.0F)

        var outF = floatArrayToBuffer(outs)

        val inputs: Map<String, FloatBuffer?> = mapOf("x" to inF)

        var outputs: Map<String, FloatBuffer?> = mutableMapOf("prediction" to outF)

        tflite.runSignature(inputs, outputs, "predictor")
        return outputs
    }
Enter fullscreen mode Exit fullscreen mode

The method sets our input array to the floatArrayToBuffer method we have just finished creating. We do the same for the output variable. We need this as it is what TFLite will map the output to.

The maps are used for matching the inputs and the outputs according to what was set in the TensorFlow model's function. Remember we had the input as x and the output as prediction.

We use the TFLite's runSignature method to get the prediction. The input, the output variable, and the signature of the Tensorflow function are passed in as parameters.

To get the output, we will add some statements in the onCreate method. These are the statements.

        try {
            var pred = doInference()

            var cb = pred.get("prediction")

            Log.i("PREDICTION: ", cb?.get(0).toString())
        }catch(e: Exception){
            Log.e("ERR: ", e.toString())
        }
Enter fullscreen mode Exit fullscreen mode

W use a try-catch block to get any exception that may occur. Since we said the TensorFlow function returns a map, we will use the Map.get() method while passing in the key to retrieve the corresponding output. In this case, it will be a float buffer. The prediction value is normally returned as the first value of the float buffer returned. So we used the method get(0) while passing in the position.

On checking the logcat, you will see such an output.

PREDICTION:             com.bigdolphin.predtest    I  0.78253055
Enter fullscreen mode Exit fullscreen mode

That means the model gave a 78.25% chance of a heart attack. This is in agreement with the sigmoid curve. Above 0.5 is 1 and below that is 0. The input data is from the first row of the dataset whose output is 1.

Conclusion

That is it. We have seen how to use a TFLite model in our apps. We did so by first creating a model from scratch, converting it, and then feeding it to the Android platform where we run an inference. I hope you gained some insights. In case of any questions, inaccuracies, or contributions, please feel free to leave a comment or email me at contact@terrence-aluda.com.

Have a good one.

Top comments (0)