During the last Netguru team
For some time now I’ve been interested in machine learning and I thought of implementing this myself. To solve this problem I’ve used
Object Detection API SSD MultiBox model using
- The first step is to install all the necessary dependencies and clone the Object Detection API repository. The
ObjectDetectionAPI repository can be cloned from https://github.com/tensorflow/models.
- Before we can start training the model we need some input data for training and evaluation, in a format accepted by the
ObjectDetectionAPI - TFRecord. Additionallywe should specify the label map, which does the mapping of class id to classname. Labels should be identical for training and evaluation datasets. To accomplish this step I’ve used this script, that fetches human hand pictures dataset from http://www.robots.ox.ac.uk/~vgg/data/hands/index.html and creates necessary files. Outputof this script: hands_train.record, hands_val.record, hands_test.record and hands_label_map.pbtxt.
- When the input files are ready we can start configuring our model.
ObjectDetectionAPI uses protobuffiles to configure train and eval work, more info about configuration pipeline can be found here. ObjectDetectionAPI provides several sample configuration. Those configurations are a good starting point, with minimal effort you’ve got workingconfiguration. As I wrote onthe beginning of this post I’ve used ssd_mobilenet_v1_coco.config. I’ve changed following parameters:
- num_classes to 1 because I wanted to detect only one type of objects - hand.
- num_steps to 15000 because running locally can take forever :D
- fine_tune_checkpoint to location of earlier downloaded frozen model ssd_mobilenet_v1_coco_2017_11_17/model.ckpt.
- input_path and label_map_path of train_put_reader and eval_input_reader path to previously generated hands_train.record, hands_test.record and hands_label_map.pbtxt files.
- I’ve trained
modelon my local machine, to do this I’ve used scriptfrom library:
python object_detection/train.py \
- The library also provides a script to evaluate the model during and after training:
python object_detection/eval.py \
- After the work is done we can freeze our trained model using the following script:
python object_detection/export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path object_detection/ssd_mobilenet_v1_hands.config \ --trained_checkpoint_prefix object_detection/training/model.ckpt-15000 --output_directory object_detection/frozen/
- Now it’s time to use our model to detect a hand on a mobile device’s camera preview. To implement this quickly I’ve used a demo project from the TensorFlow repo. I’ve cloned it and imported a project from examples/android directory.
Projectcan be build using different build systems: bazel, cmake, makefile, none. I’ve built projectusing cmake, to do this I’ve changed in build.gradle variable nativeBuildSystem to cmake(I had a problemswith others build systems).
- In order to use the frozen model and labels, we need to put them in the assets directory, then assign the assets names to the TF_OD_API_MODEL_FILE and TF_OD_API_LABELS_FILE variables in DetectorActivity class.
AdditionallyI’ve changed the camera preview to use the front camera and implemented a toast message to pop up when a hand is detected.
- Results of my experiment:
Thanks to the