I was reading Practical Deep Learning for Cloud, Mobile, and Edge and I wanted to try a simple reverse image search on Caltech 256 with Tensorflow on Kaggle.
Idea
The idea is to take a pre-trained model (ResNet50) and eliminate the fully connected layer. Indeed, the last layer now become a conv layer with dimension 2048. This is called feature list or embedding.
Now, images that are similar have also similar embedding. In order to find this similarity there are several method: here we use Knn and, moreover, we make a comparison of the speed with annoy.
Dataset Preparation & model
I’ve used Kaggle with this dataset. The dataset are under ../input/caltech256/256_ObjectCategories
Create the embedding for each image of the dataset
KNN and result
Here we select a specific image (IMG_INDEX) and we set the number of neighbors. For this particular image, we select the 6 nearest neighbors and we plot the result.
Moreover, with the %timeit, we get the time for getting the neighbors
237 ms ± 1.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Here the result
Time search improvement
Let’s try with the Annoy library if we can reduce the time of the search operation
With annoy, the time for the search is
627 µs ± 34.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
A great improvement!
How to improve
- pca on the embedding
- different model
- fine tuned the model