I never used the Yolo network so I test it on NFL dataset
Useful material
Github page and the official docs
Notebook
In the following section I will post the most important parts of the notebook, some considerations and the results.
How to custom train the network
The most important thing when using the Yolov5 (for training the custom datset) is to understand how to setup the folder structure.
Here we have:
- parent_folder: the name say it all
- images: it contains the images for the train, valid and test set (not present in this case)
- labels: same thing for the images folder
- yolov5: is the folder where we clone the github repository.
- data: it’s already present in the repo. Inside it we put our custom yaml (I called it data.yaml) file for the custom training
Class
In this example we have five different labels, namely
classes = image_labels['label'].unique().tolist()
# classes -> ['Helmet', 'Helmet-Blurred', 'Helmet-Difficult', 'Helmet-Sideline', 'Helmet-Partial']
Note that they have to be 0-indexing, so later when we create the label txt file Helmet become 0, Helmet-Blurred 1 and so on…
How to map the images with the label
As you can see from the image above (for simplicity I reported only two images) in the train folder we have img0.jpg and img1.jpg. Moreover, I have the corresponding label text files.
What is the content of img0.txt?
Each file should contain one row for each object in the image with the following information:
class x_center y_center width height
See more info https://docs.ultralytics.com/tutorials/train-custom-datasets/#2-create-labels
Custom yaml file
Wandb
Yolov5 is integrated with Wandb, take a look here
import wandb
from kaggle_secrets import UserSecretsClientuser_secrets = UserSecretsClient()
secret_value = user_secrets.get_secret(“wandb-login”)wandb.login(key=secret_value)
Train
I trained the yolov5s (the smallest one)
train.py is inside yolov5 directory
For the model selection refer here. Th project options is for wandb, img for the image size
Inference
This is the code I used for generate the video (with the predictions) that you can see in the thumbnail
Get the bounding box coordinates
results now contains
Let’s iterate and plot the rect
Result
The result is somehow expected. If we order the label by the difficulty we obtain
['Helmet-Difficult', 'Helmet-Partial', 'Helmet-Blurred', 'Helmet-Sideline','Helmet' ]
This is reflected on the f1-score with the confidence. The result are identical
Improve the result with the yolov5 medium net?
Yes, but it take a lot of time to train on Kaggle. The models were trained with only 10 epochs.
Comparison
We should try to train the network for more time, changing some hyperparam to it