1. Overview
I tested the operation of training the YOLOv8 object detection network with converted BIRDS 525 dataset described on this page using the free version of Google Colab.
Although the free version of Google Drive is limited to 15 GB, the BIRDS 525 dataset is 1.96 GB in its zipped form. So, we can run this operation on the free version.
A training of 100 epochs using yolov8n, which has a relatively small network size, could be completed in about 8 minutes and 30 seconds.
2. Upload a compressed file of BIRDS 525 dataset on Google Drive
2.1. Download BIRDS 525 SPECIES – IMAGE CLASSIFICATION dataset
Download the BIRDS 525 SPECIES – IMAGE CLASSIFICATION dataset from this page.
2.2. Upload archive.zip to Google Drive
Upload the archive.zip downloaded in 2.1. above to Google Drive.
3. Running commands on Google Colab
3.1. Mounting Google Drive
Write the following Python script in the code cell and run the script. Google Drive will be mounted on /content/drive.
from google.colab import drive drive.mount('/content/drive')
3.2. Decompress BIRDS 525 compressed file
Write the following in the code cell and execute the script. Data archive.zip will be copied from Google Drive to Google Colab and extracted. It took 1 minute and 30 seconds to extract the zip file.
%%bash mkdir -p kaggle/birds525 cd kaggle/birds525 cp /content/drive/MyDrive/kaggle/birds525/archive.zip . unzip archive.zip cd /content
I have also tried extracting files on Google Drive by writing the following script in a code cell. By doing so, the extracted files will remain even if the Google Colab connection is lost. In this case, however, it took about 20 minutes to extract the zip file.
If the data is placed in Google Drive and referenced from Google Colab, training the neural network also seems to take long time. Although we will need to re-deploy it when we reconnect, it would be better to copy the data to Google Colab before deploying it.
%%bash cd /content/drive/MyDrive/kaggle/birds525/ unzip archive.zip
3.3. Installing Ultralytics YOLO
Write the following in the code cell and execute the script.
%pip install ultralytics import ultralytics ultralytics.checks()
3.4. Creating datasets in Ultralytics YOLO format
I have prepared a Python script on this GitHub page to create an Ultralytics YOLO format dataset from the BIRDS 525 dataset. Write the following in the code cell and execute the script.
!git clone https://github.com/fukagai-takuya/birds525yolo.git
Write the following in the code cell and execute the script. The script below creates a dataset in Ultralytics YOLO format from the BIRDS 525 dataset.
In the script below, /content/kaggle/birds525/ is the directory where the expanded BIRDS 525 dataset is located. /content/birds525-yolo-data is the output directory for the generated Ultralytics YOLO format dataset.
%%bash mkdir birds525-yolo-data cd birds525yolo python3 ./create_yolo_dataset_from_birds525_limit_bird_species.py /content/kaggle/birds525/ /content/birds525-yolo-data cd /content
The following log will be output. As described on this page, some images were excluded from the training data when generating the YOLO format training data, for example, when multiple birds were detected even though only one bird was supposed to be in the image.
When an excluded image is processed, a message beginning with Failed is output. Although many messages beginning with Failed will be output, there is no problem.
/content/kaggle/birds525/ satisfy the requirement of birds525/archive/ Downloading https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov9c.pt to 'yolov9c.pt'... YOLOv9c summary: 618 layers, 25,590,912 parameters, 0 gradients, 104.0 GFLOPs valid Failed: len(boxes.cls):2, label_name:BLUE HERON, image_file:2.jpg success_counter: 19 ... failure_counter_single_bird_multiple_objects: 1 train Failed: number_of_birds:0, label_name:BLUE HERON, image_file:088.jpg ... Failed: number_of_birds:2, label_name:ROCK DOVE, image_file:115.jpg success_counter: 471 failure_counter_results_not_one: 0 failure_counter_no_birds: 11 failure_counter_multiple_birds: 65 failure_counter_single_bird_multiple_objects: 25 100%|██████████| 49.4M/49.4M [00:00<00:00, 362MB/s]
3.5. Running the training command
Write the following in the code cell and execute the script.
The yolov8n.pt set in the model parameter is the data of the trained network. To ensure short training times, yolov8n, which has a relatively small network size, is specified as the trained network. The network trained on other data is used as the initial network, and the data prepared this time is used for training.
The data parameter is the data.yaml file of the generated Ultralytics YOLO format dataset. The parameter “epochs” is the number of epochs. In the example below, 100 is specified, so the network will be trained using 100 repetitions of the prepared set of training data.
The image size for the BIRDS 525 dataset is 224 x 224 pixel, and training with the larger default image size of 640 did not progress. Therefore, the image size is passed as a parameter, such as imgsz=224.
!yolo train model=yolov8n.pt data=/content/birds525-yolo-data/data.yaml epochs=100 imgsz=224
The training could be completed in about 8 minutes and 30 seconds, and the following message was output to the log.
Downloading https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt to 'yolov8n.pt'... 100% 6.25M/6.25M [00:00<00:00, 280MB/s] Ultralytics YOLOv8.2.98 🚀 Python-3.10.12 torch-2.4.1+cu121 CUDA:0 (Tesla T4, 15102MiB) engine/trainer: task=detect, mode=train, model=yolov8n.pt, data=/content/birds525-yolo-data/data.yaml, epochs=100, time=None, patience=100, batch=16, imgsz=224, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs/detect/train Downloading https://ultralytics.com/assets/Arial.ttf to '/root/.config/Ultralytics/Arial.ttf'... 100% 755k/755k [00:00<00:00, 130MB/s] Overriding model.yaml nc=80 with nc=4 from n params module arguments 0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2] ... ... ... 22 [15, 18, 21] 1 752092 ultralytics.nn.modules.head.Detect [4, [64, 128, 256]] Model summary: 225 layers, 3,011,628 parameters, 3,011,612 gradients, 8.2 GFLOPs Transferred 319/355 items from pretrained weights TensorBoard: Start with 'tensorboard --logdir runs/detect/train', view at http://localhost:6006/ Freezing layer 'model.22.dfl.conv.weight' AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n... AMP: checks passed ✅ train: Scanning /content/birds525-yolo-data/labels/train... 496 images, 0 backgrounds, 0 corrupt: 100% 496/496 [00:00<00:00, 1829.14it/s] train: New cache created: /content/birds525-yolo-data/labels/train.cache albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, num_output_channels=3, method='weighted_average'), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) /usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. self.pid = os.fork() val: Scanning /content/birds525-yolo-data/labels/val... 20 images, 0 backgrounds, 0 corrupt: 100% 20/20 [00:00<00:00, 1490.62it/s] val: New cache created: /content/birds525-yolo-data/labels/val.cache Plotting labels to runs/detect/train/labels.jpg... optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... optimizer: AdamW(lr=0.00125, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0) TensorBoard: model graph visualization added ✅ Image sizes 224 train, 224 val Using 2 dataloader workers Logging results to runs/detect/train Starting training for 100 epochs... Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 1/100 0.386G 0.9005 3.037 1.234 41 224: 100% 31/31 [00:07<00:00, 4.10it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:01<00:00, 1.65s/it] all 20 20 0.0162 1 0.489 0.44 ... ... ... Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 100/100 0.348G 0.1748 0.178 0.8938 16 224: 100% 31/31 [00:03<00:00, 9.97it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 10.73it/s] all 20 20 0.988 1 0.995 0.945 100 epochs completed in 0.132 hours. Optimizer stripped from runs/detect/train/weights/last.pt, 6.2MB Optimizer stripped from runs/detect/train/weights/best.pt, 6.2MB Validating runs/detect/train/weights/best.pt... Ultralytics YOLOv8.2.98 🚀 Python-3.10.12 torch-2.4.1+cu121 CUDA:0 (Tesla T4, 15102MiB) Model summary (fused): 168 layers, 3,006,428 parameters, 0 gradients, 8.1 GFLOPs Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 9.44it/s] all 20 20 0.984 1 0.995 0.954 BLUE HERON 5 5 0.984 1 0.995 0.9 EUROPEAN TURTLE DOVE 5 5 0.976 1 0.995 0.995 MALLARD DUCK 5 5 0.991 1 0.995 0.926 ROCK DOVE 5 5 0.984 1 0.995 0.995 Speed: 0.0ms preprocess, 0.6ms inference, 0.0ms loss, 1.1ms postprocess per image Results saved to runs/detect/train 💡 Learn more at https://docs.ultralytics.com/modes/train
3.6. Running object detection on trained network
Write the following in the code cell and execute the script.
This is the object detection process using the network trained in 3.5 above.
The model parameter specifies the weight data best.pt obtained after training.
The input image specified with the source parameter is an image from the test directory of the BIRDS 525 dataset and is not used for training.
!yolo predict model=/content/runs/detect/train/weights/best.pt source="/content/kaggle/birds525/test/EUROPEAN TURTLE DOVE/1.jpg"
It was executed in about 7 seconds and the following message was output to the log.
Ultralytics YOLOv8.2.98 🚀 Python-3.10.12 torch-2.4.1+cu121 CUDA:0 (Tesla T4, 15102MiB) Model summary (fused): 168 layers, 3,006,428 parameters, 0 gradients, 8.1 GFLOPs image 1/1 /content/kaggle/birds525/test/EUROPEAN TURTLE DOVE/1.jpg: 224x224 1 EUROPEAN TURTLE DOVE, 20.2ms Speed: 1.3ms preprocess, 20.2ms inference, 858.0ms postprocess per image at shape (1, 3, 224, 224) Results saved to runs/detect/predict 💡 Learn more at https://docs.ultralytics.com/modes/predict
Double-click on the detection result image /content/runs/detect/predict/1.jpg displayed in the folder on the left to see the detection result as shown in the image below. The network have successfully detected a EUROPEAN TURTLE DOVE.