Yolov5: Question

Created on 21 Oct 2020 · 6Comments · Source: ultralytics/yolov5

I've a data set where images are in random shape (width and height). So while creating the labels

image_id    object   width   height    x      y  w   h  x_center y_center area labels
 01           cat   1200   800  833 390 254 410  543.5   400.0   104140   1
 02       dog   1500   600  901 284 117 111  509.0   197.5   12987    2 
 03       rat   200    300  909 241 101 46   505.0   143.5   4646     3
````

How to I normalize here?

row = rows[['labels','x_center','y_center','w','h']].astype(float).values
row = row/...?
```

Stale question

Source

Lincoln93

Most helpful comment

@Lincoln93, one way:

We can achieve it as following. Let's create a dummy data frame first.

import pandas as pd
import random

info = {
    'image_id': ['01', '01', '01', '02', '04', '04'],
    'x':random.sample(range(500, 600), 6),
    'y':random.sample(range(200, 500), 6),
    'w':random.sample(range(200, 300), 6),
    'h':random.sample(range(400, 600), 6),
    'x_center':random.sample(range(250, 460), 6),
    'y_center':random.sample(range(250, 460), 6),
    'img_height':random.sample(range(2100, 3000), 6), 
    'img_width':random.sample(range(1100, 4000), 6), 
    'labels':[0,0,0,1,2,2]
}

df = pd.DataFrame(data=info)
df.head()

--------------------------

    image_id    x    y   w  h   x_center  y_center  img_height  img_width   labels
0     01       561  435 290 449 303        318      2105        2806         0
1     01       583  447 265 427 394        421      2338        2047         0
2     01       520  417 262 592 429        395      2947        3388         0
3     02       516  415 214 470 455        319      2649        1594         1
4     04       522  386 204 514 343        394      2847        1770         2

Next, we will groupyby the image id and iterate over each row.

df_image_id = df.groupby('image_id') # group by id

for _ , row in df_image_id:
    for _ , each in row.iterrows(): # iterate each samples within same id 
        img_w = each['img_width']  
        img_h = each['img_height']

        content = [
            each['labels'], 
            each['x_center']/each['img_width'],
            each['w']/each['img_width'],
            each['y_center']/each['img_height'],
            each['h']/each['img_height']
        ]

        id = each['image_id']
        with open(f'{id}.txt', 'a') as f1:
            f1.write(" ".join(str(x) for x in content)+ '\n')

innat on 26 Oct 2020

🚀2 ❤2

All 6 comments

https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data#2-create-labels

glenn-jocher on 21 Oct 2020

Also in the future, a more informative title might get you more responses.

glenn-jocher on 21 Oct 2020

https://github.com/ultralytics/yolov5/blob/master/utils/datasets.py#L367

How can I change the method?

it have to put label and images in the same folder

I want to put it in different place

alicera on 22 Oct 2020

@alicera see https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data for dataset formatting instructions.

glenn-jocher on 22 Oct 2020

@Lincoln93, one way:

We can achieve it as following. Let's create a dummy data frame first.

import pandas as pd
import random

info = {
    'image_id': ['01', '01', '01', '02', '04', '04'],
    'x':random.sample(range(500, 600), 6),
    'y':random.sample(range(200, 500), 6),
    'w':random.sample(range(200, 300), 6),
    'h':random.sample(range(400, 600), 6),
    'x_center':random.sample(range(250, 460), 6),
    'y_center':random.sample(range(250, 460), 6),
    'img_height':random.sample(range(2100, 3000), 6), 
    'img_width':random.sample(range(1100, 4000), 6), 
    'labels':[0,0,0,1,2,2]
}

df = pd.DataFrame(data=info)
df.head()

--------------------------

    image_id    x    y   w  h   x_center  y_center  img_height  img_width   labels
0     01       561  435 290 449 303        318      2105        2806         0
1     01       583  447 265 427 394        421      2338        2047         0
2     01       520  417 262 592 429        395      2947        3388         0
3     02       516  415 214 470 455        319      2649        1594         1
4     04       522  386 204 514 343        394      2847        1770         2

Next, we will groupyby the image id and iterate over each row.

df_image_id = df.groupby('image_id') # group by id

for _ , row in df_image_id:
    for _ , each in row.iterrows(): # iterate each samples within same id 
        img_w = each['img_width']  
        img_h = each['img_height']

        content = [
            each['labels'], 
            each['x_center']/each['img_width'],
            each['w']/each['img_width'],
            each['y_center']/each['img_height'],
            each['h']/each['img_height']
        ]

        id = each['image_id']
        with open(f'{id}.txt', 'a') as f1:
            f1.write(" ".join(str(x) for x in content)+ '\n')

innat on 26 Oct 2020

🚀2 ❤2

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.