Yolov5: Question

Created on 21 Oct 2020  路  6Comments  路  Source: ultralytics/yolov5

I've a data set where images are in random shape (width and height). So while creating the labels

image_id    object   width   height    x      y  w   h  x_center y_center area labels
 01           cat   1200   800  833 390 254 410  543.5   400.0   104140   1
 02       dog   1500   600  901 284 117 111  509.0   197.5   12987    2 
 03       rat   200    300  909 241 101 46   505.0   143.5   4646     3
````

How to I normalize here? 

row = rows[['labels','x_center','y_center','w','h']].astype(float).values
row = row/...?
```

Stale question

Most helpful comment

@Lincoln93, one way:

We can achieve it as following. Let's create a dummy data frame first.

import pandas as pd
import random

info = {
    'image_id': ['01', '01', '01', '02', '04', '04'],
    'x':random.sample(range(500, 600), 6),
    'y':random.sample(range(200, 500), 6),
    'w':random.sample(range(200, 300), 6),
    'h':random.sample(range(400, 600), 6),
    'x_center':random.sample(range(250, 460), 6),
    'y_center':random.sample(range(250, 460), 6),
    'img_height':random.sample(range(2100, 3000), 6), 
    'img_width':random.sample(range(1100, 4000), 6), 
    'labels':[0,0,0,1,2,2]
}

df = pd.DataFrame(data=info)
df.head()

--------------------------

    image_id    x    y   w  h   x_center  y_center  img_height  img_width   labels
0     01       561  435 290 449 303        318      2105        2806         0
1     01       583  447 265 427 394        421      2338        2047         0
2     01       520  417 262 592 429        395      2947        3388         0
3     02       516  415 214 470 455        319      2649        1594         1
4     04       522  386 204 514 343        394      2847        1770         2

Next, we will groupyby the image id and iterate over each row.

df_image_id = df.groupby('image_id') # group by id

for _ , row in df_image_id:
    for _ , each in row.iterrows(): # iterate each samples within same id 
        img_w = each['img_width']  
        img_h = each['img_height']

        content = [
            each['labels'], 
            each['x_center']/each['img_width'],
            each['w']/each['img_width'],
            each['y_center']/each['img_height'],
            each['h']/each['img_height']
        ]

        id = each['image_id']
        with open(f'{id}.txt', 'a') as f1:
            f1.write(" ".join(str(x) for x in content)+ '\n')

All 6 comments

Also in the future, a more informative title might get you more responses.

https://github.com/ultralytics/yolov5/blob/master/utils/datasets.py#L367

How can I change the method?

it have to put label and images in the same folder

I want to put it in different place

@alicera see https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data for dataset formatting instructions.

@Lincoln93, one way:

We can achieve it as following. Let's create a dummy data frame first.

import pandas as pd
import random

info = {
    'image_id': ['01', '01', '01', '02', '04', '04'],
    'x':random.sample(range(500, 600), 6),
    'y':random.sample(range(200, 500), 6),
    'w':random.sample(range(200, 300), 6),
    'h':random.sample(range(400, 600), 6),
    'x_center':random.sample(range(250, 460), 6),
    'y_center':random.sample(range(250, 460), 6),
    'img_height':random.sample(range(2100, 3000), 6), 
    'img_width':random.sample(range(1100, 4000), 6), 
    'labels':[0,0,0,1,2,2]
}

df = pd.DataFrame(data=info)
df.head()

--------------------------

    image_id    x    y   w  h   x_center  y_center  img_height  img_width   labels
0     01       561  435 290 449 303        318      2105        2806         0
1     01       583  447 265 427 394        421      2338        2047         0
2     01       520  417 262 592 429        395      2947        3388         0
3     02       516  415 214 470 455        319      2649        1594         1
4     04       522  386 204 514 343        394      2847        1770         2

Next, we will groupyby the image id and iterate over each row.

df_image_id = df.groupby('image_id') # group by id

for _ , row in df_image_id:
    for _ , each in row.iterrows(): # iterate each samples within same id 
        img_w = each['img_width']  
        img_h = each['img_height']

        content = [
            each['labels'], 
            each['x_center']/each['img_width'],
            each['w']/each['img_width'],
            each['y_center']/each['img_height'],
            each['h']/each['img_height']
        ]

        id = each['image_id']
        with open(f'{id}.txt', 'a') as f1:
            f1.write(" ".join(str(x) for x in content)+ '\n')

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nanometer34688 picture nanometer34688  路  3Comments

we1pingyu picture we1pingyu  路  3Comments

DucTaiVu picture DucTaiVu  路  3Comments

lisa676 picture lisa676  路  3Comments

hktxt picture hktxt  路  3Comments