Keras: Confusion regarding class_weight

Created on 2 Mar 2016  Â·  17Comments  Â·  Source: keras-team/keras

Hey there,

How does one actually use class_weight on model.fit?

I had originally written the following method to do this, but I'm not entirely sure whether it works or not.

def calculate_class_weights(train_label):
    list = train_label.tolist()

    num_neg = list.count(0)
    num_pos = list.count(1)

    duplicate = num_pos / num_neg

    class_weights={0 : (num_neg * (duplicate)) , 1: num_pos }
    return class_weights

This returns a dictionary of...

{0: 34, 1: 34}

Does anyone have a working example of how to balance 2 classes using the class_weights method?

Thanks,

Keiron.

stale

Most helpful comment

I have this simple function for computing the weights for each class:

def get_class_weights(y):
    counter = Counter(y)
    majority = max(counter.values())
    return  {cls: float(majority/count) for cls, count in counter.items()}

What i do is pick the majority class as a reference and assign weights for the other classes based on the reference class. So if you have 3 classes with classA:10%, classB:50% and classC:40% then you get the weights:

{0:5, 1:1, 2:1.25}

So this means that if you miss-classify classA the loss will be 5 times more than miss-classifying classB and so on...

All 17 comments

You should probably post this in the Keras google group:
https://groups.google.com/forum/#!forum/keras-users

Stackoverflow would work too.

See also: https://groups.google.com/forum/#!topic/keras-users/MUO6v3kRHUw

train_generator = train_datagen.flow_from_directory(
    train_img_path,  # this is the target directory
    target_size=(img_rows, img_cols), 
    batch_size=batch_size,
    class_mode='binary',
    color_mode='grayscale',
    classes=['good', 'bad'],
    save_to_dir=generate_train_img_path) 

validation_generator = test_datagen.flow_from_directory(
    validation_img_path,
    target_size=(img_rows, img_cols),
    batch_size=batch_size,
    class_mode='binary',
    color_mode='grayscale',
    classes=['good', 'bad'],
    save_to_dir=generate_validation_img_path)

#There are 83% images which are class 1, and 17% images which are class 0.  I balance 2 classes using the class_weights
class_weight = {0:83,1:17}

for i in range(0, nb_epoch):
    print('epoch:{}'.format(i))
    if i == 0:
        print('epoch:{}'.format(i))
    else:
        model.load_weights('{}.h5'.format(i - 1))
    model.fit_generator(
        train_generator,
        samples_per_epoch=1800,
        nb_epoch=1,
        validation_data=validation_generator,
        nb_val_samples=250,
        class_weight=class_weight)
    model.save('{}.h5'.format(i))

I have this simple function for computing the weights for each class:

def get_class_weights(y):
    counter = Counter(y)
    majority = max(counter.values())
    return  {cls: float(majority/count) for cls, count in counter.items()}

What i do is pick the majority class as a reference and assign weights for the other classes based on the reference class. So if you have 3 classes with classA:10%, classB:50% and classC:40% then you get the weights:

{0:5, 1:1, 2:1.25}

So this means that if you miss-classify classA the loss will be 5 times more than miss-classifying classB and so on...

Seems like a useful utility function. Do you have any papers about the best choices of class weights and why this is the correct scaling? Maybe rename to something like balanced_class_weights and try to add it to np_utils.

Here is an example applying SegNet to the RoadScene dataset, where the class_weights are given for several classes in the images:

class_weighting = [
 0.2595,
 0.1826,
 4.5640,
 0.1417,
 0.5051,
 0.3826,
 9.6446,
 1.8418,
 6.6823,
 6.2478,
 3.0,
 7.3614
]
# Fit the model
history = segnet_basic.fit(train_data, train_label, callbacks=callbacks_list, batch_size=batch_size, nb_epoch=nb_epoch,
                    verbose=1, class_weight=class_weighting , validation_data=(test_data, test_label), shuffle=True) # validation_split=0.33

Hi
I'm confused about how to use the class_weights, I pasted a simple example here, in the example I fit the same inputs to predict two different classes, without weights, the prediction for the inputs should be 50% for 2nd class and 50% for 4th class. And I set the class_weight to mask out the 2nd class, but by doing this, the prediction gives the same 50% 50% results. Am I doing something wrong?

from keras.models import Sequential
from keras.layers import Dense
from keras import optimizers
import numpy as np
model = Sequential()
model.add(Dense(4, input_shape=(2,)))
model.add(Dense(4, activation='softmax'))
model.compile(optimizer=optimizers.Adagrad(), loss='categorical_crossentropy')


x = np.array([[1,1], [1,1]])
y = np.array([[0,1,0,0], [0,0,0,1]])
weights_mask = np.array([1, 1])
class_weights = {
    0:0,
    1:0,
    2:0,
    3:10
}
# weights_mask = np.array([1])
model.fit(x,y, epochs=1000, sample_weight=weights_mask, class_weight=class_weights, validation_data=((x,y)))

ret = model.predict(x)

print(ret)

@0bserver07 how to set the value in class_weight ? Is there some papers to refer?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

How do we use class_weight in case of fit_generator? I mean is there a way to do it on the go for each training batch? I tried using a generator to return class_weight for each batch but that gives me

TypeError: object of type 'generator' has no len()

I actually want to calculate class weights for each batch and not the entire dataset. And I am unable to find a way to do this using fit_generator without duplication of effort.

@0bserver07 I am training a semantic segmentation network. When I try to pass a dict as the class_weight parameter to fit_generator, it complains that ValueError:class_weightnot supported for 3+ dimensional targets., but when I pass it a list like you did, it magically works! But the docs don't mention anything about passing lists to the class_weight parameter of fit or fit_generator. Could you please shed some light on how this is working? Thanks!

What would a class weight of 0 imply?

For example, suppose class_weights are {0 : 0.5, 1 : 0.5, 2 : 0.0}. Does this mean we're asking the model to consider classes 0 and 1 equally and ignore class 2 i.e. not have it contribute to the loss?

yes

On Jan 22, 2019, at 11:51 PM, Ashwin Nair notifications@github.com wrote:

What would a class weight of 0.0 imply?

For example, suppose class_weights are {0 : 0.5, 1 : 0.5, 2 : 0.0}. Does this mean we're asking the model to consider classes 0 and 1 equally and ignore class 2?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@0bserver07 I am training a semantic segmentation network. When I try to pass a dict as the class_weight parameter to fit_generator, it complains that ValueError:class_weightnot supported for 3+ dimensional targets., but when I pass it a list like you did, it magically works! But the docs don't mention anything about passing lists to the class_weight parameter of fit or fit_generator. Could you please shed some light on how this is working? Thanks!

I have the same problem. could you please let me know how did you fix it?

Check this out:
https://stackoverflow.com/questions/60408901/sklearn-utils-compute-class-weight-function-for-large-dataset

train_generator = train_datagen.flow_from_directory(
        'train_directory',
        target_size=(224, 224),
        batch_size=32,
        class_mode = "categorical"
        )
and the class weights for the training set can be computed like this

class_weights = class_weight.compute_class_weight(
           'balanced',
            np.unique(train_generator.classes), 
            train_generator.classes)
Was this page helpful?
0 / 5 - 0 ratings

Related issues

farizrahman4u picture farizrahman4u  Â·  3Comments

MarkVdBergh picture MarkVdBergh  Â·  3Comments

anjishnu picture anjishnu  Â·  3Comments

amityaffliction picture amityaffliction  Â·  3Comments

oweingrod picture oweingrod  Â·  3Comments