Tensorflow-examples: Linear regression: use tf.reduce_mean instead of tf.reduce_sum

Created on 20 Oct 2016 · 4Comments · Source: aymericdamien/TensorFlow-Examples

In this fileTensorFlow-Examples/examples/2_BasicModels/linear_regression.py line 39, I try to use tf.reduce_mean instead.
cost = tf.reduce_mean(tf.pow(pred - Y, 2)) / 2
This is the original code
cost = tf.reduce_sum(tf.pow(pred - Y, 2)) / (2 * n_samples)

But I found the result is different.
Origin(with tf.reduce_sum)
Epoch: 0050 cost= 0.078097254 W= 0.268404 b= 0.666144
Epoch: 0100 cost= 0.077967979 W= 0.267285 b= 0.674192
Epoch: 0150 cost= 0.077853732 W= 0.266233 b= 0.681762
Epoch: 0200 cost= 0.077752732 W= 0.265244 b= 0.688881
Epoch: 0250 cost= 0.077663496 W= 0.264313 b= 0.695577
Epoch: 0300 cost= 0.077584624 W= 0.263437 b= 0.701875
Epoch: 0350 cost= 0.077514939 W= 0.262614 b= 0.707798
Epoch: 0400 cost= 0.077453338 W= 0.26184 b= 0.713369
Epoch: 0450 cost= 0.077398948 W= 0.261111 b= 0.718609
Epoch: 0500 cost= 0.077350870 W= 0.260426 b= 0.723537
Epoch: 0550 cost= 0.077308379 W= 0.259782 b= 0.728173
Epoch: 0600 cost= 0.077270873 W= 0.259176 b= 0.732533
Epoch: 0650 cost= 0.077237763 W= 0.258606 b= 0.736633
Epoch: 0700 cost= 0.077208482 W= 0.25807 b= 0.740489
Epoch: 0750 cost= 0.077182651 W= 0.257565 b= 0.744117
Epoch: 0800 cost= 0.077159837 W= 0.257091 b= 0.747528
Epoch: 0850 cost= 0.077139676 W= 0.256645 b= 0.750738
Epoch: 0900 cost= 0.077121906 W= 0.256226 b= 0.753756
Epoch: 0950 cost= 0.077106208 W= 0.255831 b= 0.756594
Epoch: 1000 cost= 0.077092350 W= 0.25546 b= 0.759265
Optimization Finished!
Training cost= 0.0770923 W= 0.25546 b= 0.759265
train
Testing ...
Testing cost= 0.100858
Absolute mean square loss difference: 0.0112338
test

Now(with tf.reduce_mean)
Epoch: 0050 cost= 0.089926675 W= 0.229255 b= 0.785728
Epoch: 0100 cost= 0.089696057 W= 0.225371 b= 0.814913
Epoch: 0150 cost= 0.089644745 W= 0.224121 b= 0.8243
Epoch: 0200 cost= 0.089630596 W= 0.223719 b= 0.827321
Epoch: 0250 cost= 0.089626297 W= 0.22359 b= 0.828292
Epoch: 0300 cost= 0.089624956 W= 0.223548 b= 0.828604
Epoch: 0350 cost= 0.089624502 W= 0.223535 b= 0.828705
Epoch: 0400 cost= 0.089624360 W= 0.223531 b= 0.828736
Epoch: 0450 cost= 0.089624345 W= 0.22353 b= 0.828744
Epoch: 0500 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0550 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0600 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0650 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0700 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0750 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0800 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0850 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0900 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 0950 cost= 0.089624323 W= 0.22353 b= 0.828746
Epoch: 1000 cost= 0.089624323 W= 0.22353 b= 0.828746
Optimization Finished!
Training cost= 0.0896243 W= 0.22353 b= 0.828746
train1
Testing ...
Testing cost= 0.100858
Absolute mean square loss difference: 0.0112338
test1

You can see that the cost and W b is different.

Why?
Could anyone tell me?

Source

secsilm

Most helpful comment

Multiply the reduce_mean loss with the batch_size, I bet you will get the same result.

huyong1109 on 6 Jun 2017

👍3

All 4 comments

Multiply the reduce_mean loss with the batch_size, I bet you will get the same result.

huyong1109 on 6 Jun 2017

👍3

I have similar problem, when using tf.reduce_sum(tf.pow(pred - Y, 2)) / (2 * n_samples), everything is fine.
But when I moved to reduce_mean, the loss stopped decreasing after only a few epochs.
This weird bug makes me wonder if reduce_mean is with some flaws.