Hello, Keras
I appreciate for this useful and great wrapper.
and I am building a network for the regression problem.
I saw that Keras calculate Acc and Loss even in regression.
how is it calculated?
loss moves alike MSE but the values are different to each other.
thank you
Keras can calculate a "regression accuracy" which actually works, but the terminology makes mathematically not really sense. Regression is an error minimization problem and the regression metrics should be r_square (R^2), mean absolute error (MAE), mean_squared_error (MSE) and root mean squared error (RMSE). For regression it is best practice to use the mean_squared_error as loss function. But other loss functions may work as well, they need to be compared and benchmarked. For training curves one can follow R^2, mae and rmse.
See Keras/Accuracy/Regression issue https://github.com/keras-team/keras/issues/108
and here http://www.statsoft.com/Textbook/Multiple-Regression
and here http://onlinestatbook.com/2/regression/regression.html
and https://www.quantinsti.com/blog/polynomial-regression-adding-non-linearity-to-a-linear-model/
The functions below are Keras backend tensor functions and can be used for Keras loss functions, Keras metrics and Keras learning curves. When calculating with scalar types such as floats, doubles or int it is important to use normal math functions or numpy math functions and not the backend functions.
````
def rmse(y_true, y_pred):
from keras import backend
return backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))
def mse(y_true, y_pred):
from keras import backend
return backend.mean(backend.square(y_pred - y_true), axis=-1)
def r_square(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return ( 1 - SS_res/(SS_tot + K.epsilon()) )
````
The calling convention for Keras backend functions in loss and metrics is:
````
model.compile(optimizer="Nadam", loss="mean_squared_error", metrics=["mean_squared_error"])
model.compile(optimizer="Nadam", loss=rmse, metrics=[r_square, rmse])
````
That is how an regression example would look like in Keras and TF.
````
"""
Created on Wed Aug 15 18:44:28 2018
Simple regression example for Keras (v2.2.2) with Boston housing data
@author: tobigithub
"""
from tensorflow import set_random_seed
from keras.datasets import boston_housing
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.callbacks import EarlyStopping
from keras.layers import Dense
import matplotlib.pyplot as plt
import numpy as np
def rmse(y_true, y_pred):
from keras import backend
return backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))
def mse(y_true, y_pred):
from keras import backend
return backend.mean(backend.square(y_pred - y_true), axis=-1)
def r_square(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return (1 - SS_res/(SS_tot + K.epsilon()))
def r_square_loss(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return 1 - ( 1 - SS_res/(SS_tot + K.epsilon()))
np.random.seed(12345)
set_random_seed(12345)
(x_train, y_train), (x_test, y_test) = boston_housing.load_data(seed=12345, test_split=0.2)
model = Sequential()
model.add(BatchNormalization())
model.add(Dense(units=300, activation='relu', input_dim=x_train.shape[1]))
model.add(Dense(units=1, activation='relu'))
model.compile(optimizer="Nadam", loss="mean_squared_error", metrics=["mean_squared_error", rmse, r_square])
earlystopping=EarlyStopping(monitor="mean_squared_error", patience=40, verbose=1, mode='auto')
result = model.fit(x_train, y_train, epochs=240, batch_size=5, validation_data=(x_test, y_test), callbacks=[earlystopping])
y_pred = model.predict(x_test)
plt.plot(result.history['val_r_square'])
plt.plot(result.history['r_square'])
plt.title('model R^2')
plt.ylabel('R^2')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
plt.plot(result.history['rmse'])
plt.plot(result.history['val_rmse'])
plt.title('rmse')
plt.ylabel('rmse')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(y_test.reshape(-1,1), y_pred)
y_fit = regressor.predict(y_pred)
reg_intercept = round(regressor.intercept_[0],4)
reg_coef = round(regressor.coef_.flatten()[0],4)
reg_label = "y = " + str(reg_intercept) + "*x +" + str(reg_coef)
plt.scatter(y_test, y_pred, color='blue', label= 'data')
plt.plot(y_pred, y_fit, color='red', linewidth=2, label = 'Linear regression\n'+reg_label)
plt.title('Linear Regression')
plt.legend()
plt.xlabel('observed')
plt.ylabel('predicted')
plt.show()
import sklearn.metrics, math
print("\n")
print("Mean absolute error (MAE): %f" % sklearn.metrics.mean_absolute_error(y_test,y_pred))
print("Mean squared error (MSE): %f" % sklearn.metrics.mean_squared_error(y_test,y_pred))
print("Root mean squared error (RMSE): %f" % math.sqrt(sklearn.metrics.mean_squared_error(y_test,y_pred)))
print("R square (R^2): %f" % sklearn.metrics.r2_score(y_test,y_pred))
````
Finally one can compare different loss functions, with the mean_squared_error performing best.
````
Mean absolute error (MAE): 2.556359
Mean squared error (MSE): 15.532753
Root mean squared error (RMSE): 3.941161
R square (R^2): 0.831592
Mean absolute error (MAE): 2.883197
Mean squared error (MSE): 19.527428
Root mean squared error (RMSE): 4.418985
R square (R^2): 0.788282
Mean absolute error (MAE): 2.736181
Mean squared error (MSE): 21.020702
Root mean squared error (RMSE): 4.584834
R square (R^2): 0.772091
Mean absolute error (MAE): 2.752203
Mean squared error (MSE): 21.361542
Root mean squared error (RMSE): 4.621855
R square (R^2): 0.768396
Mean absolute error (MAE): 2.854676
Mean squared error (MSE): 22.078403
Root mean squared error (RMSE): 4.698766
R square (R^2): 0.760624
Mean absolute error (MAE): 3.956800
Mean squared error (MSE): 40.298911
Root mean squared error (RMSE): 6.348142
R square (R^2): 0.563075
Mean absolute error (MAE): 19.958943
Mean squared error (MSE): 493.105668
Root mean squared error (RMSE): 22.205983
R square (R^2): -4.346304
````
Is there any pull request for implementing R2 Score nativelly in keras? In order to have something like model.compile(optimizer="Nadam", loss="mean_squared_error", metrics=["r2_score"])
?
The Keras docs about custom metrics say (emphasis mine):
The function would need to take (y_true, y_pred) as arguments and return a single tensor value.
The metrics/loss functions given here do not seem to do that, e.g.:
def rmse(y_true, y_pred):
from keras import backend
return backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))
Here, backend.mean()
is applied along the last axis only, thus all axes but the last axis will stay intact. I would think there should be a backend.mean()
over all axes around the return value. Like so:
def rmse(y_true, y_pred):
from keras import backend
return backend.mean(
backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))
)
Is that not needed?
Also, why is the backend imported inside each function. Wouldn't a single regular import at the top be sufficient as well?
EDIT: I just realized that Keras' built-in loss function / metrics don't follow (my reading of) the docs as well. Is the mean along the first axis (i.e., the lines of data points) performed implicitly somewhere? Maybe the docs could more clear about what the interface for metrics should be...
@WittmannF I would even like to have the choice to use (1 - R2) as the loss and R2 as the performance metric.
def r_square_loss(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return 1 - ( 1 - SS_res/(SS_tot + K.epsilon()))
Is this correct for data having multiple variables/attributes in time series forecasting?
Most helpful comment
Keras can calculate a "regression accuracy" which actually works, but the terminology makes mathematically not really sense. Regression is an error minimization problem and the regression metrics should be r_square (R^2), mean absolute error (MAE), mean_squared_error (MSE) and root mean squared error (RMSE). For regression it is best practice to use the mean_squared_error as loss function. But other loss functions may work as well, they need to be compared and benchmarked. For training curves one can follow R^2, mae and rmse.
See Keras/Accuracy/Regression issue https://github.com/keras-team/keras/issues/108
and here http://www.statsoft.com/Textbook/Multiple-Regression
and here http://onlinestatbook.com/2/regression/regression.html
and https://www.quantinsti.com/blog/polynomial-regression-adding-non-linearity-to-a-linear-model/
The functions below are Keras backend tensor functions and can be used for Keras loss functions, Keras metrics and Keras learning curves. When calculating with scalar types such as floats, doubles or int it is important to use normal math functions or numpy math functions and not the backend functions.
````
root mean squared error (rmse) for regression (only for Keras tensors)
def rmse(y_true, y_pred):
from keras import backend
return backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))
mean squared error (mse) for regression (only for Keras tensors)
def mse(y_true, y_pred):
from keras import backend
return backend.mean(backend.square(y_pred - y_true), axis=-1)
coefficient of determination (R^2) for regression (only for Keras tensors)
def r_square(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return ( 1 - SS_res/(SS_tot + K.epsilon()) )
````
The calling convention for Keras backend functions in loss and metrics is:
````
original Keras functions
model.compile(optimizer="Nadam", loss="mean_squared_error", metrics=["mean_squared_error"])
custom function example
model.compile(optimizer="Nadam", loss=rmse, metrics=[r_square, rmse])
````
That is how an regression example would look like in Keras and TF.
````
"""
Created on Wed Aug 15 18:44:28 2018
Simple regression example for Keras (v2.2.2) with Boston housing data
@author: tobigithub
"""
from tensorflow import set_random_seed
from keras.datasets import boston_housing
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.callbacks import EarlyStopping
from keras.layers import Dense
import matplotlib.pyplot as plt
import numpy as np
-----------------------------------------------------------------------------
Define custom loss functions for regression in Keras
-----------------------------------------------------------------------------
root mean squared error (rmse) for regression
def rmse(y_true, y_pred):
from keras import backend
return backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))
mean squared error (mse) for regression
def mse(y_true, y_pred):
from keras import backend
return backend.mean(backend.square(y_pred - y_true), axis=-1)
coefficient of determination (R^2) for regression
def r_square(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return (1 - SS_res/(SS_tot + K.epsilon()))
def r_square_loss(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return 1 - ( 1 - SS_res/(SS_tot + K.epsilon()))
-----------------------------------------------------------------------------
Start a simple Keras sequential model
-----------------------------------------------------------------------------
set the seeds for reproducible results with TF (wont work with GPU, only CPU)
np.random.seed(12345)
set the TF seed
set_random_seed(12345)
Import data, assign seed for same results, do train/test split 80/20
(x_train, y_train), (x_test, y_test) = boston_housing.load_data(seed=12345, test_split=0.2)
built Keras sequential model
model = Sequential()
add batch normalization
model.add(BatchNormalization())
add layer to the MLP for data (404,13)
model.add(Dense(units=300, activation='relu', input_dim=x_train.shape[1]))
add output layer
model.add(Dense(units=1, activation='relu'))
compile regression model loss should be mean_squared_error //
model.compile(optimizer="Nadam", loss="mean_squared_error", metrics=["mean_squared_error", rmse, r_square])
enable early stopping based on mean_squared_error
earlystopping=EarlyStopping(monitor="mean_squared_error", patience=40, verbose=1, mode='auto')
fit model
result = model.fit(x_train, y_train, epochs=240, batch_size=5, validation_data=(x_test, y_test), callbacks=[earlystopping])
get predictions
y_pred = model.predict(x_test)
-----------------------------------------------------------------------------
Plot learning curves including R^2 and RMSE
-----------------------------------------------------------------------------
plot training curve for R^2 (beware of scale, starts very low negative)
plt.plot(result.history['val_r_square'])
plt.plot(result.history['r_square'])
plt.title('model R^2')
plt.ylabel('R^2')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
plot training curve for rmse
plt.plot(result.history['rmse'])
plt.plot(result.history['val_rmse'])
plt.title('rmse')
plt.ylabel('rmse')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
print the linear regression and display datapoints
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(y_test.reshape(-1,1), y_pred)
y_fit = regressor.predict(y_pred)
reg_intercept = round(regressor.intercept_[0],4)
reg_coef = round(regressor.coef_.flatten()[0],4)
reg_label = "y = " + str(reg_intercept) + "*x +" + str(reg_coef)
plt.scatter(y_test, y_pred, color='blue', label= 'data')
plt.plot(y_pred, y_fit, color='red', linewidth=2, label = 'Linear regression\n'+reg_label)
plt.title('Linear Regression')
plt.legend()
plt.xlabel('observed')
plt.ylabel('predicted')
plt.show()
-----------------------------------------------------------------------------
print statistical figures of merit
-----------------------------------------------------------------------------
import sklearn.metrics, math
print("\n")
print("Mean absolute error (MAE): %f" % sklearn.metrics.mean_absolute_error(y_test,y_pred))
print("Mean squared error (MSE): %f" % sklearn.metrics.mean_squared_error(y_test,y_pred))
print("Root mean squared error (RMSE): %f" % math.sqrt(sklearn.metrics.mean_squared_error(y_test,y_pred)))
print("R square (R^2): %f" % sklearn.metrics.r2_score(y_test,y_pred))
````
Finally one can compare different loss functions, with the mean_squared_error performing best.
````
---------------------------------------------------------
Comparison of different loss functions during regression
---------------------------------------------------------
loss="mean_squared_error" (best)
Mean absolute error (MAE): 2.556359
Mean squared error (MSE): 15.532753
Root mean squared error (RMSE): 3.941161
R square (R^2): 0.831592
loss="poisson"
Mean absolute error (MAE): 2.883197
Mean squared error (MSE): 19.527428
Root mean squared error (RMSE): 4.418985
R square (R^2): 0.788282
loss = rmse
Mean absolute error (MAE): 2.736181
Mean squared error (MSE): 21.020702
Root mean squared error (RMSE): 4.584834
R square (R^2): 0.772091
loss="logcosh"
Mean absolute error (MAE): 2.752203
Mean squared error (MSE): 21.361542
Root mean squared error (RMSE): 4.621855
R square (R^2): 0.768396
loss="mean_absolute_error"
Mean absolute error (MAE): 2.854676
Mean squared error (MSE): 22.078403
Root mean squared error (RMSE): 4.698766
R square (R^2): 0.760624
loss=r_square_loss
Mean absolute error (MAE): 3.956800
Mean squared error (MSE): 40.298911
Root mean squared error (RMSE): 6.348142
R square (R^2): 0.563075
loss="kullback_leibler_divergence" (worst)
Mean absolute error (MAE): 19.958943
Mean squared error (MSE): 493.105668
Root mean squared error (RMSE): 22.205983
R square (R^2): -4.346304
````