I just realized that keras does not have a GELU activation function in activations.py. I request that it be added, because it has many applications in neural networks.
Note : I'll probably submit a pull request for it.
[y] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps
[y] Check that your version of TensorFlow is up-to-date. The installation instructions can be found here.
I don't think this should be merged into Keras.
Please submit your PR at keras-contrib.
This guy uses it, and he clearly knows whats going on..
https://github.com/borisbanushev/stockpredictionai
keras code:
from keras.layers import Activation
from keras.utils.generic_utils import get_custom_objects
def custom_gelu(x):
return 0.5 * x * (1 + tf.tanh(tf.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))))
get_custom_objects().update({'custom_gelu': Activation(custom_gelu)})
fit1.add(Dense(output_dim=1, activation=custom_gelu))
somethings wrong with that custom loss.. getting really strange predictions not going under ~-.25
GELU activation has started to pick up and it has been published a while ago (2016):
https://arxiv.org/abs/1606.08415
Also been used in OpenAI's GPT-1 and 2 and Google's BERT papers. Would love to see this implemented in Keras activations.
Code from Google's BERT:
def gelu(x):
"""Gaussian Error Linear Unit.
This is a smoother version of the RELU.
Original paper: https://arxiv.org/abs/1606.08415
Args:
x: float Tensor to perform activation.
Returns:
`x` with the GELU activation applied.
"""
cdf = 0.5 * (1.0 + tf.tanh(
(np.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3)))))
return x * cdf
Code from OpenAI's GPT-2:
def gelu(x):
return 0.5*x*(1+tf.tanh(np.sqrt(2/np.pi)*(x+0.044715*tf.pow(x, 3))))
This guy uses it, and he clearly knows whats going on..
https://github.com/borisbanushev/stockpredictionai
keras code:from keras.layers import Activation from keras.utils.generic_utils import get_custom_objects def custom_gelu(x): return 0.5 * x * (1 + tf.tanh(tf.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3)))) get_custom_objects().update({'custom_gelu': Activation(custom_gelu)}) fit1.add(Dense(output_dim=1, activation=custom_gelu))somethings wrong with that custom loss.. getting really strange predictions not going under ~-.25
it's not wrong that you are not getting below -0.25, look at the graph for the function:

I know that It start to be very confusing but I need to make a cross Org reference https://github.com/tensorflow/tensorflow/pull/33945
Gelu is in tensorflow https://github.com/tensorflow/tensorflow/pull/41178. You can close this.
Thank you, @bhack! I will close this issue :)
Most helpful comment
GELU activation has started to pick up and it has been published a while ago (2016):
https://arxiv.org/abs/1606.08415
Also been used in OpenAI's GPT-1 and 2 and Google's BERT papers. Would love to see this implemented in Keras activations.