Keras: Feature Request : Add GELU activation function

Created on 10 Dec 2018 · 8Comments · Source: keras-team/keras

I just realized that keras does not have a GELU activation function in activations.py. I request that it be added, because it has many applications in neural networks.

Note : I'll probably submit a pull request for it.

[y] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps
[y] Check that your version of TensorFlow is up-to-date. The installation instructions can be found here.

feature

Source

SriRangaTarun

👍6

Most helpful comment

GELU activation has started to pick up and it has been published a while ago (2016):
https://arxiv.org/abs/1606.08415

Also been used in OpenAI's GPT-1 and 2 and Google's BERT papers. Would love to see this implemented in Keras activations.

pskrunner14 on 27 May 2019

👍7

All 8 comments

I don't think this should be merged into Keras.

Not widely used
Not published yet

Please submit your PR at keras-contrib.

Dref360 on 10 Dec 2018

👍1

This guy uses it, and he clearly knows whats going on..
https://github.com/borisbanushev/stockpredictionai
keras code:

from keras.layers import Activation
from keras.utils.generic_utils import get_custom_objects

def custom_gelu(x):
    return 0.5 * x * (1 + tf.tanh(tf.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))))
get_custom_objects().update({'custom_gelu': Activation(custom_gelu)})
fit1.add(Dense(output_dim=1, activation=custom_gelu))

somethings wrong with that custom loss.. getting really strange predictions not going under ~-.25

joefaron on 3 May 2019

GELU activation has started to pick up and it has been published a while ago (2016):
https://arxiv.org/abs/1606.08415

Also been used in OpenAI's GPT-1 and 2 and Google's BERT papers. Would love to see this implemented in Keras activations.

pskrunner14 on 27 May 2019

👍7

Code from Google's BERT:

def gelu(x):
    """Gaussian Error Linear Unit.
    This is a smoother version of the RELU.
    Original paper: https://arxiv.org/abs/1606.08415
    Args:
        x: float Tensor to perform activation.
    Returns:
        `x` with the GELU activation applied.
    """
    cdf = 0.5 * (1.0 + tf.tanh(
        (np.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3)))))
    return x * cdf

Code from OpenAI's GPT-2:

def gelu(x):
    return 0.5*x*(1+tf.tanh(np.sqrt(2/np.pi)*(x+0.044715*tf.pow(x, 3))))

casperbh96 on 26 Aug 2019

This guy uses it, and he clearly knows whats going on..
https://github.com/borisbanushev/stockpredictionai
keras code:
from keras.layers import Activation
from keras.utils.generic_utils import get_custom_objects

def custom_gelu(x):
    return 0.5 * x * (1 + tf.tanh(tf.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))))
get_custom_objects().update({'custom_gelu': Activation(custom_gelu)})
fit1.add(Dense(output_dim=1, activation=custom_gelu))
somethings wrong with that custom loss.. getting really strange predictions not going under ~-.25

it's not wrong that you are not getting below -0.25, look at the graph for the function: