Deeplearning4j: Performs convolutions over the embedded word vectors using multiple filter sizes

Created on 29 Jan 2016 · 5Comments · Source: eclipse/deeplearning4j

Something similar to
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
and the paper "Convolutional Neural Networks for Sentence Classification" link
http://arxiv.org/abs/1408.5882

Some descriptions to make it more clear.
"The first layers embeds words into low-dimensional vectors. The next layer performs convolutions over the embedded word vectors using multiple filter sizes. For example, sliding over 3, 4 or 5 words at a time. Next, we max-pool the result of the convolutional layer into a long feature vector, add dropout regularization, and classify the result using a softmax layer."

Thanks.

Enhancement

Source

zzyxzz

Most helpful comment

Right. That's what I was mentioning earlier. Mainly flagging for later.

On Fri, Jan 29, 2016 at 4:28 PM, Alex Black [email protected]
wrote:

Just a heads up: this sort of architecture should be relatively
straightforward with the computation graph architecture we've been working
on. The idea with that is that you define a network using an arbitrary
directed acyclic graph connection structure, instead of just a stack of
layers as in MultiLayerNetwork.
This should be merged to master soon.

—
Reply to this email directly or view it on GitHub
https://github.com/deeplearning4j/deeplearning4j/issues/1088#issuecomment-177030604
.

agibsonccc on 30 Jan 2016

🎉1 👍1

All 5 comments

Just a heads up: this sort of architecture should be relatively straightforward with the computation graph architecture we've been working on. The idea with that is that you define a network using an arbitrary directed acyclic graph connection structure, instead of just a stack of layers as in MultiLayerNetwork.
This should be merged to master soon.

AlexDBlack on 30 Jan 2016

🎉1 👍1

Right. That's what I was mentioning earlier. Mainly flagging for later.

On Fri, Jan 29, 2016 at 4:28 PM, Alex Black [email protected]
wrote:

Just a heads up: this sort of architecture should be relatively
straightforward with the computation graph architecture we've been working
on. The idea with that is that you define a network using an arbitrary
directed acyclic graph connection structure, instead of just a stack of
layers as in MultiLayerNetwork.
This should be merged to master soon.

—
Reply to this email directly or view it on GitHub
https://github.com/deeplearning4j/deeplearning4j/issues/1088#issuecomment-177030604
.