I am trying to build a binary classification algorithm (output is 0 or 1) on a dataset that contains normal and malicious network packets. The dataset shape (after converting IP @'s and hexa to decimal) is:
Note: The final column is the output.
And the Keras model is:
from keras.models import Sequential
from keras.layers import Dense
from sklearn import preprocessing
import numpy
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
seed = 4
numpy.random.seed(seed)
dataset = numpy.loadtxt("NetworkPackets.csv", delimiter=",")
X = dataset[:, 0:11].astype(float)
Y = dataset[:, 11]
model = Sequential()
model.add(Dense(12, input_dim=11, kernel_initializer='normal', activation='relu'))
model.add(Dense(12, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='relu'))
model.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.fit(X, Y, nb_epoch=100, batch_size=5)
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
However, I tried different optimizers, activation functions, number of layers, but the accuracy is reaching 0.5 at most:
Even I tried Grid search for searching the best parameters, but the maximum is 0.5. Does anyone knows why the output is always like that? and how can I enhance it. Thanks in advance!
The data has also to be standardized:
(x - x_mean) / x_std
Please ask questions on stackoverflow. We have here so many issues; many of them still open.
Your issue is having a RELU activation in the last layer. Use sigmoid!
@joelthchao Do you mean that the inputs must be normalized before using them in the model? And if yes do you know a method in keras for doing that?
@StefanoD I used standardized_X = preprocessing.scale(X) and the result becomes:
Which is great!
But the question is that a right approach?
Because when I predicted if a packet is normal or malicious after training, the model predicts a wrong one :/
@myhussien I tried using that and the result becomes 0%
Your test data has also to be standardized before prediction.
@StefanoD when I am standardizing the data before prediction (See below), the output is all [[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]].
@Ahmid you have to use the same transformer that you fitted with the training data.
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(train_packet)
# <train here>
X_test_scaled = scaler.transform(test_packet)
preds = loaded_model.predict(X_test_scaled)
@avsolatorio Thank you! Solved my problem 馃憤
The data has also to be standardized:
(x - x_mean) / x_stdPlease ask questions on stackoverflow. We have here so many issues; many of them still open.
Does the data has to be standardized , normalized or both ?
Does the data has to be standardized , normalized or both ?
Where's the difference between standardized and normalized?
Does the data has to be standardized , normalized or both ?
Where's the difference between standardized and normalized?
In case of standardization we use the formula: _(x - mean) / standard_deviation_
While in case of normalization we use the formula : _(x - xmin) / (xmax - xmin)_
In case of standardization we use the formula: _(x - mean) / standard_deviation_
While in case of normalization we use the formula : _(x - xmin) / (xmax - xmin)_
I'm not sure, if there is a big difference, because the goals are similar, see Normalization Wikipedia.
And I wouldn't mix different methods, if I were you. I don't see the point. But this is just an opinion.
I would just test what works best.
Most helpful comment
@Ahmid you have to use the same transformer that you fitted with the training data.