MLP
A Multilayer Perceptron (MLP) is a type of feedforward artificial neural network, consisting of:
- fully connected neurons with a nonlinear activation function
- organised in at least three layers:
- an input layer
- one or more hidden layers
- and an output layer
MPL’s properties breakdown
Neurons: The basic building blocks of an MLP are individual neurons, also known as perceptrons
- Each neuron takes in multiple inputs, applies weights to them, and produces an output based on an activation function
- Fully Connected: In an MLP, each neuron in one layer is connected to every neuron in the next layer
- Nonlinear Activation Function: MLPs use nonlinear activation functions, which introduce non-linearity into the network and enable it to learn and represent complex patterns in the data (e.g., ReLU, Sigmoid, Softmax)
- Layers: An MLP consists of three or more layers: an input layer, one or more hidden layers, and an output layer
- The input layer receives the input data, the hidden layers perform computations, and the output layer produces the final output of the network
- Feedforward: MLPs are feedforward networks, meaning that information flows in one direction, from the input layer to the output layer, without any feedback loops
- This architecture makes them suitable for tasks such as classification and regression, where the output is based solely on the input data
- Training: MLPs are trained using a combination of forward propagation, where input data is passed through the network to generate an output, and backpropagation, where the network’s weights are adjusted based on the difference between the output and the expected output
Regression MLPs
MLPs can be used for regression tasks.
A single value (e.g., the output of a sensor), then you just need a single output neuron: its output is the predicted value
Multivariate regression (i.e., to predict multiple values at once), you need one output neuron per output dimension
- Example: to predict a position, you need to predict 2D coordinates (latitude and longitude), so you need two output neurons
[!CAUTION] When building an MLP for regression, you do not need to use any activation function for the output neurons, so they are free to output any range of values.
- If you want to guarantee that the output will always be , then you can use the ReLU activation function in the output layer ≥ 0
- If you want to guarantee that the predictions will fall within a given range of values, then you can use the logistic function, and then scale the labels to the appropriate range: 0 to 1 for the logistic function
- The loss function to use during training is typically the mean squared error
Classification MLPs
MLPs can be also used for classification tasks.
- For a binary classification problem, you just need a single output neuron using the logistic activation function
- the output will be a number between 0 and 1, which you can interpret as the estimated probability of the positive class
- the estimated probability of the negative class is equal to one minus that number
- MLPs can also easily handle multilabel binary classification tasks (e.g., benign/ malicious and incoming/outgoing traffic)
- If each instance can belong only to a single class,(e.g., classes 0 through 9 for network traffic classification), then you need to have one output neuron per class, and you should use the softmax activation function for the whole output layer.