Architectures

Architectures

A FNN is a type of artificial neural network characterised by the unidirectional flow of information between its layers

  • In a FNN, the information moves only in one direction, from the input nodes, through the hidden nodes (if any), and to the output nodes, without any cycles or loops
  • This is in contrast to recurrent neural networks, which have a bi-directional flow and can have feedback connections
  • Training with backpropagation: Modern FNNs are typically trained using the backpropagation method, which involves iteratively adjusting the weights and biases of the network to minimise the difference between the predicted outputs and the true outputs

Types of FNNs

  • Multilayer Perceptron (MLP) is a type of feedforward artificial neural network, consisting of fully connected neurons with a nonlinear activation function, organized in at least three layers: an input layer, one or more hidden layers, and an output layer
  • Autoencoders: FNNs used for unsupervised learning, where the network learns to encode input data into a reduceddimensional latent space and then reconstruct the input from the encoded representation
  • Convolutional Neural Networks (CNNs): Specialised for grid-like data, using convolutional and pooling layers for feature extraction from images, videos, and other spatial data

The main difference between CNNs and MLPs is that CNNs have 3 dimensions: width, height, and depth, and not all neurons in one layer are fully connected to neurons in the next layer

  • Recurrent Neural Networks (RNNs): Designed for sequential data, RNNs have connections that form directed cycles, allowing information to persist. While in a feedforward network, information travels in one direction: from the input layer through the hidden layers to the output layer.

Convolutional Neural Networks (CNNs)

A Convolutional Neural Network (CNN) is a specialized type of neural network designed for processing grid-like data, such as images, videos, as well as array-like representation of network traffic

  • The defining feature of CNNs is represented by convolution operations applied to the input data
  • Convolution involves sliding small filters (also known as kernels) over the input image to perform element-wise multiplications and summations
  • These operations enable CNNs to detect patterns, edges, textures, and more complex features in the input data

Recurrent Neural Networks (RNNs)

Recurrent Neural Network: A RNN is a type of artificial neural network designed to handle sequential data or data with a temporal aspect

  • Each neuron of a RNN layer receives both the input vector and the output of the previous time step x(t) y(t−1)
  • Since the output of a recurrent neuron is a function of all the inputs from previous time steps, we can say it has a form of memory
  • This makes them well-suited for tasks where the order of the data matters, such as time series prediction, speech recognition, language modelling, and translation

Autoencoders

An Autoencoder is a type of neural network that is capable of learning the latent representation of the input data

  • At training time, an autoencoder learns to copy its input to its output
  • Since the internal representation has a lower dimensionality than the input data, the autoencoder cannot simply pass the input through the internal layers and produce an exact copy as the output
  • It is forced to learn the most important features and discard the unimportant ones

Generative Adversarial Networks (GANs)

Generative Adversarial Networks (GANs) consist of two neural networks, a generator and a discriminator, which are trained simultaneously through adversarial training

  • The generator network takes random noise or a random vector as input and transforms it into data that is meant to resemble some specific kind of data (e.g., traffic flows, images, etc.)
    • Essentially, it generates new, synthetic data samples
    • The generator’s goal is to produce data points that are indistinguishable from real data
  • The discriminator network evaluates the authenticity of a given data sample, determining whether it is real (from the actual dataset) or fake (generated by the generator)
    • The discriminator’s goal is to correctly classify real and generated data