Introduction to NN

Introduction to NN

Neural networks consist of interconnected nodes, or “neurons”, organized into layers. Information is passed through the network, and the network learns to make decisions or predictions based on the input data

  • The first layer is the input layer, which receives the raw data
  • The last layer is the output layer, which produces the final prediction or classification
  • Between the input and output layers, there can be one or more hidden layers where computation and learning occur
  • Deep learning refers to networks with many hidden layers, allowing them to learn intricate patterns

NN layers

Input Layer

The first layer of the network is the input layer.

  • This layer consists of nodes (also called neurons) that receive the initial input data
  • Each node in the input layer represents a feature or attribute of the input data

Hidden Layers

Each hidden layer contains nodes that perform computations on the input data.

These computations involve weighted sums of inputs followed by activation functions, allowing the network to learn complex patterns and representations of the input data.

Output Layer

The nodes in the output layer produce the network’s predictions or classifications based on the computations performed in the hidden layers.

The number of nodes in the output layer depends on the nature of the task (e.g., binary or multi-class classification).

Multi-Layer Perceptron

A MLP is a type of neural network:

  • Each neuron in a layer is connected to every neuron in the subsequent layer
  • The term multilayer signifies the presence of multiple hidden layers between the input and output layers

![[mlp.png]]

Shallow vs Deep Neural Network

  • Shallow neural networks (e.g., logistic regression) are relatively simple and have limited capacity to learn complex patterns and representations
  • Deep neural networks have a greater capacity to learn intricate features and hierarchical representations from the data, allowing them to capture more complex patterns and relationships
  • Feature engineering:
    • Shallow networks often require handcrafted features to be extracted from the input data before they can be fed into the model (e.g., rotation for DTs, polynomial features)
    • Deep neural networks can learn to perform a complex task directly from raw input data, without the need for manual feature engineering

Logistic regression vs a 2-layer MLP

![[logistic_vs_mlp.png]]

Example: The XOR operator

A logistic regression model, which is only able to learn linear decision boundaries, would not be able to correctly classify the XOR data.

A Multilayer perceptron can solve the problem. ![[xor.png]]

XOR classification problem and an MLP that solves it (source: Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O’Reilly Media, Inc.")