Davide Santangelo

Posted on Jan 29

Understanding Neural Networks with Ruby

#ai #machinelearning #ruby #computerscience

Neural networks are a fundamental part of modern artificial intelligence and machine learning. They are inspired by the structure of the human brain and are used to recognize patterns, make predictions, and process complex data. These networks consist of interconnected layers of artificial neurons that process information in a way similar to biological neurons. They excel at solving problems in diverse fields such as computer vision, natural language processing, financial forecasting, and robotics. With advancements in computational power and algorithms, neural networks have become increasingly powerful, leading to breakthroughs in self-driving cars, real-time translation, and personalized recommendations.

In this article, we will explore how neural networks work and implement a simple one in Ruby.

What is a Neural Network?

A neural network consists of layers of neurons (also called nodes) that are connected by weights. It typically has three types of layers:

Input Layer: Receives the raw data.
Hidden Layers: Perform calculations using weighted connections and activation functions.
Output Layer: Produces the final result.

Each neuron processes the input, applies an activation function, and passes the result to the next layer.

What is an Activation Function?

Activation Functions in Neural Networks

An activation function is a function that's applied to the output of a neuron. It determines whether the neuron should be "activated" or not. There are many different activation functions, each with its own advantages and disadvantages.

Characteristics of a Good Activation Function

A good activation function should be:

Non-linear: This is important because it allows the network to learn complex patterns in the data.
Differentiable: This is important for the training process, as it allows us to calculate the gradient of the error function.
Computationally efficient: This is important because neural networks can be very large, and we need to be able to compute the activation function quickly.

Popular Activation Functions

Sigmoid

Outputs a value between 0 and 1.
Often used in the output layer of a network for binary classification problems.

class Sigmoid
  def self.call(x)
    1.0 / (1.0 + Math.exp(-x))
  end
end

Tanh (Hyperbolic Tangent)

Outputs a value between -1 and 1.
Similar to Sigmoid but centered at 0, making it easier to train.

class Tanh
  def self.call(x)
    Math.tanh(x)
  end
end

ReLU (Rectified Linear Unit)

Outputs 0 if the input is negative, and the input itself if the input is positive.
Often used in the hidden layers of a network to help prevent the vanishing gradient problem.

class ReLU
  def self.call(x)
    [0, x].max
  end
end

Softmax (For Multi-Class Classification)

Converts outputs into probabilities.
Computationally expensive.

# Softmax Activation Function
class Softmax
  def self.call(values)
    exp_values = values.map { |v| Math.exp(v) }
    sum_exp = exp_values.sum
    exp_values.map { |v| v / sum_exp }
  end
end

puts Softmax.call([2.0, 1.0, 0.1]).inspect # Output: Probabilities summing to 1

Activation functions are a key part of neural networks, enabling them to learn complex relationships in data efficiently.

Implementing a Simple Neural Network in Ruby

Let's build a single-layer neural network that takes two inputs and predicts an output.

class SimpleNeuralNetwork
  attr_accessor :weights, :bias

  def initialize
    @weights = [rand, rand]  # Two random weights
    @bias = rand             # Random bias
  end

  def forward(inputs)
    sum = inputs[0] * @weights[0] + inputs[1] * @weights[1] + @bias
    Sigmoid.call(sum)  # Using Sigmoid activation
  end

  def train(inputs, target, learning_rate = 0.1)
    prediction = forward(inputs)
    error = target - prediction

    # Adjust weights and bias
    @weights[0] += learning_rate * error * inputs[0]
    @weights[1] += learning_rate * error * inputs[1]
    @bias += learning_rate * error
  end
end

# Example Usage
nn = SimpleNeuralNetwork.new
puts nn.forward([1, 0])  # Predict output
nn.train([1, 0], 1)       # Train with expected output 1
puts nn.forward([1, 0])  # Check new prediction

Explanation:

We initialize the network with random weights and bias.
The forward method computes the weighted sum and applies the Sigmoid activation function.
The train method updates the weights based on the error.

Simple Sentiment Analysis Neural Network in Ruby

In this tutorial, we'll build a neural network from scratch in Ruby that can analyze the sentiment of text inputs. Our network will be able to classify text as negative, neutral, or positive. While this is a simplified implementation, it provides a good foundation for understanding how sentiment analysis and neural networks work.

require 'matrix'
require 'set'

class SentimentNeuralNetwork
  def initialize
    @vocabulary = Set.new
    @word_to_index = {}
    @input_size = 0
    @hidden_size = 64
    @output_size = 3  # negative, neutral, positive

    # Pre-trained weights will be initialized after vocabulary building
    @weights1 = nil
    @weights2 = nil
    @bias1 = nil
    @bias2 = nil
  end

  def preprocess_text(text)
    # Convert to lowercase and split into words
    words = text.downcase.gsub(/[^a-z\s]/, '').split

    # Remove common stop words
    stop_words = Set.new(['the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by'])
    words.reject! { |word| stop_words.include?(word) }

    words
  end

  def build_vocabulary(training_texts)
    training_texts.each do |text|
      words = preprocess_text(text)
      @vocabulary.merge(words)
    end

    @vocabulary.each_with_index do |word, index|
      @word_to_index[word] = index
    end

    @input_size = @vocabulary.size
    initialize_weights
  end

  def initialize_weights
    # Initialize weights with Xavier/Glorot initialization
    xavier_init1 = Math.sqrt(6.0 / (@input_size + @hidden_size))
    xavier_init2 = Math.sqrt(6.0 / (@hidden_size + @output_size))

    @weights1 = Matrix.build(@input_size, @hidden_size) { rand(-xavier_init1..xavier_init1) }
    @weights2 = Matrix.build(@hidden_size, @output_size) { rand(-xavier_init2..xavier_init2) }
    @bias1 = Matrix.build(1, @hidden_size) { 0.0 }
    @bias2 = Matrix.build(1, @output_size) { 0.0 }
  end

  def text_to_vector(text)
    vector = Array.new(@input_size, 0)
    words = preprocess_text(text)

    words.each do |word|
      if @word_to_index.key?(word)
        vector[@word_to_index[word]] += 1
      end
    end

    # Normalize the vector
    sum = vector.sum.to_f
    sum = 1.0 if sum == 0
    vector.map! { |x| x / sum }

    vector
  end

  def sigmoid(x)
    1.0 / (1.0 + Math.exp(-x))
  end

  def softmax(x)
    exp_x = x.map { |val| Math.exp(val) }
    sum = exp_x.sum
    exp_x.map { |val| val / sum }
  end

  def forward(input_vector)
    input_matrix = Matrix[input_vector]

    # Hidden layer
    hidden = (input_matrix * @weights1 + @bias1).map { |x| sigmoid(x) }

    # Output layer
    output = (hidden * @weights2 + @bias2).to_a[0]
    softmax(output)
  end

  def analyze_sentiment(text)
    # Convert text to vector
    input_vector = text_to_vector(text)

    # Forward pass
    output = forward(input_vector)

    # Get prediction
    sentiment_index = output.index(output.max)
    sentiment = ['negative', 'neutral', 'positive'][sentiment_index]
    confidence = output[sentiment_index]

    {
      sentiment: sentiment,
      confidence: confidence,
      probabilities: {
        negative: output[0],
        neutral: output[1],
        positive: output[2]
      }
    }
  end

  # Pre-train the network with some basic examples
  def pretrain
    training_data = [
      ["I love this, it's amazing!", "positive"],
      ["This is great!", "positive"],
      ["What a wonderful day", "positive"],
      ["I don't like this at all", "negative"],
      ["This is terrible", "negative"],
      ["I hate this", "negative"],
      ["It's okay", "neutral"],
      ["This is fine", "neutral"],
      ["Not bad, not great", "neutral"]
    ]

    build_vocabulary(training_data.map(&:first))

    # Simple training loop (in a real implementation, this would be more sophisticated)
    training_data.each do |text, label|
      input_vector = text_to_vector(text)
      target = case label
               when "negative" then [1, 0, 0]
               when "neutral" then [0, 1, 0]
               when "positive" then [0, 0, 1]
               end

      # Update weights (simplified training)
      output = forward(input_vector)
      error = target.zip(output).map { |t, o| t - o }

      # Backpropagation would go here in a full implementation
    end
  end
end

# Example usage
if __FILE__ == $0
  # Create and train the network
  network = SentimentNeuralNetwork.new
  network.pretrain

  # Test some examples
  test_texts = [
    "I really love this product!",
    "This is the worst experience ever",
    "It's an okay service, nothing special",
    "The quality is amazing",
    "I'm not sure how I feel about this"
  ]

  puts "\nSentiment Analysis Results:\n\n"
  test_texts.each do |text|
    result = network.analyze_sentiment(text)
    puts "Text: #{text}"
    puts "Sentiment: #{result[:sentiment]} (Confidence: #{(result[:confidence] * 100).round(2)}%)"
    puts "Probabilities:"
    result[:probabilities].each do |sentiment, prob|
      puts "  #{sentiment}: #{(prob * 100).round(2)}%"
    end
    puts "\n"
  end
end

Breaking Down the Implementation

1. Network Architecture

Our neural network has three layers:

def initialize
  @vocabulary = Set.new
  @word_to_index = {}
  @input_size = 0          # Will be set based on vocabulary size
  @hidden_size = 64        # 64 neurons in hidden layer
  @output_size = 3         # 3 possible outputs (negative, neutral, positive)

  # Weights and biases
  @weights1 = nil
  @weights2 = nil
  @bias1 = nil
  @bias2 = nil
end

Input Layer: Size depends on vocabulary size
Hidden Layer: 64 neurons with sigmoid activation
Output Layer: 3 neurons with softmax activation

2. Text Preprocessing

The text preprocessing step is crucial for converting raw text into a format our neural network can understand:

def preprocess_text(text)
  # Convert to lowercase and split into words
  words = text.downcase.gsub(/[^a-z\s]/, '').split

  # Remove common stop words
  stop_words = Set.new(['the', 'a', 'an', 'and', 'or', 'but', 'in', 
                       'on', 'at', 'to', 'for', 'of', 'with', 'by'])
  words.reject! { |word| stop_words.include?(word) }

  words
end

This method:

Converts text to lowercase
Removes punctuation using regex
Splits into words
Removes common stop words

3. Vocabulary Building

The vocabulary system converts words into numerical vectors:

def build_vocabulary(training_texts)
  training_texts.each do |text|
    words = preprocess_text(text)
    @vocabulary.merge(words)
  end

  @vocabulary.each_with_index do |word, index|
    @word_to_index[word] = index
  end

  @input_size = @vocabulary.size
  initialize_weights
end

This creates a mapping between words and indices, which is used to create input vectors.

4. Weight Initialization

We use Xavier/Glorot initialization for better training stability:

def initialize_weights
  xavier_init1 = Math.sqrt(6.0 / (@input_size + @hidden_size))
  xavier_init2 = Math.sqrt(6.0 / (@hidden_size + @output_size))

  @weights1 = Matrix.build(@input_size, @hidden_size) { rand(-xavier_init1..xavier_init1) }
  @weights2 = Matrix.build(@hidden_size, @output_size) { rand(-xavier_init2..xavier_init2) }
  @bias1 = Matrix.build(1, @hidden_size) { 0.0 }
  @bias2 = Matrix.build(1, @output_size) { 0.0 }
end

5. Text to Vector Conversion

Converting text input into numerical vectors:

def text_to_vector(text)
  vector = Array.new(@input_size, 0)
  words = preprocess_text(text)

  words.each do |word|
    if @word_to_index.key?(word)
      vector[@word_to_index[word]] += 1
    end
  end

  # Normalize the vector
  sum = vector.sum.to_f
  sum = 1.0 if sum == 0
  vector.map! { |x| x / sum }

  vector
end

6. Forward Propagation

The forward pass through the network:

def forward(input_vector)
  input_matrix = Matrix[input_vector]

  # Hidden layer
  hidden = (input_matrix * @weights1 + @bias1).map { |x| sigmoid(x) }

  # Output layer
  output = (hidden * @weights2 + @bias2).to_a[0]
  softmax(output)
end

7. Sentiment Analysis

The main method for analyzing text sentiment:

def analyze_sentiment(text)
  # Convert text to vector
  input_vector = text_to_vector(text)

  # Forward pass
  output = forward(input_vector)

  # Get prediction
  sentiment_index = output.index(output.max)
  sentiment = ['negative', 'neutral', 'positive'][sentiment_index]
  confidence = output[sentiment_index]

  {
    sentiment: sentiment,
    confidence: confidence,
    probabilities: {
      negative: output[0],
      neutral: output[1],
      positive: output[2]
    }
  }
end

Using the Network

Here's how to use the sentiment analysis network:

# Create and train the network
network = SentimentNeuralNetwork.new
network.pretrain

# Analyze some text
result = network.analyze_sentiment("I really love this product!")
puts result[:sentiment]           # => "positive"
puts result[:confidence]          # => confidence score
puts result[:probabilities]       # => Hash of all probabilities

Limitations and Possible Improvements

This implementation has several limitations:

Simple Architecture: The network uses a basic feed-forward architecture. More complex architectures like LSTM or transformers would perform better.
Limited Training: The pre-training is very basic. A production system would need:
- Larger training dataset
- Proper backpropagation
- Cross-validation
- Learning rate optimization
Basic Text Processing: The text preprocessing could be improved with:
- Better tokenization
- Lemmatization
- N-gram support
- Word embeddings
No Context Understanding: The network doesn't understand context, sarcasm, or complex language patterns.

This implementation provides a foundation for understanding how sentiment analysis works with neural networks. While it's not production-ready, it demonstrates the key concepts and can be extended for more sophisticated applications.

Remember that real-world sentiment analysis systems typically use more advanced techniques and pre-trained models, but building a simple version helps understand the fundamentals.

Conclusion

Neural networks are powerful tools for solving complex problems. In this article, we implemented a simple neural network in Ruby, explored different activation functions, and saw how learning happens. While Ruby is not the most common language for deep learning, this implementation provides a fundamental understanding of how neural networks operate under the hood.

References

Ian Goodfellow, Yoshua Bengio, Aaron Courville - Deep Learning (MIT Press)
Michael Nielsen - Neural Networks and Deep Learning (Online Book)
Andrew Ng - Machine Learning Specialization (Coursera)
Geoffrey Hinton - Lecture Notes on Neural Networks
Ruby Documentation - https://ruby-doc.org/

DEV Community