Neural networks are a fundamental part of modern artificial intelligence and machine learning. They are inspired by the structure of the human brain and are used to recognize patterns, make predictions, and process complex data. These networks consist of interconnected layers of artificial neurons that process information in a way similar to biological neurons. They excel at solving problems in diverse fields such as computer vision, natural language processing, financial forecasting, and robotics. With advancements in computational power and algorithms, neural networks have become increasingly powerful, leading to breakthroughs in self-driving cars, real-time translation, and personalized recommendations.
In this article, we will explore how neural networks work and implement a simple one in Ruby.
What is a Neural Network?
A neural network consists of layers of neurons (also called nodes) that are connected by weights. It typically has three types of layers:
- Input Layer: Receives the raw data.
- Hidden Layers: Perform calculations using weighted connections and activation functions.
- Output Layer: Produces the final result.
Each neuron processes the input, applies an activation function, and passes the result to the next layer.
What is an Activation Function?
Activation Functions in Neural Networks
An activation function is a function that's applied to the output of a neuron. It determines whether the neuron should be "activated" or not. There are many different activation functions, each with its own advantages and disadvantages.
Characteristics of a Good Activation Function
A good activation function should be:
- Non-linear: This is important because it allows the network to learn complex patterns in the data.
- Differentiable: This is important for the training process, as it allows us to calculate the gradient of the error function.
- Computationally efficient: This is important because neural networks can be very large, and we need to be able to compute the activation function quickly.
Popular Activation Functions
Sigmoid
- Outputs a value between 0 and 1.
- Often used in the output layer of a network for binary classification problems.
class Sigmoid
def self.call(x)
1.0 / (1.0 + Math.exp(-x))
end
end
Tanh (Hyperbolic Tangent)
- Outputs a value between -1 and 1.
- Similar to Sigmoid but centered at 0, making it easier to train.
class Tanh
def self.call(x)
Math.tanh(x)
end
end
ReLU (Rectified Linear Unit)
- Outputs 0 if the input is negative, and the input itself if the input is positive.
- Often used in the hidden layers of a network to help prevent the vanishing gradient problem.
class ReLU
def self.call(x)
[0, x].max
end
end
Softmax (For Multi-Class Classification)
- Converts outputs into probabilities.
- Computationally expensive.
# Softmax Activation Function
class Softmax
def self.call(values)
exp_values = values.map { |v| Math.exp(v) }
sum_exp = exp_values.sum
exp_values.map { |v| v / sum_exp }
end
end
puts Softmax.call([2.0, 1.0, 0.1]).inspect # Output: Probabilities summing to 1
Activation functions are a key part of neural networks, enabling them to learn complex relationships in data efficiently.
Implementing a Simple Neural Network in Ruby
Let's build a single-layer neural network that takes two inputs and predicts an output.
class SimpleNeuralNetwork
attr_accessor :weights, :bias
def initialize
@weights = [rand, rand] # Two random weights
@bias = rand # Random bias
end
def forward(inputs)
sum = inputs[0] * @weights[0] + inputs[1] * @weights[1] + @bias
Sigmoid.call(sum) # Using Sigmoid activation
end
def train(inputs, target, learning_rate = 0.1)
prediction = forward(inputs)
error = target - prediction
# Adjust weights and bias
@weights[0] += learning_rate * error * inputs[0]
@weights[1] += learning_rate * error * inputs[1]
@bias += learning_rate * error
end
end
# Example Usage
nn = SimpleNeuralNetwork.new
puts nn.forward([1, 0]) # Predict output
nn.train([1, 0], 1) # Train with expected output 1
puts nn.forward([1, 0]) # Check new prediction
Explanation:
- We initialize the network with random weights and bias.
- The forward method computes the weighted sum and applies the Sigmoid activation function.
- The train method updates the weights based on the error.
Simple Sentiment Analysis Neural Network in Ruby
In this tutorial, we'll build a neural network from scratch in Ruby that can analyze the sentiment of text inputs. Our network will be able to classify text as negative, neutral, or positive. While this is a simplified implementation, it provides a good foundation for understanding how sentiment analysis and neural networks work.
require 'matrix'
require 'set'
class SentimentNeuralNetwork
def initialize
@vocabulary = Set.new
@word_to_index = {}
@input_size = 0
@hidden_size = 64
@output_size = 3 # negative, neutral, positive
# Pre-trained weights will be initialized after vocabulary building
@weights1 = nil
@weights2 = nil
@bias1 = nil
@bias2 = nil
end
def preprocess_text(text)
# Convert to lowercase and split into words
words = text.downcase.gsub(/[^a-z\s]/, '').split
# Remove common stop words
stop_words = Set.new(['the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by'])
words.reject! { |word| stop_words.include?(word) }
words
end
def build_vocabulary(training_texts)
training_texts.each do |text|
words = preprocess_text(text)
@vocabulary.merge(words)
end
@vocabulary.each_with_index do |word, index|
@word_to_index[word] = index
end
@input_size = @vocabulary.size
initialize_weights
end
def initialize_weights
# Initialize weights with Xavier/Glorot initialization
xavier_init1 = Math.sqrt(6.0 / (@input_size + @hidden_size))
xavier_init2 = Math.sqrt(6.0 / (@hidden_size + @output_size))
@weights1 = Matrix.build(@input_size, @hidden_size) { rand(-xavier_init1..xavier_init1) }
@weights2 = Matrix.build(@hidden_size, @output_size) { rand(-xavier_init2..xavier_init2) }
@bias1 = Matrix.build(1, @hidden_size) { 0.0 }
@bias2 = Matrix.build(1, @output_size) { 0.0 }
end
def text_to_vector(text)
vector = Array.new(@input_size, 0)
words = preprocess_text(text)
words.each do |word|
if @word_to_index.key?(word)
vector[@word_to_index[word]] += 1
end
end
# Normalize the vector
sum = vector.sum.to_f
sum = 1.0 if sum == 0
vector.map! { |x| x / sum }
vector
end
def sigmoid(x)
1.0 / (1.0 + Math.exp(-x))
end
def softmax(x)
exp_x = x.map { |val| Math.exp(val) }
sum = exp_x.sum
exp_x.map { |val| val / sum }
end
def forward(input_vector)
input_matrix = Matrix[input_vector]
# Hidden layer
hidden = (input_matrix * @weights1 + @bias1).map { |x| sigmoid(x) }
# Output layer
output = (hidden * @weights2 + @bias2).to_a[0]
softmax(output)
end
def analyze_sentiment(text)
# Convert text to vector
input_vector = text_to_vector(text)
# Forward pass
output = forward(input_vector)
# Get prediction
sentiment_index = output.index(output.max)
sentiment = ['negative', 'neutral', 'positive'][sentiment_index]
confidence = output[sentiment_index]
{
sentiment: sentiment,
confidence: confidence,
probabilities: {
negative: output[0],
neutral: output[1],
positive: output[2]
}
}
end
# Pre-train the network with some basic examples
def pretrain
training_data = [
["I love this, it's amazing!", "positive"],
["This is great!", "positive"],
["What a wonderful day", "positive"],
["I don't like this at all", "negative"],
["This is terrible", "negative"],
["I hate this", "negative"],
["It's okay", "neutral"],
["This is fine", "neutral"],
["Not bad, not great", "neutral"]
]
build_vocabulary(training_data.map(&:first))
# Simple training loop (in a real implementation, this would be more sophisticated)
training_data.each do |text, label|
input_vector = text_to_vector(text)
target = case label
when "negative" then [1, 0, 0]
when "neutral" then [0, 1, 0]
when "positive" then [0, 0, 1]
end
# Update weights (simplified training)
output = forward(input_vector)
error = target.zip(output).map { |t, o| t - o }
# Backpropagation would go here in a full implementation
end
end
end
# Example usage
if __FILE__ == $0
# Create and train the network
network = SentimentNeuralNetwork.new
network.pretrain
# Test some examples
test_texts = [
"I really love this product!",
"This is the worst experience ever",
"It's an okay service, nothing special",
"The quality is amazing",
"I'm not sure how I feel about this"
]
puts "\nSentiment Analysis Results:\n\n"
test_texts.each do |text|
result = network.analyze_sentiment(text)
puts "Text: #{text}"
puts "Sentiment: #{result[:sentiment]} (Confidence: #{(result[:confidence] * 100).round(2)}%)"
puts "Probabilities:"
result[:probabilities].each do |sentiment, prob|
puts " #{sentiment}: #{(prob * 100).round(2)}%"
end
puts "\n"
end
end
Breaking Down the Implementation
1. Network Architecture
Our neural network has three layers:
def initialize
@vocabulary = Set.new
@word_to_index = {}
@input_size = 0 # Will be set based on vocabulary size
@hidden_size = 64 # 64 neurons in hidden layer
@output_size = 3 # 3 possible outputs (negative, neutral, positive)
# Weights and biases
@weights1 = nil
@weights2 = nil
@bias1 = nil
@bias2 = nil
end
- Input Layer: Size depends on vocabulary size
- Hidden Layer: 64 neurons with sigmoid activation
- Output Layer: 3 neurons with softmax activation
2. Text Preprocessing
The text preprocessing step is crucial for converting raw text into a format our neural network can understand:
def preprocess_text(text)
# Convert to lowercase and split into words
words = text.downcase.gsub(/[^a-z\s]/, '').split
# Remove common stop words
stop_words = Set.new(['the', 'a', 'an', 'and', 'or', 'but', 'in',
'on', 'at', 'to', 'for', 'of', 'with', 'by'])
words.reject! { |word| stop_words.include?(word) }
words
end
This method:
- Converts text to lowercase
- Removes punctuation using regex
- Splits into words
- Removes common stop words
3. Vocabulary Building
The vocabulary system converts words into numerical vectors:
def build_vocabulary(training_texts)
training_texts.each do |text|
words = preprocess_text(text)
@vocabulary.merge(words)
end
@vocabulary.each_with_index do |word, index|
@word_to_index[word] = index
end
@input_size = @vocabulary.size
initialize_weights
end
This creates a mapping between words and indices, which is used to create input vectors.
4. Weight Initialization
We use Xavier/Glorot initialization for better training stability:
def initialize_weights
xavier_init1 = Math.sqrt(6.0 / (@input_size + @hidden_size))
xavier_init2 = Math.sqrt(6.0 / (@hidden_size + @output_size))
@weights1 = Matrix.build(@input_size, @hidden_size) { rand(-xavier_init1..xavier_init1) }
@weights2 = Matrix.build(@hidden_size, @output_size) { rand(-xavier_init2..xavier_init2) }
@bias1 = Matrix.build(1, @hidden_size) { 0.0 }
@bias2 = Matrix.build(1, @output_size) { 0.0 }
end
5. Text to Vector Conversion
Converting text input into numerical vectors:
def text_to_vector(text)
vector = Array.new(@input_size, 0)
words = preprocess_text(text)
words.each do |word|
if @word_to_index.key?(word)
vector[@word_to_index[word]] += 1
end
end
# Normalize the vector
sum = vector.sum.to_f
sum = 1.0 if sum == 0
vector.map! { |x| x / sum }
vector
end
6. Forward Propagation
The forward pass through the network:
def forward(input_vector)
input_matrix = Matrix[input_vector]
# Hidden layer
hidden = (input_matrix * @weights1 + @bias1).map { |x| sigmoid(x) }
# Output layer
output = (hidden * @weights2 + @bias2).to_a[0]
softmax(output)
end
7. Sentiment Analysis
The main method for analyzing text sentiment:
def analyze_sentiment(text)
# Convert text to vector
input_vector = text_to_vector(text)
# Forward pass
output = forward(input_vector)
# Get prediction
sentiment_index = output.index(output.max)
sentiment = ['negative', 'neutral', 'positive'][sentiment_index]
confidence = output[sentiment_index]
{
sentiment: sentiment,
confidence: confidence,
probabilities: {
negative: output[0],
neutral: output[1],
positive: output[2]
}
}
end
Using the Network
Here's how to use the sentiment analysis network:
# Create and train the network
network = SentimentNeuralNetwork.new
network.pretrain
# Analyze some text
result = network.analyze_sentiment("I really love this product!")
puts result[:sentiment] # => "positive"
puts result[:confidence] # => confidence score
puts result[:probabilities] # => Hash of all probabilities
Limitations and Possible Improvements
This implementation has several limitations:
Simple Architecture: The network uses a basic feed-forward architecture. More complex architectures like LSTM or transformers would perform better.
-
Limited Training: The pre-training is very basic. A production system would need:
- Larger training dataset
- Proper backpropagation
- Cross-validation
- Learning rate optimization
-
Basic Text Processing: The text preprocessing could be improved with:
- Better tokenization
- Lemmatization
- N-gram support
- Word embeddings
No Context Understanding: The network doesn't understand context, sarcasm, or complex language patterns.
This implementation provides a foundation for understanding how sentiment analysis works with neural networks. While it's not production-ready, it demonstrates the key concepts and can be extended for more sophisticated applications.
Remember that real-world sentiment analysis systems typically use more advanced techniques and pre-trained models, but building a simple version helps understand the fundamentals.
Conclusion
Neural networks are powerful tools for solving complex problems. In this article, we implemented a simple neural network in Ruby, explored different activation functions, and saw how learning happens. While Ruby is not the most common language for deep learning, this implementation provides a fundamental understanding of how neural networks operate under the hood.
References
- Ian Goodfellow, Yoshua Bengio, Aaron Courville - Deep Learning (MIT Press)
- Michael Nielsen - Neural Networks and Deep Learning (Online Book)
- Andrew Ng - Machine Learning Specialization (Coursera)
- Geoffrey Hinton - Lecture Notes on Neural Networks
- Ruby Documentation - https://ruby-doc.org/
Top comments (0)