1. What is Docker, and Why is it Used?
Docker is an open-source containerization platform that allows developers to package applications and their dependencies into isolated environments called containers. These containers ensure that applications run consistently across different environments.
πΉ Real-Life Example:
Imagine you're developing a MERN stack web app. It works fine on your laptop, but when your teammate runs it, they get "version mismatch" errors.
With Docker, you create a consistent environment across all machines, preventing such issues.
β Why Use Docker?
Docker is beneficial when you need:
- Portability β Works on any OS without compatibility issues
- Consistency β Eliminates "It works on my machine" problems
- Lightweight β Uses fewer system resources than virtual machines
- Scalability β Quickly scale applications with minimal overhead
2. Main Components of Docker
π οΈ 1. Docker Daemon (dockerd)
- The background process that manages Docker containers
- Listens for API requests and handles images, networks, and volumes
π» 2. Docker CLI (Command-Line Interface)
- A tool to interact with the Docker Daemon
- Common commands:
docker ps # List running containers
docker run # Start a new container
docker stop # Stop a running container
π¦ 3. Docker Images
- A read-only template containing the application, libraries, and dependencies
- Immutable β Once built, images donβt change
- Used to create containers
π 4. Docker Containers
- A running instance of a Docker image
- Isolated from the host system but can interact if needed (e.g., exposing ports)
π 5. Docker Hub
- A cloud-based registry where Docker images are stored and shared
ποΈ 6. Docker Volumes
- Used for persistent data storage outside of containers
π Illustration of Docker Components:
3. How is Docker Different from Virtual Machines?
β‘ Example:
You're testing a React.js + Express.js app. Instead of running a full Ubuntu VM (which consumes high RAM & CPU), you start a lightweight container in seconds:
docker run -d -p 3000:3000 node:16
Unlike a VM, which takes minutes to boot, a container starts instantly.
π Docker vs. Virtual Machines
Feature | Docker (Containers) | Virtual Machines (VMs) |
---|---|---|
Boot Time | Seconds | Minutes |
Size | MBs | GBs |
Performance | Near-native speed | Slower due to hypervisor overhead |
Isolation | Process-level isolation | Full OS-level isolation |
Resource Efficiency | Shares OS kernel, lightweight | Requires full OS, resource-intensive |
docker run vs. docker start vs. docker exec
docker run : Start a new container
docker start : Restart a stopped container
docker exec : Run a command inside it
4. Popular and Useful Docker Commands
Here are some of the most commonly used Docker commands:
π Container Management
# List all running containers
docker ps
# List all containers (including stopped ones)
docker ps -a
# Start a stopped container
docker start <container_id>
# Stop a running container
docker stop <container_id>
# Remove a container
docker rm <container_id>
π Image Management
# List all available images
docker images
# Pull an image from Docker Hub
docker pull <image_name>
# Remove an image
docker rmi <image_name>
π¦ Build and Run Containers
# Build a Docker image from a Dockerfile
docker build -t <image_name> .
# Run a container from an image
docker run -d -p 8080:80 <image_name>
π Volume Management
# List all Docker volumes
docker volume ls
# Create a new volume
docker volume create <volume_name>
# Remove a volume
docker volume rm <volume_name>
Docker Compose: docker-compose.yml
What is docker-compose.yml
?
The docker-compose.yml
file is used to define and run multi-container Docker applications. With Docker Compose, you can manage and orchestrate multiple services, including databases, backend APIs, and front-end applications, all in a single file.
It allows you to define services, networks, and volumes, making it easier to deploy and manage applications that require multiple services working together.
Why is docker-compose.yml
Useful?
Simplifies Multi-Container Management:
Instead of managing each container manually, Docker Compose allows you to define all services (frontend, backend, database, etc.) in one configuration file and launch them with a single command.Networking and Dependency Management:
Docker Compose automatically creates a network for your containers, allowing them to communicate with each other. Services can be referenced by their service name, which means the backend can talk to the database without needing an IP address.One Command to Start Everything:
Instead of running individual containers with complexdocker run
commands, Docker Compose lets you define the services and their dependencies in a YAML file, and run everything withdocker-compose up
.Simplified Development Environment:
With Docker Compose, developers can easily replicate the production environment locally, using the same configuration for services like databases, backends, and frontends. It allows seamless integration and testing, as you don't have to manually set up each service.Environment Variable Management:
You can manage environment variables for each service within thedocker-compose.yml
file, making it easier to configure your application for different environments (development, testing, production).
Example of docker-compose.yml
for a Web Application
Letβs walk through an example where we have three services:
- Frontend: A React app running on port 3000.
- Backend: A Node.js API running on port 5000.
- Database: A MongoDB instance to store data.
version: '3.8'
services:
frontend:
build: ./frontend
ports:
- "3000:3000"
volumes:
- ./frontend:/app
depends_on:
- backend
backend:
build: ./backend
ports:
- "5000:5000"
environment:
- NODE_ENV=development
depends_on:
- database
database:
image: mongo
volumes:
- mongo-data:/data/db
ports:
- "27017:27017"
volumes:
mongo-data:
Database Migrations
- Explain how you would design and manage a database schema using Sequelize, including the process of setting up migrations, handling model relationships, optimizing for performance, and managing database changes in a collaborative team environment.
Database Migration with Sequelize
Purpose
Database migrations allow you to safely update and manage your database schema over time. They help track changes to the schema in a version-controlled manner, making it easy to collaborate in teams.
Setting Up Migrations
- Initialize Sequelize with
sequelize-cli
to generate migration files. - Migration files contain two primary methods:
-
up
: For applying changes (e.g., create tables, add columns). -
down
: For rolling back changes (undoing the applied changes).
-
Handling Schema Changes
Creating Migrations:
When you need to add, modify, or delete database schema (e.g., tables, columns), you create a new migration file.Applying Migrations:
Use the commandnpx sequelize-cli db:migrate
to apply migrations to the database.Rolling Back Migrations:
Usenpx sequelize-cli db:migrate:undo
to undo the last applied migration.
Model Relationships
- Define associations (e.g., one-to-many, many-to-many) within your models using Sequelize methods:
-
hasMany
,belongsTo
,manyToMany
, etc.
-
Collaborative Workflow
- Migrations should be version-controlled using Git.
- Each team member works with migrations, and when schema changes are required, new migrations are created and applied across all environments (development, staging, production).
Github Action
Reference
Steps to Deploy on AWS EC2
1. Launch EC2 Instance
2. Add Secret Variables in GitHub
- Go to GitHub Repo Settings β Secrets and Variables β Actions β Add Secret
3. Connect to EC2 Instance
Install Docker on AWS EC2
sudo apt-get update
sudo apt-get install docker.io -y
sudo systemctl start docker
sudo chmod 666 /var/run/docker.sock
sudo systemctl enable docker
docker --version
docker ps
4. Create Two Runners on the Same EC2 Instance
- In React App β Actions β Runner β New Self-Hosted Runner
- Copy the download commands and run them in the EC2 instance terminal
- Install it as a service to keep it running in the background
sudo ./svc.sh install
sudo ./svc.sh start
- Do the same for the Node.js Runner
5. Create a Dockerfile for Node.js (Backend)
6. Create a GitHub Actions Workflow
Create a .github/workflows/cicd.yml
file
7. Push Docker Images to DockerHub
8. Add Inbound/Outbound Rules on EC2 Instance
9. Access the Node.js Application
- Use
EC2_PUBLIC_IP:PORT
to access your application
Deploying React App
- Create a Dockerfile for React
- Follow the same process as above
What is GitHub Actions, and how does it work?
GitHub Actions is a CI/CD automation tool that allows you to define workflows in YAML to build, test, and deploy applications directly from GitHub repositories.
How do you trigger a GitHub Actions workflow?
Workflows can be triggered by events such as push
, pull_request
, schedule
, workflow_dispatch
, and repository_dispatch
.
What are the key components of a GitHub Actions workflow?
Key components include:
-
Workflows (YAML files in
.github/workflows/
) - Jobs (Independent execution units in a workflow)
- Steps (Commands executed in a job)
- Actions (Reusable units of functionality)
- Runners (Machines that execute jobs)
What is the difference between jobs, steps, and actions?
- Jobs: Run in parallel or sequentially within a workflow.
- Steps: Individual tasks executed within a job.
- Actions: Pre-built reusable components within steps.
How do you use environment variables and secrets in GitHub Actions?
- Define environment variables using
env
:
env:
NODE_ENV: production
- Store sensitive values in
secrets
:
env:
API_KEY: ${{ secrets.API_KEY }}
What are self-hosted runners, and when should you use them?
Self-hosted runners are custom machines used to execute workflows instead of GitHub's hosted runners. Use them for private repositories, custom hardware, or specific dependencies.
How do you cache dependencies in GitHub Actions?
Use actions/cache@v3
to cache dependencies and speed up builds:
- uses: actions/cache@v3
with:
path: ~/.npm
key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
restore-keys: npm-${{ runner.os }}
How do you create a reusable workflow in GitHub Actions?
Define a workflow with on: workflow_call
and call it from another workflow:
on: workflow_call
jobs:
build:
runs-on: ubuntu-latest
steps:
- run: echo "Reusable workflow"
How do you set up a CI/CD pipeline using GitHub Actions?
Define a workflow that includes jobs for building, testing, and deploying:
jobs:
build:
runs-on: ubuntu-latest
steps:
- run: echo "Building..."
test:
runs-on: ubuntu-latest
steps:
- run: echo "Testing..."
deploy:
runs-on: ubuntu-latest
needs: test
steps:
- run: echo "Deploying..."
What is the difference between workflow_dispatch, workflow_run, and schedule triggers?
-
workflow_dispatch
: Manual trigger via GitHub UI/API. -
workflow_run
: Triggered when another workflow finishes. -
schedule
: Runs workflows at specific times using cron syntax.
How do you debug a failing GitHub Actions workflow?
- Check logs in GitHub Actions UI.
- Use
set -x
in bash scripts for verbose output. - Add
continue-on-error: true
to isolate issues.
How do you run a GitHub Actions workflow locally?
Use act
, a tool that simulates GitHub Actions on your local machine:
act
How do you optimize and speed up GitHub Actions workflows?
- Use caching (
actions/cache@v3
). - Run jobs in parallel when possible.
- Use
matrix
builds for different environments. - Limit workflow execution to necessary branches.
How do you manage permissions and security in GitHub Actions?
- Use least privilege principle for tokens (
GITHUB_TOKEN
). - Restrict
secrets
exposure to trusted workflows. - Use branch protection rules to limit workflow execution.
Websockets & Multi-backend system
Why Do Backends Need to Talk to Each Other?
In a typical client-server architecture, communication happens between the browser (client) and the backend server. However, as applications grow, keeping everything on a single server exposed to the internet becomes inefficient and unscalable.
When designing a multi-backend system, you need to consider:
- If there are multiple services, how should they communicate when an event occurs?
- Should it be an immediate HTTP call?
- Should the event be sent to a queue?
- Should the services communicate via WebSockets?
- Should you use a Pub-Sub mechanism?
These decisions impact performance, scalability, and reliability.
Example: Payment Processing System
Let's consider a payment application. When a transaction occurs:
- The database update should happen immediately (synchronous).
- The notification (email/SMS) can be pushed to a queue (asynchronous).
Why not handle everything in the primary backend?
- If the email service is down, should the user be forced to wait after completing the transaction? No!
- Instead, we push the notification event to a queue.
- Even if the notification service is down, the queue retains the event and sends notifications once the service is back.
- This is why message queues (e.g., RabbitMQ, Kafka, AWS SQS) are better than HTTP for such tasks.
Types of Communication
-
Synchronous Communication
- The system waits for a response from the other system.
- Examples: HTTP requests, WebSockets (in some cases).
-
Asynchronous Communication
- The system does not wait for a response.
- Examples: Message queues, Pub-Sub services.
Why WebSockets?
WebSockets provide persistent, full-duplex communication over a single TCP handshake.
Limitations of HTTP:
- In HTTP, the server cannot push events to the client on its own.
- The client (browser) can request, and the server can respond, but the server cannot initiate communication with the client.
WebSockets vs. HTTP for Real-Time Applications
Example: Stock Market Trading System
- Stock buying & selling generates millions of requests per second.
- If you use HTTP, every request requires a three-way handshake, adding latency and overhead.
- With WebSockets, the handshake happens only once, and then the server and client can continuously exchange data.
Alternative: Polling
If you still want to use HTTP for real-time updates, an alternative approach is polling.
- However, polling creates unnecessary load on the server by making frequent requests.
- WebSockets are a more efficient solution for real-time updates.
Some Basic Questions
Basic
What is Node.js?
Node.js is a runtime environment for executing JavaScript on the server side. It is not a framework or a language. A runtime is responsible for memory management and converting high-level code into machine code.
Examples:
- Java: JVM (Runtime) β Spring (Framework)
- Python: CPython (Runtime) β Django (Framework)
- JavaScript: Node.js (Runtime) β Express.js (Framework)
With Node.js, JavaScript can run outside the browser as well.
Runtime vs Frameworks
- Runtime: Focuses on executing code, handling memory, and managing I/O.
- Framework: Provides structured tools and libraries to simplify development.
What happens when you enter a URL in the browser and hit enter?
DNS Lookup
The browser checks if it already knows the IP address for www.example.com.
If not, it contacts a DNS (Domain Name System) server to get the IP address (e.g., 192.168.1.1).
Establishing Connection
The browser initiates a TCP connection with the web server using a process called three-way handshake.
If the website uses HTTPS, a TLS handshake happens to encrypt the communication.
Sending HTTP Request
The browser sends an HTTP request to the server:
GET / HTTP/1.1
Host: www.example.com
Server Processing
The web server processes the request and may:
Fetch data from a database
Generate a response (HTML, JSON, etc.)
Receiving the Response
The server sends an HTTP response back to the browser:
HTTP/1.1 200 OK
Content-Type: text/html
Rendering the Page
The browser processes the HTML, CSS, and JavaScript and displays the webpage.
Difference Between Monolithic and Microservices Architecture
Monolithic Architecture
- All components (UI, DB, Auth, etc.) are tightly coupled.
- Single application handles everything.
Microservices Architecture
- Divided into small, independent services.
- Each service handles a specific function (Auth, Payments, etc.).
Pros:
- Scalable
- Services can use different tech stacks
Cons:
- More complex to manage
- Requires API communication
HTTP Status Codes
-
200
OK -
201
Created -
400
Bad Request -
401
Unauthorized -
402
Payment Required -
404
Not Found -
405
Method Not Allowed -
500
Internal Server Error
What is cors ?
CORS stand for Cross Origin Resource Sharing- a security feature built into browsers
It blocks the requests from one origin(domain,protocol or port) to another origin unless explicitly allowed by the server
For exmple: Your frontend is hosted at frontend.com and you bacend at backend.com
The browser these as a different origin and blocks the request unless it is explicitly allowed
why does this happen though?
CORS error are triggered by Same Origin Policy,which prevents malicious websites from making unauthorized API call using your credentials
Browser isn't blocking the requests---its blocking the response for security reasons
REST vs GraphQL
REST API:
"REST (Representational State Transfer) is an architectural style where data is fetched using multiple endpoints, and each request returns a fixed structure of data."
GraphQL:
"GraphQL is a query language for APIs that allows clients to request only the data they need, reducing overfetching and underfetching."
π‘ Key Point:
- REST APIs have multiple endpoints (
/users
,/orders
), while GraphQL has a single endpoint (/graphql
). - GraphQL provides more flexibility by allowing clients to request exactly what they need in a single query.
- REST APIs return predefined responses and sometimes require multiple requests.
- If performance and flexibility are key concerns, GraphQL is a better choice.
How Do You Design an API for a Large-Scale System?
- Use Microservices: Separate services (Auth, Payments, etc.).
- Load Balancers: Distribute traffic efficiently.
- Caching: Use Redis for frequently accessed data.
- Pagination: Send data in chunks.
- Rate Limiting: Prevent API abuse.
What is Pagination? How to Implement It?
Pagination breaks large datasets into smaller parts.
Implementation:
- Use
limit
andoffset
in database queries. - Example:
SELECT * FROM users LIMIT 10 OFFSET 20;
- Use cursor-based pagination for better performance.
How Do You Handle File Uploads?
-
Single file upload: Use
multipart/form-data
with Express.js & Multer. - Large file handling: Use chunked uploads.
- Storage options: Store files on AWS S3, Google Cloud Storage, or a database.
- Server-side Upload: The file is uploaded to your backend server first, and then the server sends it to S3 or Cloudinary.
Explain the concept of statelessness in HTTP and how it impacts backend services
Intermediate
What is full text search?
What is Serverless and Serverful backend ?
A serverfull backend means you manage the entire server, while a serverless backend means you donβt have to manage serversβyour code runs only when needed on cloud platforms like AWS Lambda
Example: Imagine you are building a food delivery app like Zomato or Uber Eats.
If you use a serverfull backend:
You set up an Express.js server on AWS EC2.
The server is always running, handling all API requests like fetching restaurants, placing orders, and tracking deliveries.
You pay for the server 24/7, even when there are no active users.
If you use a serverless backend:
You use AWS Lambda functions to handle API requests.
When a user places an order, the function runs only for that request and then shuts down.
You only pay for execution time, making it cost-effective.
Can you explain single-threaded vs. multi-threaded processing?
Single-threaded programs execute one task at a time, while multi-threaded programs can execute multiple tasks in parallel. However, single-threaded systems can still be asynchronous using event loops, like in Node.js. If I were building a CPU-intensive app like a video editor, Iβd go with multi-threading. But for an API server handling multiple users, Iβd use a single-threaded, asynchronous model like Node.js to handle requests efficiently
Top comments (0)