DEV Community

Cover image for AWK Basics Tutorial
Leapcell
Leapcell

Posted on

AWK Basics Tutorial

Image description

Leapcell: The Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis

A Concise AWK Tutorial

I. Basic Concepts

AWK is a built-in text processing tool in Linux systems, specializing in handling structured text (e.g., logs, CSV files). It reads files line by line, processes data by fields, and supports simple programming logic.

II. Basic Syntax

1. Fundamental Format

awk [options] 'actions' filename
Enter fullscreen mode Exit fullscreen mode

2. Simplest Examples

# Print entire file content
awk '{print $0}' demo.txt

# Process standard input via pipe
echo 'this is a test' | awk '{print $0}'
Enter fullscreen mode Exit fullscreen mode

3. Field Handling

  • $1: First field
  • $2: Second field
  • $0: Entire line
  • NF: Total number of fields in current line
  • $NF: Last field
# Extract third field
echo 'this is a test' | awk '{print $3}'  # Output: a

# Extract second-to-last field
echo 'a,b,c,d' | awk -F ',' '{print $(NF-1)}'  # Output: c
Enter fullscreen mode Exit fullscreen mode

III. Core Functions

1. Field Separator

# Specify colon as separator
awk -F ':' '{print $1}' /etc/passwd
Enter fullscreen mode Exit fullscreen mode

2. Built-in Variables

Variable Name Description Example
NR Current line number awk '{print NR}' file
FS Input field separator (default space) awk -v FS=: '{print $1}'
OFS Output field separator (default space) awk -v OFS=, '{print $1,$2}'
FILENAME Current file name awk '{print FILENAME}' file

IV. Advanced Operations

1. Conditional Filtering

# Regular expression match: Print lines containing "usr"
awk -F ':' '/usr/ {print $1}' /etc/passwd

# Numeric comparison: Print content after line 3
awk -F ':' 'NR > 3 {print $1}' /etc/passwd

# Combined conditions
awk -F ':' '$1 == "root" || $3 > 1000' /etc/passwd
Enter fullscreen mode Exit fullscreen mode

2. Built-in Functions

Function Name Function Example
toupper() Convert to uppercase awk '{print toupper($1)}'
length() String length awk '{print length($1)}'
substr() Substring extraction awk '{print substr($1,3,5)}'
rand() Generate random number awk '{print int(rand()*100)}'

V. Control Statements

1. Single-line Conditions

# Process odd-numbered lines
awk 'NR % 2 == 1 {print "Line", NR}' file

# Field comparison
awk -F ':' '$3 > 1000 {print $1}' /etc/passwd
Enter fullscreen mode Exit fullscreen mode

2. Multi-line Logic

awk -F ':' '{
  if ($1 > "m") {
    print "High:", $1
  } else {
    print "Low:", $1
  }
}' /etc/passwd
Enter fullscreen mode Exit fullscreen mode

VI. Practical Tips

  • Formatted Output: Use printf instead of print
  awk -F ':' '{printf "%-10s %s\n", $1, $3}' /etc/passwd
Enter fullscreen mode Exit fullscreen mode
  • Large File Handling: Memory-friendly line-by-line processing
  • Tool Integration: Combine with grep/sed

VII. Quick Reference

# Common command combinations
awk -F ':' '/^root/ {print $1}'  # Lines starting with root
awk -F ':' '!/nologin/ {print $1}'  # Exclude lines containing nologin
awk -F ':' '$3 ~ /[0-9]{4}/'  # Match 4-digit fields
Enter fullscreen mode Exit fullscreen mode

Optimization Notes:

  1. Hierarchical heading structure
  2. Variable/function tables for clarity
  3. Code block/output result contrast
  4. Practical tips and quick reference added
  5. Learning curve enhanced through logical ordering
  6. Improved readability with proper spacing and indentation

Leapcell: The Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis

Finally, I recommend the best platform for deployment: Leapcell

Image description

1. Multi-Language Support

  • Develop with JavaScript, Python, Go, or Rust.

2. Deploy unlimited projects for free

  • Pay only for usage — no requests, no charges.

3. Unbeatable Cost Efficiency

  • Pay-as-you-go with no idle charges.
  • Example: $25 supports 6.94M requests at a 60ms average response time.

4. Streamlined Developer Experience

  • Intuitive UI for effortless setup.
  • Fully automated CI/CD pipelines and GitOps integration.
  • Real-time metrics and logging for actionable insights.

5. Effortless Scalability and High Performance

  • Auto-scaling to handle high concurrency with ease.
  • Zero operational overhead — just focus on building.

Image description

Explore more in the documentation!

Leapcell Twitter: https://x.com/LeapcellHQ

Top comments (0)