Back2Basics

Posted on Mar 7

Shell Scripting Back to Basics

Shell Scripting

1. Variables naming conventions recommended

file="filename.txt" no spaces allowed between '='
house_color, source_file
constants written in UpperCase Example: read only FILE="filename.txt"
varialbe outside of scope to be used then use exports .

2. Functions:

functions can start with function keyword or without it.

calculate_area() {
}
clone_repo() {
}

Expanding Variables

The below both versions works well.

#!/bin/bash
var="value of var"
echo ${var}

#!/bin/bash
var="value of var"
echo $var

The difference come when appending additional string in the end. Expanding variables with {} should be used based on need.

#!/bin/bash

height=70
echo "Your height is = $heightcm" # throws error
echo "Your height is = ${height}cm" # works good

Without the quotes the variable string will split by field seperators here space.

#!/bin/bash
string="One Two Three"
for element in ${string}; do
    echo ${element} # or echo $element
done

O/P:

One
Two
Three

Now try enclosing with the double quotes. The string will not split out and will be act as entire string as one element.

#!/bin/bash
string="One Two Three"
for element in "${string}"; do
    echo ${element} # or echo $element
done

O/P:

One Two Three

For the below scenario we want server names seperately.

#!/bin/bash
readonly SERVERS="server1 server2 server3"
for server in ${SERVERS}; do
    echo "${server}.worker-node"
done

O/P:

server1.worker-node
server2.worker-node
server3.worker-node

Recommended to use "" expanding variables:

Directory paths, filenames
Assigning URLs to variables
When in doubt, use double quotes :)

Let revisit basic concepts of linux commands, loops, performing tasks. Shell scripting is much more powerfull. It follows the top down approach and only when we can it gets executed ( calling functions).

# list items
ls
pwd
cd
mkdir

Lets say there is some command will throws the errors to break the script over there we have declare "set -x" in the top of script.

Usage of [[ ]] for conditional controls.

#!/bin/bash
if [[ 3 > 4 ]]; then
   echo " three greater than four";
fi

Shell script has 4 loop types

While
for
until similar to do-while loop

While loop

#!/bin/bash
i=1
while [[ $i -le 3 ]]; do
    echo "Iteration $i"
    i=$((i+1))
done

For loop

for i in {1..3}; do
# for i in $(seq 1 3); do using the sequence as sub-shell the output from it is treated input to i
   echo "Iteration $i"
done

Until Loop

#!/bin/bash

i=3
until [[ $i -eq 0 ]]; do
    echo "Iteration $i"
    i=$((i-1))
done

To import the other script file

There were sometimes we need to import variables from other files(.sh)

# conf file
name="Ubuntu"

Example

#!/bin/bash
source .conf # sourcing the conf file
echo "${name}"

what if the conf file doesnt exists. We need to checkpoint in the script

#!/bin/bash
readonly CONF_FILE=".conf"
if [[ -f ${CONF_FILE} ]]; then
   source "${CONF_FILE}"
else
   name="Bob"
fi

echo "${name}"
exit 0

Functions

Functions encapsulates code. Allows re-usability.
Here is the script that takes backup without function

#!/bin/bash
mkdir backup
cd backup
cp -r "file.txt" .
tar -czvf backup.tar.gz *
echo "Backup complete!!"
exit 0

now convert to a function

perform_backup() {
  mkdir backup
  cd backup
  cp -r "${1}" .
  tar -czvf backup.tar.gz *
  echo "Backup complete!!"
}
perform_backup ${1}
exit 0

Cloning a git repo

#!/bin/bash
git_url=${1}

clone_git() {
  git clone ${1}
}
find_files() {
  find . -type f | wc -l
}

clone_git "${git_url}"
find_files

Declaring variables local to a function. Here the variable local is scoped under my_function.

#!/bin/bash
my_function() {
  local var1="Hello"
  echo "${var1}"
}

my_function

Command Line Arguments

Pass the argument which needs to add dynamically and it keeps changing

#!/bin/bash
git clone "${1}"
find . -type f | wc -l

clone_project.sh [git-url]

Sometimes remembering these cmdline args bit hard.
so use the "shift" keyword then each time used the next usage will be the next argument serially it act as $1

#!/bin/bash
echo "First arg: $1"
shift
echo "Second arg: $1"
shift
echo "Third arg: $1"

./shift-example.sh arg1 arg2 arg3

O/P:

First arg: arg1
Second arg: arg2
Third arg: arg3

Each Process running on the terminal will associated to the tty session PID. If you want to keep the programme running even after closing the terminal run by "nohup ./script.sh &".
Each Tab in terminal creates its own session with the tty PID as root PID.

Killing processes by PID:

To check the systemcall made by the process by PID

strace -Tfp 99838

Options:
T - Timing info
f - Child Process
p - Parent Shells PID

Built-in Commands

Built-in commands dont generate the PID

Some Exercise Questions:

Looping 1 to 99 odd numbers

#!/bin/bash
for((i=1;i<100;i+=2)); do
  echo $i
done

Reading one line of string

#!/bin/bash
read name;
echo "Welcome ${name}";

Arithematic operations

#!/bin/bash
read x
read y
if [[ $x -ge -100 && $x -le 100 && $y -ge -100 && $y -le 100 ]]; then
    echo "$((x+y))"
    echo "$((x-y))"
    echo "$((x*y))"
    if [[ $y -ne 0 ]]; then
        echo "$((x/y))"
    else
        echo "Not possible"
    fi
else
    echo "No correct"
    exit 1
fi

Comparing Two numbers

#!/bin/bash
read x
read y
if [[ $x -gt $y ]]; then
    echo "X is greater than Y"
fi
if [[ $x -lt $y ]]; then
    echo "X is less than Y"
fi
if [[ $x -eq y ]]; then
    echo "X is equal to Y"
fi

Comparing characters

#!/bin/bash
read ch

if [[ $ch == "y" || $ch == "Y" ]]; then
    echo "YES";
fi

if [[ $ch == "N" || $ch == "n" ]]; then
    echo "NO";
fi

#!/bin/bash
read ch

case $ch in
    [yY])  echo “YES” ;;
    [nN])  echo “NO” ;;
    *)   echo “No input” ;;
esac

Evaluating arithmetic expression with floating point 3 decimal places

#!/bin/bash
read expression
result=$(echo "scale=4; $expression" | bc -l)
#for getting 4 decimal places scale is used
result=$(printf "%.3f" $result) # for 3 decimal places rounded
echo $result

Avg of column values in a file

#!/bin/bash

# Compute the average using awk
avg=$(awk '{sum+=$1} END {print sum/NR}' log.txt)

echo "Average: $avg"

#!/bin/bash

# Initialize sum and count
sum=0
count=0

# Read each line in the file
while read -r number; do
    sum=$((sum + number)) # Add the current number to sum
    count=$((count + 1)) # Increment the count
done < log.txt

# Calculate the average
if [ $count -gt 0 ]; then
    avg=$((sum / count))
    echo "Average: $avg"
else
    echo "No numbers in the file"
fi

AVG N numbers

read n
result=0
for((i=0;i<$n;i++)) do
    read num
    result=$(echo " $result + $num "|bc)
done
r=$(echo "$result / $n" | bc -l)
printf "%0.3f" "$r"

Print 3rd char of each line

#!/bin/bash
while read line; do
    echo "${line}" | cut -c3
done

Print 2nd position till 7th position of each line

#!/bin/bash
while read line; do
    echo "${line}" | cut -c 2-7
done

Print 2nd position and 7th position of each line of input

#!/bin/bash
while read line; do
    echo "${line}" | cut -c 2,7
done

Print first 4 characters of each line of string

#!/bin/bash
while read line; do
echo "${line}" | cut -c 0-4
done

Print characters from 13th position to till the end

#!/bin/bash
while read line; do
    echo "${line}" | cut -c 13-
done

Print 4th word of each line

#!/bin/bash
while read line; do
    echo "${line}" | cut -d ' ' -f 4
done

Print first 3 words in each line

#!/bin/bash
while read line; do
    echo "${line}" | cut -d ' ' -f -3
done

Print words with tab delimeter from 2nd word to till end of sentence

#!/bin/bash

while read line; do
    # Extract everything from the second field onward, assuming tab as the delimiter
    echo "$line" | cut -d $'\t' -f 2-
done

Print first 20 lines of text file

#!/bin/bash

head -20 $1

Print first 20 character of text file

#!/bin/bash

head -c 20 $1

Print from 12 to 22 lines in text file

#!/bin/bash

head -n 22 $1 | tail -n 11

Print last 20 lines in text fille

#!/bin/bash
tail -n 20 $1

Print last 20 character of text file

#!/bin/bash
tail -c 20 $1

Replace () with [] in a text file

#!/bin/bash
cat $1 | tr '()' '[]'

Remove lower case characters from the line

#!/bin/bash
tr -d 'a-z'

replace multiple consecutive spaces with single space

#!/bin/bash
while read input; do 
    echo $input | tr -s " "  # s is squeeze
done

Sort in lexicographical/dictionary order of text file

#!/bin/bash
cat $1 | sort -d

Sort in reverse lexicographical/dictionary order of text file

#!/bin/bash
cat $1 | sort -r

Sort numeric,float numbers in a file

#!/bin/bash
cat $1 | sort -n

Sort numeric,float numbers in a file

#!/bin/bash

cat $1 | sort -nr

Sort tab seperated file having tabluar data descending, by column2(k2)

#!/bin/bash
sort -rnk2 -t $'\t'

Sort tab seperated file having tabluar data descending, by column2(key =2nd column) monthly temp avgs.

sort -t$"|" -rnk2

Remove Consecutive repetitions in a file

#!/bin/bash
uniq $1

Consecutive duplicates with count,element removing leading spaces

!/bin/bash

uniq -c | sed 's/^[[:space:]]*//'

Consecutive duplicates with count,element removing leading spaces, ignore case

!/bin/bash

uniq -ci | cut -c7-

Print only uniq line

!/bin/bash

uniq -u

Replace new lines with tab

tr "\n" "\t"

paste -s

repeat the file 3 times

paste - - -

AWK

Your task is to identify the performance grade for each student. If the average of the three scores is 80 or more, the grade is 'A'. If the average is 60 or above, but less than 80, the grade is 'B'. If the average is 50 or above, but less than 60, the grade is 'C'. Otherwise the grade is 'FAIL'.

#!/bin/bash
cat marks.txt
awk '{
if (($2+$3+$4)/3 >= 80)
print $0,":","A";
else if (($2+$3+$4)/3 >= 60 && ($2+$3+$4)/3 < 80)
print $0,":","B";
else if (($2+$3+$4)/3 >= 50 && ($2+$3+$4)/3 < 60)
print $0,":","C"; 
else
print $0,":","FAIL";
}'

Concatenate every 2 lines of input with a semi-colon.

awk 'ORS=NR%2?";":"\n"'

ORS is output record seperator we are choosing based on condition.

NR is count of rows

awk ‘{ print $1.$2 }’ test.log

it will concatenate the column 1 and 2

mimicing grep with awk

awk ‘{/test/ print $1 }’ test.log

it will print only lines with test matching in column 1

atleast one lower case letter matching in each line

awk ‘/[a-z]/  { print }’ test.log

Every line starts with a number

awk ‘/^[0-9]/ { print }’ test.log

Every line ends with a number

awk ‘/[0-9]$/ { print }’ test.log

Print line that contain 123 in column 1

awk ‘{
  if ($1 ~ /123/)
    print 
}’ test.log

print line contains number in column 1

awk ‘{
  if ($1 ~ /[0-9]/)
    print 
}’ test.log

Check if all the three columns in each row are not empty atleast

#!/bin/bash
awk '{
if( $2 =="" || $3 =="" || $4 =="" )
print "Not all scores are available for",$1
}'

Students pass/fail check

awk '{
    avg = ($2 + $3 + $4)/3
    if ( avg >=50 )
        print $1,":","Pass"
    else
        print $1,":","Fail"
}'

GREP

https://www.thegeekstuff.com/2009/03/15-practical-unix-grep-command-examples/#google_vignette

https://tldp.org/LDP/Bash-Beginners-Guide/html/sect_04_02.html

REGEX
https://www.gnu.org/software/sed/manual/html_node/Regular-Expressions.html

Find the lines which contianer only these words no case sensitive

egrep -wi "the|that|then|those"

credit cards digits are groups of 4 with 4 digits each group.

We need to print those numbers having consecutive digits same more than one time either seperated by space or without space. ( is used to make Basic regular expression to capture the group [0-9] and \s* is for spaces and \1 for back reference group ([0-9])

grep '\([0-9]\)\s*\1'

Print the lines that doesnt contain the “that” word case-insensitive

grep -iv "that"

Print the lines that does contain the “the” word case-insensitive

grep -wi "the"

SED

grep -n “maths”  dump.log

it give the matched lines with line number.

grep  -n “maths”$ dump.log

it give the matched lines in end of line with “maths” with line number.

grep  -n  ^“maths” dump.log

it give the matched lines in start of line with “maths” with line number.

$ !! -c

this will give the last ran command and its output

grep -n “maths”  dump.log -c

this will print the output that how many times the “maths” exists with count.

grep  -n “ma..” dump.log

match expression is ma

grep -n ^”[ab]” dump.log

match expression is words that match start with either a or b.

For aa, ab, bb, ba - ^”[ab][ab]”
atleast one number – “[0-9]”
digit followed by letter any case - “[0-9][a-zA-Z]”

case-insensitive replace thy with your globally

sed 's/\bthy\b/your/Ig'

Highlight the thy within {}

sed -e 's/thy/{&}/Ig'

mask first three groups of credit card numbers

awk '{
    print "**** **** ****",$4
}'

using regex

sed 's/\([0-9]\{12\}\)[0-9]\{4\}/\1****/g'

reorders the credit card numbers of 4 groups of 4 digits each.

sed -E 's/([0-9]{4})([0-9]{4})([0-9]{4})([0-9]{4})/\4 \3 \2 \1/g'

Explanation:

1. ([0-9]{4} ): This matches a block of 4 digits followed by a space.
    ◦ ([0-9]{4}) captures 4 digits, and the space after it is also captured.
    ◦ The whole expression will match four groups of 4 digits with spaces.
2. \4 \3 \2 \1: This specifies the order of the captured groups in the replacement part:
    ◦ \4: Refers to the fourth captured group (the last 4 digits).
    ◦ \3: Refers to the third captured group (the third 4 digits).
    ◦ \2: Refers to the second captured group (the second 4 digits).
    ◦ \1: Refers to the first captured group (the first 4 digits).
   This changes the order of the groups, swapping them as required.
3. g: The g at the end ensures that the substitution is applied globally to all occurrences in the line.

Reading country names line by line and inserting into an array. Printing 3rd index element

i=0
countries=[]
while read line; do
    countries[i]=$line
    i=$((i+1))
done
echo "${countries[3]}"

Print count of elements in the input

i=0
countries=[]
while read line; do
    countries[i]=$line
    i=$((i+1))
done
echo "${#countries[@]}"

Print the elements replacing the first capital letter with “.”

i=0
countries=()
while read line; do
    if [[ $line =~ ^[A-Z] ]]; then
        countries[i]=".${line:1}"
        i=$((i+1))
    fi
done
echo "${countries[@]}"

Print the number that occurs only once in a duplicate elements array.

#!/bin/bash

# Read the number of integers
read n

# Read the array of integers
read -a arr

# Initialize a variable to hold the result
result=0

# XOR all the numbers
for num in "${arr[@]}"; do
    result=$((result ^ num))
done

# Output the number that occurs only once
echo $result

Array :

Reference Doc

delete first 100 lines in a file:

sed -i '1,100d' filename

start with Cap letter and followed by 2 letters:

egrep -o "\b[A-Z][a-z]{2}\b" /etc/nsswitch.conf > /home/bob/filtered1

5 digit number in a file:

egrep -o  '[0-9]{5}'  /home/bob/textfile > /home/bob/number

Count the lines start with number 2:

egrep -c '^2' /home/bob/textfile

Count the lines begin with “Section” in a file:

egrep -ic '^Section' /home/bob/testfile

word “man” exact match lines:

egrep   "\bman\b" /home/bob/testfile > /home/bob/man_filtered

extract last 500 lines in a file:

tail -500 /home/bob/textfile  > /home/bob/last

Reference Doc