DEV Community

Cover image for Day-02 / 100 - Archive Older Files
Sourav kumar
Sourav kumar

Posted on • Edited on

Day-02 / 100 - Archive Older Files

Project Requirement

In the given directory, if you find files more than a given size ex: 20MB or files older than given days ex: 10 days

Compress those files and move in a β€˜archive’ folder.

1. Why are we making this script?

  • We are creating this Bash script to automate the process of managing large and old files in a given directory.
  • Over time, directories accumulate large and outdated files, consuming unnecessary disk space. Manually identifying and archiving these files is time-consuming and inefficient.
  • This script helps automate the process by finding files based on size and age, compressing them, and moving them to an archive folder for better storage management.

2. Purpose of the script

The purpose of this script is to improve disk space management and system performance by:

  • Identifying large files exceeding a specific size (e.g., 20MB)
  • Finding old files that have not been modified for a specified number of days (e.g., 10 days)
  • Compressing files to reduce storage consumption
  • Moving archived files to a separate directory for better organization
  • Automating cleanup to avoid manual intervention

This script is useful in DevOps workflows for log file management, backup automation, and system maintenance. πŸš€

Steps of script:

  1. Defining Variables
  2. Checking if the Base Directory Exists
  3. Create β€˜archive’ folder if not already present
  4. Find all the files with size more than 20MB or being stored for more than 10 days
  5. Compress each file and Move the compressed files in β€˜archive’ folder

1. Defining Variables
At the beginning of the script, we define a few key variables:

BASE=/home/sourav/tutorials/find_command  # The base directory to search files in
DAYS=10  # (Unused in this script but could be used for age-based filtering)
DEPTH=1  # The depth level for the find command
RUN=0    # A control flag (currently set to always archive)
Enter fullscreen mode Exit fullscreen mode

BASE : Defines the directory where we want to search for large files.
DAYS: Unused in this script, but it might be intended for filtering files based on modification time.
DEPTH: Restricts how deep the find command should search in subdirectories.
RUN: This flag is currently set to 0, meaning the script will always run the archiving logic.

2. Checking if the Base Directory Exists
Before performing any operations, the script verifies if the target directory exists:

if [ ! -d $BASE ]
then
    echo "directory does not exist: $BASE"
    exit 1
fi
Enter fullscreen mode Exit fullscreen mode
  • if [ ! -d $BASE ]: Checks if $BASE is not a directory.
  • If the directory does not exist, the script prints an error message and exits.

3. Create β€˜archive’ folder if not already present
If an archive folder does not exist inside $BASE, the script creates one:

if [ ! -d $BASE/archive ]
then
    mkdir $BASE/archive
fi
Enter fullscreen mode Exit fullscreen mode
  • mkdir $BASE/archive: Creates the archive folder to store compressed files.
  • This ensures that files have a designated place for storage.

4. Find all the files with size more than 20MB or being store for more than 10 days

The core functionality of this script is to locate files larger than 20MB or Files older than 10 days and compress them:

for i in $(find "$BASE" -maxdepth "$DEPTH" -type f \( -size +20M -o -mtime +"$DAYS" \))
Enter fullscreen mode Exit fullscreen mode
  • find $BASE β†’ Starts searching from the base directory.
  • -maxdepth $DEPTH β†’ Limits search depth to avoid scanning deep subdirectories.
  • -type f β†’ Ensures only files (not directories) are selected.
  • (...) β†’ Groups conditions to apply the OR (-o) logic properly.
  • -size +20M β†’ Finds files larger than 20MB.
  • -o (OR operator) β†’ Ensures files matching either condition are selected.
  • -mtime +"$DAYS" β†’ Finds files older than $DAYS days.

5. Compressing and Moving the Files

Inside the loop, the script checks the RUN variable before proceeding:

if [ $RUN -eq 0 ]
Enter fullscreen mode Exit fullscreen mode

Since RUN is set to 0, it executes the archiving process:

echo "[ $(date "+%Y-%m-%d %H:%M:%S") ] archiving $i ==> $BASE/archive"
gzip $i || exit 1
mv $i.gz $BASE/archive || exit 1
Enter fullscreen mode Exit fullscreen mode
  • date "+%Y-%m-%d %H:%M:%S": Prints the timestamp for logging purposes.
  • gzip $i || exit 1: Compresses the file using gzip. If compression fails, the script exits.
  • mv $i.gz $BASE/archive || exit 1: Moves the compressed file (.gz format) to the archive directory.

whole script

BASE=/home/sourav/tutorials/find_command
DAYS=10
DEPTH=1
RUN=0

# Check if the directory is present or not
if [ ! -d "$BASE" ]; then
    echo "directory does not exist: $BASE"
    exit 1
fi

# Create 'archive' folder if not present
if [ ! -d "$BASE/archive" ]; then
    mkdir "$BASE/archive"
fi

# Find files that are either larger than 20MB OR older than 10 days
for i in $(find "$BASE" -maxdepth "$DEPTH" -type f \( -size +20M -o -mtime +"$DAYS" \)); 
do
    if [ "$RUN" -eq 0 ]; then
        echo "[ $(date "+%Y-%m-%d %H:%M:%S") ] archiving $i ==> $BASE/archive"
        gzip "$i" || exit 1
        mv "$i.gz" "$BASE/archive" || exit 1
    fi
done
Enter fullscreen mode Exit fullscreen mode

The logic handles these condtions

  1. A 5MB file that is 12 days old β†’ βœ… Archived (because it's old).
  2. A 25MB file that is 3 days old β†’ βœ… Archived (because it's large).
  3. A 5MB file that is 3 days old β†’ ❌ Not Archived (doesn't match either condition).

Save this as archive_script.sh, then give it execute permissions and run it:

chmod +x archive_script.sh
./archive_script.sh
Enter fullscreen mode Exit fullscreen mode

πŸ“Œ Before Running the Script

$ ls -lh /home/sourav3366/tutorials/find_command
-rw-r--r-- 1 sourav3366 users  25M Feb 12  test1.log
-rw-r--r-- 1 sourav3366 users  5M  Feb 1   test2.log
-rw-r--r-- 1 sourav3366 users  30M Feb 15  test3.log
-rw-r--r-- 1 sourav3366 users  10M Feb 10  test4.log

Enter fullscreen mode Exit fullscreen mode

Output of the Script Execution

[ 2025-02-17 14:30:45 ] Archiving /home/sourav3366/tutorials/find_command/test1.log ==> /home/sourav3366/tutorials/find_command/archive
[ 2025-02-17 14:30:46 ] Archiving /home/sourav3366/tutorials/find_command/test2.log ==> /home/sourav3366/tutorials/find_command/archive
[ 2025-02-17 14:30:47 ] Archiving /home/sourav3366/tutorials/find_command/test3.log ==> /home/sourav3366/tutorials/find_command/archive
Enter fullscreen mode Exit fullscreen mode

πŸ“‚ After Running the Script

$ ls -lh /home/sourav3366/tutorials/find_command
-rw-r--r-- 1 sourav3366 users  10M Feb 10  test4.log
drwxr-xr-x 2 sourav3366 users  4K  Feb 17  archive  # Archive folder created

Enter fullscreen mode Exit fullscreen mode
$ ls -lh /home/sourav3366/tutorials/find_command/archive
-rw-r--r-- 1 sourav3366 users  2M  Feb 17  test1.log.gz
-rw-r--r-- 1 sourav3366 users  500K Feb 17  test2.log.gz
-rw-r--r-- 1 sourav3366 users  2.5M Feb 17  test3.log.gz
Enter fullscreen mode Exit fullscreen mode

βœ… Final Outcome

Image description

Top comments (0)