Project Requirement
In the given directory, if you find files more than a given size ex: 20MB or files older than given days ex: 10 days
Compress those files and move in a βarchiveβ folder.
1. Why are we making this script?
- We are creating this Bash script to automate the process of managing large and old files in a given directory.
- Over time, directories accumulate large and outdated files, consuming unnecessary disk space. Manually identifying and archiving these files is time-consuming and inefficient.
- This script helps automate the process by finding files based on size and age, compressing them, and moving them to an archive folder for better storage management.
2. Purpose of the script
The purpose of this script is to improve disk space management and system performance by:
- Identifying large files exceeding a specific size (e.g., 20MB)
- Finding old files that have not been modified for a specified number of days (e.g., 10 days)
- Compressing files to reduce storage consumption
- Moving archived files to a separate directory for better organization
- Automating cleanup to avoid manual intervention
This script is useful in DevOps workflows for log file management, backup automation, and system maintenance. π
Steps of script:
- Defining Variables
- Checking if the Base Directory Exists
- Create βarchiveβ folder if not already present
- Find all the files with size more than 20MB or being stored for more than 10 days
- Compress each file and Move the compressed files in βarchiveβ folder
1. Defining Variables
At the beginning of the script, we define a few key variables:
BASE=/home/sourav/tutorials/find_command # The base directory to search files in
DAYS=10 # (Unused in this script but could be used for age-based filtering)
DEPTH=1 # The depth level for the find command
RUN=0 # A control flag (currently set to always archive)
BASE : Defines the directory where we want to search for large files.
DAYS: Unused in this script, but it might be intended for filtering files based on modification time.
DEPTH: Restricts how deep the find command should search in subdirectories.
RUN: This flag is currently set to 0, meaning the script will always run the archiving logic.
2. Checking if the Base Directory Exists
Before performing any operations, the script verifies if the target directory exists:
if [ ! -d $BASE ]
then
echo "directory does not exist: $BASE"
exit 1
fi
- if [ ! -d $BASE ]: Checks if $BASE is not a directory.
- If the directory does not exist, the script prints an error message and exits.
3. Create βarchiveβ folder if not already present
If an archive folder does not exist inside $BASE, the script creates one:
if [ ! -d $BASE/archive ]
then
mkdir $BASE/archive
fi
- mkdir $BASE/archive: Creates the archive folder to store compressed files.
- This ensures that files have a designated place for storage.
4. Find all the files with size more than 20MB or being store for more than 10 days
The core functionality of this script is to locate files larger than 20MB or Files older than 10 days and compress them:
for i in $(find "$BASE" -maxdepth "$DEPTH" -type f \( -size +20M -o -mtime +"$DAYS" \))
- find $BASE β Starts searching from the base directory.
- -maxdepth $DEPTH β Limits search depth to avoid scanning deep subdirectories.
- -type f β Ensures only files (not directories) are selected.
- (...) β Groups conditions to apply the OR (-o) logic properly.
- -size +20M β Finds files larger than 20MB.
- -o (OR operator) β Ensures files matching either condition are selected.
- -mtime +"$DAYS" β Finds files older than $DAYS days.
5. Compressing and Moving the Files
Inside the loop, the script checks the RUN variable before proceeding:
if [ $RUN -eq 0 ]
Since RUN is set to 0, it executes the archiving process:
echo "[ $(date "+%Y-%m-%d %H:%M:%S") ] archiving $i ==> $BASE/archive"
gzip $i || exit 1
mv $i.gz $BASE/archive || exit 1
- date "+%Y-%m-%d %H:%M:%S": Prints the timestamp for logging purposes.
- gzip $i || exit 1: Compresses the file using gzip. If compression fails, the script exits.
- mv $i.gz $BASE/archive || exit 1: Moves the compressed file (.gz format) to the archive directory.
whole script
BASE=/home/sourav/tutorials/find_command
DAYS=10
DEPTH=1
RUN=0
# Check if the directory is present or not
if [ ! -d "$BASE" ]; then
echo "directory does not exist: $BASE"
exit 1
fi
# Create 'archive' folder if not present
if [ ! -d "$BASE/archive" ]; then
mkdir "$BASE/archive"
fi
# Find files that are either larger than 20MB OR older than 10 days
for i in $(find "$BASE" -maxdepth "$DEPTH" -type f \( -size +20M -o -mtime +"$DAYS" \));
do
if [ "$RUN" -eq 0 ]; then
echo "[ $(date "+%Y-%m-%d %H:%M:%S") ] archiving $i ==> $BASE/archive"
gzip "$i" || exit 1
mv "$i.gz" "$BASE/archive" || exit 1
fi
done
The logic handles these condtions
- A 5MB file that is 12 days old β β Archived (because it's old).
- A 25MB file that is 3 days old β β Archived (because it's large).
- A 5MB file that is 3 days old β β Not Archived (doesn't match either condition).
Save this as archive_script.sh, then give it execute permissions and run it:
chmod +x archive_script.sh
./archive_script.sh
π Before Running the Script
$ ls -lh /home/sourav3366/tutorials/find_command
-rw-r--r-- 1 sourav3366 users 25M Feb 12 test1.log
-rw-r--r-- 1 sourav3366 users 5M Feb 1 test2.log
-rw-r--r-- 1 sourav3366 users 30M Feb 15 test3.log
-rw-r--r-- 1 sourav3366 users 10M Feb 10 test4.log
Output of the Script Execution
[ 2025-02-17 14:30:45 ] Archiving /home/sourav3366/tutorials/find_command/test1.log ==> /home/sourav3366/tutorials/find_command/archive
[ 2025-02-17 14:30:46 ] Archiving /home/sourav3366/tutorials/find_command/test2.log ==> /home/sourav3366/tutorials/find_command/archive
[ 2025-02-17 14:30:47 ] Archiving /home/sourav3366/tutorials/find_command/test3.log ==> /home/sourav3366/tutorials/find_command/archive
π After Running the Script
$ ls -lh /home/sourav3366/tutorials/find_command
-rw-r--r-- 1 sourav3366 users 10M Feb 10 test4.log
drwxr-xr-x 2 sourav3366 users 4K Feb 17 archive # Archive folder created
$ ls -lh /home/sourav3366/tutorials/find_command/archive
-rw-r--r-- 1 sourav3366 users 2M Feb 17 test1.log.gz
-rw-r--r-- 1 sourav3366 users 500K Feb 17 test2.log.gz
-rw-r--r-- 1 sourav3366 users 2.5M Feb 17 test3.log.gz
β Final Outcome
Top comments (0)