I recently worked on a data analysis project where security was a top priority. My goal was to ensure that sensitive data was shared efficiently while restricting direct access to the primary data server. This presented an interesting challenge: how to securely share data with multiple users without granting them access to the data server itself.
Table of Contents
Let me know if you’d like me to adjust the article sections or refine the table of contents further!
Rsync and Samba Server: A Secure Solution
To address this, I configured a Samba server combined with an rsync service to synchronize data between the main data server and the Samba server. The Samba server allows users to access the shared data while keeping the primary server secure.
To enhance security, I set up rsync over SSH, ensuring that all data transfers are encrypted. SSH authentication keys were configured for automated logins, eliminating the need for password entry during synchronization. Some might wonder: "If you're already using SSH and authentication keys, why not just use scp
?" The answer lies in the efficiency and flexibility of rsync.
Why Use Rsync?
Rsync is a powerful tool that identifies differences between directories on two servers and synchronizes them efficiently. This includes deleting files that no longer exist on the source directory, making it an excellent choice for maintaining an up-to-date replica. I was genuinely surprised at how simple it was to set up and use.
Syncing Data to the Samba Server
For this project, I needed to synchronize all files from the data server's directory /data/stock_predictor/
to a directory on the Samba server. Here’s the basic command I used:
rsync -avz --delete "/data/stock_predictor/" "rsyncuser@192.168.178.77:/srv/data/"
In my case, there was an additional requirement: I didn’t want the database files to be included in the sync. Excluding the database directory was as simple as adding an --exclude
flag:
rsync -avz --delete --exclude "/data/stock_predictor/database/" "/data/stock_predictor/" "rsyncuser@192.168.178.77:/srv/data/"
Syncing Data Back to the Server
The users could also upload XLSX files to a designated folder on the Samba server. These files needed to be synchronized back to the data server for processing. The command for this reverse synchronization was straightforward:
rsync -avz --delete "/srv/data/xlsx/upload/" "rsyncuser@192.168.178.76:/data/stock_predictor/xlsx/upload/"
Automating the Sync with Systemd Timers
To automate these sync processes, I created a script, /usr/local/bin/rsync_data.sh
, tailored for each server. I then used systemd timers and services to schedule the synchronization tasks. Since most of the data generation occurred during off-hours, I scheduled the sync to run every 15 minutes. This ensured that any updated data was available shortly after it was generated.
Why Not Use Rsync as a Daemon?
While rsync can operate as a daemon, it doesn't inherently support encryption for data transfers. For my project, encryption was non-negotiable. By combining rsync with SSH and systemd timers, I achieved secure, efficient, and automated data synchronization—all with just a few lines of configuration. code here
This setup highlights the flexibility and power of Linux. With tools like rsync, SSH, and systemd, you can create a robust and secure system tailored to your specific needs.
Top comments (0)