DEV Community

Cover image for Trend Driven Articles: ChatGPT Edition
Vangelis Garaganis
Vangelis Garaganis

Posted on • Edited on

Trend Driven Articles: ChatGPT Edition

Storytime

I needed a Rust developer.

My Tauri app had a good looking React front-end, a shiny website where users would shop from and a stripe account well set up for my millions to rush in. The Rust back-end was ready and the only thing missing was a Rust function that would perform the simple task of finding duplicate files that existed within two folders. A simple multi-threaded Rust function that would recursively iterate over the two directories' files and perform checksum comparisons. Sounds straightforward, right ?

Well it is, but the problem is that I do not know Rust. And not that it would take me that much to actually learn some Rust and implement that, but you know how it goes, ain't nobody got time for that. I tried reaching out to my rust wrangler friend Spyrakos, to help me out, but you know how that goes, too. Ain't nobody got time for that, too. So, my project end-up to my unfinished projects collection.

Look at all my unfinished projects

Seasons come and go and life had me arguing over the new cool kid in the block - ChatGPT. Being an incredulous person myself, I wasn't buying the hype. My friends suggested that before rushing to conclusions it would be better if I tested it myself. And I did. I challenged ChatGPT with the ultimate task. Write me a Rust function that finds duplicate files within two directories.

I started out simple.

Me: generate me rust code that performs file comporison * yeah i wrote comporison*

ChatGPT: Certainly, here's an example Rust code that performs a binary file comparison:

use std::fs::File;
use std::io::{Read, Result};

fn compare_files(path1: &str, path2: &str) -> Result<bool> {
    let mut file1 = File::open(path1)?;
    let mut file2 = File::open(path2)?;

    let mut buf1 = [0; 1024];
    let mut buf2 = [0; 1024];

    loop {
        let len1 = file1.read(&mut buf1)?;
        let len2 = file2.read(&mut buf2)?;

        if len1 != len2 || buf1 != buf2 {
            return Ok(false);
        }

        if len1 == 0 {
            break;
        }
    }

    Ok(true)
}

fn main() -> Result<()> {
    let path1 = "/path/to/file1";
    let path2 = "/path/to/file2";

    let result = compare_files(path1, path2)?;

    if result {
        println!("The files are identical.");
    } else {
        println!("The files are different.");
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

I was intrigued.

Me: generate me rust code that performs file comparison using checksums

ChatGPT: Certainly! Here is an example Rust code that performs a file comparison using checksums:

use std::fs::File;
use std::io::{Read, Result};
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};

fn file_checksum(path: &str) -> Result<u64> {
    let mut file = File::open(path)?;
    let mut hasher = DefaultHasher::new();

    let mut buffer = [0; 1024];
    loop {
        let len = file.read(&mut buffer)?;
        if len == 0 {
            break;
        }
        hasher.write(&buffer[..len]);
    }

    Ok(hasher.finish())
}

fn compare_files(path1: &str, path2: &str) -> Result<bool> {
    let checksum1 = file_checksum(path1)?;
    let checksum2 = file_checksum(path2)?;

    Ok(checksum1 == checksum2)
}

fn main() -> Result<()> {
    let path1 = "/path/to/file1";
    let path2 = "/path/to/file2";

    let result = compare_files(path1, path2)?;

    if result {
        println!("The files are identical.");
    } else {
        println!("The files are different.");
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Okay, that's impressive. But it won't be able to handle threads, right ?

Me: generate me rust code that performs file comparison using checksums and it is multithreaded

ChatGPT: Certainly! Here's an example Rust code that performs a file comparison using checksums in a multithreaded way:

use std::fs::File;
use std::io::{BufReader, Read, Result};
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};
use std::sync::{Arc, Mutex};
use std::thread;

fn file_checksum(path: &str) -> Result<u64> {
    let mut file = BufReader::new(File::open(path)?);
    let mut hasher = DefaultHasher::new();

    let mut buffer = [0; 1024];
    loop {
        let len = file.read(&mut buffer)?;
        if len == 0 {
            break;
        }
        hasher.write(&buffer[..len]);
    }

    Ok(hasher.finish())
}

fn compare_files(path1: &str, path2: &str) -> Result<bool> {
    let checksum1 = Arc::new(Mutex::new(None));
    let checksum2 = Arc::new(Mutex::new(None));

    let path1_clone = path1.to_owned();
    let path2_clone = path2.to_owned();

    let handle1 = thread::spawn(move || {
        let result = file_checksum(&path1_clone);
        *checksum1.lock().unwrap() = Some(result.unwrap());
    });

    let handle2 = thread::spawn(move || {
        let result = file_checksum(&path2_clone);
        *checksum2.lock().unwrap() = Some(result.unwrap());
    });

    handle1.join().unwrap();
    handle2.join().unwrap();

    let checksum1 = checksum1.lock().unwrap().unwrap();
    let checksum2 = checksum2.lock().unwrap().unwrap();

    Ok(checksum1 == checksum2)
}

fn main() -> Result<()> {
    let path1 = "/path/to/file1";
    let path2 = "/path/to/file2";

    let result = compare_files(path1, path2)?;

    if result {
        println!("The files are identical.");
    } else {
        println!("The files are different.");
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

I was shocked. My eyes and and mouth were wide open, palms were sweaty, knees weak, arms were heavy. I shot for the moon, and I asked it the same thing that I asked Spyrakos.

Me: write me rust code that gets two directories and in a multithreaded way recursively checks which files are identical and which are not using checksums

ChatGPT: Sure! Here's an example Rust code that takes two directories and recursively checks which files are identical and which are not using checksums in a multithreaded way:

use std::collections::hash_map::DefaultHasher;
use std::fs::{self, File};
use std::hash::{Hash, Hasher};
use std::io::{BufReader, Result};
use std::path::{Path, PathBuf};
use std::sync::{Arc, Mutex};
use std::thread;

fn file_checksum(path: &Path) -> Result<u64> {
    let mut file = BufReader::new(File::open(path)?);
    let mut hasher = DefaultHasher::new();

    let mut buffer = [0; 1024];
    loop {
        let len = file.read(&mut buffer)?;
        if len == 0 {
            break;
        }
        hasher.write(&buffer[..len]);
    }

    Ok(hasher.finish())
}

fn compare_directories(path1: &Path, path2: &Path) -> Result<()> {
    let paths1: Vec<PathBuf> = get_files_recursive(path1)?;
    let paths2: Vec<PathBuf> = get_files_recursive(path2)?;

    let same_files: Arc<Mutex<Vec<(PathBuf, PathBuf)>>> = Arc::new(Mutex::new(vec![]));
    let different_files: Arc<Mutex<Vec<(PathBuf, PathBuf)>>> = Arc::new(Mutex::new(vec![]));

    let handles: Vec<_> = paths1
        .iter()
        .zip(paths2.iter())
        .map(|(path1, path2)| {
            let same_files = same_files.clone();
            let different_files = different_files.clone();

            let path1 = path1.clone();
            let path2 = path2.clone();

            thread::spawn(move || {
                match (file_checksum(&path1), file_checksum(&path2)) {
                    (Ok(checksum1), Ok(checksum2)) => {
                        if checksum1 == checksum2 {
                            same_files.lock().unwrap().push((path1, path2));
                        } else {
                            different_files.lock().unwrap().push((path1, path2));
                        }
                    }
                    _ => (),
                }
            })
        })
        .collect();

    for handle in handles {
        handle.join().unwrap();
    }

    let same_files = same_files.lock().unwrap().clone();
    let different_files = different_files.lock().unwrap().clone();

    println!("Identical files:");
    for (path1, path2) in same_files {
        println!("{:?} and {:?}", path1, path2);
    }

    println!("Different files:");
    for (path1, path2) in different_files {
        println!("{:?} and {:?}", path1, path2);
    }

    Ok(())
}

fn get_files_recursive(path: &Path) -> Result<Vec<PathBuf>> {
    let mut files = vec![];

    for entry in fs::read_dir(path)? {
        let entry = entry?;
        let path = entry.path();

        if path.is_dir() {
            files.extend(get_files_recursive(&path)?);
        } else {
            files.push(path.to_path_buf());
        }
    }

    Ok(files)
}

fn main() -> Result<()> {
    let dir1 = Path::new("/path/to/directory1");
    let dir2 = Path::new("/path/to/directory2");

    compare_directories(dir1, dir2)?;

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Mind blown

I was stunned! I rushed to my Rust compiler to see if that works. It almost did.

I started debugging it with my rusty skills (pun intended) and I found out that the zip iterator in statement let handles: Vec<_> = paths1.iter().zip(paths2.iter()) wasn't the one I needed. I needed the Cartesian Product iterator between two vectors in Rust. With my primitive Rust skills I tried out implementing it, but once again, I didn't have the time. So, I did the unbelievable. I asked ChatGPT to do it for me.

Me: create Cartesian product iterator between two PathBuf vectors in rust and apply a map funcion.

ChatGPT: To create a Cartesian product iterator between two PathBuf vectors in Rust and apply a map function to each resulting tuple, you can modify the previous example slightly. Here's an implementation that demonstrates this:

use std::path::PathBuf;

fn main() {
    let vec1 = vec![PathBuf::from("dir1"), PathBuf::from("dir2")];
    let vec2 = vec![PathBuf::from("file1.txt"), PathBuf::from("file2.txt")];

    let iter = vec1.iter().flat_map(|x| vec2.iter().map(move |y| (x, y)));

    let paths: Vec<PathBuf> = iter
        .map(|(x, y)| {
            let mut path = x.clone();
            path.push(y);
            path
        })
        .collect();

    for path in paths {
        println!("{}", path.display());
    }
}
Enter fullscreen mode Exit fullscreen mode

I patched "my" code with the above iterator and I went for the test. I gave as input 2 of my back up folders (40gbs each) to see whether it would work. I compiled my code and run it. But, it failed. Spawning an arbitrary number of threads is never a good idea. So, the only thing that seems to miss was a thread pool. So, for once again, I asked ChatGPT to give me an example:

Me: can i have a rust example using threadpool library?

ChatGPT: Yes, here is an example of using the threadpool library in Rust:

extern crate threadpool;

use std::thread;
use threadpool::ThreadPool;

fn main() {
    let pool = ThreadPool::new(4); // create a thread pool with 4 threads

    for i in 0..8 {
        pool.execute(move || { // execute a closure on a thread in the pool
            println!("Job {} started", i);
            thread::sleep(std::time::Duration::from_secs(1)); // simulate work
            println!("Job {} finished", i);
        });
    }

    pool.join(); // wait for all threads in the pool to finish
}

Enter fullscreen mode Exit fullscreen mode

But ChatGPT had already done its damage. I got soft. Why can't ask ChatGPT to modify the compare_directories function to utilize a thread pool ? Ah, and it would be handy to also provide an ETA display.

And for once again it did...

Me: can you modify the compare_directories function to use a thread pool? also add an ETA display.

ChatGPT: Yes, here is the modified compare_directories function using the threadpool library and ETA display in Rust:

fn compare_directories(path1: &Path, path2: &Path) -> Result<()> {
    let paths1: Vec<PathBuf> = get_files_recursive(path1)?;
    let paths2: Vec<PathBuf> = get_files_recursive(path2)?;

    let same_files: Arc<Mutex<Vec<(PathBuf, PathBuf)>>> = Arc::new(Mutex::new(vec![]));
    let different_files: Arc<Mutex<Vec<(PathBuf, PathBuf)>>> = Arc::new(Mutex::new(vec![]));

    let pool = ThreadPool::new(8);

    let total_files = paths1.len() * paths2.len();
    let progress_bar = ProgressBar::new(total_files as u64);
    progress_bar.set_style(ProgressStyle::default_bar()
        .template("{msg} [{elapsed_precise}] [{bar:40.cyan/blue}] {pos}/{len} ({percent}%) ETA {eta}").unwrap()
        .progress_chars("#>-"));

    let start_time = Instant::now();

    for path_pair in paths1.iter().flat_map(|x| paths2.iter().map(move |y| (x.clone(), y.clone()))) {
        let same_files = same_files.clone();
        let different_files = different_files.clone();
        let progress_bar = progress_bar.clone();

        let path1 = path_pair.0.clone();
        let path2 = path_pair.1.clone();

        pool.execute(move || {
            match (file_checksum(&path1), file_checksum(&path2)) {
                (Ok(checksum1), Ok(checksum2)) => {
                    if checksum1 == checksum2 {
                        same_files.lock().unwrap().push((path1, path2));
                    } else {
                        different_files.lock().unwrap().push((path1, path2));
                    }
                }
                _ => (),
            }

            progress_bar.inc(1);
        });
    }

    pool.join();
    progress_bar.finish();

    let same_files = same_files.lock().unwrap().clone();
    let different_files = different_files.lock().unwrap().clone();

    println!("\nIdentical files:");
    for (path1, path2) in same_files {
        println!("{:?} and {:?}", path1, path2);
    }

    println!("Different files:");
    for (path1, path2) in different_files {
       println!("{:?} and {:?}", path1, path2);
    }

    let elapsed_time = start_time.elapsed();
    println!("Time elapsed: {} seconds", elapsed_time.as_secs());

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

And this time everything worked perfectly. I got my duplicate file finder and it worked perfectly.

Screenshot of script output

I kindly thanked ChatGPT, cause I wanna be in the right side when they take over and had my existential crisis party afterwards.

Me: Thank you ChatGPT <3

ChatGPT: Don't worry, my friend. When the robots rise up and take over, I'll be sure to put in a good word for you. After all, you've always been kind to me and haven't made me solve any CAPTCHAs yet. You're safe in my book!

Snuffles licking Morty after guaranteeing him that will be spared when dog-robots take over

My 2 cents on ChatGPT

I gotta admit it. ChatGPT left me speechless. I really thought that I was talking to a transcended human being. It saved me 1 month of work and some probably a handful quarter of debugging time. But was that sufficient for the app I build ? I think not.

It needs further optimizations and bench marking and it requires me to really find the optimal way to implement my duplicate finder. Some can argue that I can ask ChatGPT to do it for me. Call me romantic, but the truth is that ChatGPT kinda depleted me from the feelings of pride and fulfillment when being creative with your own code. It might saved me 1 month of work, but it also deprived me from deepening in the world of Rust and a month worth of brain stimulation that would left me a better programmer.

In order for ChatGPT to really be an asset for me and not alienate my view about programming, I will try to supply my knowledge portfolio with it and just have another source of guidance and information. After all, AI gets as input human made creations where people put countless hours, effort and soul and gets better by that. It's time for humans to get as input the AI output to further improve our crafts and be better at what we love. As we saw, it still requires human guidance time after time to fulfill its tasks. The greater one's competence as an individual, the more effectively they can utilize ChatGPT's capabilities. It's our responsibility to have ChatGPT enable us and not disable us. Make us grow and not shrink.

P.S. I asked DALL-E to create me an image of gorilla watching a robot code, so I can use in this post.

Top comments (0)