DEV Community

NOGIER Loïc
NOGIER Loïc

Posted on • Edited on

RUST - Download a file using a link and a css selector.

During my internship, my supervisor wanted a program to download an .xls file from a French government website. He wanted to be able to repeat this action with the task scheduler integrated in Windows Server every month.

I made this little application that takes two parameters, the first is the name of the file, the second is the folder or more precisely the location where to save this file.

Here is a demo:
https://i.imgur.com/d9yWKVC.gif

To know that it link download, I base myself on this url: https://communaute.chorus-pro.gouv.fr/annuaire-cpro.

With Rust, I use the crate select to be able to target my HTML element, here is an example:

pub fn get_download_link(url: &str) -> Result<String> {
    let res = reqwest::get(url).unwrap();

    let document = Document::from_read(res)?;
    let web_url = document
        .find(Name("input"))
        .filter_map(|n| n.attr("onclick")).nth(0).unwrap();
    let string_web_url: String = web_url.to_owned();

    let url_splited: Vec<String> = string_web_url.split("href=").map(|c| c.replace("'", "")).collect();
    Ok(url_splited[1].to_owned())
}
Enter fullscreen mode Exit fullscreen mode

You can see, that I target the first input, and get the onclick attribute that contains the link.

After that, all I have to do is download the file with the reqwest crate.
Exemple :

pub fn download_link(url: &str, file_name: &str, dir_name: &str) {
    let mut resp = reqwest::get(url).expect("request failed");
    let mut out = File::create(format!("{}/{}", dir_name, file_name)).expect("failed to create file");
    io::copy(&mut resp, &mut out).expect("failed to copy content");
}
Enter fullscreen mode Exit fullscreen mode

It is necessary to know, when I get my arguments, I make a check for the parameter which gives the path of the folder where we are going to save the file, I look if there is a / or a \ at the end, if yes, I delete this character.
Exemple :

    // Just remove / or \ character if available
    if dir_name.ends_with("/") || dir_name.ends_with("\\") {
        dir_name.pop();
    }
Enter fullscreen mode Exit fullscreen mode

This allows me to easily manage the downloading of the file:

let mut out = File::create(format!("{}/{}", dir_name, file_name)).expect("failed to create file");
Enter fullscreen mode Exit fullscreen mode

This is the github repository: https://github.com/loicngr/LinkExtracter

Feel free to make your improvements :)

Peace =)

Top comments (2)

Collapse
 
natserract profile image
Natserract

Here’s for more complete project using select, and scraper github.com/alfinsuryaS/reason-rust...

Collapse
 
zaekof profile image
NOGIER Loïc

Oh it's nice! Thanks.
I really appreciate.