DEV Community

Cover image for Strapi, another use case: Build your own API from any website with Puppeteer

Strapi, another use case: Build your own API from any website with Puppeteer

ELABBASSI Hicham on April 17, 2020

The objective of this tutorial is to build a simple job search API with Strapi and Puppeteer. Strapi is an open-source Headless CMS written in Node...
Collapse
 
seefor profile image
Sif Baksh

Great article, testing it out and new to JS.

I notice that "li.result-card.job-result-card" is no longer working.
Can you please update or point to me to what I should look for?

Collapse
 
hichamelbsi profile image
ELABBASSI Hicham • Edited

If the selector doesn't work, you can go to the Linkedin job search with your browser a copy the li selector in the DEV tools.

Collapse
 
seefor profile image
Sif Baksh • Edited

Dude this is great man, I got it to work and thanks for teaching me.

Here is the update li to update:
li.result-card.job-result-card.result-card--with-hover-state.job-card__contents--active

Thread Thread
 
hichamelbsi profile image
ELABBASSI Hicham • Edited

Thank you, Sif!

Well, it seems that the selector in the tutorial is still working li.result-card.job-result-card. Be careful, the selector in your reply will select only the active list item (as you can see, the .job-card__contents--active is the active CSS class for a selected list item). We need all the list items (not just the selected one) so you need to get a more generic selector.

Thread Thread
 
seefor profile image
Sif Baksh

Thanks for that, I will give that a try

Collapse
 
dkp1903 profile image
Dushyant Pathak

I followed the steps to the letter, but don't see jobs turning up in my admin page. I see GETs ongoing in my terminal, so I suppose that means that the data is being fetched? If I am not mistaken?

Collapse
 
hichamelbsi profile image
ELABBASSI Hicham

Hello Dushyant,

Can you see your content type in the Strapi admin page? Can you share your CRON task script please?

Collapse
 
dkp1903 profile image
Dushyant Pathak • Edited

Thanks for the reply, sir.

Yes, I can see the content-type, Jobs, in the admin page.

Here is my CRON script(functions/cron.js)
gist.github.com/dkp1903/d598e143ea...

Thread Thread
 
hichamelbsi profile image
ELABBASSI Hicham • Edited

Your welcome.

Well, it should work. Can you confirm that your Strapi server CRON configuration is set to true in config/environments/development/server.json.

Also, keep in mind that the CRON task in this example will be executed every 24 hours. Did you wait 24 hours to test the case? Maybe you should modify the CRON expression to execute your script every minutes just to test if the script works well.

...
"*/1 * * * *": (date) => {
...

Don't forget to stop the server after the test :D

Thread Thread
 
dkp1903 profile image
Dushyant Pathak

It works, sir. Forgot about the 24 hour thing. Switched it to a minute and it works right as rain.

Thanks a million!

Collapse
 
arhsim profile image
arhsim

Don't mean to be a killjoy, but the LinkedIn part seems to be a violation of the LinkedIn ToS

linkedin.com/help/linkedin/answer/...

LinkedIn has banned users for seemingly harmless apps in the past. Can you update the article to use another site as an example?

Collapse
 
dandv profile image
Dan Dascalescu • Edited

Nice!

Tip: /Users/helabbassi/perso/ should be replaceable with ~.

Collapse
 
hichamelbsi profile image
ELABBASSI Hicham

Oh thank you Dan !

Collapse
 
lucasverra profile image
Lucas Verra

What is the vest way to manage authentication with puppeteer ? To have access to our own data in linkedin

Collapse
 
hichamelbsi profile image
ELABBASSI Hicham

Hi Lucas,

I didn't have time to test that solution (and I think it isn't the best way to do this) but I think you will need to sign in to Linkedin with your browser (to start a session) and find the li_at cookie in the DEV tools. Then, you will be able to set this cookie before navigating to Linkedin (just before the await page.goTo(...))

await page.setCookie({
      'name': 'li_at',
      'value': YOUR_COOKIE_VALUE,
      'domain': '.www.linkedin.com'
})

I really recommend you to create a simple function to check if you are logged in or not. Something like

const checkIfLoggedIn = async (page) => {
     const isAuthenticated = await page.$('.sign-in-card') === null;
     return isAuthenticated;
}

I think this function needs to be called after the setCookie because your Linkedin session can be finished.

Feel free to add some additional information about this solution or suggest a better way to do that.

Collapse
 
victorwu89 profile image
victorwu89

Great tutorial! Keep it up!

Collapse
 
hichamelbsi profile image
ELABBASSI Hicham

Thank you, Victor.

Collapse
 
mrnivorous profile image
MrNivorous

Thanks for this! I was already building out a job board using Strapi and was manually inputting some of the things. This was a huge help to get some other data.