DEV Community

The Ultimate Guide to Scraping Indeed Jobs

The job market is evolving faster than ever, and real-time insights are the key to staying ahead. Indeed.com, one of the largest job portals, holds a treasure trove of data on job openings, trends, and hiring practices. But sifting through this data manually? That’s a whole different story. That’s where web scraping comes in—streamlining data extraction with the help of API, making your job easier, faster, and far more efficient.

The Importance of Scraping Indeed Job Postings

If you’re in business, HR, or even a job seeker, you need a pulse on the latest job trends and demands. By scraping job data from Indeed, you get a detailed view of what companies are hiring for, the skills in demand, and the salary ranges that matter. Automated scraping gives you a broader, more accurate picture of the job market and saves you valuable time.
This process isn’t just for large enterprises—it’s a tool that can level up anyone's job search strategy or business insights.

Unlocking the Power of the API

When it comes to scraping Indeed, API is a game-changer. It’s designed to work seamlessly with complex websites, helping you bypass anti-bot measures and gather the data you need, fast. Whether you’re after job titles, company names, or detailed descriptions, this API is the best tool for the job.

How to Extract Job Data from Indeed

Ready to get your hands dirty? This step-by-step guide will show you how to scrape job data—specifically job titles, descriptions, and company names—from Indeed using API.
Get Set Up
First, make sure you have Python 3.8 or later installed. You’ll also need to create a virtual environment for this project.
For Windows:
python -m venv indeed_env
For Mac/Linux:
python3 -m venv indeed_env
Activate the environment:
For Windows:
.\indeed_env\Scripts\Activate
For Mac/Linux:
source indeed_env/bin/activate
Next, install the necessary libraries:
pip install requests pandas
Now you’re all set for the scraping.

Crafting the Payload to Access Desired Data
API lets you send detailed requests to fetch data from Indeed. Here’s a simple example of how the API works:

import requests  

payload = {  
    "source": "universal",  
    "url": "https://www.indeed.com"  
}  

response = requests.post(  
    url="https://api.example.com/v1/queries",  
    json=payload,  
    auth=("username", "password"),  
)  

print(response.json())  
Enter fullscreen mode Exit fullscreen mode

This will print the entire HTML of the Indeed homepage. But we want more than just raw HTML—we want structured data. That's where the parsing instructions come in.

Parsing Data for Specific Fields
When you need specific details, you can tell the API exactly what to extract. Here’s an example that grabs the title of the page:

{  
    "title": {  
        "_fns": [  
            {  
                "_fn": "xpath_one",  
                "_args": ["//title/text()"]  
            }  
        ]  
    }  
}  
Enter fullscreen mode Exit fullscreen mode

Now, let’s dive into scraping actual job listings.

Scraping Indeed Job Listings
Each job listing on Indeed has a unique structure. Using Chrome’s "Inspect" feature, you can pinpoint the CSS selectors that correspond to the data you want. For instance, each job listing can be targeted using the .job_seen_beacon CSS selector.
Here's how you can start building the payload for scraping job titles, company names, and other details:

{  
    "source": "universal",  
    "url": "https://www.indeed.com/jobs?q=work+from+home&l=San+Francisco%2C+CA",  
    "render": "html",  
    "parse": true,  
    "parsing_instructions": {  
        "job_listings": {  
            "_fns": [  
                {  
                    "_fn": "css",  
                    "_args": [".job_seen_beacon"]  
                }  
            ],  
            "_items": {  
                "job_title": {  
                    "_fns": [  
                        {  
                            "_fn": "xpath_one",  
                            "_args": [".//h2[contains(@class,'jobTitle')]/a/span/text()"]  
                        }  
                    ]  
                },  
                "company_name": {  
                    "_fns": [  
                        {  
                            "_fn": "xpath_one",  
                            "_args": [".//span[@data-testid='company-name']/text()"]  
                        }  
                    ]  
                },  
                "location": {  
                    "_fns": [  
                        {  
                            "_fn": "xpath_one",  
                            "_args": [".//div[@data-testid='text-location']//text()"]  
                        }  
                    ]  
                }  
            }  
        }  
    }  
}  
Enter fullscreen mode Exit fullscreen mode

This JSON structure will let you extract job titles, company names, and locations from the search results.

Saving and Exporting Data
Once you've scraped the data, you can save it as a JSON file or even export it into CSV format for further analysis. Here’s how to export the results into a CSV:

import pandas as pd  

df = pd.DataFrame(response.json()["results"][0]["content"]["job_listings"])  
df.to_csv("job_search_results.csv", index=False)  
Enter fullscreen mode Exit fullscreen mode

Now you have all the job data neatly saved in a CSV file, ready to be analyzed.

Dealing with Proxies
Web scraping, especially on large platforms like Indeed, often requires proxies to avoid being blocked. With residential proxies, you can easily rotate IPs, mimic various locations, and ensure your scraper operates without disruptions. Here's a basic example of how you can use proxies:

import requests  
from bs4 import BeautifulSoup  

# Set up the proxies  
proxies = {  
    'http': 'http://USERNAME:PASSWORD@proxy.example.com:7777',  
    'https': 'https://USERNAME:PASSWORD@proxy.example.com:7777'  
}  

# Make the request  
response = requests.get(indeed_url, headers=headers, proxies=proxies)  
Enter fullscreen mode Exit fullscreen mode

Conclusion

Web scraping Indeed jobs data in 2025 doesn’t have to be a nightmare. With API, you can gather job market insights at scale, without worrying about bot-blocking or downtime. Add proxies to the mix for uninterrupted scraping, and you’re all set for success.

Top comments (0)