We are excited to team up with Bright Data to bring the community a new challenge.
Running through December 29, the Bright Data Web Scraping Challenge provides an opportunity to access public web data and build tools and applications powered by web scraping.
Bright Data offers dedicated endpoints for extracting fresh, structured web data from over 100 popular domains as well as a Scraping Browser that dramatically reduces overhead for maintaining a scraping and browser infrastructure.
If you’ve ever been curious about optimizing your web scraping and data collection process, this challenge is for you! We hope you give it a try.
Our Prompts
Prompt 1: Scrape Data from Complex, Interactive Websites
Create a project where you need to scrape data from sites with dynamic content and user interactions (e.g., infinite scroll or login-protected pages). Use Bright Data’s Scraping Browser for seamless handling of JavaScript-heavy and interactive websites.
Here is the submission template for anyone that wants to jump right in, but please review all challenge rules on the official challenge page before submitting.
Prompt 2: Build a Web Scraper API to Solve Business Problems
Use a Web Scraper API to tackle common business challenges like aggregating product prices, monitoring competitors, or collecting reviews across platforms. Use Bright Data’s Web Scraper API for efficient and scalable data collection.
Here is the submission template for anyone that wants to jump right in, but please review all challenge rules on the official challenge page before submitting.
Prompt 3: Most Creative Use of Web Data for AI Models
Design a pipeline that collects and structures web data to fine-tune an AI model—for example, creating custom chatbots or sentiment analysis tools. Leverage Bright Data’s Web Scraper API or Scaping Browser to collect real time web data and create innovative, AI-driven solutions.
Judging Criteria and Prizes
All three prompts will be judged on the following:
- Use of underlying technology
- Usability and User Experience
- Accessibility
- Creativity
The winner of each prompt will receive:
- $1,000 USD
- 6-month DEV++ Membership
- Exclusive DEV Badge
- A gift from the DEV Shop
All Participants with a valid submission will receive a completion badge on their DEV profile.
Need Help or Inspiration?
You can get to know the Bright Data platform by utilizing their docs and tutorials:
Important Dates
- December 11: Bright Data Web Scraping Challenge begins!
- December 29: Submissions due at 11:59 PM PDT
- January 9: Winners Announced
We can’t wait to see what you build! Questions about the challenge? Ask them below.
Good luck and happy coding!
Top comments (16)
This one was fun, thanks @noahbrinker @thepracticaldev!
Can't wait to see everyone's submissions!
I wish this also had guidance on ethical approaches to scraping. Just because we can scrape and train AI doesn't entitle us to disregard the hard work that authors, journalists, and artists produce by scraping and deriving from their work without consent or compensation.
But Bright Data's terms of service and license are to be respected.
This all just seems so wildly unethical.
@thepracticaldev @jess For the second prompt, at the heading, it states "Build a web scraper API..." and at the start of the text below it says "Use a Web Scraper API..." This has me a bit confused. Are we supposed to build an API that returns data scraped by Bright's web scraper API?
I think they are trying to say to use bright data api to solve any business problem. This will probably be judged based on the novelty of the business problem and the complexity of the sources being scraped… IG.
Thankss
This sounds like a nice idea to work with!
Cool hackathon! Can't wait to submit my project idea.
One problem, I didn't get the $15 credit after signing up using the provided link, I only get $2 trial credits. I signed up using Google login.
Hey @fahminlb33, if you haven't already please email noah@brightdata.com for support!
For the third prompt, do we need to submit a running, working AI model fine tuned with the data (which would require us to host the model at our own expense), or can we submit just the data normalization pipeline, up to the point where the normalized data is about to be fed into the model?
Hey @delaaja, you can just submit the pipeline and not the fully tuned model.
So I have a question. Is it compulsory we include the GitHub repo of the project or just the project live Demo @noahbrinker @thepracticaldev
Lets do this guys
Gonna be fun let's do this
Some comments may only be visible to logged-in visitors. Sign in to view all comments.