DEV Community

tq-bit
tq-bit

Posted on

Bright data Challenge - Industry AI Watchdog

This is a submission for the Bright Data Web Scraping Challenge: Most Creative Use of Web Data for AI Models

What I Built

A web app that aggregates industry specific news into three KPI.

The value this app provides is that users only need to take a single glance at these KPI values to see if something is going on in their industry.

I started off developing this app to solve a business problem, but figured using AI as an alternative to a more strict algorithm seemed to be a great addition.

How it works

Users will specify

  • their sources (websites and selectors)
  • their scores (weighed keywords)

before the app calculates three indexes:

  1. Relevance Index: How relevant their sources are for their scoring (high index = better)
  2. Impact Index: The impact happening in the industry right now (low index = better)
  3. Industry Index: The combined result of relevance and impact (high index means there's something going on in the industry users should be aware of)
  4. AI will also provide a summary of the analysis as part of the result

I'm leaving out some details about prompts and scoring here, but if you're curious, you can find them in the codebase:

Demo

Codebase

You can find the repository on Github. It's written in Deno+Fresh and quickly setup, follow the readme.md instructions to get started. I've added some sources and scorings so you can quickly get started.

Industry Watchdog

This project is a prototype project for the Bright Data challenge on dev.to. IW lets users take a quick glance at a single KPI to see if something is going on in their industry.

Getting Started

  1. Clone the repo
  2. Install Deno
  3. Rename .env.example to .env and set your BROWSER_WS variable
  4. Run deno task start
  5. Navigate to http://localhost:8000, add your sources and scores and run the indexing process

How to use

  1. Remove all sources and scores
  2. Follow the steps on the Home-page
  3. Run the indexing process



Screenshots

Overview & starting page

Image description

Source maintenance

Image description

Scoring maintenance

Image description

How I Used Bright Data

Bright data provides security and scalability for browser scraping, which is crucial to the availability and integrity of the index data. Industry Watchdog uses Bright Data browser scraping to scrape multiple sources at once and circumvent possible Captcha issues. Using their broad proxy network ensures that critical articles are being considered for the analysis.

Basically, this project could also qualify for Prompt 2: Build a Web Scraper API to Solve Business Problems, however instead of using Brightdata's API, it's using the scraping browser.

This app proves useful for analytical firms and BI departments who use internal, as well as external data to monitor their business strategies and operations, and would like to extend their KPI collection by the Industry Index.

Top comments (0)