DEV Community

Markus
Markus

Posted on

Parsing Google by Keywords: A Detailed Plan to Create Your Own Free Parser

Imagine you’re trying to gather a huge number of search phrases for Google. If there are only ten, it’s no big deal. But when you’re dealing with thousands or even tens of thousands of keywords, any SEO practitioner will feel the pinch: standard tools just can’t keep up, and the old-school combo of Key Collector, Google Ads, and a couple of proxies looks like an anachronism. The world is evolving, and nowadays, life seems much harder without using the official API.

Image description

Don’t let that scare you. If you’re willing to dig into some technical depths, set up authentication, and dance with Google’s documentation, you can gain access to the Keyword Planner API. This isn’t just a handy tool—it’s the genuine key to an abundance of semantic data. In this article, you’ll get a clear guide on how to replace pricey third-party services (often starting at $300 a month!) with your own “parser” based on the Google Ads API. Let’s begin!

Image description

Why You Should Get Access to the Keyword Planner API

When you need to build a large-scale semantic core of search phrases for Google, a handful of queries won’t cut it. Classic Key Collector used to be a reliable friend, but times have changed: now, especially for certain ad accounts, the standard approach is lagging. As a result, people look at paid solutions like Keyword Tool, which can cost significant sums yet essentially pull the same data you could fetch yourself from Google Ads. After all, you want to work without middlemen and configure the process exactly how you like, right?

Sure, Key Collector can still be useful for handling already gathered keyword sets: it’s good at sorting, cleaning, and grouping. Services for CAPTCHA recognition or proxies might not even be necessary, and tools like KeySo can help with data organization. But overall, if you’re aiming for a self-driven setup and don’t want to burn a fortune on third parties, it’s time to turn to the Google Ads API. You’ll be free to tune everything to your needs, skip those intermediary fees, and finally build your own automated workflow.

Image description

Getting Started With Building Your Google Parser Through the API

Want to collect more than 40,000 keywords? Totally doable—but first, you’ll need to handle a few formalities. You must “play by the rules”: without an official Google Ads account, you can’t access the Keyword Planner. You need a valid, active account with real ad spend—without actual expenses, the Keyword Planner just won’t be available.

Once you have a working ad account, the next step is to create a Manager Account (MCC). Head over to the Manager Accounts page, register a new manager account, and then in the left-hand menu find the “API Center.” There, you’ll see your Developer token—save it somewhere safe.

Google Cloud Console: Getting Your Client ID and Client Secret for Future Parsing

To let your script connect to the Google Ads API, it needs to be authorized via OAuth. For this, you’ll register an application in the Google Cloud Console.

Image description

  1. Go to the Google Cloud Console under the same account that’s linked to your ad account.

  2. Create a new project with any name.

  3. Navigate to APIs & Services → Library, find Google Ads API, and enable it.

  4. Then under APIs & Services → Credentials, create new credentials: choose “OAuth client ID.”

  5. Application type: Web application. You can name it anything you like, for example, “my-ads-api.”

  6. In the Authorized redirect URIs section, add:

http://localhost:8081/
http://localhost:8081
Enter fullscreen mode Exit fullscreen mode

Click “Create” — you’ll get two important items: the Client ID and the Client Secret.

A quick note about those two nearly identical URLs: based on user experience, some setups require the trailing slash, and others don’t. Cover both options just to be sure.

Getting a refresh_token for Your Google Parser to Work Automatically

For your code to have ongoing access to the API, you need a refresh_token, which handles automatic renewal of session keys without your intervention.

Image description

Here’s how you do it:

Install the library:

pip install google-ads
Enter fullscreen mode Exit fullscreen mode

Create a file get_refresh_token.py with the following code:

import logging
from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow

logging.basicConfig(level=logging.DEBUG)

def generate_refresh_token(client_id, client_secret):
    scopes = ["https://www.googleapis.com/auth/adwords"]

    flow = InstalledAppFlow.from_client_config(
        {
            "installed": {
                "client_id": client_id,
                "client_secret": client_secret,
                "auth_uri": "https://accounts.google.com/o/oauth2/auth",
                "token_uri": "https://oauth2.googleapis.com/token",
                "redirect_uris": ["http://localhost:8081/"]
            }
        },
        scopes,
    )

    auth_url, state = flow.authorization_url(
        access_type="offline",
        include_granted_scopes="true",
        prompt='consent'
    )
    print(f"URL for authorization: {auth_url}")

    credentials = flow.run_local_server(port=8081, state=state)
    print(f"Your Refresh Token: {credentials.refresh_token}")

generate_refresh_token(
    "Client ID",
    "Client secret"
)
Enter fullscreen mode Exit fullscreen mode

Plug in your Client ID and Client Secret, then run:

python get_refresh_token.py
Enter fullscreen mode Exit fullscreen mode

Open the link that appears in the console, and sign in to the relevant ad account. When the script finishes, it’ll display your refresh_token. Keep it safe!

Setting up google-ads.yaml — The Bedrock of Your Parser

Gather all your keys (developer_token, client_id, client_secret, refresh_token, login_customer_id) into a single config file:

developer_token: "YOUR_DEVELOPER_TOKEN"
client_id: "YOUR_CLIENT_ID"
client_secret: "YOUR_CLIENT_SECRET"
refresh_token: "YOUR_REFRESH_TOKEN"
login_customer_id: "YOUR_MANAGER_ACCOUNT_ID"
Enter fullscreen mode Exit fullscreen mode

Note that login_customer_id should be the ID of the active ad account (where the real ads run), not your MCC. Also make sure you’re using the correct ID without hyphens.

Getting Basic Access to the API: Without It, Your Parser Is Just a Pumpkin

Even after all that, you’re granted Test Access by default, which restricts your access to keyword search metrics. You need Basic Access. To request it:

  1. Go to the API Center in your MCC.
  2. Next to Test Access, select “Apply for Basic access.
  3. Mention that you’ll be using the API for your own analysis or internal automation.
  4. Usually, you’ll hear back in a couple of days. Once approved, you can freely fetch keyword data.

Important: your MCC must be linked to a real ad account. If it isn’t yet, add the ad account in the “Accounts” section and confirm the link.

Finally, the Parsing Process: A Python Script for Collecting Google Semantic Data

Below is a ready-made snippet. It reads phrases from a CSV, sends them in batches of 10 to the Keyword Planner API, and writes the results to a new CSV. You can tweak the country, language, and other parameters as desired.

import csv
import time
from google.ads.googleads.client import GoogleAdsClient

def chunk_list(lst, n):
    for i in range(0, len(lst), n):
        yield lst[i:i+n]

def main():
    client = GoogleAdsClient.load_from_storage("google-ads.yaml")
    keyword_plan_idea_service = client.get_service("KeywordPlanIdeaService")

    customer_id = "YOUR_CUSTOMER_ID"  # no hyphens

    keywords = []
    with open("keywords.csv", "r", encoding="utf-8") as f:
        reader = csv.DictReader(f)
        for row in reader:
            kw = row['keyword'].strip()
            if kw:
                keywords.append(kw)

    chunk_size = 10

    with open("keyword_data.csv", "w", newline="", encoding="utf-8") as outfile:
        writer = csv.writer(outfile)
        writer.writerow([
            "keyword",
            "avg_monthly_searches",
            "competition",
            "low_top_of_page_bid_micros",
            "high_top_of_page_bid_micros"
        ])

        for chunk in chunk_list(keywords, chunk_size):
            request = client.get_type("GenerateKeywordIdeasRequest")
            request.customer_id = customer_id

            request.geo_target_constants.append("geoTargetConstants/2250")  # France
            request.language = "languageConstants/1010"  # French

            request.keyword_seed.keywords.extend(chunk)
            response = keyword_plan_idea_service.generate_keyword_ideas(request=request)

            for idea in response.results:
                text = idea.text
                metrics = idea.keyword_idea_metrics
                avg_searches = metrics.avg_monthly_searches if metrics.avg_monthly_searches else 0
                competition = metrics.competition.name if metrics.competition else "UNSPECIFIED"
                low_bid = metrics.low_top_of_page_bid_micros or 0
                high_bid = metrics.high_top_of_page_bid_micros or 0

                writer.writerow([
                    text,
                    avg_searches,
                    competition,
                    low_bid,
                    high_bid
                ])

            time.sleep(1)

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

How It Works and Extra Details

  • Your keywords.csv file needs a column named keyword.

  • The script splits the list of keywords into chunks of 10 so you don’t exceed API limits.

  • You can customize the country, language, and other parameters in each request. For example, “2250” stands for France and “1010” is for French. To discover the codes for your target languages and regions, either check the docs or use a quick script like the one below.

To figure out language codes, you can run a GAQL query:

from google.ads.googleads.client import GoogleAdsClient

def main():
    client = GoogleAdsClient.load_from_storage("google-ads.yaml")
    ga_service = client.get_service("GoogleAdsService")

    customer_id = "YOUR_CUSTOMER_ID"
    query = """
    SELECT language_constant.id, language_constant.code, language_constant.name, language_constant.targetable
    FROM language_constant
    ORDER BY language_constant.id
    """

    response = ga_service.search_stream(customer_id=customer_id, query=query)

    for batch in response:
        for row in batch.results:
            lang = row.language_constant
            print(f"ID: {lang.id}, Code: {lang.code}, Name: {lang.name}, Targetable: {lang.targetable}")

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

Run that, and you’ll see a list of available languages with their IDs, which you can then use in your parser.

On Limits and Different Access Levels

With Basic Access, your daily quota is roughly 15,000 API calls. Since each call can handle up to 10 phrases, you can analyze up to 150,000 keywords a day. On the Standard Access tier, your limits effectively increase dramatically, but to get that level, you’ll need to prove your seriousness to Google: follow best practices, avoid misuse, and generally invest a decent amount in ads.

In Closing

The biggest hurdle to creating your own free parser with the Google Ads API is getting Basic Access. Without it, even the most elegant code won’t do you much good. But if you have a real ad account and carefully follow these steps, things typically go smoothly.

I hope this guide helps you ditch expensive third-party services and gives you a tool for forming your own semantic databases. Keep an eye on usage quotas, file your access requests properly, and use what the official API offers. May your experiments be fruitful and your search metrics crystal clear!

Top comments (0)