DEV Community

AIRabbit
AIRabbit

Posted on

AI-Powered market research with Reddit and NotebookLM in 10 Minutes (mind-blowing)

In recent years, Reddit has become one of the largest social communities on the internet. It hosts forums and discussions from all over the globe on every imaginable topic—from politics to history and, most importantly for this post, honest and extensive discussions about every brand or product you could envision.

This makes Reddit an invaluable tool for data-driven market research, providing deep insights such as:

  • Consumer perception of any brand:
    • Their opinions
    • What drives them to use the product
    • What they like or dislike
    • What they wish for
    • How they compare it to other products
    • And many more insights

If you're wondering why Reddit is so successful, here are five key reasons:

  • Strong Community: The site's community-driven approach creates a sense of belonging and connection among users.
  • Effective Algorithms: Reddit's algorithms prioritize quality over quantity, encouraging creators to produce better content and engage with users more effectively.
  • Low Barrier to Entry: The site's simplicity and ease of use make it easy for new users to join and participate in discussions.
  • Strong Moderation: Moderators play a crucial role in ensuring that discussions remain on-topic and respectful, maintaining a positive and inclusive environment.

- Transparency and Openness: Reddit's commitment to transparency and openness has helped build trust with users, establishing a sense of credibility and reliability.

This makes Reddit an invaluable source of information for market research on your brand or competitor analysis.

How to Conduct Market Research with Reddit (the Ordinary Way)

You could open Reddit, enter a search term, and go through the comments while making notes. This will certainly provide you with numerous insights and ideas on how your competitor or your own brand is performing, how customers perceive it, what they miss, what they like or dislike, etc.

That’s great. However, it sounds like an awful amount of effort—especially if you want to get an aggregated view, the big picture based on hundreds or even thousands of comments. This could become very tedious if done manually.

Hence, there are tools like Make.com and others that allow you to create pipelines that process large amounts of data and derive knowledge from this data. However, there are many caveats:

  1. Complex Configuration: You need to configure quite a lot. For example, if you want to access the Reddit API, you need to be familiar with Twitter, the API, and, of course, platforms like Make.com or whichever you use. The same applies to any other platform you use as a source of information for your market research, e.g., X.com, etc.
  2. Data Preprocessing: You need to preprocess the data to be suitable for AI processing.
  3. AI Configuration: You need to configure the AI to leverage this data effectively.

In essence, it requires a lot of knowledge about the platform, the API, and how to integrate all the moving parts to get a good picture of your brand or the brand you are trying to analyze.

Is There an Easier Solution?

Fortunately, yes. A solution that addresses all the above problems and makes AI-powered market research a breeze.

Here's how it works:

  1. Leverage a no-code Scraping Platform: We use a scraping platform called Apify and a Reddit scraper (also known as a Reddit actor) that does all the heavy lifting—setting keys, limits, proper data formats, everything. Essentially, all we need to do is provide the search keyword we want to analyze, and that's it. It's just magic.
  2. Use Notebook LM: We take the output of this scraper and feed it into Notebook LM, another powerful tool provided by Google for analyzing (and even generating podcasts from) massive amounts of data.

AI Rabbit News & Tutorials

Beyond AI News: Trends, Tools, and Tutorials

favicon airabbit.blog

With these two tools, we combine the power of massive data with the power of AI to provide you with insights into any brand, product, company, or whatever you can imagine.

If you have never worked with these tools before, I recommend you check out my blog posts (just use the search, you will find plenty of articles).

What We Will Need

  1. An Account on Apify.com: If you don't already have an account, you can sign up here.
  2. A Google Account: To use Notebook LM.

Sounds good? Let's try it.


The Goal

Conduct market research on Substack by analyzing up to 1,000 posts about everything users say about it.


Step by Step

1. Open the Reddit Scraper

Go to the store and open this actor (the Reddit scraper):

Reddit Scraper Actor

If you prefer another tool, you can use it as well, but this one worked well for me, so I'll use it here. The pricing of $4 per 1,000 is quite reasonable.

And with a free plan, you can test it for free.

Reddit Scraper

2. Configure the Scraper

In the configuration, you need to set basically only two things: the keyword and the limit of posts and comments you want to scrape.

You can set the configuration in the UI, but you can also enter a JSON with all the configuration to save time copying and pasting.

Configuration

Configuration Snippets:

"searches": [
    "substack"
],
Enter fullscreen mode Exit fullscreen mode
"maxItems": 1000,
"maxPostCount": 1000,
Enter fullscreen mode Exit fullscreen mode

You can copy and paste this JSON code below to automatically configure all necessary fields for this tutorial:

{
    "debugMode": false,
    "includeNSFW": true,
    "maxComments": 10,
    "maxCommunitiesCount": 2,
    "maxItems": 1000,
    "maxPostCount": 1000,
    "maxUserCount": 2,
    "proxy": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    },
    "scrollTimeout": 40,
    "searchComments": false,
    "searchCommunities": false,
    "searchPosts": true,
    "searchUsers": false,
    "searches": [
        "substack"
    ],
    "skipComments": false,
    "skipCommunity": true,
    "skipUserPosts": false,
    "sort": "new"
}
Enter fullscreen mode Exit fullscreen mode

Then hit Save.

Save Configuration

3. Start the Scraping Job

Now go to the list of runs (the scraping jobs) to see the job you have just started.

List of Runs

Apify Actor Runs

After a few minutes, you should start getting some results.

Results

4. Export the Results

Now let's export those results.

Make sure to select only the field "body"; this includes all the data we want to analyze without all the metadata like usernames, dates, etc. We leave them out for this use case, but they could certainly be valuable for other purposes.

Export Results

After exporting, you should have a JSON file with a format like this:

JSON File

Rename the file to .txt to be able to import it into Notebook LM.

Tip for Coders: You can convert it to markdown with a command similar to this:

cat dataset_reddit-scraper-lite_2024-11-27_10-08-25-793 (1).json.txt | jq '.[].body | gsub("\n"; " \n")'

But simply renaming the file works well for analysis with Notebook LM.

Rename File


Part II

1. Upload to Notebook LM

Now head to Notebook LM and upload the file we just renamed.

Upload to Notebook LM

And watch the magic happen.

After a few minutes, you can see on the left side the sources you just uploaded and, in the chat window, a summary. From there, you can drill down—the sky is the limit for the types of questions you can ask. I find it just mind-blowing.

Notebook LM Interface

2. Get Insights

To get a first insight, we can request a high-level overview of what people are saying about Substack.

Request Overview

And here is the result:

Overview Result

Markdown Version:


The Substack posts you've provided cover a wide array of topics, reflecting the diverse nature of the platform itself. However, some overarching themes do emerge:

1. Substack as a Platform for Niche Content and Community Building

Key Points:

  • Many posts highlight Substack's ability to cater to specialized interests and foster engaged communities.
  • Authors discuss connecting with like-minded individuals, sharing in-depth analyses, and building a dedicated readership around specific topics.
  • The posts showcase Substack's potential for attracting an audience passionate about niche subjects, ranging from AI and biohacking to F1 racing and experimental music.

2. Substack as a Tool for Writers Seeking Greater Control and Ownership

Key Points:

  • Several posts contrast Substack with platforms like Medium, emphasizing Substack's advantages in terms of content ownership, monetization options, and direct connection with subscribers.
  • Authors express a desire for greater autonomy and the ability to build a sustainable business around their writing.
  • The posts reflect a trend of writers migrating to Substack in pursuit of a more independent and potentially lucrative writing career.

3. Substack as a Space for Experimentation and Innovation

Key Points:

  • Some posts showcase innovative uses of Substack, such as incorporating educational programs, integrating with other platforms like Patreon, and hosting serialized fiction.
  • Authors are exploring new formats and strategies, pushing the boundaries of what's possible on the platform.
  • The posts highlight Substack's potential for supporting creative projects and encouraging experimentation in content creation and community engagement.

4. Substack's Role in the Evolving Landscape of Online Content Consumption

Key Points:

  • Some posts discuss the decline of traditional blogs and the rise of newsletters and subscription-based models.
  • Authors explore the implications of these shifts for writers and readers alike.
  • The posts reflect a broader conversation about the changing dynamics of online publishing and content monetization, with Substack positioned as a key player in this evolving landscape.

5. Substack as a Reflection of Contemporary Concerns and Interests

Key Points:

  • The Substack posts touch upon a variety of current topics, including AI ethics, biohacking, social justice, and digital addiction.
  • Authors use Substack to share their perspectives, analyses, and personal experiences related to these issues.
  • The posts demonstrate Substack's role as a platform for engaging with contemporary concerns and fostering dialogue around important societal and technological developments.

These overarching themes offer valuable insights into the nature and potential of Substack. The platform emerges as a dynamic space for writers and readers to connect, engage with niche content, and navigate the evolving world of online publishing.


Quite impressive for what we get with just three clicks!

You can take this to the next level and, with a fourth click, generate a podcast. If you haven't tried it before, I highly recommend you do. You will be amazed.


Wrap-up

In this tutorial, I have demonstrated how you can take your market research to the next level by conducting a deep analysis of Reddit comments and posts and leveraging AI to gain profound insights into any brand.

With these two tools, you can analyze not just Reddit but also X.com, Facebook, and hundreds of other rich information sources, providing you with unparalleled insights.

Top comments (0)