DEV Community

Cover image for 🧠🤖Gemini API for free (by Super Mario and ChatGPT)
Web Developer Hyper
Web Developer Hyper

Posted on

🧠🤖Gemini API for free (by Super Mario and ChatGPT)

Intro

I wanted to try AI API but I don't have money...😅
Well, which AI API should I use?
I googled carefully and found out that Gemini API might be the answer.
This time, my friend Mario (ChatGPT) will introduce some of the contents.

What is Gemini API? (by Mario)

It's-a me, Mario! Woohoo!
The Gemini API is-a super AI power-up from Google! 🍄✨
https://ai.google.dev/gemini-api/docs
It ain't just for text, oh no!
It can-a handle images, video, and even audio—Mamma mia, that's-a powerful!
And guess what?
The ultra-fast, high-performance Gemini 2.0 Flash just-a dropped, and everyone's talking about it! 🚀🔥

Let's try Gemini API (by Mario)

https://ai.google.dev/gemini-api/docs/quickstart?lang=node
I-a gave it a spin with Node.js, 'cause the I've been-a working on a tech blog with React!
Here’s-a how to do it:

1️⃣ Make the API key for Gemini API
https://aistudio.google.com/app/apikey

2️⃣ Make a next.js project

npx create-next-app@latest
Enter fullscreen mode Exit fullscreen mode

https://nextjs.org/docs/app/getting-started/installation

3️⃣ Install Gemini API library

npm install @google/generative-ai
Enter fullscreen mode Exit fullscreen mode

4️⃣ Used ChatGPT to make the code of both frontend and calling API.
ChatGPT gave me the magic code, and—Wahoo!—it works! 🎩✨
generate/page.tsx

"use client";

import { useState } from "react";

export default function Gemini() {
  const [input, setInput] = useState("");
  const [response, setResponse] = useState("");

  const handleGenerate = async () => {
    const res = await fetch("/api/generate", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ prompt: input }),
    });
    const data = await res.json();
    setResponse(data.text || "Error generating response");
  };

  return (
    <div>
      <h1>Gemini AI</h1>
      <textarea
        value={input}
        onChange={(e) => setInput(e.target.value)}
        placeholder="Enter your prompt..."
      />
      <button onClick={handleGenerate}>Generate</button>
      <p>Response: {response}</p>
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

api/generate/route.ts

import { NextResponse } from "next/server";
import { GoogleGenerativeAI } from "@google/generative-ai";

export async function POST(req: Request) {
  try {
    const { prompt } = await req.json();

    if (!prompt)
      return NextResponse.json(
        { error: "No prompt provided" },
        { status: 400 }
      );

    const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!);
    const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });

    const result = await model.generateContent(prompt);
    const text = result.response.text();

    return NextResponse.json({ text });
  } catch (error) {
    console.error("Gemini API Error:", error);
    return NextResponse.json(
      { error: "Failed to generate response" },
      { status: 500 }
    );
  }
}
Enter fullscreen mode Exit fullscreen mode

Point: Use generateContent to send requests to the Gemini API! 📡🤖

5️⃣ Set .env.local
Set up .env.local to keep-a your API key safe—no peeking, Bowser! 🔑🚫

GOOGLE_API_KEY=your-api-key-here
Enter fullscreen mode Exit fullscreen mode

6️⃣ How it works
Run npm run dev and launch-a locally! 🚀
Type-a your question, hit the Generate button,
and—BINGO!—Gemini API gives you an answer! 🎉
And the output?
Well, here’s-a what it looks like! 👇
Image description
Mario, thank you for your support.
Leave the rest of the introduction of Gemini API to me.

RAG by Gemini API

https://ai.google.dev/gemini-api/docs/document-processing?lang=node
Gemini API supports PDF input.
It can understand text and image inside the document.
You can make a RAG and make it answer questions in the document.
For example, Gemini API can summarize the document that you uploaded.
Change the part of route.ts below

form

const result = await model.generateContent(prompt);
Enter fullscreen mode Exit fullscreen mode

to

    const pdfResp = await fetch(
      "https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf"
    ).then((response) => response.arrayBuffer());

    const result = await model.generateContent([
      {
        inlineData: {
          data: Buffer.from(pdfResp).toString("base64"),
          mimeType: "application/pdf",
        },
      },
      "Summarize this document",
    ]);
    console.log(result.response.text());
Enter fullscreen mode Exit fullscreen mode

Chat bot by Gemini API

You can collect multiple rounds of questions and responses.
Allows users to step incrementally toward answers or get help with multipart problems.
I asked ChatGPT to make a sample program for this.
https://ai.google.dev/gemini-api/docs/text-generation?lang=node#chat
generate/page.tsx

"use client";

import { useState } from "react";

export default function GeminiChat() {
  const [input, setInput] = useState("");
  const [messages, setMessages] = useState([
    { role: "user", parts: [{ text: "Hello" }] },
    {
      role: "model",
      parts: [{ text: "Great to meet you. What would you like to know?" }],
    },
  ]);

  const sendMessage = async () => {
    if (!input.trim()) return;

    const newMessages = [
      ...messages,
      { role: "user", parts: [{ text: input }] },
    ];
    setMessages(newMessages);

    const res = await fetch("/api/gemini", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ messages: newMessages }),
    });

    const data = await res.json();
    if (data.text) {
      setMessages([
        ...newMessages,
        { role: "model", parts: [{ text: data.text }] },
      ]);
    }

    setInput("");
  };

  return (
    <div>
      <h1>Gemini AI Chat</h1>
      <div
        style={{
          minHeight: "200px",
          border: "1px solid #ddd",
          padding: "10px",
          marginBottom: "10px",
        }}
      >
        {messages.map((msg, index) => (
          <p
            key={index}
            style={{ textAlign: msg.role === "user" ? "right" : "left" }}
          >
            <strong>{msg.role === "user" ? "You" : "AI"}:</strong>{" "}
            {msg.parts[0].text}
          </p>
        ))}
      </div>
      <input
        type="text"
        value={input}
        onChange={(e) => setInput(e.target.value)}
        placeholder="Type your message..."
        style={{ width: "80%", marginRight: "10px" }}
      />
      <button onClick={sendMessage}>Send</button>
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

api/generate/route.ts

import { NextResponse } from "next/server";
import { GoogleGenerativeAI } from "@google/generative-ai";

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();
    if (!messages)
      return NextResponse.json(
        { error: "No messages provided" },
        { status: 400 }
      );

    const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
    const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });

    const chat = model.startChat({
      history: messages,
    });

    const result = await chat.sendMessage(
      messages[messages.length - 1].parts[0].text
    );
    const text = result.response.text();

    return NextResponse.json({ text });
  } catch (error) {
    console.error("Gemini API Error:", error);
    return NextResponse.json(
      { error: "Failed to generate response" },
      { status: 500 }
    );
  }
}
Enter fullscreen mode Exit fullscreen mode

The output of the chat is like this.↓
Image description

NEW! Gemini 2.0 Flash

Gemini 2.0 Flash is now available.
https://ai.google.dev/gemini-api/docs/models/gemini-v2
I was testing the Gemini API in Node.js for now.
2.0 SDK is only available with Python and Go.
Java and JavaScript SDK will come soon.
So, I will introduce briefly only the features of 2.0.
Features of Gemini 2.0 Flash are as follows.

① Of course 2.0 is faster and cleaver than the previous model.

② Multimodal Live API.
It can process text, audio, and video input, and provide text and audio output.

③ Use Google Search as a tool.
You can improve the accuracy and recency of responses.

④ Generate Speech (early access/allowlist).
Text-to-speech, sounds like a human voice.

⑤ Generate Image (early access/allowlist).
Text to image, image and text to image, image editing, multi-turn image editing (chat).

Outro

It was easy to use Gemini API.
The API doc was easy to understand.
Also, I appreciate ChatGPT for the support for using Gemini API.
It helped a lot to make the code of frontend and calling the API.
Well, using AI (ChatGPT) to use AI API (Gemini API) seemed kind of weird...🤖🤝🤖
Thank you for reading.
Happy coding!

Top comments (0)