Today I learned that I can show Goose, an open source AI developer agent, a screenshot of my UI and ask it to debug or update my code. I enjoy UI development, but my weakness is I'm not really detail-oriented. I lean more towards being a big picture thinker, and my not-so-pixel-perfect UI implementations sometimes drive UI designers up a wall.
In this blog post, we'll explore Goose’s screen toolkit to capture and update a UI, focusing on simple changes to color and layout. While Goose has robust capabilities, this example keeps it accessible for developers at all skill levels.
Goose and Toolkits 🛠️
Goose is a semi-autonomous agent that enables developers to extend its capabilities through toolkits.
According to the official Goose documentation, toolkits are plugins that
"provide Goose with tools (functions) it can call and optionally will load additional context into the system prompt (such as 'The Github CLI is called via gh and you should use it to run git commands'). Toolkits can do basically anything, from calling external APIs, to taking screenshots, to summarizing your current project."
How I Describe Toolkits 📱
Toolkits are like applications on your phone. At baseline, your phone can make calls, but you can download apps that extend its capabilities from playing games, to taking pictures, to listening to music. Similarly, with Goose at baseline, you can manage plugins, start sessions, and manage versions. But you can add toolkits that enable you to:
- Take screenshots for debugging
- Interact with the GitHub CLI
- Manage Jira projects
- Summarize repositories
How to Use the Screen Toolkit 📸
Step 1: Install Goose ⚡
To install Goose, run the following commands in your terminal:
brew install pipx
pipx ensurepath
pipx install goose-ai
Step 2: Start a Session 🚀
Start a Goose session with the following command:
goose session start
Note: First, set an API key for your preferred LLM provider (like OpenAI, Anthropic, etc.). For example:
export OPENAI_API_KEY=your_api_key
# Or for other providers:
export ANTHROPIC_API_KEY=your_api_key
Step 3: Enable the Screen Toolkit ⚙️
You'll find a settings file at ~/.config/goose/profiles.yaml
that configures how you want to use Goose. In this case, you'll need to enable the screen toolkit. Here's an example of what it may look like:
default:
provider: openai
processor: gpt-4o
accelerator: gpt-4o-mini
moderator: truncate
toolkits:
- name: developer
requires: {}
- name: screen
requires: {}
Step 4: Prompt Goose to Take a Screenshot 📷
In natural language, you can prompt Goose to take a screenshot of your display with a line like:
"Take a screenshot of display (1)."
Goose will return the command it ran to capture the screenshot on your computer. It might look something like this:
screencapture -x -D 1 /tmp/goose_screenshot_a12e2775bdad4c55810e8f9812921731.jpg -f jpg
You can access the saved image by opening a new tab or terminal outside of a Goose session and running the command:
open /tmp/goose_screenshot_a12e2775bdad4c55810e8f9812921731.jpg
Enable Goose to Update a UI via Screenshots 🎨
Now that we know how to capture screenshots with Goose, let's use it to update a UI.
Step 1: Create a Simple UI 💻
We'll create an index.html file that renders three green boxes:
<!DOCTYPE html>
<html>
<head>
<style>
.container {
display: flex;
justify-content: center;
gap: 20px;
padding: 20px;
}
.box {
width: 100px;
height: 100px;
background-color: #4CAF50;
border-radius: 4px;
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
}
</style>
</head>
<body>
<div class="container">
<div class="box"></div>
<div class="box"></div>
<div class="box"></div>
</div>
</body>
</html>
Save the file and open it in your browser. It should look like this:
Step 2: Prompt Goose to Screenshot This UI 📸
Let's get Goose to take a screenshot of our UI with the following prompt:
"Take a screenshot of my Google Chrome Browser. It should be on display(1)"
After Goose captured and saved the screenshot, I verified the image showed our three green boxes correctly.
Step 3: Prompt Goose to Update the UI ✨
Let's modify our UI through conversation with Goose. I asked it to make two changes:
- Make the middle square pink
- Rearrange the boxes into a column
The exact details of the conversation are in the image below:
Step 4: Verify the Result 🎉
Goose successfully updated the UI for me rendering three squares stacked vertically, with a pop of pink in the middle.
In this tutorial, we performed a basic UI update using Goose’s screen toolkit - changing colors and adjusting layouts. But the toolkit is capable of much more complex tasks, from solving tricky z-index stacking issues to handling cross-browser compatibility problems.
Beyond the screen toolkit, Goose offers a variety of other toolkits to enhance your development workflow. Check out our available toolkits to discover more possibilities.
As I was writing this blog post, I realized there are opportunities for us to add new capabilities like screen recording to enable Goose to handle keyframe animations and transformations.
Join Our Community! 🤝
The Goose Open Source Community is growing quickly, and I invite you join the fun by:
Can't wait to see you there!
Top comments (1)
Cool!