Go has a big standard library, and one package in there I really like is the image package. It's fun to work with since you can make cool photo filters with it, and I find its use of interfaces to represent images and color models (RGB, grayscale, CMYK) really elegant.
To show what the image package can do, I'll show you how to make an image filter using just Go plus a few Wikipedia searches worth of theory on color representations. I originally made the logo just for fun, but then decided to use for making the logo I use on my slides when I'm giving a talk.
If you know your way around graphics and color spaces in more detail than just Wikipedia searches, by the way, please share what you know in the comments or talk to me on Twitter at @AndyHaskell2013. I find this stuff really cool and would love to hear what you know!!
If you want to use the final version of the script, with a command line interface and being able to recolor your images to any RGB colors, you can find the script on my GitHub.
Loading an image
First thing we'll need is an image to apply the filter to. The picture below is the one I used for my logo. If you're following along, you should take a picture of a drawing of your own logo, or another photo you want to run this filter on. Save your image to raw.jpg
:
What we'll be converting it to is this:
Now let's get started with the code! First, add this code to a file named main.go
in the same directory as the image you're recoloring:
package main
import (
"image"
"log"
"os"
)
func main() {
f, err := os.Open("raw.jpg")
if err != nil {
log.Fatalf("error loading raw.jpg: %v", err)
}
defer f.Close()
img, _, err := image.Decode(f)
if err != nil {
log.Fatalf("error decoding file to an Image: %v", err)
}
_ = img // [TODO] Edit the image
log.Println("We got an image!")
}
What's happening here? In the first chunk of code, we're using os.Open
to open our file, and then deferring closing the file when we're done with it. f
is a *os.File
, so the next step is to get that File's data into a type the image package can work with.
That's where image.Decode
comes in. It takes in an io.Reader
(any Go type that can read bytes; *os.File
is one of those), and converts its content to an image.Image
, which is the type the image package uses to represent and manipulate images. Once we've got that Image
, we can start working with it, so run go run main.go
and you'll get:
2020/02/02 15:28:25 error decoding file to an Image: image: unknown format
exit status 1
This file is definitely a valid image, but the image package can't recognize a JPEG. As the overview in the image package Godoc says, for the image package to be able to decode JPEGs, we need to register the JPEG format. To do that, add this line to the import block:
import (
"image"
+ _ "image/jpeg"
"log"
"os"
)
Interesting, an underscore import. What's go-ing on here?
What's happening is, by importing a package, you end up running all the init()
functions inside that package. And it just so happens that at the bottom of Go's image/jpeg/reader.go
file, we've got this init function:
func init() {
image.RegisterFormat("jpeg", "\xff\xd8", Decode, DecodeConfig)
}
With this init
function, now the image package knows what to do if image.Decode
is run on an image in the JPEG format! So if you run go run main.go
now, you'll see:
2020/02/02 15:44:27 We got an image!
Excellent! We've got an image.Image
. Now let's do something with it!
Before we go, though, let's also register PNGs and GIFs with the image package so we can run this script on them too. We're just two lines away from that!
import (
"image"
+ _ "image/gif"
_ "image/jpeg"
+ _ "image/png"
"log"
"os"
)
The code up to this point is in commit 1.
The Image interface
For making the logo, what we're doing is re-coloring the pixels on the image; the pixels from the paper, which are lighter, will be recolored to charcoal black, and the pixels from the ink, which are darker, will be recolored to blue-green. Let's see what the Image
type gives us for figuring out which pixels are light and which are dark:
type Image interface {
ColorModel() color.Model
Bounds() Rectangle
At(x, y int) color.Color
}
Interesting, Image
is not a struct, it's an interface. And it has just three methods.
-
ColorModel
tells us what format the image uses to represent colors, like whether it's red/green/blue, the printer-style cyan/magenta/yellow/black, or something else. -
Bounds
tells us the coordinates of the top-left and bottom right-pixels of the image (top left is not always(0, 0)
). - Finally,
At
tells us what color the pixel at the given x and y coordinates is.
NOTE: If you haven't worked with images before, by the way, note that unlike in the graphs you would draw in math class, increasing the y coordinate means going down in the image, not up. For example, the coordinates (100, 250) are 100 pixels to the right and 250 pixels below (0, 0).
So if we're re-coloring pixels, that means we'd be using img.At()
to get their original colors. Let's try that out by seeing what color the top-left pixel is in our image by adding this code to the end of the main
function:
// we need this to know the coordinates of the top-left pixel, which isn't
// necessarily (0, 0)
b := img.Bounds()
color := img.At(b.Min.X, b.Min.Y)
log.Printf("The top-left pixel is %#v\n", color)
If we run this code with go run main.go
, we'll get this output:
2020/02/02 16:39:50 The top-left pixel is color.YCbCr{Y:0xd7, Cb:0x78, Cr:0x86}
YCbCr is a color model that represents colors based on their brightness (Y), blueness (Cb), and redness (Cr). So we have retrieved a pixel that's pretty bright, and both kinda-blue and kinda-red, so off-white. As you might guess, it's part of the paper I drew the logo on.
Now we have:
- A way to get what color a pixel is with
Image.At
- and a way to get where the top-left and bottom-right of the image is with
Image.Bounds
so knowing those two things, we also have a way of looping over all the pixels and being able to get the colors (and then do something with them, like recolor them).
Knowing that, the code we'll be writing for recoloring all the pixels of the image would look something like this, which is saved in commit 2:
func recolor(img image.Image) {
b := img.Bounds()
for y := b.Min.Y; y < b.Max.Y; y++ {
for x := b.Min.X; x < b.Max.X; x++ {
color := img.At(x, y)
// we have thew color of the pixel at (x, y). Now we just need to
// figure out how bright that pixel is, and recolor it accordingly!
_ = color
}
}
}
In the recolor
function, we get the image's bounds. Then, we loop over the image's pixels the way we would loop over a 2-dimensional array; in the outer loop, we work our way down the rows of pixels in the image, and in the inner loop, we loop from the leftmost pixel to the rightmost pixel in the current row of pixels.
Inside the inner loop, at each pixel, we get that pixel's color. So the next steps are:
- Figure out whether the pixel is light or dark
- If the pixel is light, then recolor it to charcoal black. If it's dark, then recolor it to blue-green.
Checking a pixel's brightness with the Color interface
Let's start by determining the brightness of each pixel in the image. When we run Image.At
, the type we get back is a color.Color
from Go's image/color
package. And just like Image
, Color
is another interface type:
type Color interface {
RGBA() (r, g, b, a uint32)
}
Just a single method! This means if you have a type with an RGBA
method telling us how red, green, blue, and opaque (a
stands for alpha transparency) a given pixel is, then the image package can use your type as a Color
!
If you look in the color package's Godoc, you'll see that the package has a lot of implementations of this interface. There's the Gray
type for supporting colors represented as grayscale, the RGBA
type for supporting colors represented in red/green/blue/alpha color spaces, and the one we saw when we printed out the color of our pixel, YCbCr
(luma/red-difference/blue-difference, which you can read more about in its Wikipedia article). There's even separate implementations for 8-bit vs. 16-bit gray and RGBA colors (ex Gray/Gray16
, and RGBA/RGBA64
)!
Why so many different ways to represent a color? Because depending on the medium you're working with, different color models make more sense to represent colors in. I don't have an in-depth knowledge on these color models, but RGBA is used a lot in computer graphics, grayscale of course works for black and white images, and cyan-magenta-yellow-black (CMYK) is used for printers (which is why if you look closely at old comic books, you'll see little cyan, magenta, yellow, and black dots making the colors for everything).
Even with all the different representations though, with the Color interface, our code doesn't need to care which format a pixel's color is represented in! We can use a YCbCr color, or a grayscale color, or an RGBA color in any piece of code that takes in a color.Color
. So let's try this Color
interface out for figuring out how whether a pixel is lighter or darker than a certain threshold.
To use a more sciencey term for that, let's see if the pixel's luminance is above a certain threshold!
Calculating luminance
For our purposes, we can go with the definition that luminance is how light or dark a pixel is. So completely white pixels have a luminance of 100%, and completely black pixels have a luminance of 0%.
In the picture of my logo, from a computer's perspective, the paper is not completely white and the ink is not completely black, so that's why our code needs to calculate luminance in order to determine if a pixel is light or dark.
If the luminance is above a certain threshold, let's say, 50% luminance, then we'll consider it light, and if it's below that threshold, we'll say it's dark. So how do we calculate luminance? This is the relative luminance formula I found on Wikipedia:
Y = 0.2126*R + 0.7152*G + 0.0722*B
So if we have a pixel in RGBA space, its luminance is affected the most by how green it is, then also affected some by how red it is, then just a bit by how blue it is. Here's what calculating luminance would look like in a Go function:
func luminancePercent(c color.Color) float64 {
r, g, b, _ := c.RGBA()
// We're dividing our pixel's red, green, and blue values by 2^16 because
// in colors returned from Color.RGBA(), the maximum value for a color
// is 2^16-1, or 65,535.
redPercent := float64(r)/65535 * 100
greenPercent := float64(g)/65535 * 100
bluePercent := float64(b)/65535 * 100
return redPercent*0.2126 + greenPercent*0.7152 + bluePercent*0.0722
}
First we run the color's RGBA method to get the pixel's red, green, and blue values. Then, we convert those values to percentages of the maximum possible value for red/green/blue, which is 65,535. Finally, to get our total luminance percent, we plug our red, green, and blue percentages into the luminance formula, and we have the pixel's luminance value!
So if we ran this, for example, on the color #00FF00 (as green as green gets), we'd have a green percent of 100, and red and blue percents of 0, we'd have a luminance of a bright 71.52%.
Let's head back to the recolor
function's main loop and try out this luminance
function by counting how many pixels are light and how many pixels are dark.
func recolor(img image.Image) {
b := img.Bounds()
var lightPixels, darkPixels int
for y := b.Min.Y; y < b.Max.Y; y++ {
for x := b.Min.X; x < b.Max.X; x++ {
color := img.At(x, y)
// [TODO] Replace this with actually setting the color
// of the current pixel.
if luminancePercent(color) > 50 {
lightPixels++
} else {
darkPixels++
}
}
}
log.Printf("This image has %d pixels above 50%% luminance and %d below 50%%\n",
lightPixels, darkPixels)
}
Now in the main function, if you run recolor
on the image we got back by running image.Decode
on our image file, we will get this output from go run main.go
:
2020/02/02 18:10:46 This image has 5948885 pixels above 50% luminance and 759875 below 50%
NOTE: The picture of my logo I posted up top is smaller than the copy I was using, so if you're following along using that picture, the number of light and dark pixels printed will be much smaller than the output above.
Our progress so far can be found in commit 3. Now that we can tell which pixels are light or dark, all we have to do is recolor them!
Recoloring the image
Interestingly, if we look again at the Image interface definition,
type Image interface {
ColorModel() color.Model
Bounds() Rectangle
At(x, y int) color.Color
}
you'll see that the interface doesn't actually give us a way to set the color of a pixel. However that won't stop us because most of the concrete types implementing image.Image
, such as the Image.RGBA
type, do have a Set
method for setting a color.
If we look around the image package's Godoc at the different Image
types, most of them correspond to the different color models we saw in the color
package like RGBA, YCbCr, etc.
Why do we have one Image implementation for each color model? I am planning on doing another blog post about this to take a detailed look at that, because the image package really makes beautiful use of Go interfaces. But the short answer is space efficiency, since images take up a lot of memory. By having RGBA images, for example, get their own type, you can represent every pixel's red, green, blue, and alpha-transparency values as 8-bit integers in one big space-efficient slice, but the Go code working with those images doesn't have to worry about that detail!
Details aside, though, what we need to know for now is that concrete Image types like Image.RGBA
, and Image.CMYK
do have a Set
method for setting the color of a pixel. And we can get a brand new image.RGBA with a function like:
func NewRGBA(r Rectangle) *RGBA
We need to pass an image.Rectangle
into the function to serve as the coordinates of our new image's bounds, and we have one already from our original image's Bounds
method! So to get a writable Image, we would edit our recolor
function like this:
- func recolor(img image.Image) {
+ func recolor(img image.Image) image.Image {
b := img.Bounds()
+ recolored := image.NewRGBA(b)
We now have an Image with the same bounds as the one we passed in. And it's got a Set
method with this function signature:
func (p *RGBA) Set(x, y int, c color.Color)
We've got the functions we need, so now inside the loop, instead of counting light and dark pixels, we will use the pixels' luminance values to determine which color to pass into RGBA.Set
,
In the import block, add a line to import the "image/color"
package, then edit the recolor
function's main loop like this:
+ // Add color.Color variables for the colors we're recoloring to
+ charcoal := color.RGBA{R: 34, G: 31, B: 32, A: 255}
+ blueGreen := color.RGBA{R: 128, G: 201, B: 172, A: 255}
- var lightPixels, darkPixels int
for y := b.Min.Y; y < b.Max.Y; y++ {
for x := b.Min.X; x < b.Max.X; x++ {
color := img.At(x, y)
- // [TODO] Replace this with actually setting the color
- // of ther current pixels
if luminancePercent(color) > 50 {
- lightPixels++
+ recolored.Set(x, y, charcoal)
} else {
- darkPixels++
+ recolored.Set(x, y, blueGreen)
}
}
}
And replace the log message printing the counts of light and dark pixels with the line: return recolored
to return our image!
Outputting our picture!
Just one thing left to do now: actually outputting the image to recolored.jpg
!
Let's first add a dialogue to our program so if there's already a file by that name, we don't overwrite it unless the user wants to. In the import block, import the "fmt"
, "strings"
, and "bufio"
packages, then add this code to the bottom of the main function:
if _, err := os.Stat("recolored.jpg"); err != nil && !os.IsNotExist(err) {
log.Fatalf(
"unexpected error checking if recolored.jpg already exists")
} else if err == nil {
fmt.Println("recolored.jpg already exists. Replace it? (y/n)")
replace, err := bufio.NewReader(os.Stdin).ReadString('\n')
if err != nil {
log.Fatalf(
"unexpected error reading whether to replace the file: %v",
err,
)
} else if strings.TrimRight(strings.ToLower(replace), "\n") != "y" {
return
}
}
First we use os.Stat
to check if recolored.jpg
already exists. If we get a nil error from that function, that means the file does exist. So in the else if
block, we print out a warning that recolored.jpg
exists, asking the user if they want to replace it.
We read the user's input in with bufio.NewReader(os.Stdin).ReadString('\n')
, and chop off the newline from the user's input with strings.TrimRight
. If the user typed "y", then we carry on, but if they typed "n" or anything else, then the program simply exits.
Now remove the _ from the _ "image/jpg"
line of the import block, and below the dialogue we made, add this if statement:
out, err := os.Create("recolored.jpg")
if err != nil {
log.Fatalf("error creating output file recolored.jpg: %v", err)
}
defer out.Close()
recolored := recolor(img)
if err := jpeg.Encode(out, recolored, nil); err != nil {
log.Fatalf("error outputting recolored image: %v", err)
}
We create recolored.jpg
with os.Create
, then by passing our recolored Image to jpeg.Encode
, we output that image to recolored.jpg!
Finally, run go run main.go
, and you should get your recolored image saved to recolored.jpg
:
Awesome! We've got the logo made with the recolor script, and we only used the Go standard library! You can find the progress on this at Commit 4.
I hope you liked this tutorial and are curious about trying more stuff out with image processing in Go!
By the way, on the GitHub repository, I added a command line interface with Steve Francia's Cobra CLI package so you can add more custom options, like picking what luminance thresholds you want to use, picking the name of the file you're outputting, what colors you want to convert pixels to, etc. If you want to look at those changes commit-by-commit, all the commits for making the CLI are on this GitHuib branch.
Top comments (1)
Thanks so much for this tutorial. Was so helpful!