Generating previews of RAW images in Go, fast
I’m currently toying with making an image viewer in Go, because I’m unhappy with Eye-of-Gnome’s slowness and lack of support for Fujifilm’s RAW format (.RAF
files).
Yes, I know eog
’s being replaced by loupe
which is much faster, but it still doesn’t support RAFs, and I don’t like its keyboard shortcuts.
Anyway, I just wanted to try to make one for fun.
This article is not about the viewer itself but about one small part of it that I learned along the way: how to open and generate previews of RAF image files.
The image in this article is taken from the awesome raw.pixls.us project which contains RAW files for many camera models, all with CC0/Public domain licensing. Feel free to download a sample X-T5 RAF file here to replicate my findings, or better yet, try to do the same with the RAW format of your favorite camera brand!
(Not) decoding the RAW data
That may or may not seem obvious, but to generate a preview of a RAW image, you absolutely do not want to decode (demosaic) the raw data itself, for 2 main reasons.
The first reason is speed, since demosaicing is very CPU intensive.
I won’t even try to find a way to do that in Go as we’ll avoid it entirely anyway.
But in order to have some baseline, let’s try demosaicing with simple_dcraw
which is included in libraw
:
$ time simple_dcraw DSCF0021.RAF
real 0m4.214s
user 0m32.729s
sys 0m1.001s
More than 4 seconds of wall time and 8x that in CPU time. Ouch.
For the second reason, we can just take a look at the image that that command generated:

That’s obviously not right… RAW files are, well, raw, and they need a whole bunch of post-processing to get a usable image, not just demosaicing but also white balance, contrast adjustments, cropping…
Even using a more complete pipeline with something like darktable-cli
won’t give us an output that looks good out of the box, and it will only be even slower than the plain libraw
conversion.
Getting the embedded image
Luckily we don’t have to decode the raw data. Raw files contain an embedded JPEG image (or several, as we’ll see later) for the very purpose of previewing the file. This JPEG has the camera processing applied and looks correct. At least, that’s true of Fujifilm’s RAFs but I expect most, if not all, other camera manufacturers do the same.
To learn the structure of a RAF, we can take a look at the awesome Fileformats wiki.
According to the wiki, the JPEG image offset and length are 2 big-endian uint32 numbers starting at 16 + 4 + 8 + 32 + 4 + 20 = 84 = 0x54
Let’s see what we have:
$ hexdump -C DSCF0021.RAF | head
00000000 46 55 4a 49 46 49 4c 4d 43 43 44 2d 52 41 57 20 |FUJIFILMCCD-RAW |
00000010 30 32 30 31 46 46 31 37 39 35 30 32 58 2d 54 35 |0201FF179502X-T5|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 30 31 30 30 |............0100|
00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 00 00 00 00 00 94 00 3d 49 d3 00 3d 4b 0c |.........=I..=K.|
00000060 00 00 56 f4 00 3d a2 00 02 54 39 a0 00 00 00 02 |..V..=...T9.....|
00000070 02 54 39 a0 00 00 00 00 00 00 00 00 00 00 00 00 |.T9.............|
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000090 00 00 00 00 ff d8 ff e1 ff a8 45 78 69 66 00 00 |..........Exif..|
Reading 2 uint32(BE) starting at 0x54
, we have:
- offset =
0x00000094
- length =
0x003d49d3
And if we take a look at offset 0x94
, what do we have?
ff d8 ff ...
, yep, that’s a JPEG!
Let’s extract it:
$ < DSCF0021.RAF tail -c +$((0x94 + 1)) | head -c $((0x3d49d3)) > embedded.jpg
$ file embedded.jpg
embedded.jpg: JPEG image [...] 4416x2944, components 3
We got ourselves a nice 4416x2944 (13 Mpixels), 4 MB JPEG. That’s actually a very high-res preview!
Now let’s try to use that in a Go program.
For compatibility with RAW files from a wide range of manufacturers, I think the best would be to use libraw
bindings to parse the file and get the preview image.
But for the purposes of this blog post, let’s concentrate on Fuji RAFs and do it as simply as possible.
Generating a preview in Go
Let’s say we want to open the RAF file and find the embedded preview image. Since it’s larger than needed, we’ll also resize it to a more manageable 300x200 px preview.
Here’s some code that can do that. In order to keep it as short as possible, I’ll keep the error handling minimal, sorry if it’s not very idiomatic Go code.
package main
import (
"encoding/binary"
"fmt"
"image"
"io"
"os"
"time"
"github.com/pixiv/go-libjpeg/jpeg"
"golang.org/x/image/draw"
)
const TARGET_WIDTH, TARGET_HEIGHT = 300, 200
func checkErr(err error) {
if err != nil {
fmt.Println(err)
os.Exit(1)
}
}
func findPreviewImage(filename string) io.Reader {
fd, err := os.Open(filename)
checkErr(err)
var hdr struct {
_ [0x54]byte
Offset uint32
Length uint32
}
err = binary.Read(fd, binary.BigEndian, &hdr)
checkErr(err)
_, err = fd.Seek(int64(hdr.Offset), io.SeekStart)
checkErr(err)
return io.LimitReader(fd, int64(hdr.Length))
}
func decodeJPEG(r io.Reader) image.Image {
img, err := jpeg.Decode(r, &jpeg.DecoderOptions{
ScaleTarget: image.Rect(0, 0, TARGET_WIDTH*2, TARGET_HEIGHT*2),
})
checkErr(err)
return img
}
func resize(img image.Image, width, height int) image.Image {
dst := image.NewRGBA(image.Rect(0, 0, width, height))
draw.BiLinear.Scale(dst, dst.Rect, img, img.Bounds(), draw.Over, nil)
return dst
}
func main() {
t0 := time.Now()
rdr := findPreviewImage(os.Args[1])
img := decodeJPEG(rdr)
t1 := time.Now()
fmt.Println("decoded embedded image:", img.Bounds().Max, "took:", t1.Sub(t0))
resized := resize(img, TARGET_WIDTH, TARGET_HEIGHT)
t2 := time.Now()
fmt.Println("resized image:", resized.Bounds().Max, "took:", t2.Sub(t1))
}
$ time ./genpreview DSCF0021.RAF
decoded embedded image: (4416,2944) took: 236.011982ms
resized image: (300,200) took: 169.488725ms
real 0m0.408s
user 0m0.398s
sys 0m0.007s
Nice, it works. But… 240 ms to read and decode the preview? And 170 ms to resize? Sure, the whole process takes less than half a second and that’s much better than demosaicing, but if you have to generate hundreds of previews, it quickly adds up.
Can we go faster?
Changing the library
The thing is, Go’s standard image library, being in pure Go, is not as efficient as some highly-optimized C libraries. Some would even say that it’s slow.
We can replace the standard image lib with libjpeg(-turbo) with the github.com/pixiv/go-libjpeg bindings. It’s an easy change:
@@ -4,11 +4,11 @@
"encoding/binary"
"fmt"
"image"
- "image/jpeg"
"io"
"os"
"time"
+ "github.com/pixiv/go-libjpeg/jpeg"
"golang.org/x/image/draw"
)
@@ -40,7 +40,7 @@
}
func decodeJPEG(r io.Reader) image.Image {
- img, err := jpeg.Decode(r)
+ img, err := jpeg.Decode(r, &jpeg.DecoderOptions{})
checkErr(err)
return img
}
$ time ./genpreview DSCF0021.RAF
decoded embedded image: (4416,2944) took: 63.597905ms
resized image: (300,200) took: 172.552176ms
real 0m0.239s
user 0m0.226s
sys 0m0.006s
Cool, we cut ~170 ms of decode time with a 2-line change. And by adding a dependency to an external library, sure, but I think it’s worth it.
Can we go faster?
Partial decode
The JPEG format encodes blocks of 8x8 pixels by transposing them into the frequency domain with a discrete cosine transform (DCT). This allows a neat trick if we want to decode an image in a size smaller than its original size: we can just skip decoding the higher frequencies. If we only need an image with 1/8th the resolution of the original, we can even skip computing the inverse DCT altogether and just use the DC (constant) component of each 8x8 block.
libjpeg implements these kinds of scaled decoding optimizations and we can activate them by setting a target size in the jpeg.DecoderOptions
.
The library will automatically compute a scaling ratio that speeds up decoding while guaranteeing a resulting image at least the target size.
Note however that scaled decoding can introduce artefacts and it’s not as good as bilinear resize. The recommended technique is thus to scale-decode to a slightly larger size than required and then resize to the final size. Let’s set a 2x factor over our desired final target size.
func decodeJPEG(r io.Reader) image.Image {
- img, err := jpeg.Decode(r, &jpeg.DecoderOptions{})
+ img, err := jpeg.Decode(r, &jpeg.DecoderOptions{
+ ScaleTarget: image.Rect(0, 0, TARGET_WIDTH*2, TARGET_HEIGHT*2),
+ })
checkErr(err)
return img
}
$ time ./genpreview DSCF0021.RAF
decoded embedded image: (1104,736) took: 44.648346ms
resized image: (300,200) took: 12.396289ms
real 0m0.060s
user 0m0.056s
sys 0m0.005s
Now we’re cooking! You can see that we decoded the image at 1/4 resolution. Not only did it improve the decoding time slightly, but even better, by feeding it a smaller image to begin with, the resizing step became more than 10x faster!
Bonus: even faster (but smaller)
A funny thing is that just as the RAF file contains a (slightly) smaller JPEG for preview, that embedded JPEG also embeds an even smaller JPEG! It is really small though… Like, 160x120 px. But if that’s all you need, then getting that doubly-embedded JPEG is going to be the fastest way to get any kind of preview of the RAW file.
The thumbnail is stored in the EXIF data of the larger JPEG preview image.
To retrieve it in Go, we can keep our findPreviewImage
function from before, parse its EXIF data with the github.com/rwcarlsen/goexif
library, and simply call the dedicated JpegThumbnail
method.
func getThumbnailFromJPEG(r io.Reader) image.Image {
ex, err := exif.Decode(r)
checkErr(err)
thumb, err := ex.JpegThumbnail()
checkErr(err)
img, err := jpeg.Decode(bytes.NewBuffer(thumb), &jpeg.DecoderOptions{})
checkErr(err)
return img
}
func main() {
t0 := time.Now()
rdr := findPreviewImage()
img := getThumbnailFromJPEG(rdr)
t1 := time.Now()
fmt.Println("decoded embedded thumbnail image:", img.Bounds().Max, "took:", t1.Sub(t0))
}
$ time ./getthumbnail DSCF0021.RAF
decoded embedded thumbnail image: (160,120) took: 1.555741ms
real 0m0.006s
user 0m0.001s
sys 0m0.003s
And that’s after dropping the page cache (sudo sh -c 'echo 1 >/proc/sys/vm/drop_caches'
).
With the RAW file in cache, the process only takes around 400 µs.