Basti's Scratchpad on the Internet
09 Feb 2020

Editing Fuji Photos on Linux

Darktable is my favorite RAW editor. It's a program for developing digital negatives ("RAW files") to JPEGs. But, I've long struggled with matching the quality of the out-of-camera JPEGs of my Fuji camera. Let me explain.

Today's cameras capture an astounding amount of detail, far more than monitors can display or printers can print. And then they crush it down to a printable and viewable JPEG file. But that crushing operation is idiomatic for each camera, irreversible, and not always appropriate to the image. And at those inappropriate times, when you want a wider dynamic range or different colors, a RAW editor like Darktable can take a digital sensor dump ("RAW file"), and render it differently. The challenge is that the camera's own JPEGs are already very, very good, and it's a fine line to walk between fixing a particular flaw with the image, while retaining as much detail as possible.

Fuji cameras produce notoriously difficult RAW files, as their internal processing is (said to be) particularly elaborate, and they use an unusual image sensor. This post is about matching the quality of Fuji's JPEGs in Darktable, while maintaining the fidelity and malleability of the RAW files.

Here are a few renditions of a photograph I took, the first created with my revised process, the second is Fuji's JPEG, and the third is Darktable's default rendition. First the entire picture, the a zoomed view so you can see each pixel:



While contrast and saturation don't match 100% in my version and Fuji's, the colors are close enough for further processing. Darktable's default version does not match those colors at all. The zoomed-in version highlights even worse problems with sharpness and detail retention. After working with Fuji files in Darktable for about a year, and editing about 3500 Fuji files, I finally found a workflow that reliably produces results on par with Fuji's JPEGs:

  1. Use the Iridient X-Transformer to convert Fuji's RAF RAW-files to DNG RAW-files, and have the X-Transformer do the "More Detailed" demosaicing, and the full lens correction.
  2. Use Stuart Sowerby's LUTs instead of Darktable's Base Curve or Filmic RGB.

My first gripe with Darktable's rendering of Fuji files is that the demosaicing and lens corrections are not particularly great. Pictures simply come out softer than with other tools, and chromatic aberrations remain an issue. The Iridient X-Transformer completely fixes this issue for me. The X-Transformer can also do sharpening and denoising, but I find those are better relegated to Darktable where needed. "But… that's proprietary Windows software!!!". Yes, it is. And it works beautifully in Wine.

My second issue is colors. Darktable needs a lot of massaging to produce Fuji-like colors. The Base Curve module (shown above) really does not do Fuji files justice most of the time. And while Filmic RGB is much better, it requires a lot of tedious adjustments even in simple cases. So instead, I use LUTs extracted with a color chart and some fancy math to replicate Fuji's colors. At the moment (v3.0), Darktable understands only .cube and .png LUTs. But Stuart Sowerby's website has them in .3dl. So for now, you need to install a 3D LUT converter, and convert them to .cube manually. On Windows.

With these two steps you can develop your Fuji RAFs better than your camera, and with relatively little fuss. I'll leave you with a finished render of the image above:

mine.jpg
Tags: photography linux
21 Jan 2020

I Miss My Mac

There was a time, maybe from 2009 to 2011, when Apple computers were glorious. They were elegant devices, beautiful to me, and powerful. Never before, or since, have I loved a machine as I loved that MacBook Pro early 2011.

I miss that, being able to love the elegance and power a machine.

Nowadays I mostly use KDE Linux, which I can tolerate. Things occasionally break, stuff sometimes doesn't work as intended, but I can generally work around these quirks, so they don't bother me too much. Today, off the top of my head, Firefox crashed, Thunderbird couldn't display one folder, Zotero had to be re-installed, one monitor had to be power-cycled when waking the computer, and the laptop battery died way too quickly again.

I also use a Windows Surface tablet, which is a wonderful little device. It currently shows an error message every time it boots that I can't get rid of, has to be rebooted every few days to stay fast, and sometimes just won't react until restarted. And it always wants to install some update, and just refuses to do it overnight as intended. But generally it works pretty well.

And yesterday, in a bout of nostalgia, I opened my macOS laptop again and endeavored to write a bit on it. There were no updates, then there were updates but they couldn't be installed, then they could be installed, then the download stalled, and hat to be un-paused for some reason. Then my project files randomly couldn't be downloaded to the laptop, and the window management in macOS is a horrid mess. Also the update took a full hour, and didn't change anything noteworthy. I didn't end up actually getting to write on it.

In the evening, my camera corrupted its file system, as it sometimes does if I forget to format the SD card as FAT instead of its default exFAT. The recovery process was tedious but worked. I've done it a few times now and I know what to do.

My phone, meanwhile, can't use its fingerprint reader any longer ever since I "up"graded to Android 10. It's a Pixel 2, as google as Google can be. And apps are sometimes unresponsive and have to be restarted to become usable again. And it still can't undo text edits in the year 2020, and probably spies on my every move.

My amplifier nominally supports Bluetooth, but it only works some of the time. Currently it doesn't, so we use a portable Bluetooth speaker instead. Its Spotify integration has never worked even once. But at least it now reliably sends an image to the projector ever since we vowed to only ever touch the one reliable HDMI port on the front of it.

And as much as I love my ebook reader, it sometimes crashes when there's too much stuff on the page. One time I even had to completely reset it to get it to work again. And every time I install an update, the visual layout changes. Most of the time, mostly for the better. But it just erodes trust so much, to be at the mercy of the whims of an obviously demented software designer somewhere.

As I said, I miss my Mac, back when technology was magical, and just worked. And any error could be clearly explained as user error instead of terrible programming. When I was young and foolish, in other words.

I should go play with my child now, and run around in the yard, or maybe read a paper book. There's no joy in technology any more.

Tags: computers macos
29 Dez 2019

Efficiency is not Intuitive

A few days ago, a new version of Darktable was released. Darktable is an Open Source RAW image processor, and one of my favorite pieces of software of all time. It may be THE most powerful RAW processor on the market, far exceeding its proprietary brothers. But in the social media discussions following the release, it got dismissed for being unintuitive.

Which is silly, ironic, sad; why should Intuition be a measure of quality for professional software? Because in contrast to a transient app on your phone, RAW processing is a complex task that requires months of study to do well. Like most nontrivial things, it is fundamentally not an intuitive process. In fact, most of my favorite tools are like this: Efficient, but not Intuitive; like Darktable, Scientific Python, the Unix Command Line, Emacs.

These tools empower me to do my job quickly and directly. They provide sharp tools for intelligent users, which can be composed easily and infinitely. Their power derives from clean metaphors and clear semantics, instead of "intuitive", "smart", magics. They require learning to use well, but reward that effort with efficiency. They are professional, and treat their users as adults.

In my career as a software developer, I know these ideas as traits of good APIs: Orthogonal, minimal, composable. APIs are great not when they are as expansive and convenient as possible, but when they are most constrained and recombinable. In other words, "don't optimize for stupid". That's what I understand as good design in both programming, and in professional UIs.

Which is quite the opposite of Intuitive. Quite the opposite of "easy to grasp without reading the manual". That's a description of a video game, not a professional tool. As professionals, we should seek and build empowering, efficient tools for adults, not intuitive games maskerading as professional software.

Tags: computers ui
27 Oct 2019

Travel Cameras

We went on a once-in-a-lifetime vacation this year. Six weeks in the USA, roving New England in an RV, and going on a helicopter tour around New York City. I have spent months thinking about the perfect camera for going on this trip. I went on several vacations with various permutations of cameras and lenses as a test-run. I data-mined my existing photographs to find the perfect setup through science. And I scoured the internet for untold hours to get it right.

Here's what I took with me:

P1020974_2.jpg
Figure 1: A Fuji X-E3 with an XF 18-135 lens, and a Ricoh GR

That's two cameras! Most people would take their smartphone one camera with multiple lenses. I take multiple cameras with one lens each. Why? Because they're both awesome, at different things:

The Fujifilm X-E3 with an XF 18-135 f/3.5-5.6 lens

The X-E3 is a mirrorless camera, which means it's small. It is a very recent camera with a modern sensor. It's a rangefinder-style camera with the viewfinder in the top left. So it doesn't obscure my whole face when I lift it to my eye.

Compared to a DSLR, this is a small and light camera1. But paired with the lens, it is neither. That lens, though: a massive focal range from 18 mm to 135 mm is suitable for wide landscapes, as well as tight tele shots of wildlife. Cramming it all in one lens must come with a loss in sharpness and aperture, but the 18-135 is a best of its breed, and still plenty sharp enough for me. And it comes with an image stabilization so sophisticated that I can hand-hold half-second exposures. Which is magic.

P1020982_2.jpg
Figure 2: The X-E3 with the lens extended. Small, it is not.

This is a great camera for deliberately framing a shot; it feels great to hold, with a neat and efficient control layout for quickly playing with apertures and shutter speeds and focal lengths with a minimum amount of fuss. It is a dependable tool, an amazing piece of engineering, and an artistic means of expression. It is truly a great camera.

If it has an Achille's heel, it's that it takes a while to get ready. Not because it is slow to start up, but because there's a delay before the viewfinder activates when raised to the face. Which just means you will stare into the void for half a second, while seeing the critical moment pass in front of you. Half a second of missing the shot.

And the white balance can struggle a bit with forests and lakes. Forests are green, and to compensate, the X-E3 turns everything else magenta. Easy to fix in post, but annoying. These are minor complaints, and most of the time the X-E3 is as perfect a camera as I've ever seen one.

Here are a few pictures I took with the X-E3 on my last vacation2:



Image quality is simply excellent. Not once has the X-E3 let me down. Autofocus is accurate, and instant. The lens is amazingly sharp all through its focal range. Dynamic range is unreal, at times. As you can see in the pictures, I've shot everything from landscapes to people, to wildlife, long exposures, and even some astro. All with one lens, on one camera.

The Ricoh GR

So, if the X-E3 is my "professional" camera, the Ricoh GR is my fun camera. In 2019, the GR old technology. It was released in 2013, it has a big but aging 16 megapixel APS-C sensor, and a fixed 18 mm f/2.8 lens with no zoom. It shouldn't be able to compete with the Fuji. But there is one thing the GR does better than any other camera: Speed.

DSCF1333_3.jpg
Figure 3: The Ricoh GR in all its glory (before the vacation)

The GR is in my pants pocket. It wouldn't be my pants without a GR in it. I have it with me everywhere, all the time. And at the right moment, I will pull it out, and take a picture, before you even notice it's there. That's the GR's mission: The fastest possible time from pants to picture!

Seriously, the GR goes from off to taking the first shot before even its screen turns on3. I can whip out my GR and take a picture long before you have unlocked your phone, never mind started the camera app, selected a focus point, or aimed it at your prey. In the car, I had the GR in the center console, and could take a picture of the roadside before it whizzed past. And without taking my eyes off the road, naturally.

Since the GR has a fixed and wide focal length, I don't have to look at its screen to see what I'm taking a picture of. I can do it by feel. Because all the controls are on the right, I can adjust everything with my right hand only, without taking my hands off the wheel, or stroller, or wife. And then there's snap focus, where a full-press of the shutter will take a shot at a pre-defined distance, without waiting for autofocus. I wish every camera had this feature!

However, all of this took some getting used to at first. I missed focus a lot in the beginning. But once I figured it out, a new world opened up. A world of whimsy and spontaneity that is simply not possible with a bigger or slower camera. A good half of my favorite pictures were only possible thanks to the GR.

DSCF9162_2.jpg
Figure 4: The Ricoh GR after the vacation: scarred, but stronger for it.

Again, there are downsides. It's an old sensor. Dynamic range is limited. Autofocus could be faster. There's a noticeable shutter delay. There is no viewfinder or image stabilization. The white balance is not great. It can struggle with a full memory card4.

But none of that matters. I love its colors. I love its sharpness. It has taken quite the beating on this vacation, but I wouldn't have it any other way. The GR is magnificent. There is nothing else quite like it5. And so many memories would have been lost if not for the GR:



Many of these shots were taken in the spur of the moment. In situations where I didn't take the big camera with me. Where a big camera would have been inappropriate. But the GR is so tiny, so non-threatening, it can become part my daily life. And capture the fleeting moments that make life worth living. That is a photographic super power if there ever was one.

Footnotes:

1

The body weighs 337g, which is a feather. The lens weighs 490g.

2

No personal pictures, though. I don't share those publicly.

3

The magic is "snap focus". If you depress the shutter fully, without waiting for autofocus, the GR immediately takes the shot at a pre-set distance. Works in any mode, and even when just powering on: Press the power button, aim, press the shutter, and before even the screen turns on you will have taken a picture. Lightning fast.

4

As the card fills up, getting into playback mode takes longer and longer. Minutes, for a full 64 Gb card.

5

Forget about the Fuji XF10. I tried it. It's a newer sensor, but so clumsy, so slow, so awkward to use. It's no comparison.

Tags: photography
20 Sep 2019

Analyzing Speech Signals in Time and Frequency

Our first stop in speech analysis is usually the short-time Fourier transform, or STFT1:

stft.png
Figure 1: Top: an STFT of speech (it's me, saying: "Es war einmal ein Mann"). Bottom: the signal's waveform.

As we can see, this speech signal has a strong fundamental frequency track around 120 Hz, and harmonics at integer multiples of the fundamental frequency. We perceive the frequency of the fundamental as the speech's pitch. The magnitude of the harmonics varies over time, which we perceive as the sounds of different vowels.

All of this is clearly visible in this STFT. To calculate the STFT, we chop up the audio signal into short, overlapping blocks, apply a window function 2 to each of these blocks, and Fourier-transform each windowed block into the frequency domain:

block-processing.png
Figure 2: The STFT is a two-dimensional graph of short-time spectra. Window functions and corresponding spectra of some blocks are highlighted.

We zoomed in a bit to make this visible. At this zoom level, an old friend becomes visible in the waveform: The waveform of our harmonic spectra is clearly periodic. We also see that each of our blocks contain multiple periods of the speech signal. So what happens if we make our blocks shorter?

short-blocks.png
Figure 3: STFT with shorter blocks (graph shows a shorter time, too). Each window now covers roughly one period of the signal.

As we make our blocks shorter, such that they only contain a single period of the signal, a funny thing happens: the harmonic structure vanishes. Why? Because the harmonic structure is caused by the interference between neighboring clicks; As soon as each block is restricted to a single period, there is no interference any longer, and no harmonic structure.

But instead of a harmonic structure, we see the periodicity of the waveform emerge in the STFT! To look at this phenomenon in a bit more detail, we will have to leave the classical STFT domain, and look at a close sibling, the Constant Q Transform, or CQT. The CQT is simply an STFT with different block sizes for different frequencies:

cqt.png
Figure 4: The CQT uses shorter block sizes for higher frequencies and longer block sizes for lower frequencies.

The periodic structure of the signal is now perfectly apparent. Each period of the signal gives rise to a skewed click-like spectrum that repeats at the fundamental frequency. Interestingly, however, the fundamental itself shows no periodicity. To see why this is so, and to finally get a grasp of how the short-time Fourier transforms interprets signals, we have to turn to our last graph, a gamma-tone filterbank:

gammatone.png
Figure 5: A gamma-tone filterbank is a series of band-pass filters. Each horizontal line is now a frequency-constrained waveform.

The fundamental frequency is now clearly a visible as a single sinusoid, rotating its merry way at 120 Hz. Each harmonic is its own sinusoid at an integer multiple of the fundamental frequency, but changing in amplitude in lockstep with the findamental. Thus in this graph, we see both the harmonicity and the periodicity at the same time, and can finally appreciate the complex interplay of magnitudes and phases that give rise to the beautiful and simple sound of the human voice3.

Footnotes:

1

also known as "spectrogram"

2

a Hann window, in this case

3

in the magnitude domain only; Phases, and different window functions, go beyond the scope of this article.

Tags: signal-processing
Other posts