Over the months I've developed a system of processing and organizing photos that helps me stay organized and scales well. I've cataloged over 15,000 photos this way.

There are several things I usually do with my digital photos. Despite appearances, the full process is actually ad-hoc and depends on what I'm starting with: a handful of files, a few thousand files, photos encoded in NEF formats, photos encoded in JPEG formats, etc. Every now and then I adjustment the methods to account for previously unforeseen scenarios. When I do this page gets updated.

Raw (NEF) files

I often record images in the .NEF format.

My Nikon D50 produces .NEF "raw" files that are representations of the camera sensors data as faithfully reproduced as Nikon chose to. These are images with 12-bits-per-channel of information and three channels (red, green, blue) per pixel. By contrast, JPEG files contain 8-bits-per-channel of information. This means that two photos of the same resolution but encoded in NEF and JPEG formats hold different amounts of information. The NEF file contains potentially 2^(12-8) = 16 times as much information at each pixel for each channel. JPEG files are also lossy, whereas NEF files are not. (Technically, this is not true if you consider data from the 16-bit CCD censor on the camera is reduced to 12-bit data for recording into a NEF file; most of this "lost" data is censor noise anyway, however.)

Why raw?

The advantage to using NEF files is that a broader range of adjustments to the contrast, brightness or white balance can be performed than with a JPEG file. In the 8-bit land of JPEG file, such adjustments can quickly lead to large areas being single-colored because the upper or lower bound on the channel intensities is reached. With NEF files, the spectrum of possible intensities is 16 times greater, so adjustments retain more of the original intensity differences. Furthermore, when JPEG files have their contrast, brightness, or color properties modified, visual artifacts (like single-colored rectangular regions) become visible. This is because the data compression process takes into account the intensities of each pixel's channel at the time of compression, and reduces the data so that the image is imperceptibly distorted. When these compressed areas have their intensities modified, the distortions intensify, become perceptible and are irrecoverable.

Why is all this important? Because NEF files require special treatment before they can be shown to others or even viewed on the screen for that matter. They allow more post-processing flexibility at the cost of being less accessible. So these NEF files must be transformed into JPEG files.

Each NEF file is about 6MB in size, and I might have 150 such files that fill my 1GB memory card. Processing so much data on my machine takes quite a long time, so this process is usually scripted over night.

Standard naming convention

Before anything happens to the NEF files, they are renamed to a standard naming scheme.

Each file is originally named DSC_NNNN.NEF where NNNN is a four digit integer that enumerates each photo in sequence. I rename every photo to the format YYYYMMDD-HHMMSS-NNNN-o.NEF where YYYYMMDD-HHMMSS is the date- and time- stamp of the photo and NNNN is transcribed from the original image's name. (I have decided that the likelyhood of a filename collision is slim enough that this naming scheme is sufficient.) The date- and time- stamp information is embedded in the file's EXIF metadata. The -o indicates "original".

This renaming is useful because names like DSC_0045.NEF are only relevant within the context of a few hundred other files that were shot with the same memory card. Over the course of an outing, I may shoot several memory cards (storing the results on a portable hard drive) so name conflicts will occur. This is especially true when accumulating photos from many occasions into one place.

Making JPEG files from NEF files

Once renamed, I resize it to 30%, apply a grey border and a black matte with attribution words like "Photo by Michal Guerquin, 2006" at the bottom and save it as YYYYMMDD-HHMMSS-NNNN-f.JPEG and send it to Flickr. The -f indicates "Flickr". This is all scripted, so I need only provide the name of the originally named .NEF file and it takes care of the rest.

The software

All this conversion happens in Linux. I use dcraw to convert NEF files to JPEG files. I use exiftool to read and write EXIF data from NEF files and attach them back to the matching JPEG files, and I use ImageMagick's convert to apply the black matte border and my name at the bottom.

Here's an example shell script that I use for this purpose.

~/bin/ *.nef # rename file to YYYYMMDD-HHMMSS
~/bin/ *.nef # put it in YYYY-MM-DD directory
for f in 2006-*/*.nef; do
  echo $f;
  j=`echo $f | sed -e 's/nef$/jpg/g' | sed -e 's/-o/-f/g'`
  if [ -e $j ]; then
    echo "$j already exists, skipping $f";
    ~/bin/dcraw -w -h -c $f | convert - $j &&
    exiftool -overwrite_original -tagsfromfile $f $j &&
    ~/bin/ "Photo by Michal Guerquin, 2006" $j $j &&
    ~/bin/ $j;

In the event that I shot JPEG images instead of NEF, similar processing applies: dcraw is not necessary, jpegtran is used to rotate JPEG files according to their EXIF "orientation" value (dcraw does this automatically when processing NEF files), and the original file names end in JPG instead of NEF.


I've found that storing large sets of photos taken at one event can be placed in a single directory. Within that directory, I use my filegroup script to put the photos into YYYY-MM-DD/ directories so they are be conveniently separated into days. The JPEG and NEF files live along side each other so I can easily return to the full-size original image should I want to make color adjustments (the camera-provided white balance isn't always the best choice), croppings, or prints.

Photo organization software

I have used Apple Aperture heavily, and enjoy it a lot. Its drawbacks, however, make it an undesirable decision in the long run. It is expensive and proprietary -- if the SQLite database get corrupted, I have to rely on Apple to help me recover data. The software is slow, even on a 2.0GHz G5 iMac with 2GB of memory. It is also tied to OS X, whereas I use Linux and FreeBSD on my workstation.

I have tried Bibble on the Mac (the Linux version requires a CPU with features that mine does not have), and I really really disliked the user interface. It is felt incredibly clunky and unfinished. Maybe the QT widgets look nicer under Linux, but on OS X it was just garish. It was also slow for analyzing individual images (a multi-pass mosaic appeared) compared to Aperture.

I have tried Google's Picasa2 when it came out "for Linux", but it was too slow to handle the thousands of photos I have on my computer.

I have settled for using simple GPL programs like gqview, xnview, and eog to inspect the resulting JPEG files and the aforementioned GPL software to process the NEF and JPG files.

Photo sharing

Photos on my hard drive are useless. To share, I upload pictures to my Flickr account (using a Python script of my own creation that uses Flickr's public REST interface), and organize into sets on Flickr. Then I send the URL of the set to friends and that's it. Sometimes I write about places I've been to and accompany the text with photos.


To summarize the above: rename photos to standard names, produce scaled and annotated web shareable versions in JPEG format, store everything in YYYY-MM-DD/ directory hierarchy, upload to Flickr, distribute URL.

This is, updated 2006-08-29 17:44 EDT

Contact: michalg at domain where domain is (more)