SEO-friendly image names with sorl-thumbnail and Django
When working with Django, I tend to use sorl-thumbnail more and more. It is a quick & easy way of generating image thumbnails for your application. As a matter of fact, I use it on almost every
img tag on my templates to ensure that user-upload images will fit the page layout. But the URLs that the application generates are of the form
http://media.example.com/cache/9f/0c/9f0c98435b16bcea2b9ead87.jpg. Except for being unpretty, such URLs are SEO deal-killers.
From the Google image publishing guidelines I highlight the following:
- "The filename can give Google clues about the subject matter of the image. Try to make your filename a good description of the subject matter of the image. For example, my-new-black-kitten.jpg is a lot more informative than IMG00023.JPG. Descriptive filenames can also be useful to users: If we're unable to find suitable text in the page on which we found the image, we'll use the filename as the image's snippet in our search results."
- "Consider structuring your directories so that similar images are saved together. For example, you might have one directory for thumbnails and another for full-size images; or you could create separate directories for each category of images (for example, you could create separate directories for Hawaii, Ghana, and Ireland under your Travel directory)."
sorl-thumbnail is implemented in a nice way, providing useful documentation on how to use the application. As stated in the docs, you can use the
THUMBNAIL_PREFIX to choose a different folder than cache/ to store your thumbnail. This would change your thumbnail folder from
http://media.example.com/images/9f/0c/9f0c98435.jpg. This is better but the SEO un-friendly file names are still there. Finally, I noticed the
THUMBNAIL_BACKEND setting that allows us to override the default way of generating filenames. After reading the sorl-thumbnail source code I came up with the following simple class:
import os, re from sorl.thumbnail.base import ThumbnailBackend from django.template.defaultfilters import slugify from django.conf import settings class SEOThumbnailBackend(ThumbnailBackend): """ Custom backend for SEO-friendly thumbnail file names/urls. """ def _get_thumbnail_filename(self, source, geometry_string, options): """ Computes the destination filename. """ split_path = re.sub(r'^%s%s?' % (source.storage.path(''), os.sep), '', source.name).split(os.sep) split_path.insert(-1, geometry_string) #attempt to slugify the filename to make it SEO-friendly split_name = split_path[-1].split('.') try: split_path[-1] = '%s.%s' % (slugify('.'.join(split_name[:-1])), split_name[-1]) except: #on fail keep the original filename pass path = os.sep.join(split_path) #if the path already starts with THUMBNAIL_PREFIX do not concatenate the PREFIX #this way we avoid ending up with a url like /images/images/120x120/my.png if not path.startswith(settings.THUMBNAIL_PREFIX): return '%s%s' % (settings.THUMBNAIL_PREFIX, path) return path
What the _get_thumbnail_filename() method does, is calculate the relative file path and insert the target dimensions in the target path name. For example, say we want to create a thumbnail 120x120 for the file '/home/django/project/media/polls/Question for the Big prize.jpg'. Based on the storage root path we calculate the relative path (e.g. 'polls/Question for the Big prize.jpg'), insert the dimensions to the path -to ensure we generate concrete paths for different dimensions- (e.g. 'polls/120x120/Question for the Big prize.jpg'), slugify the file name to make it SEO-friendly (e.g. 'polls/120x120/question-for-the-big-prize.jpg') and finally prepend the
THUMBNAIL_PREFIX. The returned path should be 'images/polls/120x120/question-for-the-big-prize.jpg'.
Note: Using slugify requires that you do not upload images with almost identical filenames (e.g. 'image 1 1.jpg' and 'image 1-1.jpg' because you'll end up using the same thumbnail for both files! If you really need to have such images (not recommended anyway) you can just remove the
slugify part from the above code.
Finally, we must enable sorl-thumbnail to use our backend:
THUMBNAIL_BACKEND = 'my_app.my_module.SEOThumbnailBackend' THUMBNAIL_PREFIX = 'images/'