Using RDFa to Provide License Information to Google Image Search

14 Aug 2009

About a month ago, Google made it possible for anyone searching for images to filter the results based on the license type (see Find Creative Commons images with Image Search). But it was only in the last couple of days that they spelt out exactly how you can let the Google crawlers know what licenses you are making your own images available under, in a video by Peter Linsley, from Google Image Search -- which is to use RDFa. Peter's created a great tutorial, but it's actually possible to mark-up license information even more simply than he shows in his video.

Introduction

The tutorial begins with the code for an image tag:

<img src="image.jpg" />

Peter then suggests adding a link to the relevant license, as follows:

<a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/"
 >Creative Commons Attribution-Share Alike 3.0</a>

Peter rightly points out that with this mark-up, Google has no way to tell whether the license applies to the page or the image -- a problem that would be compounded if there were a number of images on the same page. However, he shows how this can be resolved by adding a container for the mark-up, and placing an @about value onto that container:

<div about="image.jpg">
  <img src="image.jpg">
  <a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/"
   >Creative Commons Attribution-Share Alike 3.0</a>
</div>

There's nothing wrong with Peter's approach and it will work in all situations. But there will be many occasions where a slightly easier solution is possible.

Placing @about on the anchor

This first scenario assumes that we still want the clickable link, but not the containing div. In that case we need only place the @about value onto the link itself, to achieve the same effect as having the container:

<img src="image.jpg">
<a about="image.jpg" rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/"
 >Creative Commons Attribution-Share Alike 3.0</a>

The nice thing about this technique is that the clickable link that refers to the license doesn't need to be directly next to the image, or even close on the page (which it does with the container approach), which gives publishers a little more flexibility when it comes to the document's layout.

Images and @rel

If we don't want the clickable link that points to the license, then we can place the @rel="license" directly onto the image:

<img src="image.jpg" rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">

This is a great solution for software tools such as editors or content management systems, because it means that licensing information can be added to the image, right at the point where the image is embedded into the mark-up. Not only that, the image reference and the license are combined in a self- contained unit that can move around the document with no side-effects.

Using the same license for many images

Note that if the same license is to be applied to many images, then the containing element technique becomes useful again; we can specify a reference to the license on the container, and then use @rev to make the connection with the images:

<div about="http://creativecommons.org/licenses/by-sa/3.0/" rev="license">
  <img src="image1.jpg">
  <img src="image2.jpg">
  <img src="image3.jpg">
  <img src="image4.jpg">
</div>

This gives Google exactly the same information as this:

<img src="image1.jpg" rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">
<img src="image2.jpg" rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">
<img src="image3.jpg" rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">
<img src="image4.jpg" rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">

License information in the head

One final approach worth mentioning for the sake of completeness, is to place all of the license information into the head of the document:

<head>
  <link about="image1.jpg" rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">
  <link about="image2.jpg" rel="license" href="http://creativecommons.org/licenses/by/3.0/">
  ...
</head>

This may be useful in some CMS systems, where all of the licensing information for any element that appears on a particular page can simply be 'dumped' into the page in one go.

Conclusion

It's great to see Google improving the user's search experience by surfacing license information in this way, and it's great to see that RDFa is helping publishers to provide this information to Google's crawlers, however they choose to mark it up.