Biodiversity Data Journal :
Software description
|
EXIF Custom: Automatic image metadata extraction for Scratchpads and Drupal
Corresponding author:
Academic editor: Jörg Holetschek
Received: 01 Aug 2013 | Accepted: 11 Sep 2013 | Published: 16 Sep 2013
© 2013 Ed Baker
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Baker E (2013) EXIF Custom: Automatic image metadata extraction for Scratchpads and Drupal. Biodiversity Data Journal 1: e973. https://doi.org/10.3897/BDJ.1.e973
|
Many institutions and individuals use embedded metadata to aid in the management of their image collections. Many deskop image management solutions such as Adobe Bridge and online tools such as Flickr also make use of embedded metadata to describe, categorise and license images. Until now Scratchpads (a data management system and virtual research environment for biodiversity) have not made use of these metadata, and users have had to manually re-enter this information if they have wanted to display it on their Scratchpad site. The Drupal described here allows users to map metadata embedded in their images to the associated field in the Scratchpads image form using one or more customised mappings. The module works seamlessly with the bulk image uploader used on Scratchpads and it is therefore possible to upload hundreds of images easily with automatic metadata (EXIF, XMP and IPTC) extraction and mapping.
Scratchpads, image metadata, Drupal, EXIF, XMP, IPTC
The use of embedded image metadata is becoming widespread in the biodiversity informatics community (e.g.
The eMonocot project (http://about.e-monocot.org) makes use of the Scratchpads (
There are three widespread image metadata formats that can be handled by this module. A subset of the EXIF standard (
An existing Drupal module, Exif (https://drupal.org/project/exif), provides a mechanism for displaying embedded image metadata on Drupal nodes, but does not provide a mechanism for mapping the metadata into fields. The import of embedded metadata into Scratchpads/Drupal fields is a requirement of the eMonocot project and is useful for the wider Scratchpads community as it allows for these data to be easily used by other Drupal modules (e.g. Views - https://drupal.org/project/views) and in other Scratchpads-specific functions such as our on-going work on implementing the ability to export data via DarwinCore Archives (GBIF DarwinCore Archives). There is a comparison of these two modules (and potentially other similar Drupal modules) at https://drupal.org/node/1842686.
The source code of this module is hosted on https://drupal.org. All content on the Drupal.org itself is copyrighted by its original contributors, and is licensed under the Creative Commons Attribution-ShareAlike license 2.0 and is also available under the GPL version 2 or later.
EXIF, XMP and IPTC are the three image metadata standards in widespread use and the eMonocot project (http://e-monocot.org) (which uses Scratchpads) makes use of all three systems. Due to the flexibility of these systems, particularly of XMP, it is possible that the same field can be defined in more than one of these standards and there is no guarantee that all Scratchpads users will use the same image metadata field for the same data. Scratchpads as a system (and also Drupal on which Scratchpads are built) are highly customisable, and users may create their own custom fields to add metadata to image files. Due to this multiplicity of possible input formats it is not desirable for the module described here to define a mapping of embedded image metadata fields to Scratchpads/Drupal fields (adding fields to images in Drupal requires the File Entity module - https://drupal.org/project/file_entity). Users may also want to upload images from a number of different sources that make use of different subsets of the three image metadata standards supported. For these reasons this module allows users to define any number of named mappings between embedded image metadata and the Scratchpads/Drupal image fields. It is possible for those with the necessary privileges on the system to define the default mapping used by the site, and for individual users to override this with their own choice of User Default mapping (Fig.
The configuration pages for this module can be found under 'Custom Exif Mappings' in the standard Scratchpads/Drupal administration interface. The 'Settings' tab allows those with the required privileges to set the site's default mapping, and to turn on or off the automatic saving of embedded image metadata to the image fields of the site when an image is uploaded. The 'User Settings' tab provides an interface for individual users to override the site's default mapping with a mapping of their choice. New mappings can be created through the 'New Exif Mapping' tab. The first step in this process is to name the new mapping and upload a sample image that contains all of the metadata fields that you want to map to Drupal/Scratchpads image fields. The module extracts all of the embedded metadata fields that have values assigned, and provides a form that displays the name of the embedded metadata field, an example value from the sample image, and a drop-down list of Scratchpads/Drupal fields that can be mapped to (Fig.
Mapping embedded image metadata fields to Scratchpads/Drupal image fields. The image uploaded is a standard testing image from the EXIF Toolkit available from http://iptc.org. The Scratchpads project recommends the use of Creative Commons open licences.
Once one or more mappings have been created, and the Site Default and/or User Default mapping has been set, the Scratchpads/Drupal fields will automatically be populated from the embedded image metadata when a new file entity is created - either through individual entity creation or through batch entity creation (using a module such as Plupload - https://drupal.org/project/plupload) to upload multiple images at once.
This module can be enabled on Scratchpads sites via the Tools page in Admin > Structure. It can also be downloaded by maintainers of other Drupal sites from Drupal.org and enabled in the same way as any other module.
The module is potentially useful for anybody who wants to extract embedded metadata from uploaded images and use it in fields on a Drupal site. By making metadata available in fields, the metadata can be exposed to other third-party modules such as Views (https://drupal.org/project/views), allowing for many display options and filtering opportunities.
The eMonocot content team based at the Royal Botanic Gardens Kew make use of this module on the eMonocot Scratchpads to bulk upload images which have had their metadata curated using Adobe Bridge. The module extracts the metadata embedded within the image into Drupal fields which allows for both display of this data on the Scratchpad and also in the DarwinCore Archive file that is used to contribute Scratchpad data to the eMonocot portal and also the Encyclopedia of Life. This workflow prevents images being separated from their accompanying metadata (through metadata embedding) and also saves time and effort - previously in Drupal performing the task of importing metadata would have required either copying and pasting data or spreadsheet manipulation using a number of different import tools.
This module requires the Drupal File Entity (https://drupal.org/project/file_entity) module.
The module will work with any fields associated with image file entities in Drupal. The Scratchpads Audubon Core module (https://git.scratchpads.eu/v/scratchpads-2.0.git/tree/HEAD:/sites/all/modules/custom/scratchpads/scratchpads_audubon_core) creates a set of fields that are compliant with the current version of Audubon Core (a standard metadata schema for images of biological specimens and observations). With the EXIF Custom module it is possible to map embedded image metadata directly to these fields when the images are uploaded. The use of standard Drupal fields means that it is also possible to expose embedded metadata to external services, notably via DarwinCore Archives using the Scratchpads DarwinCore Archive Export module (https://git.scratchpads.eu/v/scratchpads-2.0.git/tree/HEAD:/sites/all/modules/custom/dwca_export).
Thanks to Simon Rycroft and Alice Heaton of the Scratchpads team at the Natural History Museum, London for their time and assistance and to the eMonocot Content Team at the Royal Botanic Gardens Kew for their work finding bugs and suggesting improvements to early versions of this module.