Image description 2004

Here is a summary of my trials on image description. Something have done, but several issues remain. Comments, feedback welcome.

summary of issues:

A simple description of an image

A simple description of an image can be done with bibliographic terms such as Dublin Core. Some technical terms e.g. Exif will add more detailed information.

(例)

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:image="http://jibbering.com/vocabs/image/#"
  xmlns:exif="http://www.kanzaki.com/ns/exif#"
>
 <rdf:Description rdf:about="http://www.kanzaki.com/works/2003/imagedesc/031114_2152.jpg">
  <dc:title>With DanBri in Tokyo</dc:title>
  <dc:description>Went a small 'Yakitori' house with danbri 
  and other W3C members after W3C Day 2003</dc:description>
  <dc:date>2003-11-14T21:52:45+0900</dc:date>
  <dc:creator>Masahide Kanzaki</dc:creator>
  <image:height>640</image:height>
  <image:width>480</image:width>
  <exif:make>KDDI-CA</exif:make>
  <exif:model>A5401CA</exif:model>
  <exif:exposureTime>0.2598</exif:exposureTime>
  <exif:exposureMode>Auto exposure</exif:exposureMode>
  <exif:whiteBalance>Auto white balance</exif:whiteBalance>
 </rdf:Description>
</rdf:RDF>

Issues on Exif vocab

There are several issues on designing an experimental RDF Schema of Exif vocabulary.

  • Data format: native Exif uses its own format for some data. For example, date/time format is something '2003:11:14 21:52:45'. It also lacks time zone information. My approach is convert it to W3CDTF/ISO8601 format.

  • Enumeration: some data values are stored as enumeration (1,2,3...) whose interpretations are defined in the specification. For example, value '0' for exposureMode means 'Auto exposure'. It seems better to use 'Auto exposure' instead of '0' for value for RDF property 'exif:exposureMode', but would need to specify controlled vocabulary.

  • Structure: Exif has a sort of hierarchal directory structure, and RDF schema can reflect this structure. But it seems much easier to use these properties without structure, i.e. not like:

    (例)

     <rdf:Description ...>
      ...
      <exif:exifdata>
       <exif:IFD>
        <exif:make>KDDI-CA</exif:make>
        ...
        <exif:exif_IFD_Pointer>
         <exif:IFD>
          <exif:exposureTime>0.2598</exif:exposureTime>
          ...
         </exif:IFD>
        </exif:exif_IFD_Pointer>
       </exif:IFD>
      </exif:exif:exifdata>
      ...
     </rdf:Description>
    

    but simply:

    (例)

     <rdf:Description ...>
      ...
      <exif:make>KDDI-CA</exif:make>
      <exif:exposureTime>0.2598</exif:exposureTime>
      ...
     </rdf:Description>
    

    I'd define domains of most Exif properties as 'foaf:Image', or no domains.

Images with geo info

An image with GPS data can be described with geo: vocabulary. Note that the lat/long of the GPS is of the location of the camera, not that of the subject of the picture.

[close up of two person]

For the close-up photo, the difference is not significant. It can be described like:

(例)

 <rdf:Description rdf:about="http://www.kanzaki.com/works/2003/imagedesc/031114_2152.jpg">
  <dc:title>With DanBri in Tokyo</dc:title>
  ...
  <foaf:topic rdf:parseType="Resource">
   <geo:lat>35.647488888889</geo:lat>
   <geo:long>139.73964166667</geo:long>
  </foaf:topic>
 </rdf:Description>

[viewing Mount Fuji from distant parking area]

However, it is not simple for long-distance shot. We need some new vocabulary to describe 'how the picture was taken'. I've been experimenting a small vocabulary to describe those different attributes.

(例)

 <rdf:Description rdf:about="http://www.kanzaki.com/.../031229_1639.jpg">
  <dc:title>Viewing Mt. Fuji from a highway parking area</dc:title>
  ...
  <dpd:generated>
   <dpd:GenerationEvent>
    <geo:lat>35.647488888889</geo:lat>
    <geo:long>139.73964166667</geo:long>
   </dpd:GenerationEvent>
  </dpd:generated>
 </rdf:Description>

The vocabulary defines some other properties such as dpd:edited, dpd:published etc. See Digital Picture Description vocabulary for detail.

Libby's proposal of 'foaf:creationEvent' is another possibility.

(例)

 <rdf:Description rdf:about="http://www.kanzaki.com/.../031229_1639.jpg">
  <dc:title>Viewing Mt. Fuji from a highway parking area</dc:title>
  ...
  <foaf:creationEvent rdf:parseType="Resource">
   <geo:lat>35.647488888889</geo:lat>
   <geo:long>139.73964166667</geo:long>
   <ical:date>2003-12-29T16:39+09:00</ical:date>
  </foaf:creationEvent>
 </rdf:Description>

Image collection and RSS

A collection of image metadata will be published using RSS (see Image description with FOAF and RSS -- mainly in Japanese, with some English summary). If the metadata has geo info, it will be an interesting presentation with a map.

Annotation of regions of an image

As discussed in last August, a region of an image can be annotated with Jim Ley's vocabulary.

[A stage rehearsal]

(例)

<image:hasPart>
 <image:Rectangle>
  <image:points>1,177 161,331</image:points>
  <dc:title>Percussion part</dc:title>
  <dc:description>In this concert we used many percussions</dc:description>
 </image:Rectangle>
</image:hasPart>

One important point of this vocabulary is that the value of 'image:points' can be easily converted to SVG or XHTML area maps. If the shape is limited to rectangle, some interesting presentations will be possible with XSLT/CSS (and maybe Javascript). Note: the bounding rectangle can be easily calculated from Polygon coordinates, so this restriction may not be necessary.

So far, so good. But how about this one ?

[Four major brass instruments in an orchestra: horn, trumpet, trombone and tuba]

This picture presents instruments, thus it is naturally expected to describe the instruments themselves, rather than 'the region'. We need to add one more term like 'image:depicts'.

(例)

<image:hasPart>
 <image:Rectangle>
  <image:points>0,0 220,156</image:points>
  <dc:title>Horn photo</dc:title>
  <image:depicts>
   <wn:Horn-9>
    <dc:description>a brass musical instrument consisting of a 
    conical tube that is coiled into a spiral and played by means of
    valves</dc:description>
   </wn:Horn-9>
  </image:depicts>
 </image:Rectangle>
</image:hasPart>

Jim is considering to add this term to his vocabulary. Also, there is another possible term foaf:regionDepicts, which was discussed in Aug-Oct 2003, but not defined yet.

A collection of image annotation examples: