Does Artificial Intelligence mean the end of manual image tagging?
Having worked closely with Third Light customers for over 3 years, our Head of Customer Success, Danny Smith, has seen countless examples of where finding time to tag files with relevant metadata can be difficult. We all have responsibilities and more often than not, a task such as tagging can fall to the bottom of a to-do list. This is why in 2018, Third Light started an experiment with Google's Cloud Vision system.
Here, Danny Smith explains the findings of how using Artificial Intelligence to tag images still needs the human touch.
The idea was to have AI tag images in a Chorus site with metadata, to save time on manual entry of tags. Via the Third Light API, images from a Chorus site are run through Google Vision with metadata tag suggestions returned and seamlessly applied to your images. These tags are then searchable on a Chorus site in a separate metadata panel (so you can easily disable the Vision tags from search results).
A number of our customers have taken advantage of the integration. In some cases it has been highly useful and added significant value to the metadata, and on other sites it's not been quite as successful. If a broad search term being applied to your images is useful, then it’s worth checking what this integration can do.
An example of where it has been successful is with one of our agriculture customers - rather than a human have to go through their library and tag their images, Google Vision came back with relevant terms such as "sheep", "tractor", "crop field" etc. This makes thousands of images that could not previously be found using these terms instantly searchable, which brings significantly more value to a media library. This potentially saved hours of manual tagging.
So, no more tagging? Not quite...
On the flip side to this, a car rental company found that having thousands of their images tagged with "car" wasn't overly useful! In some cases it did identify the model of car through seeing the badge, but not the specs that were important for our customer to be able to search. The level of detail required just wasn't in the tags that were returned. It did offer some useful tags, such as an image with mountains in the background applying the tag "mountains". But overall it wasn’t worthwhile and little value was added to the metadata.
So why isn’t there a solution that helps the car rental company with more specific tags? The simple answer is that Google has a vast pool of images that allows their artificial intelligence to identify what might be in an image, but even then it can only be a specific as “car” or maybe “BMW". Google will have billions of images with everything you could possibly think of appearing in them. Under the hood, each pixel in a digital image is a binary number that consists of a series of 1s and 0s. Each binary number determines the colour of the pixel. From that pool of images Google has, it can start to recognise patterns of those 1s and 0s for different objects that appear in an image. Their systems can be trained to offer tags to anything that appears enough times across images on the web that they've trawled.
There’s also the problem of incorrect tags. Some tags returned were suggesting something was in the image that definitely wasn’t. The tags do need a human sanity check and can’t be 100% relied upon. An example was an image of some people at lunch – the tag applied was “business”, we think because some of those in the picture were wearing business like attire. There was also an instance of an image of some eggs being tagged as “potato”! So while it can add a lot of relevant tags, a check is still needed after to see if there’s anything that should be removed. It’s worth noting that the amount of time needed to fix Google’s mistakes could still be much less than what would be needed to add all the correct keywords in the first place (and humans make mistakes too!).
A computer can only be as clever as the data and program behind it. There are also a number of shortcuts we can talk you through to help with tagging, such as folder level metadata, bulk tagging, smart collections and workflow. The Google Vision integration can add a lot of value, but it's not quite ready to completely take the job off our hands...yet!
The Google Vision integration is available as a module in Chorus for £500 per annum – please contact the Customer Success team for more information. Just firstname.lastname@example.org.