Choosing The Right Amount Of Metadata
Building a searchable library of digital media is an everyday problem for businesses of all sizes. We can see that each extra file we store has some value, but we need to be able to connect that value to a future user by making the entire store of media searchable.The problem goes as follows:
- Digital media is prolific. Digital technology is accelerating the quantity of data in circulation and digital marketing has an unending appetite for new and engaging content.
- We are naturally keen to ensure that this digital media is searchable to encourage reuse and streamline our work, but we need textual descriptions to unlock this potential. This text simply does not exist by default.
- To succeed, the tagging process must be efficient and must be proportionate to the value of the files. In other words, it must provide a return on investment and not be unduly time-consuming or centered on one person.
Between the human being and the database, your influence as the author of the metadata is vital. You can pre-empt your user if you put yourself in their shoes, and then shape your metadata so that the database can be effective. This is your first opportunity to save time: don’t enter too much metadata, just enough to catch the majority of the meaning.
If each extra keyword adds just half as much meaning as the last keyword, you will obtain a curve of the overall extra worth of each keyword that falls steeply off with each additional keyword.
This shape of curve should make you think "diminishing returns". Entering too much metadata takes time. In practice, I’ve found that more than 10 keywords is usually unnecessary. You can further simplify your job by using other, more flexible input types like trees of checkboxes.
It is possible to save a great deal of time by studying the behaviour of your users, and most powerful digital library solutions will allow you to browse a history of actual search terms used for this reason.