How can Google give you pics of Nepal when you google with the keyword "Nepal"?
Its the tag.
Another way for google is to learn from the clicks that when many people click or prefer the particular image with name DSC13434, (say) then it can be inferred that the picture must have some kind of coincidence with the key word.
The problem is how these will get attached anyway?
Think of a day in which the images can be characterized by looking into the "data" of the image.
If it becomes possible to tell whether it is picture of chair or that of a table!
It is a huge success.
Its a huge work.
(And the day is not so far, I read an article on New Scientist and came up with this thought)
For now the trick is to write the appropriate word to it.
For example you may search with the keyword "Nepali pop" in you tube to realize it.