Unsupervised specificity-guided optimization of Image Captioning models to encourage meaningful diversity in the generated captions.