An exploration into the possibility of generating multi-sentence image descriptions by leveraging the latent dependencies between visual concepts in an image with their textual counterparts