Generate Meaningful Captions for Images with Attention Models


Image captioning is the task of generating a description for a given image. Caption generation involves two tasks.

  1. Understanding the content of the image.
  2. Turning this understanding into a meaningful sentence describing the image.

Hence, it requires techniques from both computer vision and natural language processing. Image captioning has many use cases that include generating captions for Google image search and live video surveillance as well as helping visually impaired people to get information about their surroundings. Watch this wonderful video by Microsoft here.

