Towards Representation Learning for an Image Retrieval Task