Linkedin

Guidance for Image-to-Text and Image-to-Speech on AWS

Project Overview

This Guidance shows how to convert images to text and speech with machine learning and generative AI services on AWS. Converting images to text is done with the help of Amazon Kendra, a search engine that can be used to index an image repository and search for data. Next, generative AI is used for captioning the images, recognizing objects and features to generate a human-readable textual description, typically a caption based on extracted visual features. This Guidance also shows how to convert image to speech and can be extended to serve content through voice-enabled devices, such as Amazon Alexa. This involves the Describe for Me web app which generates a caption of an image and reads it back in a clear, human-sounding voice, including a variety of languages and dialects.

To know more about this project connect with us

Guidance for Image-to-Text and Image-to-Speech on AWS