Scope:
Design and build machine learning (“ML”) models to navigate and search the data, text, audio, video or images (“Content”), indexing transcripts from audio and video files as well as indexing text content extracted from scanned documents. More specifically, assist Customer with the following activities, which may include:
1.Design a data flow architecture, including data pipelines;
2.identify the objects in digitized Content provided by the Customer;
3.Data exploration, integration and cleaning collected data;
4.Deploy Microsoft SQL EC2 database that will act as the storage area for metadata extracted from digitized Content;
5.Deploy Amazon Comprehend and SageMaker;
6.Setting a workflow for indexing digitized Content;
7.Feature engineering for building ML models;
8.Building cloud-based architecture for data ingestion and processing; and
9.Creating an application program interface (API) endpoint for retrieving results to the Customer In-Scope Application.