Information Retrieval from RC Images Using AWS Textract API

Monday. May 18, 2020 - 1 min

Preprocessed the images using OpenCV library.
Applied AWS Textract API to Detect text efficiently.
Used Regexr for Extraction from raw text.

Abstract

Extracting useful information from any raw text is a difficult task but in this project, we are retrieving the in-formation from an image with the help of AWS Textract which gives its output as tokenized or blocked raw text which will be further reduced to useful information with the help of rule based matching or object based matching. As dataset we are using 46 sample images of RCs which is stored in an AWS S3 bucket.

Summary

Starting with the raw image first we pre-process the image ie; Normalize image and croping useless areas in the image (Normalization Includes sharpening of the image also) with the help of opencv lib. widely used for image processing. Then creating an AWS S3 Bucket of these Preprocessed Images to be further given as an input to AWS Textract.

Information Retrieval from RC Images Using AWS Textract API

Abstract

Summary

Related Posts