Amazon Textract takes character recognition to new heights

Amazon Textract takes character recognition to new heights

 1 min read


Original version published June 2019, Updated October 2019.

After releasing their Textract tool earlier this year, Amazon has announced improvements that increase the variety and complexity of documents that can be processed. Now, documents that are non-standard in any way (including size, background pattern, imperfections like bent corners, and more) can have text extracted. The accuracy of the extractions has also improved.



Amazon recently generally released Textract, a tool bound to save anyone working with documents considerable time and effort. The tool utilizes machine learning to extract text from a wide range of document types and then contextualize the extracted text. According to the release article posted by Amazon, Textract supports text content from regular blocks of words, from forms, and from tables, no matter if the content comes from scans, PDF’s, or photos. Users can also take the text one step farther after extraction by using it to create searches or export it entirely to another program.


Possible use cases include:


Change management: Automation of data collection as businesses prepare to undergo change

Advanced analytics: Accelerated data entry for use in analytics dashboards

Digital marketing: Population of digital marketing collateral and boilerplate


The machine learning algorithms behind Textract are just another example of how humans are simplifying their workloads with the help of artificial intelligence!





AWS Logo