Amazon Transcribe can now automatically redact personally identifiable data

Amazon is adding a new privacy-focused feature to its business transcription service, one that automatically redacts personally identifiable information (PII), such as names, social security numbers, and credit card credentials.

Amazon Transcribe is part of Amazon’s AWS cloud unit and was launched in general availability in 2018. An automatic speech recognition (ASR) service, Transcribe enables enterprise customers to convert speech into text, which can help make audio content searchable from a database, for example. Contact centers can also use the tool to mine call data for insights and sentiment analysis. However, privacy issues have cast a spotlight on how technology companies store and manage consumers’ data.

Text-to-speech services can be used to search for keywords and sentiment at a later date, but phone calls often feature significant private data that may be transcribed by Amazon and stored in a searchable database — even if that information is not necessary for analysis. Meanwhile, regulations are springing up around the world to protect consumer data — including the recently implemented California Consumer Privacy Act (CCPA) and Europe’s General Data Protection Regulation (GDPR).

Against this backdrop, Amazon Transcribe will now enable companies to automatically redact personal data, including credit/debit card numbers, expiration dates, CVV codes, PINs, social security numbers, bank account numbers, customer names, email addresses, phone numbers, and postal addresses. It’s worth noting that Google Cloud Platform offers a data loss prevention API that could be used in conjunction with its speech-to-text service to identify and redact sensitive data. But building automated redaction directly into Amazon Transcribe should make the process a lot easier to implement.

Companies using Amazon Transcribe can use automatic redaction as they see fit and can choose which PII elements they wish to obfuscate. The transcribed text will then display a [PII] tag in place of the sensitive information, and the corresponding timestamps mean anyone with sufficient system access will still be able to locate the necessary PII in the original audio file. This may also prove useful if a company wants to carry out extra audio processing to fully redact the information in the original recording.

Amazon Transcribe is available in 31 languages, six of which are supported by real-time transcription, though for now the automated redaction feature is limited to U.S. English. The feature is billed monthly at a rate of $0.00004 per second of content.

This announcement was originally published in VentureBeat.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Previous Article

Thunderhead announces ThunderBay

Next Article

Looker, Alight Analytics Team Up

Related Posts

Subscribe to TheCustomer Report

Customer Enlightenment Delivered Daily.

    Get the latest insights, tips, and technologies to help you build and protect your customer estate.