Detection and Classification of Personally Identifiable Information in Images Using Artificial Intelligence
Abstract
Owais Shaikh
Personally, Identifiable Information (PII) is any content that is sensitive that needs to be treated as secure and private. When data pieces such as a person's name, address, Social Security number, phone number, email address, and so on may be used to identify a specific individual, they are deemed PII. As organizations grow, so does their volume of data. This makes identifying and protecting such sensitive resources at a scale quite complex. In this project, we demonstrate where and how PII can be discovered and how we developed a working prototype of a tool that can easily detect PII images using advanced artificial intelligence (AI) techniques such Optical Character Recognition (OCR) and image classification using Convolutional Neural Networks (CNN).