Data Machines
...
Models Supported by Data Machi...
NLP Models
Detect PII
5min
detect pii, identifies and if desired redacts and/or extracts mentions of entities related to pii (person, organizations, dates, emails, ssn, bank account numbers, credit card numbers, etc ) these models detect, classify and provide options to de identify personal identifiable information (pii) in unstructured text a simple identification example would be if the phrase my email is bill\@qualetics com mailto\ bill\@qualetics com is analyzed it can return the specific entity and the label for that entity, "text" "bill\@qualetics com" and "type" "emailaddress" a more complex example of detect pii, would be taking the following text hello support team, i am reaching out to seek help with my credit card number 1234 5678 9873 2345 expiring on 11/23 there was a suspicious transaction on 12 aug 2022 which i reported by calling from my mobile number +1 (423) 111 9999 also i emailed from my email id sarah jones1234\@hotmail com would you please let me know the refund status? regards, sarah and processing it to redact pii information resulting in hello support team, i am reaching out to seek help with my credit card number expiring on there was a suspicious transaction on which i reported by calling from my mobile number also i emailed from my email id would you please let me know the refund status? regards, use cases this can be particularly helpful for would be detecting private information in user feedback many organizations collect user feedback is collected through various channels such as product reviews, return requests, support tickets, and feedback forums you can use language pii detection service for automatic detection of pii entities to not only proactively warn, but also anonymize before storing posted feedback using the automatic detection of pii entities allows you to proactively warn users about sharing private data, and applications to implement measures like storing masked data scanning object storage for presence of sensitive data cloud storage solutions are widely used by employees to store business documents in locations either locally controlled or shared by multiple teams ensuring these shared locations do not store private information such as employee names, demographics and payroll information requires automatic scanning of all the documents for the presence of pii this model can support this process at scale model input parameters parameter name parameter type required input text yes masking character character no (if not provided, ' ' will be used) rest api example var settings = { "url" "https //mlapi qualetics com/api/datamachine/init?id=\<datamachine id>", "method" "post", "headers" { //add authorization headers here "content type" "application/json" }, "data" json stringify({ "input" "my name is james bond" }), }; $ ajax(settings) done(function (response) { console log(response); }); model output result parameter name parameter type masked text text pii entity count number detected pii entities json string rest api output example { "input" "my name is james bond", "original input" "my name is james bond", "masked text" "my name is ", "pii entity count" 1, "detected pii entities" "\[{"type" "person", "text" "james bond", "confidence score" 0 9989403486251831}]", "final result" "my name is ", "sessionid" "a9a416cf 7f19 4e8a aa18 87854bf0e0fb", "status" "completed" } standard output parameters every model execution output consists of the following standard output parameters input the input string required for the model to extract the categories original input this is the input provided to the first step in model which is retained across multiple steps in a data machine workflow final result the result of the model executed in the final step of the data machine workflow sessionid a unique session id that is generated for every execution of a data machine which can be used to retain results across multiple sessions status the result of the data machine execution if all of the steps in a sequence are successfully executed, a value of "completed" is provided if the execution is interrupted at any point, a value of "terminated" is provided with the reason for termination