Skip to content

Latest commit

 

History

History
121 lines (104 loc) · 6.58 KB

File metadata and controls

121 lines (104 loc) · 6.58 KB

Cloud Data Loss Prevention (DLP) API Samples

Open in Cloud Shell

The Data Loss Prevention API provides programmatic access to a powerful detection engine for personally identifiable information and other privacy-sensitive data in unstructured data streams.

Setup

  • A Google Cloud project with billing enabled
  • Enable the DLP API.
  • (Local testing) Create a service account and set the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to the downloaded credentials file.
  • (Local testing) Set the DLP_DEID_WRAPPED_KEY environment variable to an AES-256 key encrypted ('wrapped') with a Cloud Key Management Service (KMS) key.
  • (Local testing) Set the DLP_DEID_KEY_NAME environment variable to the path-name of the Cloud KMS key you wrapped DLP_DEID_WRAPPED_KEY with.

Build

This project uses the Assembly Plugin to build an uber jar. Run:

   mvn clean package -DskipTests

Retrieve InfoTypes

An InfoType identifier represents an element of sensitive data.

InfoTypes are updated periodically. Use the API to retrieve the most current InfoTypes.

  java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Metadata

Run the quickstart

The Quickstart demonstrates using the DLP API to identify an InfoType in a given string.

   java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.QuickStart

Inspect data for sensitive elements

Inspect strings, files locally and on Google Cloud Storage, Cloud Datastore, and BigQuery with the DLP API.

Note: image scanning is not currently supported on Google Cloud Storage. For more information, refer to the API documentation. Optional flags are explained in this resource.

Commands:
  -s <string>                   Inspect a string using the Data Loss Prevention API.
  -f <filepath>                 Inspects a local text, PNG, or JPEG file using the Data Loss Prevention API.
  -gcs -bucketName <bucketName> -fileName <fileName>  Inspects a text file stored on Google Cloud Storage using the Data Loss
                                          Prevention API.
  -ds -projectId [projectId] -namespace [namespace] - kind <kind> Inspect a Datastore instance using the Data Loss Prevention API.

Options:
  --help               Show help 
  -minLikelihood       [string] [choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"]
                       [default: "LIKELIHOOD_UNSPECIFIED"]
                       specifies the minimum reporting likelihood threshold.
  -f, --maxFindings    [number] [default: 0]
                       maximum number of results to retrieve
  -q, --includeQuote   [boolean] [default: true] include matching string in results
  -t, --infoTypes      set of infoTypes to search for [eg. PHONE_NUMBER US_PASSPORT]
  -customDictionaries  set of comma-separated dictionary words to search for as customInfoTypes
  -customRegexes       set of regex patterns to search for as customInfoTypes

Examples

  • Inspect a string:
    java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -s "My phone number is (123) 456-7890 and my email address is me@somedomain.com" --infoTypes PHONE_NUMBER EMAIL_ADDRESS
    java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -s "My phone number is (123) 456-7890 and my email address is me@somedomain.com" -customDictionaries me@somedomain.com -customRegexes "\(\d{3}\) \d{3}-\d{4}"
    
  • Inspect a local file (text / image):
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f src/test/resources/test.txt --infoTypes PHONE_NUMBER EMAIL_ADDRESS
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f src/test/resources/test.png --infoTypes PHONE_NUMBER EMAIL_ADDRESS
    
  • Inspect a file on Google Cloud Storage:
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -gcs -bucketName my-bucket -fileName my-file.txt --infoTypes PHONE_NUMBER EMAIL_ADDRESS
    
  • Inspect a Google Cloud Datastore kind:
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -ds -kind my-kind --infoTypes PHONE_NUMBER EMAIL_ADDRESS
    

Automatic redaction of sensitive data from images

Automatic redaction produces an output image with sensitive data matches removed.

Commands:
  -f <string>                   Source image file
  -o <string>                   Destination image file
 Options:
  --help               Show help
  -minLikelihood       choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"]
                       [default: "LIKELIHOOD_UNSPECIFIED"]
                       specifies the minimum reporting likelihood threshold.
  
  -infoTypes      set of infoTypes to search for [eg. PHONE_NUMBER US_PASSPORT]

Example

  • Redact phone numbers and email addresses from test.png:
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Redact -f src/test/resources/test.png -o test-redacted.png -infoTypes PHONE_NUMBER EMAIL_ADDRESS
    

Integration tests

Setup

Run

Run all tests:

   mvn clean verify