Basic Usage (Python)

Say you have a number of files containing customer or patient data and you are not sure which of them are ok to share in a less secure manner. By leveraging Nightfall’s API you can easily verify whether a file contains sensitive PII, PHI, or PCI.

To make a request to the Nightfall API you will need:

  • A Nightfall API key
  • A list of data types you wish to scan for
  • Data to scan. Note that the API interprets data as plaintext, so you may pass it in any structured or unstructured format.

You can read more about obtaining a Nightfall API key or about our available data detectors in the linked reference guides.

Before you continue, be sure to install the requests library using pip.

pip install requests

We will be using a few built in python libraries along with the requests library to run this sample API script.

import json
import os
import pprint
import requests

First we define the endpoint we want to reach with our API call.

endpoint = 'https://api.nightfall.ai/v2/scan'

Next we define the headers of our API request. In this example, we have our API key set via an environment variable called "NIGHTFALL_API_KEY". Your API key should never be hard-coded directly into your script.

h = {
    'Content-Type': 'application/json',
    'x-api-key': os.environ.get('NIGHTFALL_API_KEY')
}

Next we define the condition set with which we wish to scan our data. The condition set can be pre-made in the Nightfall web app and referenced by UUID. The condition set may also be defined in line with your request as a collection of detectors.

config = {
    'conditionSetUUID': os.environ.get('CONDITION_SET_UUID')
}
config = {
    'conditionSet': {
        'conditions': [
            {
                'minNumFindings': 1,
                'minConfidence': 'POSSIBLE',
                'detector': {
                    'displayName': 'credit_card',
                    'detectorType': 'NIGHTFALL_DETECTOR',
                    'nightfallDetector': 'CREDIT_CARD_NUMBER'
                }
            },
            {
                'minNumFindings': 1,
                'minConfidence': 'LIKELY',
                'detector': {
                    'displayName': 'ICD9',
                    'detectorType': 'NIGHTFALL_DETECTOR',
                    'nightfallDetector': 'ICD9_CODE'
                }                
            },
            {
                'minNumFindings': 1,
                'minConfidence': 'POSSIBLE',
                'detector': {
                    'displayName': 'SSN',
                    'detectorType': 'NIGHTFALL_DETECTOR',
                    'nightfallDetector': 'US_SOCIAL_SECURITY_NUMBER'
                }                  
            }          
        ]
    }
}

Next, we build the request body, which contains the condition set from above, as well as the raw data that you wish to scan. In this example, we will read it from a file called sample_data.csv.

Here we assume that the file is under the 500 KB payload limit of the Scan API. If your file is larger than the limit, consider breaking it down into smaller pieces across multiple API requests.

with open('sample_data.csv', 'r') as f:
    raw_data = f.read()

    
d = {
    'payload': [raw_data],
    'config':config
}
import os

if os.stat('sample_data.csv').st_size < 500000:
  print('This file will fit in a single API call.')
else:
  print('This file will need to be broken into pieces across multiple calls.')

Now we are ready to call the Nightfall API to check if there is any sensitive data in our file. If there are no sensitive findings in our file, the response will be '[null]'.

response = requests.post(endpoint, headers = h, data = json.dumps(d))

if (response.status_code == 200) & (len(response.text) > 6):
  print('This file contains sensitive data.')
  pprint.pprint(json.loads(response.text))
elif response.status_code == 200:
  print('No sensitive data detected. Hooray!')
else:
  print(f'Something went wrong -- Response {response.status_code}.')
[[]]
[
  [
  {'fragment': '172-32-1176',
   'detector': 'US_SOCIAL_SECURITY_NUMBER',
   'confidence': {'bucket': 'LIKELY'},
   'location': {'byteRange': {'start': 122, 'end': 133},
    'unicodeRange': {'start': 122, 'end': 133}}},
  {'fragment': '514-14-8905',
   'detector': 'US_SOCIAL_SECURITY_NUMBER',
   'confidence': {'bucket': 'LIKELY'},
   'location': {'byteRange': {'start': 269, 'end': 280},
    'unicodeRange': {'start': 269, 'end': 280}}},
  {'fragment': '213-46-8915',
   'detector': 'US_SOCIAL_SECURITY_NUMBER',
   'confidence': {'bucket': 'LIKELY'},
   'location': {'byteRange': {'start': 418, 'end': 429},
    'unicodeRange': {'start': 418, 'end': 429}}}
  ]
 ]

And that's it!

Be sure to check out some of our other guides for more advanced use cases, like redacting a file, or integrating the API into your CI pipeline.