Want to keep your data as secure as possible? Check out our webinar “Preventing Data Leaks in Microsoft Teams (and Other Collaboration Systems).” View here!
There are now more data protection options than ever before, but there are also more ways for leaks to occur. Compliance Guardian’s data validation, classification, and protection features can help your organization identify and prevent the intentional or accidental exposure of sensitive information to unwanted parties.
Compliance Guardian uses a content and context analysis engine to search files, email messages, instant messaging, and other platforms for sensitive information like credit card numbers, personal identifiable information (PII), and Intellectual Properties (IPs). You can then take actions such as:
- Logging events for auditing purposes
- Sending alerts to the end user creating the violation
- Creating an incident within Compliance Guardian’s Incident Management System for further investigation, accountability, and reporting
- Tagging or classifying the files according to your business classification scheme
You can also apply preventative actions such as:
- Blocking file sharing from taking place by either changing or removing permissions
- Moving files to a better-secured location
- Redacting sensitive information and pseudo-anonymizing content
- Encrypting or quarantining the files if necessary
- Automating and applying your existing Azure Information Protection (AIP) policies if you already have the licenses for the functionality within their Office 365 subscription
To protect your data and prevent data loss in the first place, it’s important to first understand the data in question and how users interact with information within and outside of the organization. Be it a spreadsheet with employee names, payroll, and payment data, an email, or even a screenshot (image) from a protected file, you’d be surprised how employees use your company data to complete their tasks.
How Sensitive Information is Detected by Compliance Guardian
Compliance Guardian has several ways of identifying sensitive information, including:
- Regular expression (RegEx) pattern matching using a dictionary with multiple keywords
- Using a combination of keywords and regular expressions
- Determining the proximity of certain results and counting how many matches have been identified during the discovery process.
A more advanced approach is also available in Compliance Guardian using machine learning and fingerprinting.
Example: Identifying Credit Card Numbers
A VISA credit card number has 16 digits, but these 16 digits can be presented in different ways such as:
- 1234567890123456, or
- 1234 5678 9012 3456.
Not all 16-digit numbers are credit cards, however; they could be fax numbers, serial numbers, or ID numbers. Compliance Guardian’s Validation capabilities (checksum or LUHN) ensures that the numbers identified match a known pattern from different credit card types. A content analyzing solution should be flexible enough to understand the difference between these two texts in a file or an email:
- We need to hire a car for our trip. Please use my VISA card 1234 5678 9012 3456. It expires on 01/01/2020.
- My car’s VIN number is 1234-5678-9012-3456, and I bought it on 12/12/19. We should arrange our insurance once my visa is approved.
Where to Start with Compliance Guardian
It can be difficult to work out exactly where you should start implementing your company data protection policies. In order to protect, you first need to understand what data you have and where it is.
Compliance Guardian provides a range of Test Suites (Templates) you can use to identify your content. These test suites can be country, region, or regulation-specific and you have the possibility to use, re-use, or edit the existing and create new test suites.
Let’s take a look at the Payment Card Industry (PCI) Data Security Test Suite since every organization works with payment details such as credit cards.
The first step is to create a Scan Plan. You can choose between a scheduled or real-time scan and then chose the test suites you want to use for your data discovery. I’ll only use the PCI template, but you can add and combine multiple templates for a single Scan Plan:
Once we’ve selected our test suites, the next step is to choose the location we’d like our scan to apply to. I’ll choose SharePoint On-Premises from one of the many available options:
When selecting your source you can scan everything, scan an individual sub-location, or do a more granular scan within your system:
After selecting the target location(s), the next options are here to help you chose whether you’d like to scan all file types or most common documents as well as if you want to scan all versions of a document or not. The default options are a great choice and you can refine as necessary. Keep in mind that if you scan all versions, the scan may take longer rather than if you only scan the latest version of a document:
Once we are past the scope settings, we can configure action rules to be automatically applied based on a condition. The conditions can be:
- Based on values you specify (name, size, content type, etc.),
- Based on a risk level (which is automatically calculated based on type and amount of information identified during the scan).
When configuring the action rules, we can specify multiple conditions and add multiple rules or logic to what we want to achieve when our criteria from the scan plan matches a result. If the risk is minimal, (one credit card number if found), you can alert someone as an action. If the risk is extreme, (ten credit card numbers are found) you can add another rule to quarantine the file and create an incident assigning it to someone for further investigation:
The final step allows you some options to tweak your scan plan if required, such as setting a schedule, sending you an alert once the scan is finished, or even integrating and sending the scan results to a Security Information and Event Management (SIEM) system. All that’s left now is to click Finish, provide a name for the plan, and click Save:
How to Test Your Compliance Guardian Scan Policy
Your new policy will be in effect as soon as you click the “Run Now” button. You can edit your policy as many times as necessary, and you have the option to run a “Full Scan” or an “Incremental Scan.”
Once you click the “Test Run” button, the scan policy will start scanning all items or files within the scope that you’ve defined and will report on the items that have been identified as a match.
In this case, it’ll search for keywords representing a Credit Card Type such as VISA, MASTERCARD etc., and if found it also checks whether there are credit card digits within 50 characters. The 50 characters distance can be adjusted as required but the default value is usually a good start.
There are a few ways to see the results from testing your scan plan:
- Via the Job Monitor – It allows you to see the files that have been scanned, skipped, or if any exception has occurred. The Job Monitor is a good place when testing your policy with known files that match your criteria and see whether these results are in the scan job report. The job report shows information about files that have been scanned successfully, skipped files which don’t match your filter criteria, or files that may not been able to go through the scan such as ones that are encrypted or password-protected.
- Via the Incident Manager>Scan Records – This is the place where you can see a detailed view for each file that the scan plan has identified as a match. Note that to have a detailed view of files that match your scan policy criteria, you need to hit the “Run Now” button.
Compliance Guardian has a built-in Incident Management System that provides an analysis of the files that have been scanned with summaries of what has been scanned and identified, details of the elements and how many instances have been detected, a highlighted report that shows the match, who has access to the file, and the entire audit trail for the given scan record.
To demonstrate the PCI detection, look above to see that a file named “currency and cc.txt” with a risk level of 0.7 has been identified. The risk level is adjustable, and it represents values from 0.1 (minimal) to 10 (catastrophic) with moderate, severe, and critical values in-between.
Clicking on the scan records’ name provides us with a detailed summary, and we can see our highlighted finding of actual credit card details including name and numbers. However, we can also notice that the number that is not highlighted but may look like a credit card number is excluded as a result. The PCI check validates the various formats in which a 16-digit number can be found, and it also does a checksum or Luhn algorithm check.
If you go through your scan records and analyze the findings, you may start to understand the results and accuracy from the built-in test suites and checks.
Read the second part of this blog post where we continue with tips on how to tune a scan policy, investigate and improve false positives, scheduling, and turning off a scan plan. For more information on Compliance Guardian, visit our product page to request a free demo!