data discovery data mapping

Data Discovery and Data Mapping: Building Your Inventory to Ensure GDPR Compliance

If you’ve made it this far in our blog series on the European Union General Data Protection Regulation (GDPR), we hope that you’re starting to understand the gravity of what’s expected. If you’re part of an organization that handles or processes any EU citizen’s information, your responsibilities now carry the weight of fines up to four percent of your annual revenue. Even if you think that your organization is only responsible for protecting home-grown data such as intellectual property, financial data, and HR data, remember that your websites may already be tracking behavior of EU citizens!

The GDPR requires us to be able to identify and protect data wherever we own it. In addition, we’re responsible for the accuracy of the data we hold, and legacy data is often a prime candidate for housing out-of-date information.

Many organizations are currently looking at terabytes or petabytes of information with an uncomfortable fear, knowing that potentially regulated data may be hiding in that dark data (commonly in file shares). The data controllers or processers who originally collected this information may have already left the organization, leaving us with an unclassified mess.

Unfortunately, the old concept of security by obscurity no longer applies. Being governed by the GDPR means that you need to make reasonable efforts to safeguard all EU citizen data. In addition, if users request to remove their data, we have to be able to ensure that the request has been completed.

There are three steps in planning a data inventory of your sensitive data. In each, we’re going to start by reaching out to our data owners in an extension of trust to our business, but we’re going to follow up with the verification of their responses. Here are the steps:

Step 1: Data Discovery

What are you processing on a daily basis? A data inventory needs to represent information your company retrieves and processes regularly. This includes data that is touched actively or stored by your teams.

Trust: Privacy Assessments and Surveys

There are many users that come into contact with potential EU citizen data. A few examples are:

  • Partner managers that handle the identity of external vendors and contractors
  • Human resources managing employee data
  • Day-to-day managers who track information on their staff
  • Marketing and sales who handle web form data on customers or leads
  • Data analysts who process potentially unmasked survey data or website information
  • Accounting processing sales or credit card information

The only way to truly understand what type of data your staff touches on a daily basis is to run an internal survey. AvePoint helps get you started with a simple tool to gather and analyze feedback from employees, but you could also run a survey that allows users to mark how much unmasked data they come into contact with on a daily basis.

Verify: Data Classification Programs

At the end of the day, most users will simply not realize how much sensitive data they touch on a day to day basis. The definition provided by the EU is intentionally broad, so what some consider to be sensitive may be different than what you think. In addition, you might be storing more data that your users don’t touch, especially in cases where there has been turnover after data was collected and stored.

The best way to inventory where the information is being collected is through automated classification. By leveraging common core standards for GDPR-regulated data, AvePoint Compliance Guardian provides automated classification against file shares, SharePoint, Office 365, and other points where you might accept and store data.

Figure 2 - Data Classification example in a SharePoint site.

Data Classification example in a SharePoint site.

Step 2: Review Data Collection Procedures

Are you over-collecting data? Ask whether your inventory includes data that was not permitted to be retained. It’s common to go beyond what the law states you’re allowed to collect, even unintentionally!

Trust: Review of Privacy Policies

Privacy by Design is an essential step for developing any user-facing web applications or data surveys. For this, we strongly recommend leveraging the IAPP’s Privacy Impact Assessment (PIA) application, a free option that lets you start educating your application developers and data processors about your privacy policies.

You can insert this process at the beginning of any new data management project, or create annual reviews revisiting acceptable use of data.

Verify: Create Heat Maps of Sensitive Data

In order to know if PIAs are being honestly answered, it’s crucial to be able to run a scan for EU citizen data. A simple way to highlight areas of your inventory that include sensitive information beyond permitted sets is to generate a heat map using AvePoint Compliance Guardian.

Figure 3 – Heat maps of sensitive data indicating over-sharing of information

Heat maps of sensitive data indicating over-sharing of information

Step 3: Data Mapping

The risk to GDPR compliance can be elevated when we consider who has access to sensitive information and where it moves. This is commonly referred to as data mapping.

Trust: Set secure share policies

A good governance policy is the first step in educating your staff about the protection of EU citizen data. They are typically the first to grant access to information to other peers, share data externally with vendors, or even simply leave with data stored on unprotected devices.

With good training, you can establish a culture that involves constantly asking about the best way to handle data. AvePoint has suggestions for governance of collaboration systems, but it’s also essential to provide direct access to your data privacy team for users to ask questions. It is better to provide acceptable and documented ways to share rather than block sharing altogether. That will simply encourage users to go around your measures of control!

Verify: Alert on data at common hand-off points

Collaboration systems are a critical hand-off point to check against data sharing policies. One major retailer recently worked with us to identify one percent of their content that had been shared with external parties, but contained sensitive information. That represented nearly 10,000 files! AvePoint’s Compliance Guardian has an Incident Management Dashboard that both alerts when sensitive data is shared and gives you the audit information necessary to understand the extent of how data was shared.

Figure 4 - Data alert in the Incident Manager representing the extend of shared data

Data alert in the Incident Manager representing the extend of shared data

Additional Resources

AvePoint has designed Compliance Guardian product around data-centric audit and protection reports, including the ability to produce a data inventory through classification and data maps through incident dashboards.

In order to get you started, here are a few resources to help: