Where is sensitive data stored?Ana Rodriguez
We are all aware up to a certain point that we have sensitive data in our databases, but do we know where it is and what data do we have? To get a better picture of this situation, let’s follow the simile:
Imagine that you move to a new house. You start organizing your home in the best possible way, placing everything as you would like to find it, knowing where you leave your belongings and, in the meanwhile, you discover what you have and create a mental schema of it.
However, now, after five or ten years, you need an inventory of what you own and where it is, plus, you need to know how valuable those objects are, as you don’t want any guest to discover them when they are invited to spend a few days in your house or even to protect it against robbers. During these years, since you moved, you haven’t been the only one who did the cleaning or added and moved stuff, your family also helped. Now, the organization of your house doesn’t have any similarities with the one you modeled the first day.
Then, you begin an inventory process reviewing corner by corner what you have and evaluating how important its protection is.
Something similar happens with databases at companies. There are many people manipulating data, models or testing, and it is possible that even the team who devised the model is no longer working there or that there is no clear record of how it was done. Here is where Icaria comes into play, capturing in a Sensitive Data Map the available information and its role.
In this post we will develop the scanning and inventorying process that will response the initial question of this article, where is sensitive data? In this stage of the project, with much of the paperwork already formalized, it is time to configure the platform and explore the data model in order to dissociate it in the future. The usual procedure to fulfill this purpose is:
- Install and configure icaria Mirage in the new environment. It is possibly the most methodical and tedious process of the entire dissociation phase.
- In respect of the installation, we follow a script to not forget any step in the configuration and try run the platform for the first time. Once verified that everything works, it is time to start adapting to the new environment. In addition, at this point we began to involve the client, showing the installation and how to access the platform. We included this step after listening to their suggestions. And they were right, we have checked that involving them from such early stages of the project helps in the future development.
- Each installation requires some specific configuration, searching for the appropriate connectors for the databases, testing connections, etc. This takes a few days, performing tests are carried out and we make sure that we will not have any eventuality in the future.
- Finally, we import the data model, that is, tables, relations, fields and types to icaria Mirage to be able to start working on what is really interesting.
- The data model is already known, the imported tables are checked, and we make sure they are the ones that the other development team considers necessary.
- Now we can start with the data search. For this, we launched a first analysis with some general parameters preset in the platform that will serve as the basis for the next iterations. In the following publication we will give more details of how the data inspection process works.
Image. Sensitive Data Map results
Image. Data analysis configuration
- The result of the previous process is the Sensitive Data Map, an inventory of data generated by icaria Mirage. This gives a general impression of the location of the data, the type of information they contain and proposes that will anonymize them. In other words, it shows the tables and fields, that could contain sensitive information, whether this information is an ID, telephone number, bank account, etc. and suggest the dissociators included in icaria Mirage to hide the information from prying eyes.
- From the map we get a first impression, basis for the next iterations of the inspection. We take the results obtained and look for the terms that usually define the fields with sensitive data. In addition, we minimize false positives by adjusting the sensitivity in the analysis in subsequent cycles. In this way we pursue the reduction of the possibilities of overlooking a field that contains sensitive information, although for this we must tolerate a certain number of false positives.
- Once we consider that the result of the map reflects reality, it is sent to the client’s team. Who values the results in a very positive way, confirms results and evaluates if any other type of information that was not included in the Sensitive Information Inventory should be dissociated. In case of detecting some unprotected fields, it is also the best moment to notify us.
These have been the steps taken to inspect the types of data, but during the procedure, we always detect data stored in a very particular way, which does not follow the standard. This requires customizing the dissociation mechanisms to make the anonymized data look real as it is the production environments. Our recommendation is to run the analysis periodically to check if new sensitive information has been added and if it needs to be dissociated.
Thus, thanks to the efforts of the icaria team and the feedback from the client, we will be able to continue working in the coming weeks to achieve a reliable, fast and precise data masking process.
In the following deliveries we will develop in greater depth the mechanisms of data analysis and the creation of the Sensitive Data Map, as this is a complex but fundamental process for the project, as well as interesting.