It is about the definition and implementation of a global model for one of the main Spanish Financial Institutions, which guarantees the security of access to the data according to the regulations that apply in each area of the different entities of the group, in a homogeneous way throughout the business group. This implies the identification of sensitive data and diagnosing its physical location, trying to limit its access only to those people considered necessary and as such with access privileges to them.
For the purpose of defining the appropriate access criteria, the following types of data are determined:
To cover the established criteria, different levels of security have been determined for:
The following tools are available to facilitate individual implementation by project area: automatic labeling of existing fields and registration of new ones based on recursive neural networks, HDFS table classifier, compactor of physical files in the tables, tokenization based on the information schemes provided and a catalog of entities as a common repository in which table names are linked with their path within the HDFS.