A Brief Look Into Data Access Management and Data Security Policy Language

A Brief Look Into Data Access Management and Data Security Policy Language

A Brief Look Into Data Access Management and Data Security Policy Language

This article, authored by Phillip from the DPO Club Content Team and reviewed by Lathamani, Privacy Engineer at Google, explores how legal frameworks and technical architectures converge to create a robust privacy-compliant ecosystem. It delves into the role of Data Access Management Systems, the emergence of policy languages, and presents illustrative fictional scenarios to demonstrate how these concepts play out in practice. Together, these elements showcase how law and technology can harmonize to build safer, compliant, and future-ready business environments.

Introduction

Data privacy compliance lies in the intricacies of technology and legal jargon. Fortunately, privacy engineering is keeping pace with the ever-increasing requirements of laws and regulations. Modern technology has enabled the seamless integration of law and technology by creating laws that are comprehensible to the systems themselves. 

In this article, we’ll see how legal and technical architectures coincide to create a compliant environment that’s safe for businesses to flourish in. We’ll discuss Data Access Management Systems, a few policy languages, fictional cases, and how they all tie in together.

A Data Check System called DAM

Data Access Management (DAM) refers to the practice of managing data, which encompasses all aspects from acquiring to deleting information that comes into the company’s possession. The traditional system that granted or denied access in its entirety is now far more sophisticated. It governs not just who accesses the data, but precisely what data, for what specific purpose, and under what conditions can be accessed. It serves as the technical enforcement arm of an organisation's privacy policy's access mandates. 

The system is created to help people and organisations optimise the use of data whilst complying with policies and regulations. Thus, the DAM system becomes pivotal to data privacy compliance by implementing security measures to protect the data from unauthorised access, security breaches, and data loss. This is achieved through data encryption and access control. This technology has advanced to incorporate critical privacy-specific elements such as:

1. Data Sensitivity and Classification through access rules tied directly to the sensitivity level of the data (e.g., "Highly Confidential PII," "Public Data").

2. Processing Purpose by granting only for the specific and legitimate purpose for which the data was collected.

3. Contextual Attributes, which determine real-time conditions like network location (e.g., internal VPN vs. public Wi-Fi), time of day, device posture (e.g., managed corporate device), and even the nature of the application requesting access.

4. Consent Status: Explicitly checking if the data subject's consent, if the legal basis, covers the proposed access.

To incorporate these elements, DAM systems use different types of access control models. The most common models include Role-Based Access Control (RBAC), Attribute-Based Access Control (ABAC), and Purpose-Based Access Control (PBAC).

A Role-Based Access Control (RBAC), for example, would permit all HR managers to view employee salary data. But since GDPR and various other legislations, access has become granular, building on the RBAC system. Privacy engineering now provides both Attribute-Based Access Control (ABAC) and Purpose-Based Access Control (PBAC).

Attribute-Based Access Control (ABAC): Attribute-Based Access Control essentially requires several factors or attributes to be satisfied or “true” to process the data. 

To be precise, if an HR manager wants to view certain employee data, the user (e.g., department, clearance level, location), the resource (e.g., data sensitivity, owner), the action (e.g., read, write, delete), and the environment (e.g., time, IP address) among other factors, would have to match with the prescribed attribute in order to view the data.

Purpose-Based Access Control (PBAC):

According to this research paper on “Purpose-Based Access Control of Complex Data

 for Privacy Protection” - purposes are organised according to a hierarchical structure based on the principles of generalisation and specialisation, which is appropriate in common business environments.

As per their findings, each node on the tree represents a purpose for accessing the data (e.g., billing, clinical care, marketing). In the hierarchy, there exist both broader purposes (like healthcare) that are parents to more specific ones (like treatment, diagnosis, and billing). This helps create flexible policies that grant access for a broader purpose, such as healthcare, automatically permitting access to its narrower sub-purposes like diagnosis and treatment.

This enables data controllers to deny or grant access based on “why” the data is needed rather than who is requesting it. Such a system of purpose-based access helps by:

  1. Supporting user consent management by aligning usage with ethical and legal expectations.

  2. Bridging gaps between organisational policy and system enforcement in complex environments like healthcare or finance.

Consider this fictional scenario: Researchers need to access pseudonymized patient medical records for statistical analysis, but only for approved studies, and never personally identifiable information (PII) without explicit, granular consent from the data controller for that specific study. Furthermore, access might need to be restricted to specific geographies (e.g., under GDPR, Access to data would be restricted to specific regions, depending on where the authorised teams are located).

The following ABAC model would be implemented:

User Attributes: (researcher_role: true), (project_id: "XYZ_Cancer_Study"), (affiliation: "University_of_Berlin").

Data Attributes: (data_classification: "Pseudonymized_Clinical_Trial_Data"), (data_source_country: "Germany").

Environmental Attributes: (network_location: "EU_VPN_Subnet"), (access_time: "working_hours").

The DAM system will verify that the purpose associated with the research project (XYZ_Cancer_Study) is explicitly registered and approved for accessing (Pseudonymized_Clinical_Trial_Data).

Thus, only authorised researchers, accessing from approved locations and during specified times, for an approved study purpose, can view the highly specific, pseudonymized datasets. Any attempt to access raw PII or data from an unauthorised study is automatically denied and logged.

A Language for Security Policies:

Security policy languages in data governance are often structured as machine-readable languages that are used to express and define data handling rules. This converts human-made policies into machine-readable rules, enabling the enforcement of regulations such as data retention limits, locality restrictions, processing obligations, and purpose constraints, automatically and consistently.

For example:

A certain company’s natural language policy might prescribe, "Data should be deleted after 7 years". This would then be drafted in policy language to read - 

("DELETE (data_type: 'customer_transaction_record') AFTER (retention_period: '7 years') UNLESS (legal_hold: true)").

In this manner, natural policies that may seem ambiguous are converted into actionable, verifiable commands. Thus, the need for precise language that reflects compliance between laws and policy language becomes vital. This form of policy writing now requires testing and rewriting, just as an application code would require, giving rise to the system of Policy-as-Code (PaC).

The Legal Bridge from Natural Language to Code

The process of converting nuanced legal concepts into clear and executable code is a significant challenge. Unlike legal provisions, code cannot be open to interpretation. The most common hurdles in going from legislation to code are:

  1. Terminological Accuracy: Terms such as “legitimate interest”, “data minimisation” and “reasonable security measures” are open to interpretation and lack technical prescription. For example, to quantify the term “legitimate interest” under the GDPR Article 6(1)(f), it is often subject to tests that weigh the interests of the organisation against the data subject's rights. But due to the existence of grey areas within the term, it becomes difficult to condense such a clause into a simple “allow” or “deny”

  2. Clause Construction: Clauses are often constructed with exceptions and conditions that complicate an otherwise straightforward interpretation, thus requiring careful mapping. A failure in the process of converting the clause into code can lead to regulatory gaps that infringe critical rights.

Here’s an Example of Codifying GDPR's "Right to Object". Under Article 21 of the GDPR, a data subject has the right to object to processing based on legitimate interest. A policy language must not only define the legitimate interest but also include a dynamic, overriding rule that checks for a data subject's objection status.

Policy Language Logic (conceptual):

ALLOW_PROCESSING IF (

    purpose == "legitimate_interest_customer_segmentation"

    AND data_subject.has_not_objected == true

)

DENY_PROCESSING IF (

    purpose == "legitimate_interest_customer_segmentation"

    AND data_subject.has_not_objected == false

)

This simple example illustrates how a technical system can respect a fundamental legal right, but also highlights the complexity of managing and updating the has_not_objected attribute in real-time.

Policy-as-Code (PaC) is directly influenced by policy languages. This also means that privacy policies can be version-controlled, tested, deployed, and audited with the same rigour as any application code.

Here are some popular policy languages and their relevance:

  • Rego (used in Open Policy Agent - OPA): An increasingly popular declarative policy language. OPA acts as a lightweight policy engine that helps offload policy decisions from services. It is particularly useful for microservices architectures and clusters that operate in Kubernetes environments.

  • XACML (eXtensible Access Control Markup Language): A robust and mature XML-based standard for expressing complex access control policies. Although generally considered verbose, its detailed structure eases the process of defining attributes, rules, and combining algorithms.

  • Gaia-X’s Policy Rules Language (PRL): Emerging in the European data spaces, PRL aims to enable transparent and trustworthy data sharing through federated policy enforcement. This is particularly useful for data ecosystems that scale multiple organisations. 

These policy rules can be deployed to policy enforcement points (e.g., API gateways, data lake governance layers, application runtime environments). In cases of location-based access, any attempt to move data outside its compliant jurisdiction or access data beyond its retention period will be automatically blocked and logged. This helps provide verifiable proof of regulatory compliance.

This is also true for breach reporting under Articles 33 and 34 of the GDPR, where policies are not only coded to reflect data breach thresholds but also to be triggered without undue delay. Thereafter, legal teams would be required to provide a comprehensive understanding of ‘data breach' under every applicable jurisdiction, whereby the engineering teams can implement these with conditional logic to handle any conflicting requirements, ensuring the most stringent rule is applied. In the event of a regulatory audit, the responsibility falls on the company to demonstrate that its automated system operated within its programmed legal obligations.

The responsibility of the legal department is to conduct the Legitimate Interest Assessment (LIA), a three-step process, before any processing based on legitimate interest can occur:

  1. Purpose Test: Is there a clear benefit to the company?

  2. Necessity Test: Can the same outcome be achieved with less data or through a less intrusive method?

  3. Balancing Test: The company's legitimate interest is assessed to ensure it does not infringe or override the rights and freedoms of the data subject. They consider factors like the nature of the data (is it sensitive?), the reasonable expectations of the data subject, and the potential impact on them.

The LIA is not a code, but a detailed legal document justifying the legitimacy of processing. It should explicitly state the legitimate interest, the types of data involved, the duration of processing, and the conclusion of the balancing test.

The team translates the conclusions of the LIA into a technical policy. This involves moving from the legal document to a series of specific, enforceable rules. Therefore, the code doesn't perform the balancing test; it enforces the result of the test.

The synergy between Technology and Language

While DAM relates to “who can access data and under what conditions”, a policy language explains “how that data must be processed and secured once accessed, and for how long.”

Without policy languages, the DAM system would have the ability to secure data but no way of outlining its usage boundaries. Conversely, a policy language without the DAM system forfeits its control to the sovereignty of the individual actors accessing that data.

Case Study: E-commerce Loyalty Program (Data Sharing)

Consider an e-commerce company that wants to share aggregated, pseudonymized customer purchase data with a third-party marketing analytics firm for a Loyalty Program but must adhere to explicit customer consent, data minimisation, and prevention of re-identification.

In such cases, an integrated DAM & Policy Language Solution helps to define precise rules for data minimisation, re-identification prevention by ensuring K-anonymity standards, and consent checks. The DAM system can manage the data based on the coded policy. Thus, when the marketing team attempts to access a particular dataset, the DAM system would sift through the dataset to provide the requested information.

This system provides precisely the data it is permitted to access in a minimised, pseudonymized form, with verified consent, all enforced by automated systems driven by the policy language.

The key benefits of this synergy towards Privacy are:

  • Automated Compliance Validation- where a policy language defines all applicable regulatory consent rules, while the DAM system, operating on these policies, can be automated to check if access to specific PII is compliant with the recorded consent.

  • Scalable Privacy-by-Design- through a policy language-driven DAM means privacy requirements are embedded directly into system design and automation pipelines from inception, preventing privacy issues before they arise.

  • Reduced Human Error- as an automated policy enforcement drastically reduces the potential for human misconfiguration or oversight.

  • Dynamic Adaptability- allowing privacy regulations to evolve or business needs to change, by updating policies in the policy language, which can swiftly propagate those changes across the DAM infrastructure.

  • Enhanced Auditability and Explainability- which secures regulators from audits that question the extent of data protection enforcement, helping organisations to point to formalised policies (the defined rules) and the DAM logs (the proof of enforcement).

Conclusion

As we have seen, effective privacy engineering bridges the divide between legal requirements and system behaviour. As privacy regulations continue to expand, organisations that invest in robust DAM frameworks and well-defined policy languages will be better equipped to manage data responsibly and adapt to future legal and technical developments. It is in the interest of companies to enable security policy languages that enable easy yet risk-free access to data without compromising legal compliance. The ultimate aim should be moving beyond mere compliance, rather, to create a system where data protection is a foundational and verifiably part that is embedded into every digital process.