digitalAudience DCR principles

Our data clean room solution is built to meet and exceed the principles outlined in the IAB guidelines, ensuring secure, privacy-first collaboration between data partners. By combining advanced privacy-enhancing technologies with strict access controls and query limitations, we enable participants to gain actionable insights without exposing raw or identifiable data. This document highlights how our implementation not only adheres to industry standards but also introduces additional safeguards and flexibility for modern data collaboration.

IAB Recommendation

What DCR principles our data collaboration solution already offers

Data Isolation

Raw or plaintext data cannot be observed or learned by any participant or DCR Provider unless the participants agree.

Data contributors cannot observe or learn any raw or plaintext data because the data is hashed in the browser before it lands in the data storage if the client has not hashed it themselves beforehand.

Moreover, when collaborating with other data partners, data contributors have only pre-defined capabilities. It is not possible to query the other participant’s data. A participant can only view the overlap analysis results.

Privacy Enhancing Technologies (PETs)

To minimize data movement, risk of exposure of personal data, and misuse of data for re-identification of individuals, a DCR must deploy a combination of one of more PETs.

We use a combination of encryption and private set intersection to allow data providers to view a rough estimate of a resulting intersection.

  1. Encryption

  2. Secure multi-party compute

  3. Private set intersection

  4. Federated learning

  5. Synthetic data

  6. Pseudonymization

  7. Noise injection

  8. Differential privacy

  9. K-anonymity

Privacy control mechanisms:

  1. Limiting the number of queries allowed

  2. Limiting the time for which access to compute operations are allowed, expiring the data access after a certain time window

  3. Limiting the type or complexity of queries that can be executed

  4. Restricting reuse of one data set with other participants

  5. Requiring rebuild of input data sets for each operation

  6. Apply statistical noise on query results

  7. Limit the outputs or granularity to only those necessary insights that are required for the task

In technical terms, we limit the number of queries that can be executed and limit the granularity to the necessary insights required for the task. Practically speaking, our users can perform only a few queries:

  1. View the results of the private set intersection.

  2. View the match key for the private set intersection.

  3. There is a list of preset queries that can be applied on the data sets. It is not possible to modify the queries by the users.

  4. Each collaboration executes a full recalculation of the overlap function, injecting noise to increase the ambiguity of the result set.

  5. Limit the data involvement to the necessary parts selectively.

Access Controls

DCRs provide permissions and scoped access controls to define, monitor, and control who can perform what specific action, for what purpose, at what granularity, for how long.

We provide access only to the permitted audiences for collaboration.

Moreover, we let our clients decide who can collaborate with their audiences and for how long.

Data Connection

DCR must provide a secure way for Data Contributors to connect their data to the DCR and define the format and structure for e.g. data types, data fields to ensure Data contributors can properly send data to DCR.

We support both client’s SFTP and S3 buckets as well as provide the clients with access to their designated S3 within our system.

In addition, we provide clear guidelines on how they should format and structure out data.

Data Transformation

Data collaboration may require assembling the data in a form and shape that is ready for joining with other data sets and querying.

We make the participant’s data interoperable by converting it into the identifiers that are required by the recipient system.

Data Processing

DCRs may provide two types of processing modes - centralized and federated.

We process all data in centralized mode post-encryption for matching. The real values are extracted within the DCRs, after the matching step, not available in the central processing environment.

Data Preparation and Protection

DCR provides the capability to protect and secure the personal data by converting them to irreversible anonymized values. This can be done inside the DCR env if you fully trust the provider of the env, or prior to submitting data to the DCR based upon agreed technologies and mechanisms. Some common mechanisms are:

  1. Salted hash

  2. Encryption

  3. Commutative Encryption

We encourage data participants to encrypt their data prior to onboarding, applying the SHA256 algorithm.

DCR Environment and Interface

We offer a user interface where users with limited or no technical knowledge can safely onboard, collaborate, and activate their data sets.

The activities that can be performed with UI, are also available in API mode, enabling server to server operations.

Data Computation

To enable collaboration between parties, a DCR may offer different join types and matching types.

There are two Join types:

  1. Party-to-party-join

  2. Multi-party join

Common Matching types are:

  1. Intersection

  2. Union

  3. Exclusion

Other data processing capabilities:

  1. Once the join and matching is completed, DCR may offer advanced compute and querying capabilities, for example generate insights and outcomes based on predictive ML models.

  2. The DCR Provider should be transparent about features used for data modelling, .e.g., LAL segments so that the DCR consumer is aware of attributes used in models and they can remain compliant under laws.

We offer two types of joins:

  1. Party-to-party-join

  2. Multi-party join

The support the intersection, union and exclusion matching type.

Data Output

The data outputs may be aggregate or at the individual user level.

  1. Aggregate outputs

    1. Insights

      1. Customer overlap analysis

      2. Consumer segmentation

      3. LAL modelling

      4. Audience expansion

    2. Measurement

      1. Frequency/lift analysis

      2. Reach and frequency

      3. Audience verification

      4. Attribution

  2. User level output (media activation/serving)

    1. Direct activation

      1. Emerging media, CTV, streaming, audio, gaming, and retail media

      2. walled gardens

    2. Indirect or open activation

      1. Private marketplaces and direct premium digital publishers

      2. Longtail media inventory over open bidding programmatic channels through integrated partners

We provide

  1. Aggregate outputs

    1. Customer overlap analysis

    2. LAL modelling

    3. Audience expansion through digitalAudience data

  2. We enable activation into walled gardens, private marketplaces, and direct premium digital publishers.

Last updated