How Identity Resolution Works

This article provides a technical explanation of how identity resolution operates within the Untitled platform, including the phases of resolution, validation logic, and matching safeguards.

Introduction

The system is designed to maintain high confidence at internet scale while balancing match rate and precision.

Identity resolution within Untitled is built on:

Deterministic signal triangulation
Distributed validation across large datasets
Cross-device HEM anchoring
Ongoing feedback loops
Confidence reinforcement over time

Accuracy is not binary. It is signal-weighted and environment-dependent. In production environments with sufficient traffic and activation, matching accuracy typically falls within the 90–95% range.

Overview of the Resolution Framework

Identity resolution follows a three-phase framework:

Verification — Establishing deterministic linkage between digital identifiers and a primary identity marker
Efficacy — Expanding that linkage through the Identity Graph and enrichment layers
Accuracy — Reinforcing and improving record confidence through activation feedback loops

Each phase builds on the prior one.

Phase 1: Verification

Objective

Establish a deterministic linkage between a browser event and a primary identity marker.

Primary Identity Marker

The system uses Hashed Email Addresses (HEMs) as the foundational identity anchor within the Identity Graph.

HEMs enable:

Cross-device linkage
Offline-to-online reconciliation
Deterministic record reinforcement

Signals Captured When the Tag Fires

When the Untitled Identity Tag executes, the system observes:

IP address
Device characteristics
Browser characteristics
Cookie identifiers
Timestamped engagement activity

Specifications of this can be found in this article.

Triangulation Process

The system attempts to triangulate:

Observed IP address
Device-to-browser relationship
Cookie identifier
Known HEM associations

Validation includes:

Comparing observed IP against known IP footprint history
Validating device-to-HEM relationships
Timestamp alignment across signals

If sufficient deterministic alignment exists, a valid linkage is formed.

Why Signal Density Matters

The more frequently a visitor is observed, the more confidence increases.

Accuracy improves:

Proportionally with site scale
Logarithmically with session frequency

In practical terms: More tag fires → More triangulation → Higher confidence.

Low-traffic environments provide fewer signals, increasing the likelihood of ambiguity.

Phase 2: Efficacy

Objective

Expand and enrich the verified identity through the Identity Graph.

Once a valid HEM linkage is established, the system maps online identifiers to:

B2C demographic data
B2B firmographic data
Cross-device associations
Behavioral attributes

Distributed Validation Model

To maintain record confidence at scale, the system relies on:

First-party and third-party cookie sync partnerships
Email engagement IP footprint validation
Mobile movement data (MAID/HEM associations)
Log-level timestamp validation

Rather than correcting individual records in isolation, the system relies on distributed signal consensus. This avoids overfitting to single data points and instead maintains majority confidence alignment across signals.

The Self-Healing Mechanism

Records are continuously evaluated for:

Signal reinforcement
Drift detection
Stale or invalid attributes

If signals weaken or contradict prior associations, confidence scoring adjusts accordingly. This distributed reinforcement model enables ongoing recalibration without sacrificing match scale.

Phase 3: Accuracy

Objective

Increase confidence through real-world activation and client-side validation.

The highest level of resolution occurs when:

Records are activated in marketing channels
A conversion or transaction occurs
First-party data confirms identity

When this feedback is ingested:

Confidence improves locally
Enrichment becomes more precise
Similar audience modeling improves

This creates a reinforcing loop: Activation → Validation → Confidence Improvement → Better Future Resolution

Deterministic vs Probabilistic Matching

Current State

The system is primarily deterministic and rule-based.

Strengths:

High confidence in observed signals
Reduced reliance on speculative modeling
Strong performance in high-signal environments

Limitation:

In low-volume settings, deterministic logic may produce extraneous linkages from partial signals.

Upcoming Enhancements

We are rolling out limited probabilistic modeling designed to:

Evaluate statistical likelihood that multiple records belong to the same individual
Reduce deterministic false positives
Improve ambiguity handling
Return only records meeting defined confidence thresholds

Trade-off:

Slightly reduced attribute fill in smaller datasets
Meaningfully higher precision overall

Everything is a balance between match rate and confidence.

What Can Undermine Accuracy?

Identity resolution can be impacted by:

Low traffic volume (<1K monthly uniques)
Inconsistent visitor sessions
Heavy post-resolution filtering
Small export subsets
QA or Dev environments

These conditions reduce signal density and increase ambiguity.

PreviousData Quality & Identity Resolution NextWhy Data Quality May Vary

Last updated 27 days ago

Was this helpful?

hashtagIntroduction

hashtagOverview of the Resolution Framework

hashtagPhase 1: Verification

hashtagObjective

hashtagPrimary Identity Marker

hashtagSignals Captured When the Tag Fires

hashtagTriangulation Process

hashtagWhy Signal Density Matters

hashtagPhase 2: Efficacy

hashtagObjective

hashtagDistributed Validation Model

hashtagThe Self-Healing Mechanism

hashtagPhase 3: Accuracy

hashtagObjective

hashtagDeterministic vs Probabilistic Matching

hashtagCurrent State

hashtagUpcoming Enhancements

hashtagWhat Can Undermine Accuracy?

Introduction

Overview of the Resolution Framework

Phase 1: Verification

Objective

Primary Identity Marker

Signals Captured When the Tag Fires

Triangulation Process

Why Signal Density Matters

Phase 2: Efficacy

Objective

Distributed Validation Model

The Self-Healing Mechanism

Phase 3: Accuracy

Objective

Deterministic vs Probabilistic Matching

Current State

Upcoming Enhancements

What Can Undermine Accuracy?