Forem

Merényi Mónika
Merényi Mónika

Posted on • Edited on

Understanding Entra Connect Sync Architecture: A Deep Dive - Part 1

Introduction

In today's cloud-driven world, identity synchronization is the backbone of seamless user access across hybrid environments. Microsoft Entra Connect Sync plays a crucial role in ensuring that on-premises directories and Microsoft Entra ID remain in sync.

But how does it really work under the hood? What components make up this synchronization engine? And how do sync rules define the flow of identities?

In this deep-dive series, we'll unravel Entra Connect Sync Architecture, breaking down:
✅ Core components of Entra Connect Sync
✅ Data flow and processing in synchronization
✅ Connectors, metaverse, and rules engine
✅ How synchronization rules shape identity lifecycle management

This first post focuses on understanding the architecture before we explore the intricacies of sync rules in future articles.

In my first post I quickly run over the basic architecture of Entra Connect, and we used it in real life, but I haven't really explained how it works in details.

I believe it will be valuable for many to explore the design and grasp the fundamental concepts.

What is Entra Connect Sync?

Entra Connect Sync bridges your on-premises Active Directory with Microsoft Entra ID by synchronizing user identities to the cloud. The data flow is fully customizable using sync rules, allowing you to control how information is synced. This ensures seamless authentication across your hybrid environment.

This topic explains how key features of the Microsoft Entra Connect Sync service work.

Sync Engine Overview

The sync engine gathers and manages identity data from multiple sources, such as Active Directory or SQL Server, creating a unified view of identities.
Any system that structures data in a database-like format and supports standard data-access methods can be a data source. These synchronized data sources are known as connected directories (CD) or connected data sources.

Connectors

The sync engine interacts with connected data sources through modules called Connectors. Each data source has a specific Connector that translates operations into a format the source understands.

Connectors use API calls to read and write identity data between the sync engine and the data source.
Custom Connectors can also be created using the extensible connectivity framework for additional integration options.

Connector

Connectors are installed on the machine that is running Entra Connect Sync.
They enable communication without requiring specialized agents by using remote system protocols. This approach reduces both deployment time and risk, particularly when integrating with critical applications and systems.

The connector handles all import and export operations, freeing developers from having to understand the native connection methods of each system.

Imports and exports are scheduled, meaning changes in the system do not automatically sync to the connected data source. Additionally, developers can create custom connectors to integrate with virtually any data source.

The default synchronization schedule for Microsoft Entra Connect Sync is set to run every 30 minutes. This means that the sync engine checks for updates and performs imports or exports at 30-minute intervals.

We will talk about imports and exports a little bit later.

Connector Space

Each connected date source is represented as a filtered subset of objects and attributes in its own Connector Space. The identity data is temporarily stored, and this design allows the sync engine to operate locally, reducing the need to interact with the remote system during the sycn process. It ensures that the data is properly mapped and syncronized before it is moved to its final destination.

The sync engine uses the connector space to determine what has changed in the connected data source and to stage incoming changes.

Connector Space

Metaverse

The metaverse is a central repository that consolidates identity data, acting as a unified view of identities from one or more connected data sources.
It stores and organizes identity objects and their attributes, ensuring they are mapped correctly and synchronized across systems.

The metaverse manages attribute flow, which is the process of transferring and transforming data based on predefined attribute mappings.

Attribute flow between the connector space and the metaverse is governed by synchronization rules, which define one-way data movement per rule. However, multiple rules can run in the same sync cycle, allowing bidirectional updates in different stages of synchronization.

Even with a single data source, the metaverse plays a key role in ensuring data is structured properly and updated efficiently without re-evaluating connections every time.

Here you can see how multiple data sources would use the metaverse:

multiple data

Sync Engine Identity Management Process

The sync engine ensures identity updates between connected data sources through three main processes: Import, Synchronization, and Export.

Import

The sync engine retrieves the identity data from the connected data source and stages it in the connector space. It detects changes and marks them as pending import for processing. Staging objects in the connector space ensures only modified data is synchronized, improving efficiency.

Synchronization

This process updates the metaverse with new or changed data from the connector space (inbound syncronization) and propagates updates back from metaverse to the connector space (outbound synchronization).
New objects maybe created (projected), linked to existing records (join), or updated as needed.

Export

Changes staged in the connector space as pending export are sent to the connected data source. Since the sync engine doesn’t maintain a persistent connection, it verifies changes by re-importing data to confirm successful updates.

Note: Sync engine does not maintain a live, continuous connection to external systems. Instead, it imports data on a scheduled basis to verify changes and confirm that exports were applied correctly.

Synchronization rules

Inbound and outbound syncronization both controlled by synchronization rules.

Inbound Synchronization Rules (ISR)

They define how the data flows from connector space to the metaverse.
They determine whether objects should be joined, projected (creating new objects) ** , or **updated and specify which attributes should be mapped to the metaverse.

Outbound Synchronization Rules (OSR)

Define how data flows from the metaverse to the connector space. They determine whether changes in the metaverse should be applied to connected data sources, including updating attributes, provisioned(creating new objects), or **deprovisioning* (disconnecting/deleting) objects.

In summary: new objects may be projected into the metaverse (projection), linked to existing records (join), or updated as needed. If a metaverse object needs to be created in a connected data source, it is provisioned into the corresponding connector space.

Rules

In Part 2, we’ll see more about the objects created in each space!

Top comments (0)