Identification Management SAD wiki

Title page

MDM Hub Software Architecture Document – Identification Management

Introduction
The document provides a comprehensive overview of the software architecture components supporting the Identification Management in Client Hub. One of the main purposes of the Client Hub is to raise alert for suspect duplicate profiles that may exist within the system for a given party and provide with tools to manage the duplicate profiles (e.g. merge profile). Identification Management process is necessary to support these requirements.

Purpose
This document provides an architectural overview of the ETL processing system and Hub Services using a number of different use cases and architectural views to depict different aspects of the Identification Management within the Client Hub. It is intended to capture and convey the significant architectural decisions which have been made on the system.

Scope
The document focuses on the ETL software components and Hub services that support the functionality to manage profiles of Client data records as part of Identification Management.

Definitions, Acronyms and Abbreviations
Please refer to the Glossary section in the Appendix, which contains a list of the main definitions, acronyms and abbreviations used in this document.

Overview
The document begins with a high-level architectural representation of the overall processes and subsequently provides a lower-level detail of each architecture component.

Architectural Representation
The high-level Identification Management architecture shows two main components- 1) Identification Management ETL processing architecture and 2) Hub Services architecture

''The ETL architecture component includes the source systems that provide data for the Client Hub and the Client Hub which is the target data destination of the ETL processing. It also includes the External Data Provider which is used for data enrichment and in order to obtain links between client records based on matching results provided by the vendor system. The software component of the ETL architecture are software programs that are used for extracting, transforming and loading of data into the databases within the Client Hub environment.''

The Hub Services architecture component comprises of the services which are required for Identification Management tasks.

ETL System Components

 * Source Systems:
 * Client Hub ’ s Databases:
 * External Data Providers Iif any for data enrichment)

ETL Software Components

 * Extract Program: An extract program residing on the source system that is responsible for extracting and providing data extract to External Vendor for Identification Link
 * ETL: ETL processes that involve in the data extraction, transformation and loading of data into a database within the Client Hub
 * Supporting ETL: Software programs that are auxiliary to the main ETL processes. They are responsible for sending client data to External vendor through FTP, receiving and processing enriched data from external vendor, logging execution status of ETL jobs, processing of exception handling rules and capturing of exception data.

Hub Services Components (Examples)

 * GUID Management Services
 * Key Generation Services:
 * Cross Reference Services
 * Business Enterprise Services:

Architectural Goals and Constraints
List the key high level architectural goals and constraints related to the ETL processes, Hub Services.

Use-Case View
Examples of the use cases within the client hub


 * Obtain External Link
 * Compute GUID
 * Match profile records
 * Assign GUID
 * Merge/Split Client Profile records
 * Update Cross Reference
 * Log exceptions

Initial Load
When the source data from Enterprise Business Systems is loaded into the staging area then client hub profiles are created. Each profile created in the staging area will be assigned a unique profile id. The goal of the GUID computation process is to assign GUID to each profile. The same GUID assigned to multiple profiles signifies that the profiles belong to the same party based on the matching rules defined.

If minimum requirements for an External Identification Link assignment are met for the profile, the name and address information along with the profile key will be sent to Vendor. Vendor will return the External Link. If minimum requirements for GUID calculation are met, GUID will be computed. For some profiles the data may not be sufficient to obtain the External Identification Link and / or GUID. The GUID values will remain blank in these scenarios.

The following table illustrates the order of invocation of the use cases required for implement the scenario along with corresponding functionality provided by each use case.

Change Data Capture
When a critical piece of data changes on a profile, GUID may need to be recomputed. This GUID change may trigger GUID re-assignment on other profiles. Below is an example to illustrate the scenario

Initially all three records were assigned different GUIDs, which indicates that the profiles represent three different parties. The end user updated the second record by providing the SSN and DOB.



After the change the first record is linked to the second one through the SSN and address. The second and the third records are linked by the name, address, and date of birth. As a result all three records are linked by the same GUID. The chaining effect caused the linkage between record 1 and record 3 even though no data change occurred on these profiles and they represented different parties before the change in the second record.

Sequence Diagram
Sequence diagram illustrating the sequence of use case interactions and system boundaries where a use case is residing.

Functionality
'' The section describes functionalities of the ETL processing software component. Some of the examples are''

Sample functionality is described below

Logical View
This section should provide a logical view of the services, system, and the database components

Logical View – ETL Services



Logical View - Hub Services



Process View
The process view of the ETL processing architecture reveals ETL processes that support the Identification Management requirements.

Example: Client Hub-to-External Vendor ETL
''The ETL software component is primarily responsible for sending data files, control files to Vendor via the Enterprise Data Exchange Server. It is responsible for creating a package of client data elements by identifying the profile records which needs to be sent out.''

The major processing steps for the component is given below


 * Flow 1A:  Extract Profile Data from Client Hub
 * Flow 1B: Validate Profile Data
 * Flow 1C: Transform Data into Vendor Format
 * Flow 1D: Generate data files
 * Flow 1E: FTP Data Files via Enterprise Data Exchange Server

Deployment View
This deployment view of the ETL and Hub services architecture for Identification Management provides a level of details as to how these architecture components reside in the source systems, the ETL environment as well as in the Client Hub environment.



Overview (Example)
Major Processing Steps and Design Specifications:

FTP Client Hub data to Vendor System

 * Program Logic Overview
 * Program Specifications

The following exception-handling rules defined for this program.
 * Exception Handlings

Data View
The section provides a detail of each data architectural components that are relevant to the ETL Processing

Examples : Source file names; details of the relevant data models etc.

Deployment Information of Software Components
The section is for informational purposes only.