Enterprise Information Architecture: Strategic thinking around Data and Processes - DATAVERSITY
Plan and implement enterprise-scale MDM and Data Governance solutions. • Develop master data model. • Identify, match, and link master records for various domains through entity 5 Data Management Concerns of MDM Architecture: Entities, Hierarchies, 9 Introduction to Information Security and Identity Management. Abstract: The enterprise-wide management of master data is a prerequisite for . organization of a system, embodied in its components, their relationships to each information architecture whose scope is restricted to a specific data type. When looking into the main issues with data quality, Master Data is often the high level architecture of these types of information in an enterprise: (MDM) is a comprehensive method of enabling an enterprise to link all of its.
We will write about everything data related from translating "data" speak into "business" speak, to governance models, to the real differences among the myriad software tools available.
But there's one catch: That's right, no 30, foot, theoretical strategies that leave you wondering how to execute and actually improve performance.
Visit regularly to learn from peers and partners on how they are managing and improving data, and we hope you'll also share your views and experiences. Kelle can be reached at kelle firstsanfranciscopartners. Be sure to visit today! In a buzz-word heavy industry, there is a lot of confusion around the difference between Data Management and Information Management. After all, isn't data information?
Master Data Management and Data Governance
Well, yes and no. Wikipedia defines data as: Yes, information is provided by data but only because data is always specified in some abstract setting. The class to which the attribute belongs The object which is a member of that class Some ideas about object operations or behavior, and relationships to other objects and classes.
Data alone and in the abstract therefore, does not provide information. Efficient Create, Read, Update, and Delete CRUD functions for transactional systems processing structured Operational Data Appropriate enforcement of access rights to data to allow only authenticated and authorized users to work with the data Low costs for administration by efficient administration interfaces and autonomics Business resiliency of the operational applications by providing proper continuous availability functions, including high availability and disaster recovery In essence, the data management capability provides all functions needed by transactional systems such as order entry or billing applications to manage structured Operational Data across its lifecycle.
Compliance with legal requirements for example, e-mail archiving Efficient management of Unstructured Data for example insurance contracts Delivery of content for web application for example, images for e-commerce solutions Appropriate enforcement of access rights to Unstructured Data to allow only authenticated and authorized users to work with the data Business resiliency of the operational applications by providing proper continuous availability functions, including high availability and disaster recovery Comprehensive content-centric workflow capabilities to enable for example workflow-driven management of insurance contracts This capability enables end-to-end management of Unstructured Data needed in many industries such as the insurance industry.
It also enables compliance solutions for e-mail archiving. Identity Analytics can be applied to mitigate fraud or to improve homeland security by discovering non-obvious and hidden relationships. DWs are the foundation for reporting for business analysts and report on historical data—the past.
For example, analyzing blog posts regarding a product to find out what features customers like or which parts they reported broke most often alongside selling statistics provides new insight.
Master Data Management (MDM) : Architecture & Technology | element61
This insight is unavailable with reporting on Structured Data only. Discovery mining in a DW allows a business to discover patterns. An example would be association rule mining to find out which products are typically bought together. Building a DW with enterprise scope where Operational Data from heterogeneous sources must be extracted, cleansed, and harmonized before it is loaded into the DW system has a strong dependency on the EII capability.
Here, two areas emerge as requirements for satisfying business needs by looking into the present or the future. Predictive Analytics are capabilities allowing the prediction of certain values and events in the future. For example, based on the electricity consumption patterns of the past, an energy provider would like to predict spikes in consumption in the future to optimize energy creation and to reduce loss in the infrastructure.
Real Time Analytics are capabilities to address the need to analyze large-scale volumes of data in real time. Real Time Analytics capabilities consist of the ability of real time trickle feeds see Chapters 8 and 13 for details on trickle feeds into the DW, the ability of complex reporting queries executing in real time within the DW, and the ability of real time delivery of analytical insight to front-line applications.
Stream analytics are another Real Time Analytics capability, which is introduced in section 4. Monitor and measure against the defined KPIs on an ongoing basis. Visualize the measurements in a smart way enabling rapid decision making.
Complement the visualization with trust indices about the quality of the underlying data putting the results in context regarding their trustworthiness. Intelligently act if the measurement of the KPIs indicates a need to act. Trigger events and notifications to business users if there are abnormalities in the data. This capability often depends on strong analytical application capabilities. Data harmonization from various Operational Data sources into an enterprise-wide DW.
- The Relationship Between Master Data Management and Enterprise Information Architecture
- The Art of Enterprise Information Architecture: A Conceptual and Logical View
- Master Data Management (MDM) : Architecture & Technology
For cost and flexibility reasons, hiding complexity in the various, heterogeneous data sources of new applications should not be implemented in such a way that they are tied to specific versions of these data sources. Thus federated access must be available. Re-use of certain data cleansing functions such as standardization services in an SOA to achieve data consistency and improve data quality on data entry must be supported.
This requires deploy capabilities of data quality functions as services. Extract-Transform-Load ETL typically identifies EII capabilities to extract data from source systems, to transform it from the source to the target data model, and finally to load it into the target system. ETL thus is most often a batch mode operation. Typical characteristics are that data volumes involved are generally large, the process and load cycles long, and complex aggregations and transformations are required.
During the last two to three years, in many enterprises, ETL changed from custom-built environments with little or no documentation to a more integrated approach by using suitable ETL platforms.
Improved productivity was the result of object and transformation re-use, strict methodology, and better Metadata support—all functions provided by the new ETL platforms. A discipline known as Enterprise Application Integration EAI 4 is typically considered for solving application integration problems in homogeneous as well as heterogeneous application environments.
Historically, applications were integrated with point-to-point interfaces between the applications—this approach of tightly coupled applications failed to scale with growing number of applications in an enterprise because the maintenance costs were simply too high. Every time an application was changed the point-to-point connections themselves as well as the applications on the other end of the connection had to be changed too. Avoiding these costs and increasing flexibility for evolving applications were drivers of the wide-spread deployment of an SOA.
As a result, the applications became more loosely coupled creating more agility and flexibility for business process orchestration. Now, in many cases, applications are application-integrated with an Enterprise Service Bus ESB 5 based on message-oriented middleware. With that, an interface change of one application can be hidden from other applications because the ESB through mediation using a new interface map can hide this change.
Compared to point-to-point connections, this is a significant advantage. IT Architects now have an abundance of materials 6 available for this domain. High latency is the major disadvantage of traditional ETL moving data from one application to the next.
Streams techniques and certain EAI techniques based on ESB infrastructure can be used to solve the problem of high latency for data movement. For example, if a customer places an order through a website of the e-commerce platform and expects product delivery in 24 hours or less, a weekly batch integration to make fulfillment and billing applications aware of the new order is inappropriate.
Today, EAI solves this by providing asynchronous and synchronous near real time and real time capabilities useful for data synchronization across systems.
EAI can effectively move data among systems in real time, but does not define an aggregated view of the data objects or business entities nor does it deal with complex aggregation problems. It resolves transformations of data generally only managed at the message level. To date, the term EII has been typically used to summarize data placement capabilities based on data replication techniques and capabilities to provide access to data across various data sources.
Providing a unified view of data from disparate systems comes with a unique set of requirements and constraints. First, the data should be accessible in a real-time fashion, which means that we should be accessing current data on the source systems as opposed to accessing stale data from a previously captured snapshot.
Second, the semantics, or meaning, of data needs to be resolved across systems. Different systems might represent the data with different labels and formats that are relevant to their respective uses, but that requires some sort of correlation by the end user to be useful to them. Duplicate entries should be removed, validity checked, labels matched, and values reformatted. The challenges with this information integration technique involve governing the use of a collection of systems in real time and creating a semantic layer that should map all data entities in a coherent view of the enterprise data.Every Master Data Management Project Must Consider Taxonomy
These techniques for information integration are applied across all five data domains: We briefly introduce the new set of EII capabilities which are described in more detail in Chapter 8: Discover capabilities—They detect logical and physical data models as well as other Technical and Business Metadata.
They enable understanding of the data structures and business meaning. Profile capabilities—They consist of techniques such as column analysis, cross-table analysis, and semantic profiling.
They are applied to derive the rules necessary for data cleansing and consolidation of data because they unearth data quality issues in the data, such as duplicate values in a column supposedly containing only unique values, missing values for fields, or non-standardized address information.
Master Data Management and Data Governance [Book]
Cleanse capabilities—They improve data quality. Name 7 and address standardization, data validation for example address validation against postal address dictionariesmatching to identify duplicate records enabling reconciliation through survivorship rules, and other data cleansing logic are often used. Transform capabilities—They are applied to harmonize data. A typical example is the data movement from several operational data sources into an enterprise DW.
In this scenario, transformation requires two steps: First, the structural transformation of a source data model to a target data model. Second, a semantical transformation mapping code values in the source system to appropriate code values in the target system. Replicate capabilities—They deliver data to consumers. Typical data replication technologies are database focused using either trigger-based or transactional log-based Change Data Capture CDC mechanisms to identify the deltas requiring replication.
Federate capabilities—They provide transparent and thus virtualized access to heterogeneous data sources. From an information-centric view, federation is the topmost layer of virtualization techniques.