DM – Part 1: Data Modeling Overview

Advanced Data Modeling and Architecture 

Table of Contents

1.1: Prerequisites for this Session

 

Pre-Requisites

  • Should be aware of DBMS ConceptsS
  • Should be aware of Data warehouse concepts
  • Should be aware of Data base concepts (any database)
  • Should have 2+ years of experience on any ETL Tool

1.2 : Expectations from Session 

1.3: Scope of the Session

  1. Data Model Overview
  2. Data Models
  3. How to Model Data
  4. Data Modeling

–Conceptual Data Model

–Relational Data Modeling

–Dimensional Data Modeling

–Logical Data Modeling

–Physical Data Modeling

 

 

  1. Data Warehouse Database Administration and Performance Improvements
  2. Data Modeling for Data Warehouse Environment
  3. Introduction to Erwin Tool

Data Model Overview

1.4: Introduction to Data Models

Fundamentals of Database Systems

 

Data

A known fact that can be recorded and that have implicit meaning

Database

  A collection of related data with the following implicit properties

– A Database represents some aspect of real world, sometimes called the Universe of Discourse (UoD)

– A database is a logically coherent collection of data with some inherent meaning

– A Database is designed, built, and populated with data for a specific purpose

 Database Management System (DBMS)

–A collection of programs that enables users to create and maintain a database

–A general purpose software system that facilitates the process of defining, constructing and manipulating database for various applications

 

Database System

–Database and DBMS software together forms a database system

What is Data Modeling?

Data modeling is a technique for exploring the data structures needed to support an organization’s information need.

It would be a conceptual representation or a replica of the data structure required in the database system.

A data model focuses on which data is required and how the data should be organized.

At the conceptual level, the data model is independent of any hardware or software constraints.

 Why Use Data Modeling?

Leverage

-Data model serves as a blueprint for the database system.

Conciseness

-Data model functions as an effective communication tool for discussions with the users.

Data Quality

– Data model acts as a bridge from real-world information to database storing relevant data content

 What Makes a Good Data Model?

Completeness

 

–Ensure that every piece of information required for a System is recorded and maintained.

Non-Redundant

–One fact should be recorded only once. Repetition may result in inconsistency and increased storage requirements.

Adherence to Business Rules

–The collected data is to be recorded by considering all business rules. It should not violate any rule.

 Data Reusability

–Design a data structure to ensure re-usability.

 

 

Stability and Flexibility

–A model needs to be flexible enough to adopt to new changes without forcing the programmer to re-write the code.

Elegance

–A data model should neatly present the required data in the least possible number of groups or tables.

Communication

–A model should present the data in a manner understandable to all stakeholders.

Integration

-A good model is compatible with the existing and future systems.

1.5: Sample ER Data Model

1.6: Challenges, Benefits &  Opportunities

Challenges

Organizations today have vast quantities of data. Although this data contains information that is useful to the business, it can be extremely difficult to gather and report on this information. There are several key challenges that need to be addressed:

  • Discovering, collecting, and transforming data into a single source of record.

 

  • Ensuring that data is relevant and accurate for business reporting.

 

  • Storing historical data in a format that enables fast searches across large amounts of data.

Opportunities

If the data that a business holds can be unlocked and provide meaningful information to business users, there are several opportunities:

  • Business users can have access to relevant information to enable them to   make informed decisions.

 

  • Fast queries make data more accessible and historical data enables trends to be identified.

 

  • Data can be assessed at every level — from an individual purchase to the total sales of a multinational corporation.

Benefits

Designing data warehouses correctly by using a data model will help meet many of today’s data challenges. Key benefits include:

  • Designing structures specifically to enable fast querying for business-centric reporting.

 

  • Ensuring that business requirements are met, and reports are accurate and meaningful.

 

  • Documenting source and target systems correctly to aid development, ensure effective version control, and enhance understanding of the systems.

 1.7: Why data modeling Tool

 

Why Data modeling tool

  • Improves productivity among developers when database designs are divided, shared, and reused

 

  • Establishes corporate modeling standards

 

  • Creates good documentation (metadata) in variety of useful formats

 

  • Ensures consistency, reuse, and integration of enterprise data

 

  • Enable creation the data model in one notation and conversion it to another notation without losing the meaning of the model

 

  • Saves time by accelerating the creation of high-quality, high-performance physical database from logical model

 

  • Conserves resources and improves accuracy by synchronizing model and database
  • Improves productivity among developers when database designs are divided, shared, and reused

 

  • Establishes corporate modeling standards

 

  • Creates good documentation (metadata) in variety of useful formats

 

  • Ensures consistency, reuse, and integration of enterprise data

 

  • Enable creation the data model in one notation and conversion it to another notation without losing the meaning of the model

 

  • Saves time by accelerating the creation of high-quality, high-performance physical database from logical model

 

  • Conserves resources and improves accuracy by synchronizing model and database

1.8: Requirements for a good data modeling Tool

Requirements for a good data modeling tool

  • Diagram Notation
    • Both ER & dimensional modeling notation must be available
  • Reverse Engineering
    • Creation of a model based on the source data in the operational environment as well as from other external sources of data
  • Forward Engineering
    • Creation of the data definition language (DDL) for the target tables in the data warehouse databases
  • Source to Target Mapping
    • Linking of source data in the operational systems and external sources to the data in the databases in the target data warehouse
  • Data Dictionary or Repository
    • Contains metadata that describes the data model

1.9: Overview of Data Model Tools

Data Modeling tools available in market

  • AllFusion ERwin Data Modeler (www.ca.com) An industry-leading data modeling solution that can help you create and maintain databases, data warehouses and enterprise data models
  • PowerDesigner(www.sybase.com) Designs and generates the database schema through true bi-level (conceptual and physical) relational database modeling supports data warehouse specific modeling   techniques
  • Oracle Designer (www.oracle.com)
  • ER/Studio (www.embarcadero.com)
  • IBM VisualAge DataAtlas (www.software.ibm.com)
  • Popkin System Architect (www.popkin.com)
  • CAST DB-Builder (www.castsoftware.com)

 

For Data Warehouse Data Modeling tools, please refer to

http://en.wikipedia.org/wiki/Comparison_of_data_modeling_tools

http://www.databaseanswers.org/modelling_tools.htm

 

Summary

  • In this module, you learned about the following:

 

–Data Model Overview

–Prerequisites for this Course

–Expectations from this training

–Scope of the training

–Introduction to Data Models

 

  • Fundamentals of database Systems
  • Definition of a Model
  • What is Data Modeling
  • Why use Data Modeling
  • What Makes a Good Data Model

 

–Sample Data Models

–Data Model Tools