Short note on DATA MARTS.

 DATA MARTS



•  A data mart is a subset of a data warehouse oriented to a specific business line. Data marts contain repositories of summarized data collected for analysis on a specific section or unit within an organization E.g., Marketing, Sales, HR, or finance. It is often controlled by a single department in an organization. Data Mart usually draws data from only a few sources compared to a Data warehouse. Data marts are small in size and are more flexible compared to a Data warehouse.

• Data marts can be considered as the database or collection of databases that are designed to help managers in making strategic decisions about business and the organization. Data marts are usually smaller than data warehouses as they focus on some subject or a department of an organization (a data warehouse combines databases across an entire enterprise).

• A Data Mart is focused on a single functional area of an organization and contains a subset of data stored in a Data Warehouse. A Data Mart is a condensed version of a Data Warehouse and is designed for use by a specific department, unit, or set of users in an organization. E.g., Marketing, Sales, HR, or finance. It is often controlled by a single department in an organization.

• Data Mart usually draws data from only a few sources compared to a Data warehouse. Data marts are small in size and are more flexible compared to a Datawarehouse.

Data Mart: According to Inmon data mart is a subset of the repository that receives its data and information from the common warehouse but it is dependent on it. In another way, a data mart is small storage of data built to save the data and information for similar departments. The reason for the data mart is to provide easier access to data and fast response for queries. Moreover, data mart increases the departments’ performance 


Reasons for creating a data mart

  • Creates collective data by a group of users
  • Easy access to frequently needed data
  • Ease of creation
  • Improves end-user response time
  • Lower cost than implementing complete data warehouses
  • Potential clients are more clearly defined than in a comprehensive data warehouse
  • It contains only essential business data and is less cluttered.

Types of Data Marts

There are three types of data marts: dependent, independent, and hybrid. They are categorized based on their relation to the data warehouse and the data sources that are used to create the system.


1. Dependent Data Marts

  • A dependent data mart is created from an existing enterprise data warehouse. It is the top-down approach that begins with storing all business data in one central location, then extracts a clearly defined portion of the data when needed for analysis.
  • To form a data warehouse, a specific set of data is aggregated (formed into a cluster) from the warehouse, restructured, then loaded to the data mart where it can be queried. It can be a logical view or physical subset of the data warehouse:

Logical view - A virtual table/view that is logically—but not physically—separated from the data warehouse

Physical subset - Data extract that is a physically separate database from the data warehouse

Granular data—the lowest level of data in the target set—in the data warehouse serves as the single point of reference for all dependent data marts that are created.


2. Independent Data Marts

  • An independent data mart is a stand-alone system—created without the use of a data warehouse—that focuses on one subject area or business function. Data is extracted from internal or external data sources (or both), processed, then loaded to the data mart repository where it is stored until needed for business analytics.
  • Independent data marts are not difficult to design and develop. They are beneficial to achieve short-term goals but may become cumbersome to manage—each with its own ETL tool and logic—as business needs expand and become more complex.


3. Hybrid Data Marts

  • A hybrid data mart combines data from an existing data warehouse and other operational source systems. It unites the speed and end-user focus of a top-down approach with the benefits of the enterprise-level integration of the bottom-up method.

Advantages of Data Mart:

1. Implementation of data mart needs less time as compared to the implementation of the data warehouse as data mart is designed for a particular department of an organization.

2. Organisations are provided with choices to choose the model of data mart depending upon cost and their business.

3. Data can be easily accessed from the data mart.

4. It contains frequently accessed queries, so enable to analyse the business trends.

Disadvantages of Data Mart:

1. Since it stores the data related only to a specific function, so does not store a huge volume of data related to each and every department of an organization like a data warehouse.

2. Creating too many data marts becomes cumbersome sometimes.

Comments

Popular posts from this blog

Suppose that a data warehouse consists of the three dimensions time, doctor, and patient, and the two measures count and charge, where a charge is the fee that a doctor charges a patient for a visit. a) Draw a schema diagram for the above data warehouse using one of the schemas. [star, snowflake, fact constellation] b) Starting with the base cuboid [day, doctor, patient], what specific OLAP operations should be performed in order to list the total fee collected by each doctor in 2004? c) To obtain the same list, write an SQL query assuming the data are stored in a relational database with the schema fee (day, month, year, doctor, hospital, patient, count, charge)

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?