Describe the schemas for Multidimensional Data Model (Star Schema , Snowflakes Schema ,Fact Constellations Schema)

 Schemas for Multidimensional Data Model are:-

• Star Schema

• Snowflakes Schema

• Fact Constellations Schema

1. Start Schema

  • It is the data warehouse schema that contains two types of tables: Fact Table and Dimension Tables. Fact Table lies at the center point and dimension tables are connected with fact table such that star share is formed.

→ Fact Tables: A fact table typically has two types of columns: foreign keys to dimension tables and measures that contain numeric facts. A fact table can contain fact data on detail or aggregated level.

→ Dimension Tables: Dimension tables usually have a relatively small number of records compared to fact tables, but each record may have a very large number of attributes to describe the fact data. Each dimension in the star schema has only one dimension table and each table holds a set of attributes. This constraint may cause data redundancy. The following diagram shows the sales data of a company with respect to the four dimensions, namely time, item, branch, and location.



  • There is a fact table at the center. It contains the keys to each of the four dimensions. The fact table also contains the attributes, namely dollars sold and units sold.

Advantages and Disadvantages of Star Schema 

  • Since star schema contains de-normalized dimension tables, it leads to simpler queries due to a lesser number of join operations and it also leads to better system performance. 
  • On the other hand, it is difficult to maintain the integrity of data in star schema due to de-normalized tables. It is the widely used data warehouse schema and is also recommended by oracle

2. Snowflakes Schema.

  • The snowflake schema is a variant of the star schema model, where some dimension table is normalized, thereby further splitting the data into additional tables. The resulting schema graph forms a shape similar to a snowflake. 
  • For example, the item dimension table in a star schema is normalized and split into two dimension tables, namely item and supplier table.


Advantages and Disadvantages of Snowflakes Schema
  • Due to normalization table is easy to maintain and saves storage space. However, this saving of space is negligible in comparison to the typical magnitude of the fact table.
  •  Furthermore, the snowflake structure can reduce the effectiveness of browsing, since more joins will be needed to execute a query. Consequently, the system performance may be adversely impacted. 
  • Hence, although the snowflake schema reduces redundancy, it is not as popular as the star schema in data warehouse design.

3. Fact Constellations Schema
  • This kind of schema can be viewed as a collection of stars, and hence is called a galaxy schema or a fact constellation.
  •  A fact constellation schema allows dimension tables to be shared between fact tables. 
  • For example, the following schema specifies two fact tables, sales, and shipping. 
- The sales table definition is identical to that of the star schema. 
- The shipping table has five dimensions or keys: item key, time key, shipper key, from location, and to location, and two measures: dollars cost and units shipped.



Comments

Popular posts from this blog

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?

Suppose that a data warehouse consists of the three dimensions time, doctor, and patient, and the two measures count and charge, where a charge is the fee that a doctor charges a patient for a visit. a) Draw a schema diagram for the above data warehouse using one of the schemas. [star, snowflake, fact constellation] b) Starting with the base cuboid [day, doctor, patient], what specific OLAP operations should be performed in order to list the total fee collected by each doctor in 2004? c) To obtain the same list, write an SQL query assuming the data are stored in a relational database with the schema fee (day, month, year, doctor, hospital, patient, count, charge)