Describe the main reasons for the potential advantage for distributed database.

 Distributed database (DDB)

 A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (DDBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users. The distributed database (DDB) and distributed database management system (DDBMS) together are called Distributed database systems (DDBS).

The main reasons for the potential advantage of distributed databases. /Distributed database management is basically proposed for various reasons from organizational decentralization and economic processing to greater autonomy. Some of these advantages are as follows:


1. Management of data with different levels of transparency –

Ideally, a database should be distribution transparent in the sense of hiding the details of where each file is physically stored within the system. The following types of transparencies are basically possible in the distributed database system:

Network transparency:

This basically refers to the freedom of the user from the operational details of the network. These are of two types Location and naming transparency.

Replication transparencies:

It basically made users unaware of the existence of copies as we know that copies of data may be stored at multiple sites for better availability performance and reliability.

Fragmentation transparency:

It basically made the user unaware of the existence of fragments it may be the vertical fragment or horizontal fragmentation.

2. Increased Reliability and availability –

Reliability is basically defined as the probability that a system is running at a certain time whereas Availability is defined as the probability that the system is continuously available during a time interval. When the data and DBMS software are distributed over several sites one site may fail while other sites continue to operate and we are not able to only access the data that exist at the failed site and this basically leads to improvement in reliability and availability.


3. Easier Expansion –

In a distributed environment expansion of the system in terms of adding more data, increasing database sizes, adding more data, increasing database sizes, or adding more processors is much easier.


4. Improved Performance –

We can achieve interquery and intraquery parallelism by executing multiple queries at different sites by breaking up a query into a number of subqueries that basically executes in parallel which basically leads to improvement in performance.



Comments

Popular posts from this blog

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?

What is national data warehouse? What is census data?