What types of data structure are widely used in cluster algorithm? Explain DATA MATRIX AND DISSIMILARITY MATRIX

- December 29, 2021

DATA MATRIX AND DISSIMILARITY MATRIX

First of all, let us know what types of data structures are widely used in cluster analysis. Main memory-based clustering algorithms typically operate on either of the following two data structures.

Data Matrix: This represents n objects, such as persons, with m attributes, such as age, height. weight, gender, race, and so on. The structure is in the form of a relational table, or n-by-m matrix (n objects x m attributes). The Data Matrix is often called a two-mode matrix since the rows represent objects and columns represent attributes.

Dissimilarity Matrix: This stores a collection of proximities that are available for all pairs of n objects. It is often represented by an n-by-n matrix, where dü, i) is the measured difference or dissimilarity between objects i and J. In general, d(i, i) is a non-negative number that is close to 0 when objects i and j are highly similar or near to each other and become larger the more they differ. The distance measure is symmetric in nature that is, d(i ) -d (j), i) and the distance of an object from itself is zero that is, d(i, i) "0, we have the matrix in the figure.

Search This Blog

Notes for BSc CSIT

What types of data structure are widely used in cluster algorithm? Explain DATA MATRIX AND DISSIMILARITY MATRIX

Comments

Post a Comment

Popular posts from this blog

Discuss classification or taxonomy of virtualization at different levels.

What is RMI? Discuss stub and skeleton. Explain its role in creating distributed applications.