Explain discretization and concept hierarchy generation for categorical data.

- December 25, 2021

Discretization and Concept Hierarchy Generation for Categorical Data

Categorical data are discrete data. Categorical attributes have a finite number of distinct values with no ordering among the values, examples include geographic location, item type, and job category. There are several methods for the generation of concept hierarchies for categorical data

a)Specification of a partial ordering of attributes explicitly at the schema level by experts

Concept hierarchies for categorical attributes or dimensions typically involve a group of attributes. A user or an expert can easily define concept hierarchy by specifying a partial or total ordering of the attributes at a schema level. A hierarchy can be defined at the schema level such as street < city < province <state < country.

b) Specification of a portion of a hierarchy by explicit data grouping

This is identically a manual definition of a portion of a concept hierarchy. In a large database, is unrealistic to define an entire concept hierarchy by explicit value enumeration. However, it is realistic to specify explicit groupings for a small portion of the intermediate-level data.

c) Specification of a set of attributes but not their partial ordering

A user may specify a set of attributes forming a concept hierarchy, but omit to specify their partial ordering. The system can then try to automatically generate the attribute ordering so as to construct a meaningful concept hierarchy.

d) Specification of only of partial set of attributes

Sometimes a user can be sloppy when defining a hierarchy or may have only a vague idea about what should be included in a hierarchy. Consequently, the user may have included only a small subset of the relevant attributes for the location, the user may have only specified street and city. To handle such partially specified hierarchies, it is important to embed data semantics in the database schema so that attributes with tight semantic connections can be pinned together

Search This Blog

Notes for BSc CSIT

Explain discretization and concept hierarchy generation for categorical data.

Comments

Post a Comment

Popular posts from this blog

Discuss classification or taxonomy of virtualization at different levels.

Pure Versus Partial EC