Short note on Data discretization.
Data discretization transforms numeric data by mapping values to interval or concept labels. Such methods can be used to automatically generate concept hierarchies for the data, which allows for mining at multiple levels of granularity. Discretization techniques include binning, histogram analysis, cluster analysis, decision tree analysis, and correlation analysis. For nominal data, concept hierarchies may be generated based on schema definitions as well as the number of distinct values per attribute.
Although numerous methods of data preprocessing have been developed, data pre-processing remains an active area of research, due to the huge amount of inconsistent or dirty data and the complexity of the problem.
Comments
Post a Comment