Explain the multimedia data mining architecture.
Multimedia Data Mining Architecture
The typical data mining process consists of several stages and the overall process is inherently interactive and iterative. The main stages of the data mining process are (1) domain understanding (2) data selection; (3) cleaning and preprocessing; (4) discovering patterns; (5) interpretation; and (6) reporting and using discovered knowledge.
(1) domain understanding
The domain understanding stage requires learning how the results of data mining will be used so as to gather all relevant prior knowledge before mining. Blind application of data-mining techniques without the requisite domain knowledge often leads to the discovery of irrelevant or meaningless patterns. For example, while mining sports video for a particular sport, for example, cricket, it is important to have a good knowledge and understanding of the game to detect interesting strokes used by batsmen.
(2) data selection
The data selection stage requires the user to target a database or select a subset of fields or data words to be used for data mining. A proper domain understanding at this stage helps in the identification of useful data. This is the most time-consuming stage of the entire data-mining process for business applications; data are never clean and in the form of suitable for data mining. For multimedia data mining, this stage is generally not an issue because the data are not in relational form and there are no subsets of fields to choose from.
(3) cleaning and preprocessing;
The next stage in a typical data-mining process is the preprocessing step that involves integrating data from different sources and making choices about representing or coding certain data fields that serve as inputs to the pattern discovery stage. Such representation choices are needed because certain fields may contain data at levels of details not considered suitable for the pattern discovery stage. The preprocessing stage is of considerable importance in multimedia data mining, given the unstructured nature of multimedia data.
(4) discovering patterns
The pattern-discovery stage is the heart of the entire data mining process. It is the stage where the hidden patterns and trends in the data are actually uncovered. There are several approaches to the pattern discovery stage. These include association, classification, clustering, regression, time-series analysis, and visualization. Each of these approaches can be implemented through one of several competing methodologies, such as statistical data analysis, machine learning, neural networks, and pattern recognition. It is because of the use of methodologies from several disciplines that data mining is often viewed as a multidisciplinary field.
(5) interpretation
The interpretation stage of the data mining process is used to evaluate the quality of discovery and its value to determine whether previous stages should be revisited or not. Proper domain understanding is crucial at this stage to put a value on discovered patterns.
(6) reporting and using discovered knowledge.
The final stage of the data mining process consists of reporting and putting to use the discovered knowledge to generate new actions or products and services or marketing strategies as the case may be. An example of reporting for multimedia data mining is the scout system from IBM in which the mined results are used by coaches to design new moves.
The architecture, shown in figure 9.1, captures the above stages of data mining in the context of multimedia data. The broken arrows on the left of figure 9.x indicate that the process is iterative. The arrows emanating from the domain knowledge block on the right indicate domain knowledge guides in certain stages of the mining process.
The spatiotemporal segmentation step in the architecture of figure 9.1 is necessitated by the unstructured nature of multimedia data. This step breaks multimedia data into parts that can be characterized in terms of certain attributes or features. Thus, in conjunction with the feature. extraction step, this step serves the function similar to that of the preprocessing stage in a typical data mining process. In image data mining, the spatiotemporal step simply involves image segmentation. Both region and edge-based image segmentation methods have been used at this stage in different applications. Although many researchers tend to treat image segmentation for data mining identical image segmentation needed for computer vision systems, there is an important difference between the requirements for the two segmentations. The image segmentation for a computer vision system should be such that it can operate without any manual intervention and it should be quantitatively accurate so as to allow the vision system to interact with its environment. On the other hand, image segmentation for most data mining applications has no requirement of interacting with its environment. Thus, it can incorporate manual intervention and can be approximate so as to yield features that can reasonably capture the image content.
Comments
Post a Comment