Explain confusion matrix.

 All these four performance evaluation metrics are based on a special tabular form classification report called confusion matrix.

Confusion Matrix: A confusion matrix is a table that is used to describe the performance of a classifier based on a test dataset for which actual class labels are known. Given m classes, a confusion matrix is a table of size m by m. An entry, CM, indicates the number of tuples of class I that were labeled by the classifier as class j. The confusion matrix shows how many data tuples present in the test data set are correctly classified and how many data tuples are miss classified. It is named so because it shows how any classifier is confused when it makes predictions for the class label of data tuple. It shows the errors in the model performance in the form of a matrix, hence also known as an error matrix. The structure of the confusion matrix for m = 2 classes looks like as shown in figure 6.11 below.


There are various terminologies used in the confusion matrix. To understand these terminologies let the total class in our data set be two namely class1 (e.g., positive class)and class2 (e.g., Negative class). 

True Positive (TP): Data tuple with class1 label in actual are also predicted as a class] data tuple by the model is called true positive prediction. The TP section of the confusion matrix contains a total number of true positive predictions made by the model.

False Negative (FN): Data tuple with class1 label in actual are predicted incorrectly as class2 data tuple by the model is called false-negative prediction. The FN section of the confusion matrix contains the total number of false-negative predictions made by the model.

True Negative (TN); Data tuple with class2 label in actual are also predicted as class2 data tuple by the model is called true negative prediction. The TN section of the confusion matrix contains the total number of True negative predictions made by the model.

 Positive (FP): Data tuple with class2 label in actual are predicted incorrectly as class1 data tuple by the model is called false positive prediction. The FP section of the confusion matrix contains the total number of false-positive predictions made by the model. 

Comments

Popular posts from this blog

Discuss classification or taxonomy of virtualization at different levels.

What is RMI? Discuss stub and skeleton. Explain its role in creating distributed applications.

Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg_grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg_grade measure stores the actual course grade of the student. At higher conceptual levels, avg_grade stores the average grade for the given combination. a) Draw a snowflake schema diagram for the data warehouse. b) Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each BigUniversity student. c) If each dimension has five levels (including all), such as “student < major < status < university < all”, how many cuboids will this cube contain (including the base and apex cuboids)?