Mathematics Diamond Classification System Worksheet – Description
Pete, owner of Pistol Pete’s Diamond Emporium, is investing in a diamond classification system due to his deteriorating eyesight. Pete buys and sells diamonds of varying quality: Low ($1,000-$3,000), Medium ($4,000-$7,000), and High ($8,000-$10,000). It is very important to Pete that his classifier properly classifies his diamonds so that he can not only have a profitable business, but also, so that his customers will continue to trust him as a business owner.
Using the possible cost matrix values given below, fill out the cost matrix that most accurately reflects Pete’s needs for his diamond classifier model. After completing the cost matrix, justify your proposed cost matrix.
Possible cost matrix values: -1, -1, 0, 20, 20, 20, 20, 100, 100
You have been given a data set containing three discrete attributes and five continuous attributes. After carefully analyzing the problem and the available attributes, you decide that one of the continuous attributes, estimate, should be used as the class attribute for your classification problem. Describe how you could use the estimate attribute when performing classification.
When performing an unsupervised k-means clustering, it is sufficient to generate a single clustering (not a single cluster). Do you agree with this statement? Why or why not?
You have been given a customer data set containing several attributes that describe each customer’s demographics, such as age, gender, income, and so on. It’s your job to find groupings of customers so that you can properly advertise your products to them. Describe how you can use clustering to determine and analyze the groupings of customers. It’s important that you specifically note the methodology for choosing the number of groupings, based upon the data set.
Consider the following 2-itemsets and their associated support counts.
{Muffins, Donuts} = 712
{Muffins, Cake} = 771
{Donuts, Bagels} = 406
{Donuts, Cake} = 808
{Bagels, Cake} = 935
{Muffins, Bagels} = 681
If there are 1000 transactions in the transaction database and the minimum support threshold is 0.7, highlight in yellow the following 3-itemsets that will be generated. To receive full credit, you must show your work.
{Muffins, Bagels, Donuts}
{Muffins, Bagels, Cake}
{Muffins, Donuts, Cake}
Working
Run the Nearest Neighbor classifier with a k-value of 7 and a Support Vector Machine with default values using 10-folds cross validation on the diabetes data set (diabetes.arff in Assignment 3 on my Courses) in Weka. Fill in the confusion matrices for the models in the tables below and use the cost matrix to compute the cost for each model. Based upon the cost, which model should be selected and why?
The post Mathematics Diamond Classification System Worksheet first appeared on .