Your Perfect Assignment is Just a Click Away

We Write Custom Academic Papers

100% Original, Plagiarism Free, Customized to your instructions!

glass
pen
clip
papers
heaphones

DSCI 5240 – Data Mining Assignment 4

DSCI 5240 – Data Mining Assignment 4

A supermarket is offering a new line of organic products. The supermarket’s management wants to determine which customers are likely to purchase these products. The supermarket has a customer loyalty program. As an initial buyer incentive plan, the supermarket provided coupons for the organic products to all of the loyalty program participants and collected data that includes whether these customers purchased any of the organic products.
DSCI 5240 – Data Mining Assignment 4

Objectives

• Continue to gain experience with SAS Enterprise Miner • Learn to perform basic classification in SAS Enterprise Miner

Instructions

1. Submit report (docx) and SAS EM diagram (xml) through the UNT online learning management system

2. Clearly identify your name(s) on the cover page 3. A professional quality report is expected – messy or hard-to-read reports will be penalized 4. Document your actions within SAS EM using screenshots 5. Explain your answers as clearly as possible – vague answers will be penalized

Datasets

• organics.sas7bdat

Assignment Details:

A supermarket is offering a new line of organic products. The supermarket’s management wants to determine which customers are likely to purchase these products. The supermarket has a customer loyalty program. As an initial buyer incentive plan, the supermarket provided coupons for the organic products to all of the loyalty program participants and collected data that includes whether these customers purchased any of the organic products.

The ORGANICS data set contains 13 variables and over 22,000 observations. The variables in the data set are shown below with the appropriate roles and levels:

Variable Role Level Description

ID ID Nominal Customer loyalty identification number

DemAffl Input Interval Affluence grade on a scale from 1 to 30

DemAge Input Interval Age, in years DemCluster Rejected Nominal Type of residential neighborhood

DemClusterGroup Input Nominal Neighborhood group DemGender Input Nominal M = male, F = female, U = unknown DemRegion Input Nominal Geographic region DemTVReg Input Nominal Television region

PromClass Input Nominal Loyalty status: tin, silver, gold, or platinum

PromSpend Input Interval Total amount spent PromTime Input Interval Time as loyalty card member

TargetBuy Target Binary Organics purchased? 1 = Yes, 0 = No

TargetAmt Rejected Interval Number of organic products purchased

Although two target variables are listed, these exercises concentrate on the binary target variable TargetBuy.

Table 1. Variable Settings and Description

1. Open SAS Enterprise Miner Workstation 2. Create a new project. Name it Assignment4 and save it to your H drive or USB drive as

appropriate. 3. Create a new diagram. Name it Organics. 4. Import the organics.sas7bdat file into SAS Enterprise Miner

a. Set up the roles and levels for the variables as shown above. b. Examine the distribution of the target variable. What is the proportion of individuals

who purchased organic products?

c. The variable DemClusterGroup contains collapsed levels of the variable DemCluster. Presume that, based on previous experience, you believe that DemClusterGroup is sufficient for this type of modeling effort. Set the model role for DemCluster to Rejected.

d. As noted above, only TargetBuy will be used for this analysis and should have a role of Target. Can TargetAmt be used as an input for a model used to predict TargetBuy? Why or why not?

e. Finish the ORGANICS data source definition. 5. Add the ORGANICS data source to the Organics diagram workspace. 6. Add a Data Partition node to the diagram and connect it to the Data Source node. Assign 50% of

the data for training and 50% for validation. 7. Add a Decision Tree node to the workspace and connect it to the Data Partition node. 8. Create a decision tree model autonomously. Use average square error as the model assessment

statistic. a. How many leaves are in the optimal tree?

b. Which variable was used for the first split?

c. What were the competing splits for this first split?

9. Add a second Decision tree node to the diagram and connect it to the Data Partition node. a. In the Properties panel of the new Decision Tree node, change the maximum number of

branches to allow for three-way splits. b. Create a decision tree model using average square error as the model assessment

statistic. c. How many leaves are in the optimal tree?

d. Based on average square error, which of the decision tree models appears to be better?

DSCI 5240 – Data Mining Assignment 4

Order Solution Now

Our Service Charter

1. Professional & Expert Writers: Homework Discussion only hires the best. Our writers are specially selected and recruited, after which they undergo further training to perfect their skills for specialization purposes. Moreover, our writers are holders of masters and Ph.D. degrees. They have impressive academic records, besides being native English speakers.

2. Top Quality Papers: Our customers are always guaranteed of papers that exceed their expectations. All our writers have +5 years of experience. This implies that all papers are written by individuals who are experts in their fields. In addition, the quality team reviews all the papers before sending them to the customers.

3. Plagiarism-Free Papers: All papers provided by Homework Discussion are written from scratch. Appropriate referencing and citation of key information are followed. Plagiarism checkers are used by the Quality assurance team and our editors just to double-check that there are no instances of plagiarism.

4. Timely Delivery: Time wasted is equivalent to a failed dedication and commitment. Homework Discussion is known for timely delivery of any pending customer orders. Customers are well informed of the progress of their papers to ensure they keep track of what the writer is providing before the final draft is sent for grading.

5. Affordable Prices: Our prices are fairly structured to fit in all groups. Any customer willing to place their assignments with us can do so at very affordable prices. In addition, our customers enjoy regular discounts and bonuses.

6. 24/7 Customer Support: At Homework Discussion, we have put in place a team of experts who answer to all customer inquiries promptly. The best part is the ever-availability of the team. Customers can make inquiries anytime.