North Georgia Technical College Data Mining with R Computer Science Task


Having Trouble Meeting Your Deadline?

Get your assignment on North Georgia Technical College Data Mining with R Computer Science Task  completed on time. avoid delay and – ORDER NOW

2.11 The dataset ToyotaCorolla.csv contains data on used cars on sale during the late summer

of 2004 in the Netherlands. It has 1436 records containing details on 38 attributes,

including Price, Age, Kilometers, HP, and other specifications.a. Explore the data using the data visualization capabilities of R.Which of the pairs

among the variables seem to be correlated? b. We plan to analyze the data using various data mining techniques described in future

chapters. Prepare the data for use as follows:           i. The dataset has two categorical attributes, Fuel Type and Metallic. Describe how

you would convert these to binary variables. Confirm this using R’s functions

to transform categorical data into dummies.      ii. Prepare the dataset (as factored into dummies) for data mining techniques of

supervised learning by creating partitions in R. Select all the variables and use

default values for the random seed and partitioning percentages for training

(50%), validation (30%), and test (20%) sets. Describe the roles that these partitions will play in modeling.Attached ToyotaCorolla.csv for your reference.

answers to the following questions at the end of chapter 2 in the text: Problems 2, 3, 4, 7. But the end of chapter 2 in the text, we have Problems but they start with 2.1, 2.2 etc., like below. I am confused whether I need to answer 2.2, 2.3 or something else

2.1 Assuming that data mining techniques are to be used in the following cases, identify whether the task required is supervised or unsupervised learning.

a. Deciding whether to issue a loan to an applicant based on demographic and financial data (with reference to a database of similar data on prior customers).

b. In an online bookstore, making recommendations to customers concerning additional items to buy based on the buying patterns in prior transactions.

c. Identifying a network data packet as dangerous (virus, hacker attack) based on comparison to other packets whose threat status is known.

d. Identifying segments of similar customers.

e. Predicting whether a company will go bankrupt based on comparing its financial data to those of similar bankrupt and nonbankrupt firms.

f. Estimating the repair time required for an aircraft based on a trouble ticket.

g. Automated sorting of mail by zip code scanning.

h. Printing of custom discount coupons at the conclusion of a grocery store checkout based on what you just bought and what others have bought previously.

2.2 Describe the difference in roles assumed by the validation partition and the test partition 

Order Solution Now

Similar Posts