Chat with us, powered by LiveChat Write a report on Building and Evaluating Predictive Models using R. ( Should be familiar with machine learning) | WriteDemy

Question Description

Files to be submitted: word document of the report and R script files.

You may refer to the "Assignment1 – tasks.pdf" file in the attachments for list of tasks and instructions, and same content from the attached file has been posted below.

Objective:

a) Demonstrate knowledge of data exploration and selection of variables to apply for the predictive models

b) Demonstrate knowledge of building different types of predictive models using R

c) Demonstrate knowledge on comparing and evaluating different predictive models

d) Relate theoretical knowledge of predictive models and best practices to application scenarios

Part A – Data Exploration and Cleaning

Use the data for breakfast cereals (Cereals.csv) to answer the following

1. Which variables are continuous/numerical? Which are ordinal? Which are nominal?

2. Calculate following summary statistics: mean, median, max and standard deviation for each of the continuous variables, and count for each categorical variable. Is there any evidence of extreme values? Briefly discuss.

3. Plot histograms for each of the continuous variables and create summary statistics. Based on the histogram and summary statistics answer the following and provide brief explanations:

a. Which variables have the largest variability?

b. Which variables seems skewed?

c. Are there any values that seem extreme?

4. Which, if any, of the variables have missing values?

a. What are the methods of handling missing values?

b. Demonstrate the output (summary statistics and transformation plot) for each method in (4-a).

c. Apply the 3 methods of missing value handling discussed in the lectures. Which method of handling missing values is most suitable for this data set? Discuss briefly referring to the data set.

Part B Building predictive models using real world business case

Business Case

Alpha Traders Pty Ltd. is an Australian car sales company has purchased a stock of used Toyota Corolla cars for sale. The management of the Alpha is in the process of finalizing the selling prices of the purchased cars. Alpha Traders management is very keen to trial predictive modelling for this task and have obtained a historic car sales dataset of Toyota Corolla cars from a publicly available data repository.

The dataset contains 37 attributes of over 1400 sold Toyota Corolla cars. The attributes include the selling price of cars, age, kilometres driven, fuel type, horsepower, automatic or manual, number of doors, weight (in pounds), etc.

The management of Alpha Traders Pty Ltd. has outsourced the task to you to develop a reliable predictive model to predict the selling price of the cars, using the aforementioned historic dataset.

1. Data Exploration and Cleaning (15%)

a. Examine the prices of the Toyota Corolla vehicles. Explain the distribution of the prices.

b. Find out whether there are any missing values. Explain your findings.

c. Are there any categorical values that needs to be transformed into numerical values? Suggest the best possible transformation. Use this method to transform the variable(s).

d. Evaluate the correlations between the variables. Which variables should be used for dimension reduction? Explain. Carry out dimensionality reduction.

e. Explore the distribution of selected variables (from step 1-d) against the target variable. Explain.

2. Regression Modelling (20%)

a. Build a regression model with the selected variables. You need to try out at least 3 regression models to identify the optimal model.

b. Evaluate the accuracy of the regression model.

3. Decision Tree Modelling (20%)

a. Build a decision tree with the selected variables. You need to try out at least 3 decision trees with different complexity parameters to obtain the optimal tree.

b. Explain the output of the selected decision tree, evaluate the accuracy and reason for it to be selected.

4. Model Comparison (15%)

a. Compare the accuracy of the selected (optimal) regression model and (optimal) decision tree and discuss and justify the most suitable predictive model for the business case.

Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Ask A Question and we will direct you to our Order Page at WriteDemy. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.

Do you need an answer to this or any other questions?

About Writedemy

We are a professional paper writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework. We offer HIGH QUALITY & PLAGIARISM FREE Papers.

How It Works

To make an Order you only need to click on “Place Order” and we will direct you to our Order Page. Fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Are there Discounts?

All new clients are eligible for 20% off in their first Order. Our payment method is safe and secure.

Hire a tutor today CLICK HERE to make your first order