联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> javajava

日期:2019-03-16 11:12

Projects #1 and #2

Analyzing Big Data I

Project 1. Classification is the process of predicting the class of given data points. Classes

are sometimes called as targets/ labels or categories. Classification predictive modeling is

the task of approximating a mapping function (f) from input variables (X) to discrete output

variables (y).

For example, spam detection in email service providers can be identified as a classification

problem. This is a binary classification since there are only 2 classes as spam and not

spam. A classifier utilizes some training data to understand how given input variables relate

to the class. In this case, known spam and non-spam emails have to be used as the training

data. When the classifier is trained accurately, it can be used to detect an unknown email.

Classification belongs to the category of supervised learning where the targets also provided

with the input data. There are many applications in classification in many domains

such as in credit approval, medical diagnosis, target marketing etc.

With a group of no more than two, perform a complete data science evaluation of a dataset

to classify. You will need to:

? Find an appropriate dataset.

? Use the code provided to you in R to perform the analysis.

? Write up a detailed synopsis of your analysis. Please keep it brief, say no more than

five pages.

Project 2. Prediction is the process of predicting the values of a given process based on

input data. Again, the prediction task is to approximate a mapping function (f) from input

variables (X) to discrete output variables (y). For example, regression can be used to model

unknown y values given X. There are many applications in classification in many domains

such as in stock analysis, loan size, econometric theory, etc.

With a group of no more than two, perform a complete data science evaluation of a dataset

to predict. You will need to:

Find an appropriate dataset.

Use the code provided to you in R to perform the analysis.

1

Write up a detailed synopsis of your analysis. Please keep it brief, say no more than

five pages.


版权所有:留学生程序网 2020 All Rights Reserved 联系方式:QQ:99515681 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。