讲解SVC留学生、辅导Python编程、讲解Python、辅导algorithm

- 首页 >> Python编程

This project is to be done individually. All the coding involves in this project must be in Python. You may only use techniques/algorithms covered in this class. However, you are allowed to use parameters, attributes, etc. that are not covered in this class as long as they belong to the techniques/algorithms covered in the class. For example, Support Vector Classification is the algorithm covered in the class but kernel=sigmoid is the parameter that is not covered in the class. You are allowed to use kernel=sigmoid with SVC. Meanwhile, genetic algorithm is the algorithm that is not covered in the class, so you are not allowed to use genetic algorithm in this project.


Your task in this assignment is to:


1.Provide summary statistics along with visualizations of these summary statistics for the variables that are interesting/relevant to your analyses.


2.Develop a regression model (i.e., a supervised-learning model where the target variable is a continuous variable) to predict the value of the variable “pledged.” After you obtain the final model, explain the model and justify the predictors you include/exclude.


3.Develop a classification model (i.e., a supervised-learning model where the target variable is a categorical variable) to predict whether the variable “state” will take the value “successful” or “failure.” After you obtain the final model, explain the model and justify the predictors you include/exclude.


4.Develop a cluster model (i.e., an unsupervised-learning model which can group observations together) to group projects together. After you obtain the final clusters, explain the characteristics that you observe in each cluster. Note that you will be graded based on the performance of the model and the insights you obtain from the clustering task. For example, if your model generates two clusters with a high cluster separation and low cluster cohesion but one cluster essentially represents successful projects while another cluster essentially represents failed projects, then the insights gained are severely limited.


For all tasks, you will be graded based on both the performance of the model and the explanations/justifications you provide. You also need to clearly articulate how realistic or useful your model would be in a business context.


There are two deliverables for this assignment:


1)Summary Report

The report must be submitted in pdf format.

The report must not exceed 5 double-spaced pages, including everything. Page margins must measure 1” around. Please use a 12-point Times New Roman font.

Name the file as follows: “Lastname_Firstname_IndividualProject”

The report must contain:

oThe summary statistics

oThe explanations/justifications of each model along with the results. You may submit only one model per task.

The report is due by Tuesday, December 3 at 11:59pm.


2)Python Code


Along with the report, please also submit Python code that you use to develop your report. The code should be complete with informative comments and able to run fully without any errors or modifications (besides the file path).


Data Description


The dataset in this project is scraped from Kickstarter, which is a popular crowdfunding platform. There are 45 variables in total. The table below contains a short description of each variable.


Column Name Description

project_id Unique identifier for projects

name Project

goal Goal amount requested by the project

pledgedAmount pledged at time of data scrape

state Status of the project (successful, failed, etc)

disable_communication If communication with project owners was disabled

countryOrigin country of project

currency Currency of origin country

deadline End date of project funding period

state_changed_at date and time the project state was modified to current state

created_at Date and time project was created

launched_at Date and time project was launched

staff_pick If the project was a staff pick

backers_count Number of backers

static_usd_rateThe conversion rate of project country currency to USD

usd_pledged Amount pledged in USD

category Category of project

spotlight If the project was featured on kickstarter spotlight page

name_len Length of project name in word count

name_len_cleanLength of project name in word count sans non- key words (such as “for” “and” etc.)

blurb_len_cleanLength of project blurb in word count sans non- key

words

deadline_weekday Weekday of deadline date

state_changed_at_weekday Weekday of state change

created_at_Weekday Weekday of creation date


launched_at_weekday Weekday of launch date

deadline_month Month of the project deadline

deadline_day Day of the project deadline

deadline_yr Year of the project deadline

deadline_hr Hour of project deadline

state_changed_at_month Month of latest state change

state_changed_at _day Day of latest state change

state_changed_at _yr Year of latest state change

state_changed_at _hr Hour of latest state change

created_at_month Month of creation date

created_at _dayDay of creation change

created_at _yr Year of creation change

created_at _hr Hour of creation change

launched_at_month Month of launch date

launched _at _day Day of launch date

launched _at _yr Year of launch date

launched_at _hrHour of launch date

create_to_launch_days Number of days between project creation and the public launch date

launch_to_deadling_daysNumber of days between the launch date and the

deadline

launch_to_state_change_days Number of days between launch date to the latest status change


站长地图