讲解SVC留学生、辅导Python编程、讲解Python、辅导algorithm
- 首页 >> Python编程This project is to be done individually. All the coding involves in this project must be in Python. You may only use techniques/algorithms covered in this class. However, you are allowed to use parameters, attributes, etc. that are not covered in this class as long as they belong to the techniques/algorithms covered in the class. For example, Support Vector Classification is the algorithm covered in the class but kernel=sigmoid is the parameter that is not covered in the class. You are allowed to use kernel=sigmoid with SVC. Meanwhile, genetic algorithm is the algorithm that is not covered in the class, so you are not allowed to use genetic algorithm in this project.
Your task in this assignment is to:
1.Provide summary statistics along with visualizations of these summary statistics for the variables that are interesting/relevant to your analyses.
2.Develop a regression model (i.e., a supervised-learning model where the target variable is a continuous variable) to predict the value of the variable “pledged.” After you obtain the final model, explain the model and justify the predictors you include/exclude.
3.Develop a classification model (i.e., a supervised-learning model where the target variable is a categorical variable) to predict whether the variable “state” will take the value “successful” or “failure.” After you obtain the final model, explain the model and justify the predictors you include/exclude.
4.Develop a cluster model (i.e., an unsupervised-learning model which can group observations together) to group projects together. After you obtain the final clusters, explain the characteristics that you observe in each cluster. Note that you will be graded based on the performance of the model and the insights you obtain from the clustering task. For example, if your model generates two clusters with a high cluster separation and low cluster cohesion but one cluster essentially represents successful projects while another cluster essentially represents failed projects, then the insights gained are severely limited.
For all tasks, you will be graded based on both the performance of the model and the explanations/justifications you provide. You also need to clearly articulate how realistic or useful your model would be in a business context.
There are two deliverables for this assignment:
1)Summary Report
The report must be submitted in pdf format.
The report must not exceed 5 double-spaced pages, including everything. Page margins must measure 1” around. Please use a 12-point Times New Roman font.
Name the file as follows: “Lastname_Firstname_IndividualProject”
The report must contain:
oThe summary statistics
oThe explanations/justifications of each model along with the results. You may submit only one model per task.
The report is due by Tuesday, December 3 at 11:59pm.
2)Python Code
Along with the report, please also submit Python code that you use to develop your report. The code should be complete with informative comments and able to run fully without any errors or modifications (besides the file path).
Data Description
The dataset in this project is scraped from Kickstarter, which is a popular crowdfunding platform. There are 45 variables in total. The table below contains a short description of each variable.
Column Name Description
project_id Unique identifier for projects
name Project
goal Goal amount requested by the project
pledgedAmount pledged at time of data scrape
state Status of the project (successful, failed, etc)
disable_communication If communication with project owners was disabled
countryOrigin country of project
currency Currency of origin country
deadline End date of project funding period
state_changed_at date and time the project state was modified to current state
created_at Date and time project was created
launched_at Date and time project was launched
staff_pick If the project was a staff pick
backers_count Number of backers
static_usd_rateThe conversion rate of project country currency to USD
usd_pledged Amount pledged in USD
category Category of project
spotlight If the project was featured on kickstarter spotlight page
name_len Length of project name in word count
name_len_cleanLength of project name in word count sans non- key words (such as “for” “and” etc.)
blurb_len_cleanLength of project blurb in word count sans non- key
words
deadline_weekday Weekday of deadline date
state_changed_at_weekday Weekday of state change
created_at_Weekday Weekday of creation date
launched_at_weekday Weekday of launch date
deadline_month Month of the project deadline
deadline_day Day of the project deadline
deadline_yr Year of the project deadline
deadline_hr Hour of project deadline
state_changed_at_month Month of latest state change
state_changed_at _day Day of latest state change
state_changed_at _yr Year of latest state change
state_changed_at _hr Hour of latest state change
created_at_month Month of creation date
created_at _dayDay of creation change
created_at _yr Year of creation change
created_at _hr Hour of creation change
launched_at_month Month of launch date
launched _at _day Day of launch date
launched _at _yr Year of launch date
launched_at _hrHour of launch date
create_to_launch_days Number of days between project creation and the public launch date
launch_to_deadling_daysNumber of days between the launch date and the
deadline
launch_to_state_change_days Number of days between launch date to the latest status change