代写Programing Assignment 1代写留学生Python语言
- 首页 >> C/C++编程Programing Assignment 1
(Programing)
Please paste code, produced tables and plots on your solution.
1. NumPy is a package which provides convenient matrix/vector computations : (10%)
a. Please generate a 8 × 8 matrix A and find the minimum, mean, maximum values of each row and column using NumPy. (3%)
b. Please generate another 8 × 8 matrix B and find the transpose and inverse of B. (3%)
c. Please compute the element-wise multiplication and matrix multiplication of A and B. (4%)
2. Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. Pandas DataFrame consists of three principal components, the data, rows, and columns. (10%)
a. Given a table of NBA players’ stats as follows, please generate a Pandas DataFrame based on the table. (3%)
Player |
GP |
MIN |
PTS |
FGM |
FGA |
FG% |
3PM |
3PA |
3P% |
James Harden |
11 |
38.5 |
31.6 |
9.9 |
24 |
41.3 |
4.4 |
12.5 |
35 |
Kawhi Leonard |
24 |
39.1 |
30.5 |
10.1 |
20.7 |
49 |
2.3 |
6 |
37.9 |
Paul George |
5 |
40.8 |
28.6 |
8.8 |
20.2 |
43.6 |
3 |
9.4 |
31.9 |
Stephen Curry |
22 |
38.5 |
28.2 |
8.6 |
19.6 |
44.1 |
4.2 |
11.1 |
37.7 |
Damian Lillard |
16 |
40.6 |
26.9 |
8.6 |
20.6 |
41.8 |
|
9.9 |
37.3 |
Giannis Antetokounmpo |
15 |
34.3 |
25.5 |
8.6 |
17.4 |
49.4 |
1.2 |
3.7 |
32.7 |
Nikola Jokic |
14 |
39.7 |
25.1 |
9.4 |
18.6 |
50.6 |
1.6 |
4 |
39.3 |
CJ McCollum |
16 |
39.7 |
24.7 |
|
21.9 |
44 |
2.9 |
7.3 |
39.3 |
Russell Westbrook |
5 |
39.4 |
22.8 |
8 |
22.2 |
36 |
2.2 |
6.8 |
32.4 |
DeMar DeRozan |
7 |
35.9 |
22 |
8.3 |
17 |
48.7 |
0 |
0.1 |
0 |
James Harden |
11 |
38.5 |
31.6 |
9.9 |
24 |
41.3 |
4.4 |
12.5 |
35 |
b. Please check how many data are missing and fill the missing data with the average of other players. (4%)
c. Now, we get the stats of another player as follows, please add his information into our DataFrame. (3%)
Player |
GP |
MIN |
PTS |
FGM |
FGA |
FG% |
3PM |
3PA |
3P% |
Lou Williams |
6 |
29.3 |
21.7 |
7.5 |
17.3 |
43.3 |
1 |
3 |
33.3 |
3. Parkinson Dataset with replicated acoustic features Data Set
(http://archive.ics.uci.edu/ml/datasets/Parkinson+Dataset+with+replicated+acoustic+features+ ) contains acoustic features extracted from 3 voice recording replications of the sustained /a/phonation for each one of the 80 subjects (Some of them with Parkinson's Disease, i.e., status=1). Please find the data as Parkinson.csv file. (Hint: columns ‘ID’ and ‘ Recording’ can not be considered as the features.) (40%)
a. As we discussed in class, given a dataset to analyze, before designing supervised learning model or unsupervised model, we need to understand the structure and statistics of the data, i.e., distribution of class labels, distribution of each feature, etc. Please implement such data analysis using Python. (10%)
b. Considering each record as an individual sample, please train a decision tree classifier (max_depth = 3) to predict the status of each sample. Please plot your decision tree. (15%)
c. As discussed in class, Grid Search can help us to tune the model parameters to find the optimal solution. Please tune your decision tree classifier to improve the predictive performance. (15%)
4. Indian Liver Patient Dataset
(https://archive.ics.uci.edu/ml/datasets/ILPD+%28Indian+Liver+Patient+Dataset%29, please
find the data as the ILPD.csv file.) provides the age, gender, total Bilirubin, direct Bilirubin, total proteins, albumin, A/G ratio, SGPT, SGOT and Alkphos of patients. Please train a KNN classifier and a Logistic Regression classifier to predict class label of the patient. (for KNN classifier please refer to: https://scikit-
learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html ) (40%)
Note: some data are missing.