代写Assignment 2 - Video Presentation帮做R编程
- 首页 >> Matlab编程Assignment 2 - Video Presentation (35%)
Principal Component Analysis
TASK
For your video presentation, you must demonstrate your PCA analysis on the continuous features of the WACY-COM dataset and interpret the results. Submit the recording via the Panopto link on Canvas. Please ensure you follow the instructions carefully.
The due date for this assessment is Friday of Week 6 on 4 April 2025 before midnight.
Perform PCA and Visualise Data
(i) First, copy the code below to a R script. Enter your student ID into the command set.seed(.) and run the whole code. The code will create a sub-sample of 400 that is unique to you.
#You may need to change/include the path of your working directory
#Import the dataset into R Studio.
dat <- read.csv("WACY-COM.csv", na.strings=NA, stringsAsFactors=TRUE) set.seed(Enter your student ID here)
#Randomly select 400 rows
selected.rows <- sample(1:nrow(dat),size=400,replace=FALSE)
#Your sub-sample of 400 observations
mydata <- dat[selected.rows,]
dim(mydata) #check the dimension of your sub-sample
(ii) Extract only the continuous features and the APT feature from the WACY-COM dataset and store them as a data frame/tibble. Refer to Assignment 1 for the feature description if needed.
(iii) Clean the extracted data based on the feedback received from Assignment 1.
(iv) Remove the incomplete cases to make it usable in “R” for PCA.
(v) Perform PCA using prcomp(.) in R, but only on the numeric features (i.e. ignore APT in this step).
- Explain why you believe the data should or should not be scaled, i.e. standardised, when performing PCA.
- Display and describe the individual and cumulative proportions of variance (3 decimal places) explained by each of the principal components.
- Outline how many principal components are adequate to explain at least 50% of the variability in your data.
- Display and interpret the coefficients (or loadings) to 3 decimal places for PC1, PC2 and PC3. Describe which features (based on the loadings) are the key drivers for each of these three principal components.
(vi) Create and display the biplot for PC1 vs. PC2 to visualise the PCA results in the first two dimensions. Colour-code the points based on the APT feature. Explain the biplot by commenting on the PCA plot and the loadings plot individually, and then both plots combined (see Slides 28-29 of Module 3 notes). Finally, comment on and justify which (if any) features can help distinguish APT activity.
(vii) Based on the results from parts (v) and (vi), describe
- whether PC1 or PC2 (choose one) best assists in classifying APT. Hint: Project all points in the PCA plot onto the PC1 axis (i.e. consider the PC1 scores only) and assess whether there is a clear separation between known and unknown APT actors. Then, project onto the PC2 axis (i.e. consider the PC2 scores only) and evaluate whether the separation is better than in PC1. You can access the PCA scores for PC1 and PC2 via mypca$x, assuming mypca contains your PCA results from prcomp(.).
- the key features in this dimension that can drive this process (Hint: based on your decision above, examine the loadings from part (v) of your chosen PC and choose those whose absolute loading (i.e. disregard the sign) is greater than 0.3).
Video Presentation Checklist
1. In your video presentation, you must
a. Run your code corresponding to parts (i) to (vii) above
b. Display the relevant output
c. Interpret the output
2. Your video presentation must include a camera shot of yourself in the video
capture, unless there is an exceptional reason and is supported by a Learning Assessment Plan (LAP). 20% is automatically deducted from your final mark if this is not included in your video presentation. If you choose to record with another application, you must make sure that this feature is included.
3. Your video presentation must be between 4-5 minutes long.
Marking Rubrics
Criteria |
Fail <0-49% |
Pass 50-59% |
Credit 60-69% |
Distinction 70-79% |
High Distinction 80-100% |
Working Code (7%) |
Code does not run or contains major flaws, preventing meaningful PCA analysis. Little to no documentation. |
Code has significant errors or omissions that affect PCA output. Poor documentation and some redundancy. |
Code has a few errors and/or does not fully achieve intended PCA and relevant analyses. Documentation is present but could be improved. |
Code runs with minor issues but still performs PCA and relevant tasks correctly. Minimal redundancy and good documentation. |
Code runs flawlessly, correctly performs PCA and relevant tasks, and produces meaningful outputs. No errors, redundant code, or inefficiencies. |
Interpretation of results (18%) |
Fails to interpret the PCA results meaningfully or provides incorrect conclusions. |
Interpretation is vague, lacks depth, and/or has major inaccuracies or errors. |
Provides a basic interpretation with some inaccuracies or missing key insights. |
Provides a strong and mostly accurate interpretation of PCA results with minor omissions or inaccuracies. |
Provides an in-depth, clear, and accurate interpretation of PCA results, including the significance of principal components and key loadings. Justifies conclusions with evidence. |
Presentation skills (7%) |
The presentation is unclear. The presenter made an attempt at expression, but the pace and tone need improvement to better engage the audience. |
The presentation lacks structure. Presenter made a good attempt, but the expression, pace, and tone could be improved. |
The presentation is understandable and delivered at a good pace. However, there is minimal confidence in the presentation style. |
Clear and structured presentation with minor pacing or engagement issues. Presenter was fluent and displayed good confidence. |
The presenter was dynamic, natural, and persuasive, with an appropriate tone. Delivery was clear, confident, and well- structured, with effective pacing and engagement that maintained a high level of confidence throughout. |
Timing (3%) |
Presentation is less than 2 minutes or more than 9 minutes. |
Presentation is between 2 and 3 minutes, or between 8 and 9 minutes |
Presentation is between 3 and 4 minutes, or between 7 and 8 minutes |
Presentation is between 6 and 7 minutes |
Presentation is between 4 and 5 minutes |