代做BIOL/STATS 2244 Stage 2: Data代做留学生R程序
- 首页 >> WebBiol/Stat 2244A FW25
BIOL/STATS 2244
Stage 2: Data
Objectives
The Assignment component of the course assesses your achievement of several course-level Learning Outcomes. Stage 2 (this part) is focused on learning outcomes as applied to aspects of the Data stage of PPDAC:
i. Create and interpret appropriate summaries of data;
a. Select appropriate summaries based on research question and variables;
ii. Use statistical software to summarize, analyse, interpret, and communicate data in a reproducible manner;
a. Create graphical and numerical summaries of data in R;
b. Create reproducible analyses using R markdown and LaTeX;
iii. Communicate statistical concepts, analyses, and arguments in an accurate and scholarly manner;
a. Use conventional formats for reporting results of statistical analyses.
You will need to draw on course content, primarily from Topic 3: Study design and considerations, Topic 4: Summarizing & Exploring Data, as well as Labs 1 through 4.
Connection between Stage 1 and Stage 2
Stages 1 and 2 (and 3) relate to each other by the overall Research Objective, Understanding the effectiveness of computer-assisted learning. You can refer to the “Research Background” in the Stage 1 file as a refresher on the main concepts/variables if needed. Stage 2 focuses on the Data stage of the PPDAC Framework. We will ignore the sampling strategies, etc. that you proposed in Stage 1. Instead, you will work with real data from a published research study. This Stage 2 file is accompanied by the data (a CSV file named dataset.csv), the research article (named article.pdf) that describes the way the data were collected in its “ Materials and Methods” section, as well as some video resources to help you understand the data collection and datafiles.
You should, therefore, review the rest ofthis instructions file so you understand what you are being asked to do, and then spend some time reviewing the “Methods and Materials” section of the article and associated video resources while exploring the datafile in R.
We will pretend that we are the researchers that generated the data file, using the Methods and Materials outlined in the article. So, make sure you understand what the columns (vectors) in the datafile represent.
Note: there will be more variables in the dataset than you might actually need/use for this Stage.
Being successful on this Stage
Remember, this Stage is evaluating you on three things:
• Your ability to choose appropriate summaries to answer a research question
• Your ability to use R and R markdown files
• Your ability present data in a conventional, transparent, and scholarly manner
The knowledge to complete these tasks is developed in lecture Topics and Labs; everything you would
need to do in R and R markdown is achievable based on the 2244 Lab content. The concepts (i.e. on how to select a graph) is based on lecture Topics (and is reinforced in the structure of Lab 4: Lesson 1). So, go back to the 2244 course materials.
Here is my suggested scaffold for working on this assignment:
1. Analyse the Research Question (given below) like has been reinforced since the start of the
course and in our “Summarizing and Exploring Data” lectures. What are the explanatory and/or response variables? The population of interest? What is the research goal?
2. Connect the results of your “analysis” to the dataset. Read the Materials and Methods, and
review the video resources while looking at the datafiles. Identify which columns in the dataset address/relate to the variables in the Research Question. What type of variables are you working with (quantitative vs. categorical)? You’ll need to apply your understanding of types of variables from lecture.
BE CAREFUL! I chose the article and dataset because it’s interesting, freely available, and had writing that would be understandable. I did NOT choose it based on the data analysis or summaries the authors used. The authors may have made poor/incorrect choices in their analyses and/or graph types !So, use the article to understand how variables were operationalized/measured, but apply YOUR understanding of types of variables and appropriate graphs based on what you’ve learned in 2244.
3. CRITICAL STEP: Think about what you will do with the relevant columns in the data set to answer the Research Question and the graph type(s) that are appropriate for the data. There may be more than one graph and more than one approach to working with the data to answer the research question. Keep a couple points in mind:
a. The most important thing (from an assessment perspective) is that you choose a graph type that is appropriate for the variables you are working with (i.e. number of variables, type of variables, what you are trying to do—think back to lecture!).
b. You do not need to “control” for confounding or take into account covariates; you can keep your graph quite simple in that regard—i.e. just focus on it answering the Research Question!
c. You have freedom to transform/manipulate the variables using R as you see fit. If you want to summarize a variable across individuals, transform a more complex variable into a simpler one, etc., you have that freedom. You can also keep things simple. Just ALWAYS make sure your graph fundamentally answers the Research Question. And, tell us (briefly) what/why you did what you did (in Question 1d).
d. If you find there is more than one graph type that you could use/make, choose the one that YOU feel most confident making and showing your achievement of the learning outcomes!
4. Draw a quick sketch of what you want the graph to look like (what variables go where?) . Think about your axes titles, etc. Having a clear image of what you are trying to make can help you create the graph in R.
5. Use examples in Lab 4 (and possibly from lecture Topic 4) to create your graph; those examples have code to show you how to do it! Also, it might be helpful to review the “Tips for working in R to create a graph” that came with this Stage 2 BEFORE you start trying to make your graph.
6. Refer back to Lab 4 for characteristics of good figures for axes titles, figure captions, and colour/symbolism. Apply them properly. That’s scholarly communication—one of the learning outcomes being assessed.
Dataset check
You can use the following information to check that the data have imported accurately into R; I applied the str() function to the dataset.csv to get the image below.
Note that I have NOT corrected any mistakes R made at identifying variable/vector type (e.g. R may label a variable as ‘numeric, (quantitative) when it,s actually nominal or ordinal)—YOU need to think critically about whether R correctly identified the type(s) of variables.
It is up to you whether you ‘correct’ the misidentified variables (e.g. using functions like as.numeric(), etc.). It DOES make sense to correct any variables you are using for your graph. But there is no need to correct all variables that R incorrectly identifies if you are not using them.
Stage 2 Questions
Our experiment focused on the impact of computer-assisted learning (CAL) versus inquiry-based learning (IBL) was not randomized because of logistical constraints. Consequently, the students were assigned to the learning method group based on the school they attend. What if there are initial, baseline differences in characteristics of the students between the schools? Such differences in academic preparation, or in psychological characteristics of self-concept or working memory could be problematic later when we consider the impact that the different learning methods might have. Consequently, for Stage 2 you will choose ONE (1) of academic preparation, self-concept, or working memory to GRAPHICALLY investigate for the Research Question,
Do schools and learning methods have baseline differences in their
student characteristics?
OVERALL OBJECTIVE:
You will create a graph to visually investigate the Research Question. The entirety of this assignment will be completed in an R markdown file, knitted to PDF format (or knitted to Word format and subsequently saved as a PDF). All components of this assignment should be answered in your single R markdown file (i.e. all written responses and R code; the output from the R code will also be visible in the final knitted file). Be sure to read the section on Format of Stage 2 carefully.
Question 1.
For the Research Question (provided above) and your choice of characteristic, answer the following questions:
a. What are the explanatory and response variable(s) based on your choice and the Research Question?
b. Which variable(s) in the dataset (i.e. give the column name(s)) will you use to answer the
Research Question? Make sure it’s clear which of the column name(s) link(s) to the explanatory vs. response variables from part a.
c. Will you do any transformation or clean-up of the variables (e.g. create new versions of the variables or adjust the values of them) to use in your graph for Question 2? If yes:
• Briefly describe what you are doing and why. All such transformation / data processing should be completed with R (with code showing in your assignment submission file; see the section on Format of Stage 2).
• Complete any transformation / clean-up in an R chunk as part of this question.
Effectively, Part c is an opportunity for you to explain your thought process, to help us understand what you are doing with the data and how/why it relates to answering the Research Question.
d. What type of variables (quantitative or categorical) were the original columns from the dataset
that you are using, based on your application of 2244 lecture matieral? If you performed any
clean-up / transformation of the data in Part c, what type of variables (quantitative or categorical) are the final version(s) you are using for your graph in Question 2?
Note: Question 1 is effectively ‘planning out’ the way you will use the dataset to make the graph for Question 2. When we review your graph in Question 2, it should very clearly reflect your answers to Question 1.
Question 2.
Create a SINGLE graph using R (i.e. no multi-pane figures or faceting—i.e. all data needed is in a single graph) that will answer the Research Question with your choice of characteristic; your graph type should be one of the options taught in 2244. Be sure that your graph has proper and descriptive axes titles and an appropriate figure caption so that it is interpretable by others. Don’t assume that others (i.e. the graders) have looked at the dataset or article.
Note: refer to Lab 4’s Lesson 4.3 on “Characteristics of good figures” for information about proper axes titles, figure captions, use of colour/symbolism, and for clarification on what is a multi-pane figure, if needed.
Grading for Stage 2
There may be several different approaches to complete the Stage 2 tasks; the outcome will vary based on things like (i) which characteristic you choose to work with, (ii) if you conduct any transformation / data- cleaning (Question 1c), and (iii) what type of graph you use (which depends almost entirely on the choices you make in Question 1). However, not ALL approaches will be appropriate. Think back to what we talked about for selecting a summary from lecture—what are the factors that go into your decision? Show that you understand those factors and the graph types we covered in the course.
General Overview of Grading
Stage 2 will be graded based on three (3) characteristics, each marked against a 4-level rubric (on page 10). As an example, your levels for this Stage might be:
• Proficiency for “Selected summary”, indicating that your submission demonstrated proficiency for the learning outcome, “Select appropriate summaries based on research question and variables”,
and,
• Mastery for “Summaries in R”, indicating your submission demonstrated mastery of the learning outcomes, “ Use statistical software to summarize, analyse, interpret, and communicate data in a reproducible manner”, and,
• Approaching Proficiency for “Figure formatting”, indicating your submission demonstrated
approaching proficiency of the learning outcome, “ Use conventional formats for reporting results of statistical analyses.”
There is NO number or percentage grade associated with the Stages. The (levels’ achieved for this Stage get combined with the levels on Stages 1 and 3—that’s why they are described as stages of one large project/assignment. The 30% of your course grade associated with the Assignment component is a grade assigned for the entire (project’ (i.e. not 3 assignments at 10% each—it doesn’t work that way).
Regrade Requests on graded work
We are going to try to have the grading of the Assignment completed in about two weeks (that’s our goal but this is a major marking undertaking). We try to be consistent and accurate. But, with a large course and different approaches to the questions, we can sometimes miss things or make mistakes. If you think we have missed something in grading your submission, you can request a regrade.
You will have one week after the graded work is returned to you to make any requests for regrading of your submission. ALL requests for regrades go through the Gradescope. When you view your submission in Gradescope, select ‘Entire submission’ from the right side of the screen. Carefully review the feedback comments that are provided; they may provide clarification on the grading of your submission. If you think we have missed something, use the button for “Request Regrade” in the bottom right corner of the
Gradescope screen.
When you make such a request, be polite and specific about what you think we missed. You don’t need to write a huge essay for a regrade request, but make it clear what your concern is. We work hard and try our best, so being polite is important. Note that we do have to grade what is written, not what was “meant”/intended. Additional information or elaboration in the regrade request itself won’t be marked— only what is originally submitted. So, lengthy explanations need not be provided; just a message that states your specific concern about what you think we missed should be sufficient. Keep in mind that requests simply asking for a higher ‘grade’ without a specific concern aren’t really valid; please make sure it’s clear what you think was overlooked. Note that we will have to review your entire submission during a regrade (because the grading rubric is holistic rather than question-specific). While it is generally a rare outcome, it is possible for a regrade to result in a lower rubric level for some aspect of the assignment.
