R统计代码辅导、 讲解R编码、国外英文

- 首页 >> 其他

You will work in a group (of approximately 3 students) to produce a single report.

Rules for submitting group reports

 Your group will submit, via the Submit your take home assessment Moodle page, one

PDF file containing your report and one R program file

 Each group member must click their respective Submit buttons in order for the group’s

submission to be successful and final. By ticking the submission declaration box in Moodle

you are agreeing to the following declaration

Declaration: I am aware of the UCL Statistical Science Department's regulations on plagiarism

for assessed coursework. I have read the guidelines in the student handbook and understand

what constitutes plagiarism. I hereby affirm that the work my group is submitting for this incourse

assessment is entirely our own

 The Turn-It-In® plagiarism detection system may be used to scan your submission for

evidence of plagiarism and collusion

 You will work together within your group and the usual plagiarism and collusion regulations

do not apply to this form of interaction. However, they do apply to collusion with other groups

or plagiarism of work from other groups or from other sources

 Any plagiarism will normally result in zero marks for all students involved, and may also

mean that your overall examination mark is recorded as non-complete. Guidelines as to what

constitutes plagiarism may be found in Departmental Student Handbooks. The relevant

excerpt from the Statistical Science handbook is also posted on Moodle

 Late submission will incur a penalty unless there are extenuating circumstances (e.g.

medical) supported by appropriate documentation. Penalties are set out in the latest

editions of the Statistical Science Department student handbooks, available from the

departmental web pages

 Failure to submit this in-course assessment may mean that your overall examination mark is

recorded as “non-complete”, i.e. you will not obtain a pass for the course

 All members of a group will be awarded the same mark for the assignment

 I may ask you as a group to come and discuss your output with me

 You will receive, via Moodle, feedback on your work and a provisional grade – grades are

provisional until confirmed by the Statistics Examiners' Meeting in June 2018

Simon Harden, s.harden@ucl.ac.uk, 4/3/2018

Department of Statistical Science

2

Take home assessment: Group Task

As a group, you will describe and analyse monthly wind speed readings from two weather stations

for the period 1950 - 2003. The data are described in detail at the end of this document. You do

not need to investigate their source further – just report on the data as they are presented to you.

Also do not introduce other data, maps etc into your work. Your group will prepare and submit a

single, short, structured report that addresses each of the tasks set out below. All the summary

statistics in the report should come from an R program, about which details are also set out below,

or be readable from plots in the report. Your report and program will be marked by me (Simon

Harden) and you may be required to discuss then with me. You will receive group specific

feedback on your submission

To complete this assignment successfully you need to start work very soon after the

assignment is set and to plan your time carefully. It is quite possible to complete the

assignment by the end of the second term

The data for your group are available in Moodle as a CSV file. The data for each group are

different and will give different results when analysed

Tasks

1. Describe the data using techniques such as summary statistics and plots. Your description

should include both univariate and multivariate analysis

2. Carry out two sample t-tests comparing average readings for:

a. 1950 - 1976 and 1977 – 2003, for the first of your locations

b. Summer (June, July and August) and winter (December, January and February), for

the second of your locations

c. Your two locations over the whole period

For each of the three hypothesis tests you should report the results in a sentence that is

comprehensible to a non-statistician. Note also the specific tasks described in the R

program section

Report

The report should be consistent with the following:

 You must use the Microsoft Word template provided on the Moodle page for your report and

are not allowed to change its font, font sizes or margins. [If you wish to be allowed to use

alternative word processing software then you must agree the details with me before

submission]. If the template has been changed, up to 4% of marks can be lost and I will

reformat the document to the template standard, to which the following point will apply

 The report must not be longer than 2 pages (2 sides) of A4 paper, including plots, with text in

Arial 11pt font. I will only mark the first 2 pages of any report

 The report must be capable of being read on its own: ie it should not refer to the R program

but just contain data / plots from the program’s output

 Please save your report as a PDF file from Microsoft Word (FILE, Export, Create PDF/XPS)

 It must be written in clear comprehensible English

 Any plots should be readable and well labelled

 Your report should be anonymous – ie there should be no mention of group members’

names anywhere in your submission

 Your report is limited to two sides of A4 paper. This doesn’t mean that you should aim to fill

all the space available to you. Writing more text doesn’t necessarily get you more marks

Department of Statistical Science

3

R program

Your R program should:

 Assume that the working directory has already been set to the location of the data file and to

where any plot files will be stored, ie there should be no setwd () command or reference to

directories

 Import the data from the CSV file

 Generate all the summary statistics and plots that you refer to in your report

 Create an output file using the sink () function, containing only the statistics you use in

your report. Your program may investigate other things but the output file should contain all

the information you use in your report and should be created when I run your program using

the source () function in R. The output should be well laid out and contain appropriate

descriptions. The output file itself should not be included in your submission

 Create a file for each plot (or set of plots) that you use in your report, and no others

 Be well commented with both a description of the program at the start and suitable notes

throughout

 Be clearly laid out so that it is easy for me to read

 Split the imported data into two data objects – one for each of the two weather stations

 Replace all the “-1” readings with “NA”

 Be anonymous – ie there should be no mention of group members’ names

Your program may use non-standard packages and you should assume that they are installed on

my computer

Assessment criteria

85 marks will be allocated as follows:

 Tasks: 40 marks for completing the tasks as listed above (20 marks for question 1 and 20 for

question 2). So, for instance, a thorough and perceptive description of the data will earn

more marks than just a list of a few summary statistics

 Report: 15 marks for complying with the conditions listed above. If, for instance, the

meaning of part of the report is not clear, then marks will be lost

 R program: 30 marks for meeting the R program conditions listed above. Thus, the program

should, for instance, output all data and plots that are used in the report when it is run using

the source () function in R

How to approach the assignment

The way that I would approach the assignment would be to:

1. Meet as a group and plan who will do what by when

2. Start an R program which imports the data and then works through all the commands that

are required to generate the data and plots, together with any extra code needed to complete

the programming as set out above. This is probably best done as a group sitting around a

computer together so that everyone can contribute their expertise

3. Draft a report in the format described above which carries out the required tasks. This may

suggest updates to your R program which might be useful

4. Complete the R program, making sure, in particular, that layout and comments have been

considered

5. Finalise the report satisfying yourselves that it is clear and readable

6. Leave a gap of a day or two and then return to the two files that are to be submitted,

ensuring that all the requirements listed above have been met

Department of Statistical Science

4

If at all possible, I would suggest that you complete as much of the project as you can before the

end of the second term

The data

Each group’s data consists of a .csv file with 5 columns:

 Year. 1950 to 2003

 Month. 1 to 12. The first month is March 1950 and the last, January 2003

 Site_No. There are 13 sites numbered as in the following table. The first site began recording

data in 1950, but the final location did not start doing so until 1961

No Location

1 IJmuiden

2 Schiphol

3 De Bilt

4 Soesterberg

5 Leeuwarden

6 Deelen

7 Eelde

8 Vlissingen

9 Hoek van Holland

10 Zestienhoven

11 Gilze Rijen

12 Eindhoven

13 Beek

 No_Readings. The number of days in the month for which readings exist. For the remainder of

the days in a month, the readings are invalid for whatever reason

 Ave. Wind speed (in m/s). These data have been corrected for exposure changes etc and

reduced to estimates of potential wind at a height of 10m. The original data are hourly readings.

A daily maximum speed was recorded by taking the largest value at 06.00, 12.00, 18.00 or

24.00. The value shown here is the monthly average of those daily maxima. If no valid monthly

average exists, “-1” is shown

Department of Statistical Science