代写Coursework - RNAseq Data Analysis代写R编程
- 首页 >> WebCoursework - RNAseq Data Analysis
Instructions
1. If you have any questions regarding the techniques, please refer to the practical lab guide. Everything required in this assignment has been fully covered there.
2. Do NOT copy the entire result to your report. A screen shot of the key section is usually sufficient.
3. Clearly label each question (and sub-question) in your report.
4. Submit your assignment report in word format directly to Learning Mall together with properly annotated code before the deadline; late submissions will be penalized according to university policy.
5. Marking:30% for successful completion of the task, 30% for explanation and presentation of the results; 20% for coding and proper code annotation, and 20% for insightful or novel observations and comments.
For all the following tasks
· Briefly explain the results you obtained. You can use figures to illustrate the results, however, figures must be a readable size and of publication quality resolution. It is essential that you explain what you find; i.e. you need to explain the results using your own words. Figures and the outputs of the analysis software alone will only earn partial marks: the most important aspect is to demonstrate your understanding of the result.
· Provide your code with proper annotations at the end of the report.
· Please use font size 10 and single spacing.
· Submit your report in word format (not PDF).
Tasks (25 points for each task and a total 100 points)
The project is based on the RNA-seq data we used in the lab session.
1. Download the RNAseq dataset used in the practical session. Explain clearly what kind of dataset it is about, such as, the species, the technology, research subject, why the question is important, etc. (Just like it is your own data.) Perform. basic data quality assessment using FastQC or other software, trim data if necessary. Report the data quality. Align these to the appropriate reference genome. Report alignment rate. (space limit of this task: up to 1 page)
2. Choose one bam file to perform reference-based transcriptome reconstruction from RNA-seq data (withStringtie Software). Report the reconstructed transcriptome, and compare your transcriptome with the known transcripts in IGV. Find and present at least one example for each of the three following categories: (1) the reference-based transcriptome assembly is working as expected; (2) the reconstructed transcriptome is different from the known transcriptome represented by the genome browser; (3) Novel transcripts are identified by Stringtie. Please adjust the visualization options of IGV to make sure the figures are clear and ideally pretty as well. (space limit of this task: up to 1 page)
3. Perform. differential expression analysis using FeatureCounts(or DESeq2) and report the results. How many genes (up-regulated and down-regulated ) are differentially expressed at significance level 0.05? Visualize the tag density files (TDF) in IGV to check whether the results of the differential analysis are consistent with the tag density file (TDF) shown in IGV. Perform. the gene ontology analysis of the differentially expressed genes (using DAVID or other tools), report their functions, and explain how to interpret the results. (space limit of this task: up to 1 page)
4. Short essay (up to 1000 words): What is RNA-seq and how can it be used to understand gene regulation? Briefly explain the basics of RNA-seq technology, including important steps in sample preparation or data analysis, information extraction, typical problems, future challenges, promising directions and any other interesting or urgent issues that capture your attention. (space limit of this task: up to 1 page)
