代做DTS002TC Essential of Big Data代写留学生Python语言

2025.08.08 - 首页 >> C/C++编程

DTS002TC Essential of Big Data

Coursework R (Individual Assessment)

Due: 5:00 pm China time (UTC+8 Beijing) on Fri. 1^st . Aug. 2025

Weight: 100%

Maximum score: 100 marks (100 % individual marks)

Assessed learning outcomes:

A. Develop a global perspective on the sources and uses of big data.

B. Engage critically with the technical challenges of data acquisition and management.

C. Develop an understanding of the industrial and commercial applications of big data.

D. Demonstrate an awareness of the quantitative problems posed by the analysis of big data.

E. Demonstrate the ability to write codes to obtain numerical solutions to mathematical problems.

F. Demonstrate the ability to display computational results in tabulated or graphical forms.

Late policy: 5%of the total marks available for the assessment shall be deducted from the assessment mark for each working day after the submission date, up to a maximum of five working days.

Risks:

l Please read the coursework instructions and requirements carefully. Not following these instructions and requirements may result in loss of marks.

l Plagiarism results in award of ZERO mark.

l The formal procedure for submitting coursework at XJTLU is strictly followed. Submission link on Learning Mall will be provided in due course.The submission time stamp on Learning Mall will be used to check late submission.

Overall

This Coursework Resit (CWR) is designed to provide a comprehensive assessment of students' understanding of big data applications in addressing global environmental challenges and their ability to apply predictive modeling techniques using Python.

Task 1: Global Greenhouse Gas Emissions Discussion (60 marks)

Objective

Using big data, analyze and discuss the current state and future trends of global greenhouse gas emissions. This task will help you understand the role of big data in addressing global environmental challenges and support evidence-based discussions on climate change.

Research Paper

The reference paper for this coursework is:

Title: "Climate Change and big data analytics: Challenges and opportunities"

Authors: Thanos Papadopoulos, M.E. Balta

Source: International Journal of Information Management

Link: https://doi.org/10.1016/j.ijinfomgt.2021.102448

Report Structure (2000 words)

1.1 Introduction (10 marks)

a. Background: Introduce the current state of global greenhouse gas emissions and their impact on climate change. (5 marks)

b. Significance: Discuss the importance of big data in monitoring and managing greenhouse gas emissions. (5 marks)

1.2 Data Analysis (20 marks)

a. Data Sources: Identify and describe the sources of big data used in greenhouse gas emissions analysis (e.g., satellite data, national inventories, industry reports). (6 marks)

b. Trends Analysis: Use big data to analyze trends in greenhouse gas emissions over the past few decades. Visualize these trends using appropriate charts and graphs. (6 marks)

c. Impact Analysis: Discuss the impact of these emissions on global temperatures and climate patterns, supported by data. (8 marks)

1.3 Big Data Applications (20 marks)

a. Monitoring and Reporting: Discuss how big data is used in monitoring and reporting greenhouse gas emissions at national and global levels. (6 marks)

b. Predictive Modeling: Explain how big data can be used for predictive modeling of future emissions and their potential impacts. (6 marks)

c. Policy and Decision-Making: Discuss the role of big data in informing policy decisions and strategies to mitigate climate change. (8 marks)

1.4 Challenges and Solutions (10 marks)

a. Challenges: Identify and discuss the technical and logistical challenges in using big data for greenhouse gas emissions analysis. (5 marks)

b. Solutions: Propose innovative solutions to overcome these challenges, supported by real-world examples. (5 marks)

Task 2: Python-based Temperature Prediction (40 marks)

Objective

Using Python, predict future temperatures for selected countries based on historical data from long_format_annual_surface_temp.csv. This task will help you apply predictive modeling techniques to real-world data and validate your predictions against recent trends.

Programming Steps

2.1 Data Preparation (10 marks)

a. Data Loading: Load the long_format_annual_surface_temp.csv dataset into a DataFrame. (3 marks)

b. Data Cleaning: Handle any missing values or inconsistencies in the data. (4 marks)

c. Data Splitting: Split the data into training and testing sets (e.g., 80% training, 20% testing). (3 marks)

2.2 Model Building and Prediction (15 marks)

a. Model Selection: Initialize a suitable predictive model (e.g., linear regression, polynomial regression). (5 marks)

b. Model Training: Train the model using the training data. (5 marks)

c. Prediction: Predict future temperatures for the years 2023 to 2025 for at least five selected countries (e.g., China, USA, India, Brazil, Germany). (5 marks)

2.3 Model Evaluation (10 marks)

a. Performance Metrics: Evaluate the model's performance using appropriate metrics (e.g., Mean Squared Error, R-squared). (5 marks)

b. Visualization: Visualize the actual vs. predicted temperatures for the testing period. (5 marks)

2.4 Validation Against Real Data (5 marks)

a. Real Data Comparison: Compare your predicted temperatures with actual temperature data available for 2023-2024. (if available). (5 marks)

b. Error Analysis: Calculate and discuss the percentage error for each year. (5 marks)

Submission Format Instructions

The assignment must be typed, spell-checked, referenced, and submitted via Learning Mall Online to the correct dropbox.

Only electronic submissions are accepted - no hard copies:

l A Student_ID.pdf file contains a cover letter with your ID, and all the task report content.

l A Student_ID.zip file contains all the python script. and resource data.

All students must download their file and check that it is viewable after submission. Document uploads may become corrupted during the uploading process (e.g., due to slow internet connections). Therefore, students themselves are responsible for submitting a functional and correct file that needs to be tested after submitting it.

General Marking Criteria

Documentation (X marks)

Outstanding (100%): Comprehensive identification of big data sources. Detailed analysis of trends with clear visualizations. Impact analysis is well-supported with data.

Appropriate (80%):: Good identification of data sources. Analysis is accurate but may lack some detail. Visualizations are clear.

Needs Improvement (60%): Basic identification of data sources. Analysis is adequate but lacks detail. Visualizations are present but could be improved.

Hard to Understand (40%): Limited identification of data sources. Analysis is basic and lacks clarity. Visualizations are minimal or unclear.

No Submission or Missing Section (0): Minimal or no relevant content.

Code Implementation (X marks)

Outstanding (100%): Code is well-documented, efficient, and error-free. Data loading and cleaning are thorough.

Appropriate (80%): Code is mostly well-documented. Data loading and cleaning are accurate.

Needs Improvement (60%): Code is adequately documented. Data loading and cleaning are done but may have minor errors.

Hard to Understand (40%): Code documentation is minimal. Data loading and cleaning have significant errors.

No Submission or Missing Section (0): Minimal or no relevant content.