COMM5007 Coding for Business代做

- 首页 >> Algorithm 算法

COMM5007 Coding for Business (T1 2023)
Code-based Solution / Capstone Project
Individual Assessment
Image source: Medium
Table of Contents
1. Overview ............................................................................................................... 3
2. Key Dates .............................................................................................................. 4
3. Milestone 1: Data Preparation and Visualisation (20 marks) ............................ 5
3.1 Data Preparation (7 marks)............................................................................... 5
3.2 Data Visualisation (10 marks) ........................................................................... 5
3.3 Mini Report (3 marks) ....................................................................................... 6
3.4 Milestone 1 Submission .................................................................................... 7
4. Milestone 2: Data Modelling, Analytics, and Reporting (20 marks) ................. 8
4.1 Data Modelling and Machine Learning (5 marks) ............................................. 8
4.2 Report for Milestone 2 (13 marks) .................................................................... 9
4.3 Video Pitch for Milestone 2 (2 marks) ............................................................... 9
4.4 Milestone 2 Submission .................................................................................... 9
5. General Rules ..................................................................................................... 11
1. Overview
Marketing in the Fast-Moving Consumer Goods (FMCG) sector is crucial for the
success of companies operating in this field. FMCG products are those that are sold
quickly and at relatively low cost, such as food, beverages, toiletries, and cleaning
products. Companies in this sector must be efficient in their marketing strategies in
order to remain competitive in an ever-changing market. For supermarkets like Coles
and Woolworths, they need to understand the trends in the FMCG sector and always
provide products that are attractive to customers. This is to satisfy their customer’s
changing needs and preferences, keep up with the increasing competition in the
market or to capitalise on new opportunities.
Your company WSNU has had a data partnership with a large brand for the past few
years who provide transactional and customer data (As you can see in
purchaseBehaviour.csv and transactions.csv). You need to present a strategic
recommendation to your client that is supported by data so that the management
team can make a decision. The client is particularly interested in customer segments
and their chip purchasing behaviour.
2. Key Dates
What? When?
Assignment Due – Milestone 1 Week 7 Friday, 31
st March 2023, 3:00 pm
(Sydney Time) (Submit both the written report
and Python code via Moodle)
Assignment Due – Milestone 2 Week 11 Friday, 28
th April 2023, 3:00 pm
(Sydney Time) (Submit the written report,
Python code and video pitch via Moodle)
3. Milestone 1: Data Preparation and Visualisation
(20 marks)
3.1 Data Preparation (7 marks)
This task requires you to analyse your client's transaction data, perform data
cleansing, and create visualisations. The steps to be followed are:
1. Inspect the transaction data for null or missing values and handle them using
one of the methods covered in class (1 mark).
2. Identify and correct the erroneous column in the transaction data by
converting it to the appropriate data type (1 mark).
3. Verify the correctness of all the products by identifying and categorizing items
across all tables in the transaction data (1 marks).
4. Check for outliers in the transaction data using the describe() function and
handle them appropriately while explaining their existence to your team (1
marks).
5. Prepare the data for plotting: This may involve converting columns into the
desired format, creating new columns, and transforming the data into a
suitable format for plotting. Your manager wants:
A new column called "Package_SIZE", for example 380g (1 mark).
A "Brand_Name" column that displays the first word of the "PROD_NAME"
column. For example, "Smiths Chip Thinly S/Cream&Onion 175g" will be
displayed as "Smiths" (1 mark).
To combine similar brand names, such as "RED" and "RRD", which are
both Red Rock Deli chips, into one brand name (1 mark).
Remark: Each sub-task in Section 3.1 worths 1 mark. If each sub-task works as
expected, then grant 1 mark; otherwise 0.
3.2 Data Visualisation (10 marks)
After preparing the data, you can use any plotting functions of your choice such as
scatter(), hist(), etc. to visualise the data. You only need to create FIVE different plots.
You will be rewarded marks based on the quality of your analysis and plots (you may
use the same plot type if desired). Consider plotting the following questions:
1. The relationship between the total sales column and the date column to
understand the sales trend over time.
2. Daily transactions during December, as it is the most important month of the
year.
3. Transactions by brand name, as obtained from Step 3.1.5.
4. Transactions by package size, as obtained from Step 3.1.5.
5. A pie chart of the premium customer column in the behaviour data frame.
6. The distribution of budget, mainstream, and premium customers in relation to
their life stage.
7. Any other interesting relationships that you may discover, for example:
a. Which are the top brands?
b. Which are the most popular products?
c. Etc.
Remark: Each visualisation in Section 3.2 worths 2 marks. The breakdown of
the 2 marks is as follows:
Clearness of the plot – 1 mark;
Title – 0.5 marks;
Other elements such as axis labels, legends, etc. which depend on the
type of plot – 0.5 marks.
3.3 Mini Report (3 marks)
You also need to write a mini report to explain and highlight the things you have
done in Milestone 1. Word limit of the mini report is 700 words, excluding UNSW
coversheet, table of contents, reference (using the UNSW Harvard Referencing
standard), and Python code.
Following sections are suggested in the report:
1. Introduction (Data quality, Data cleansing, etc.) (approx. 200 words).
2. Data plots, key highlights and your observations (approx. 450 words).
3. Issue(s)/limitation(s) in the dataset (approx. 100 words).
3.4 Milestone 1 Submission
Please submit the following two files through Turnitin on Moodle.
1. Jupyter Notebook: The Jupyter Notebook named as zID_Milestone1.ipynb
(e.g., z1234567_Milestone1.ipynb) contains all your Python code in Sections
3.1 and 3.2. Please make sure that all Python code can run without
errors/bugs.
2. Mini Report: The mini report should be named as zID_Milestone1.pdf (e.g.,
z1234567_Milestone1.pdf). Submit your report with a signed coversheet
(typed signatures are allowed because of COVID) of all group members.
Failure to include the UNSW coversheet with signatures will lead to 5%
penalty of the awarded marks, and no marks will be released until the
coversheet is received.
4. Milestone 2: Data Modelling, Analytics, and
Reporting (20 marks)
4.1 Data Modelling and Machine Learning (5 marks)
Now that the data is ready for analysis, you want to explore the relations between
some of the driving factors using machine learning models. Once again, this is an
open-ended part, and you can explore different themes. But as a general reference,
you need to provide the following information.
You first need to merge transaction dataframe and purchase behaviour
dataframe by using loyalty card number;
split the data into a training set and testing set to build your model;
use linear regression or logistic regression to assess and predict values (we
have provided some advanced examples in Week 8 Ed lesson, and you are
free to use them); and
use confusion matrix, statsmodels.api or/and ROC_Curve to assess your
model.
We might want to target customer segments that contribute the most to sales to
retain them or further increase sales. For example, Mainstream - young
singles/couples. You can develop simple research question such as:
Is life stage a contributing factor to the sales? What kind of relationship do
they have?
Is price a contributing factor to the sales of a brand? What kind of relationship
do they have?
Some challenging questions are:
The customer's total spends over the period and total spend for each
transaction to understand what proportion of their grocery spend is on chips.
Proportion of customers in each customer segment overall to compare
against the mix of customers who purchase chips.
4.2 Report for Milestone 2 (13 marks)
You also need to write a report to explain and highlight the things you have done in
Milestone 2. Word limit of the report is 1500 words, excluding UNSW coversheet,
table of contents, reference (using the UNSW Harvard Referencing standard), and
Python code.
Following sections are suggested in the report:
1. Introduction (Background of the project, Motivation of the project, etc.) (approx.
200 words).
2. Problem definition and analysis (Research question, Driving factors, Data
modelling, Assumptions about the data modelling, Data analysis, etc.) (approx. 800
words).
3. Conclusions and Discussion (Recommendations, Management insights,
Limitations of your data modelling, Limitations of the data set, etc.) (approx. 350
words).
4.3 Video Pitch for Milestone 2 (2 marks)
You also need to make a video pitch (max. two minutes in .MP4 format) to briefly
introduce your research question, data modelling and your solution(s). At most 3
slides, excluding the cover page/opening slide that shows your name and the title of
your project and the reference list slide, in your presentation.
4.4 Milestone 2 Submission
Please submit the following three files through Turnitin on Moodle.
1. Jupyter Notebook: The Jupyter Notebook named as zID_Milestone2.ipynb
(e.g., z1234567_Milestone2.ipynb) contains all your Python code in Section
4.1. Please make sure that all Python code can run without errors/bugs.
2. Report: The report should be named as zID_Milestone2.pdf (e.g.,
z1234567_Milestone2.pdf). Submit your report with a signed coversheet
(typed signatures are allowed because of COVID) of all group members.
Failure to include the UNSW coversheet with signatures will lead to 5%
penalty of the awarded marks, and no marks will be released until the
coversheet is received.
3. Video Pitch: The video should be named as zID_Milestone2.mp4 (e.g.,
z1234567_Milestone2.mp4). Remark: If you submit the video in other format
such as .mov, you will lose all 2 marks.
5. General Rules
Proper Academic Conduct
All assignments need to follow UNSW’s guidelines regarding proper academic
conduct. The submission of materials that are non-original or have been submitted
elsewhere will be considered plagiarism. Plagiarism is unacceptable. All instances
of plagiarism or other academic misconduct will be pursued. Plagiarism may lead
to you failing this course and may have negative consequences for your
studies at UNSW. The general UNSW guideline on academic conduct is available
online.
Assignment Submission
Assignments are to be submitted via Moodle on, or better before, the due date. Late
submission of assignments is not desirable, disrupt the course timelines and are a
sign of poor time management and will lead to reduced marks. The late submission
of assignments carries a penalty of 5% of the awarded marks for that assignment per
day of lateness (including weekends and holidays). For example, a 70 marking
would be reduced by 3.5 marks per day of lateness.
An extension of time to complete an assignment may be granted by submitting a
Special Consideration in the case of illness or misadventure. Even if an extension is
granted, parts of the marks that are dependent on a timely submission and timely
progression of the course cannot be achieved at all. The general UNSW guidelines
for special considerations are available online.

站长地图