AcF633代做、Python设计编程代写

2024.03.03 - 首页 >> Java编程

AcF633 - Python Programming for Data Analysis
Manh Pham
Group Project
21st February 2024 noon/12pm to 6th March 2024 noon/12pm (UK time)
This assignment contains one question worth 100 marks and constitutes 35% of the
total marks for this course.
You are required to submit to Moodle a SINGLE .zip folder containing a single
Jupyter Notebook .ipynb file (preferred) and/or Python script .py files and supporting .csv files (e.g. input data files, if any), together with a signed group coversheet. The name of this folder must be your group number (e.g. Group1.zip,
where Group 1 is your group).
In your main script, either Jupyter Notebook .ipynb file or Python .py file, you do
not have to retype the question for each task. However, you must clearly label
which task (e.g. 1.1, 1.2, etc) your subsequent code is related to, either by using a
markdown cell (for .ipynb files) or by using the comments (e.g. #1.1 or ‘‘‘1.1’’’
for .py files). Provide only ONE answer to each task. If you have more than one
method to answer a task, choose one that you think is best and most efficient. If
multiple answers are provided for a task, only the first answer will be marked.
Your submission .zip folder MUST be submitted electronically via Moodle by the
6th March 2024 noon/12pm (UK time). Email submissions will NOT be considered. If you have any issues with uploading and submitting your group work to
Moodle, please email Carole Holroyd at c.holroyd@lancaster.ac.uk BEFORE the
deadline for assistance with your submission.
Only ONE of the group members is required to submit the work for your group.
The following penalties will be applied to all coursework that is submitted after the
specified submission date:
Up to 3 days late - deduction of 10 marks
Beyond 3 days late - no marks awarded
Good Luck!
1
Question 1:
The Dow Jones Industrial Average (DJIA) index is a price-weighted index of 30
blue-chip stocks listed in the US stock exchanges. The csv data file ‘DowJonesFeb2022.csv’ lists the constituents of the DJIA Index as of 9 February 2022 with the
following information:
Company: Name of the company
Ticker: Company’s stock symbol or ticker
Exchange: Exchange where the company’s stock is listed
Sector: Sector in which the company belongs
Date added: Date when the company was added to the index
Weighting: Weighting (in percentages) of the company in the index.
Import the data file to an object called “Index” in Python and perform the following
tasks.
Task 1: Descriptive Analysis of DJIA index (Σ = 20 marks)
1.1: How many unique sectors are there in the DJIA index? Print the following
statement: ‘There are ... unique sectors in the DJIA index, namely ...’, where
the first ‘...’ is the number of unique sectors, and the second ‘...’ contains the
names of the sectors alphabetically ordered and separated by commas. (3 marks)
1.2: Write code to create a dictionary with keys being the unique sectors in the
DJIA index sorted in alphabetical order, and and values being tuples of two
elements: the first being the number of tickers in each sector, and the second
being the list of alphabetically ordered tickers in each sector.
Hint: An example of a key-value pair of the required dictionary is ‘Materials’:
(1,[‘DOW’]). (3 marks)
1.3: Write code to find the company having the largest index weight and one
with the smallest weight. Print the following statements:
Company ... (ticker ..., sector ..., exchange ...) has the largest index weight of
...%.
Company ... (ticker ..., sector ..., exchange ...) has the smallest index weight
of ...%.
The range of the weights is ...%. (4 marks)
1.4: Write code to find the company having the longest history in the index and
the one with the shortest history. Print the following statements:
Company ... (ticker ..., sector ..., exchange ...) has the longest history in the
DJIA index, added to the index on ....
Company ... (ticker ..., sector ..., exchange ...) has the shorted history in the
DJIA index, added to the index on .... (4 marks)
1.5: Write code to produce the following pie chart that shows the DJIA index
weighting by sectors.
2
Print the following statement:
Sector ... has the largest index weight of ...%, and Sector has the smallest
index weight of ...%. (6 marks)
Task 2: Portfolio Allocation (Σ = 35 marks)
2.1: Using the order of your group letter in the alphabet (e.g. 1 for A, 2 for B,
etc.) as a random seed, draw a random sample of 5 stocks (i.e. tickers) from the
DJIA index excluding stock DOW.1 Sort the stocks in alphabetical order, and
then import daily Adjusted Close (Adj Close) prices for the 5 stocks between
01/01/2009 and 31/12/2023 from Yahoo Finance. Compute the simple daily
returns for the stocks and drop days with NaN returns. (3 marks)
2.2: Create a data frame to summarize key statistics (including sample size,
mean, standard deviation, minimum, quartiles, maximum, skewness, kurtosis,
Jarque-Bera statistic, Jarque-Bera pvalue and whether the data is normal) for
the daily returns of the five stocks over the above sample period. Jarque-Bera
statistic is the statistic for the Jarque-Bera normality test that has the formula
JB =
T
6

Sb2 +
(Kb − 3)2
4
!
, where T is the sample size, Sb and Kb are sample
skewness and kurtosis of data, respectively. Under the null hypothesis that
data is normally distributed, the JB statistic follows a χ
2 distribution with 2
degrees of freedom. Jarque-Bera pvalue is the pvalue of the JB statistic under
this χ
2 distribution. ‘Normality’ is a Yes/No indicator variable indicating if
data is normally distributed based on Jarque-Bera test.
Your data frame should look similar to the one below, but for the five stocks
in your sample.
1DOW only started trading on 20/03/2019. 3
(4 marks)
2.3: Write code to plot a 2-by-5 subplot figure that includes:
Row 1: Time series plots for the five stocks’ returns
Row 2: The histograms, together with kernel density estimates, for the five
stocks’ returns (3 marks)
2.4: Using and/or modifying function get efficient frontier() from the file
Eff Frontier functions.py on Moodle, construct and plot the Efficient Frontier for the five stocks based on optimization using data over the above period. In your code, define an equally spaced range of expected portfolio return
targets with 2000 data points. Mark and label the locations of the five stocks
in the Efficient Frontier plot. Also mark and label the locations of the Global
Minimum Variance portfolio and the portfolio with the largest Sharpe ratio,
assuming the annualized risk-free rate is 0.01 (or 1%).2
(6 marks)
2.5: What are the return, volatility, Sharpe ratio and stock weights of the portfolio with the largest Sharpe ratio? Write code to answer the question and
store the result in a Pandas Series object called LSR port capturing the above
statistics in percentages. Use the words ‘return’, ‘volatility’, ‘Sharpe ratio’,
and stock tickers (in alphabetical order) to set the index of LSR port. (4 marks)
2.6: Alice is interested in the five stocks in your sample. She is a mean-variance
optimizer and requires the expected return of her portfolio to be the average
of the expected returns of the five individual stocks.3 Suppose that Alice does
not have access to a risk-free asset (i.e. she cannot lend or borrow money
at the risk-free rate) and she would like to invest all of her wealth in the five
stocks in your sample. How much, in percentages of her wealth, should Alice
invest in each of the stocks in your sample? Write code to answer the question
and store the result in a Pandas Series object called Alice port respectively
2This equals the average of the risk-free rates over the sample period.
3Use the average return of a stock over the considered sample as a proxy for its expected return. 4
capturing the return, volatility, Sharpe ratio and the stock weights of Alice’s
portfolio. Set the index of Alice port correspondingly as in Task 2.5. (4 marks)
2.7: Paul, another mean-variance optimizer, is also interested in the five stocks
in your sample. He has an expected utility function of the form U(Rp) =
E(Rp) − 2σ
2
p
, where Rp and σ
2
p are respectively the return and variance of the
portfolio p. Also assume that Paul does not have access to a risk-free asset
(i.e. he cannot lend or borrow money at the risk-free rate) and he would like
to invest all of his wealth in the five stocks in your sample. How much, in
percentages of his wealth, should Paul invest in each of the stocks in your
sample to maximize his expected utility? Write code to answer the question
and store the result in a Pandas Series object called Paul port respectively
capturing the return, volatility, Sharpe ratio and the stock weights of Paul’s
portfolio. Set the index of Paul port correspondingly as in Task 2.5. (4 marks)
2.8: Now suppose that both Alice and Paul have access to a risk-free asset and
they can borrow and lend money at the risk-free rate. In this case, both will
choose the efficient risky portfolio with the largest Sharpe ratio in Task 2.5 as
their optimal risky portfolio and will divide their wealth between this optimal
portfolio and the risk-free asset to achieve their objectives. They could also
borrow money (i.e. have a negative weight on the risk-free asset, which is
assumed to be capped at -100%; that is, the maximum amount that they can
borrow is equal to their wealth) to invest more in the risky assets. What
will be their portfolio compositions in this case? Write code to answer the
question and store the results in Pandas Series objects called Alice port rf
and Paul port rf capturing the return, volatility, Sharpe ratio, the stock
weights and risk-free asset weight of Alice’s and Paul’s portfolios, respectively.
Set the index of Alice port rf and Paul port rf correspondingly as in Task
2.5. (7 marks)
Task 3: Factor models (Σ = 25 marks)
3.1: Denote P be the portfolio formed by combining the five stocks in your
sample using equal weights. Compute the daily returns of the portfolio P
over the considered time period from 01/01/2009 to 31/12/2023. (3 marks)
3.2: Using data from the Fama-French dataset, estimate a Fama-French fivefactor model for portfolio P over the above period. Test if portfolio P possesses
any abnormal returns that cannot be explained by the five-factor model. (4 marks)
3.3: Conduct the White test for the absence of heteroskedasticity in the residuals
of the above factor model and draw your conclusion using a 5% significance
level. (3 marks)
3.4: Conduct the Breusch-Godfrey test for the absence of serial correlation up
to order 10 in the residuals of the above factor model and draw your conclusion
using a 5% significance level. (3 marks)
3.5: Based on results in the above two tasks, update the Fama-French five-factor
regression model and re-assess your conclusion on the pricing of portfolio P
according to the five-factor model in Task 3.2. (3 marks)
5
3.6: Compute the 3-year rolling window β estimates of the Fama-French five
factors for portfolio P over the sample period. That is, for each day, we
compute β loadings for the five factors using the past 3-year data (including
data on that day). Plot a figure similar to the following for your stock sample,
showing the rolling window β estimates of the five factors, together with 95%
confidence bands. Provide brief comments. (9 marks)

Rolling CMA for portfolio P
(Σ = 20 marks)
Task 4: These marks will go to programs that are well structured, intuitive to use
(i.e. provide sufficient comments for me to follow and are straightforward for
me to run your code), generalisable (i.e. they can be applied to different sets of
stocks, different required rates of return for Alice or different utility functions
for Paul with minimal adjustments/changes to the code) and elegant (i.e. code
is neat and shows some degree of efficiency).
6