代写INMR77 Business Intelligence and Data Mining帮做Python语言程序

- 首页 >> Database

Module code: INMR77

Module name: Business Intelligence and Data Mining

Work to be handed in by: 13 May 2024 @2pm

Assignment Specification:

The module is assessed 100% through this coursework assignment.

This coursework aims to assess your knowledge of business intelligence and data mining, and your ability to perform. data mining tasks   by applying suitable concepts, methods and techniques learned during the lectures and practical lab sessions for business intelligence.

The coursework is carried out individually.  You are required to produce a report of 20 pages of A4 (+/- 10%), including tables and diagrams but excluding references and appendices, based on the case as described in this coursework document.

An appendix can be used to include support materials to back up main body points where necessary.

You are also required to submit any supplementary materials of your work using SAS Enterprise Miner on blackboard by the specified deadline.

1. The Case

Opportunities and Challenges of Sharing Economy: Airbnband InsideAirbnb

Airbnb - Holiday Lets, Homes, Experiences & Places

Airbnb (Airbnb.co.uk) is an online marketplace for arranging or offering short-term rental/lodging i.e. temporary accommodation, primarily homestays, or tourism experiences. It was founded in August 2008 by Brain Chesky and friends, and it currently has 6,300 employees as in 2021.

Airbnb went public with a valuation of over $100 billion on December 10, 2020, making it one of the largest IPOs (Initial Public Offerings) of 2020.  It is reported that Airbnb capital market was  more  than  the  top  three  largest  hotel  chains  (Marriott,  Hilton,  and  Intercontinental) combined. Though some are calling over evaluation, the company lacks traditional mortgages, employee  fees,  and  maintenance  fees  which  burden  hotels.  Airbnb  hosts  pay  their  own mortgage and clean their apartments, leaving the company much freer of debt, thus making it far more valuable.

Airbnb service overview

Airbnb provides a platform for hosts to accommodate guests with short-term lodging and tourism-related activities. Guest can search for accommodation using filters such as location, price,  specific  types  of  home.  Before  booking,  users  must  provide  personal  and  payment information.  Some   hosts  also   require  a  scan  of  government-issues   identification  before accepting a reservation. Hosts provide prices and other details for their rental or listing e.g. number of guests included in the price, type of property, type of room, number of bathrooms, number of bedrooms, number of beds and type of bed, and minimum number of nights for a reservation, and amenities.

In addition, Airbnb provides a guest review system where hosts and guests can leave reviews about  their  experience,  and  rate  each  other  after  a  stay.  However,  the truthfulness  and impartiality  of  reviews  may  be  adversely  affected  by  concerns  of  future   stays   because prospective hosts may refuse to host a user who generally leaves negative reviews. Besides, the company's policy requires users to forego anonymity, which may also detract from users' willingness to leave negative reviews.

Criticism of Airbnb

Airbnb has attracted criticism for increasing housing/residential rental prices in cities where it operates and creating nuisances and security issues etc for those living near leased properties and has negatively affects the quality of life in residential areas, and housing crisis around city in the UK, USA and Europe. The company has attracted regulatory attention from cities such as San Francisco, New York City, and the European Union over the past number years. It has also faced challenges from the hotel industry and other, similar companies.

Airbnb has made a quarter (25%) of its global workforce redundant in 2020 due to the global pandemic. But the news was welcome by some campaigners who were fighting for soaring rents in cities with large number of Airbnb hosts. The number of longer-term rental properties (i.e.  residential as  opposed  to short-term/holiday lets) in  central  Dublin  was  up  71%  on comparable period last year, as landlords abandoned short-term lets through Airbnb.

Inside Airbnb – Adding Data to the Debate (insideairbnb.com)

Inside Airbnb (insideairbnb.com) is an independent, non-commercial set of tools and data that allows individual to explore how Airbnb is really used in cities around the world. It was set up by Murray Cox and John Morries in 2016.

Airbnb claims to be part of the “sharing economy” and disrupting the hotel industry by offering short-term rental/lodging. However, data shows that most Airbnb listings in most cities are entire homes, many of which are rented all year round (i.e. illegal short-term rentals)– disrupting housing and communities.

Most recently, New York City’s plans to crackdown on illegal short-term rentals which could  remove as many as 10,000 Airbnb listing is sparking fierce debates about housing, hotels, the tourist market and residents’ rights.

By  analysing  publicly  available  information  about  a  city’s  Airbnb’s  listings,   Inside  Airbnb provides filters and key metrics so users can see how Airbnb is being used to compete with the residential housing market. With Inside Airbnb, user can ask fundamental questions about Ainbnb in any neighbourhood, or across the city as a whole, such as:

•    how many listings are in my neighbourhood and where are they?

•    how many houses and apartments are being rented out frequently to tourists and not to long-term residents?

•    how  much  are  hosts  making  from  renting  to  tourists  (compare  that  to  long-term residential rentals)?

•    which host are likely running a business with a multiple listings and where are they?

These questions (and the answers) get to the core of the debate for many cities around the world, with Airbnb claiming that their hosts only occasionally rent the homes in which they live. In addition, many cities or state legislation or ordinances that address residential housing, short term or vacation rentals, and zoning usually make reference to allowed use, including:

•    how many nights a dwelling is rented per year

•    minimum nights stay

•    whether the host is present

•    how many rooms are being rented in a building

•    the number of occupants allowed in a rental

•    whether the listing is licensed

The  Inside  Airbnb  tool  or  data  can   be  used  to  answer  some  of  these  questions.  Some understanding of how the Airbnb platform is being used will help clear up the laws as they change.

Additional Information:

Further information of Airbnb, please visit:https://www.airbnb.co.uk/

Further information of Inside Airbnb, please visit:http://insideairbnb.com/index.html

2.   Coursework requirements

The sharing economy has brought opportunities and challenges to homeowners, society, residents, communities and governments. One of the biggest issues with Airbnb is whether hosts are sharing the primary residence in which they live "occasionally" (i.e. genuine short- term rentals) or are renting out residential properties permanently as hotels (i.e. not genuine short-term rentals )

Airbnb could easily answer this question but instead it is up to us to shape our communities and solve our urgent need to house tourists, housing shortage/crisis, and to address the nuisances, security and safety issues etc for those living near leased properties by Airbnb.

In this assignment, you are required to carryout data mining tasks using data of Airbnb

listings of London, UK from InsideAirbnb, and to report your findings as a result your data mining and analysis.

2.1 The DATA

The data is available to download from InsideAirbnbas shown in Figure 1 below. A copy of the data, for the purpose of this assignment, is provided and available to download on the blackboard.

http://insideairbnb.com/get-the-data

Figure 1: London Data compiled by InisideAirbnbason 10 December 2023 (-

http://insideairbnb.com/get-the-data

As shown in Figure 1:

1) Listings.csv.gz contains detailed listing data of London. The data was compiled

on 10 December 2023. Each row of the data represents a single listing and

contains information about the host of the property, the property’s

characteristics and overall rating of the property and its features by guests. There are 91,778 listings and 60+ variables in the data set. Listing can be deleted in the   Airbnb platform. The data presented is a snapshot of listings available at a

particular of time as on and up to 10 December 2023.

2) Reviews.csv.gz contains the detailed reviews data for each listing. The data was used for a number of derived variables in the detailed listing data e.g.

number_of_reviews, number_of_review_ltm, first_review, last_review, and reviews_per_months.

3) Calender.csv.gz contains detailed calendar data i.e. the availability calendar for 365 days in the future for each listing. In addition

4) A data dictionary - can be viewed and downloaded from

https://docs.google.com/spreadsheets/d/1iWCNJcSutYqpULSQHlNyGInUvHg2Bo UGoNRIGa6Szc4/edit#gid=1322284596

2.2. Your tasks

You are required to use the detailed listing data (listings.xslx) to find meaningful pattens and rules of whether hosts London are renting out residential properties as hotels/business or genuinely sharing the primary residence in which the live “occasionally” (legitimate/geneuie short-term rentals).

You are expected to conduct cluster analysis (unsupervised learning) to differentiate hosts/listings that are likely to be genuine short term let (genuine short-term rentals) versus hosts/listings that are likely to be operating as a business (not genuine short-term rentals) based on information about the host of the property, the property’scharacteristics and overall rating of the property and its features by guests etc. Also, to build a classification model that could differentiate hosts and/or listings that are for genuine (occasionally) short- term or viceversa.

You are also expected to conduct literature search in the area, and conduct data exploration, data preparation, for the model building using the methods and techniques learned during the module. You are expected to provide justifications for the approaches taken (e.g. use existing literature to backup points made) and your findings have to backup by your analyses/results (e.g. use of screenshots/figures)

2.2 What to deliver

You are required to produce a report of 20 pages of A4 (+/- 10%) including tables and diagrams but excluding references and appendices. An appendix can be used to include support materials to back up main body points where necessary. You are also required to submit the supplementary materials of your work using SAS Enterprise Miner on blackboard by the specified deadline.

The total of 100 marks will be allocated to the following aspects of the report, which should also be used as a guideline to structure the report.

1.    Introduction (15%)

The introduction section should include the background, problems, and issues of Airbnb in the context of sharing economy, a clear statement of your data mining goal. You are expected to conduct literature review in the area, and justify your statements using relevant literature   and sources.

2.    Data Understanding (15%)

In this section, you are expected to conduct exploratory data analysis e.g. summaries statistics and data visualisation techniques, using suitable and relevant techniques and methods and report your key findings, including variables and measurement identified for your subsequent data preparation tasks.

3.    Data Preparation (15%)

In this section, you are expected to take the variables  identified  in the  previous  step  and prepare them for your model building. This should include:

a)    data cleaning e.g. missing data handling

b)   data transformation e.g. creating new derive variables

c)    data reduction (e.g. correlation analysis)

You are expected to justify the approaches taken backup by relevant literature/sources. Make sure to include figures and tables (screenshot) to support your analyses and findings.

4.    Cluster Analysis and Results Interpretation (15%)

In this section, you are expected to conduct cluster analysis i.e. identifying clusters/segments of listings based a set or combination set of variables e.g. host’scharacteristic, listings/property’s characteristics and availability, and reviews from guests etc. This should include,

a)    a list of variables and clustering techniques used with reasoning for your cluster analysis

b)   result interpretations and comments on the characteristics of the clusters/segments obtained.

You are expected to justify the approaches taken backup by relevant literature/sources. Make sure to include figures and tables (screenshots) to support your findings and analysis.

Supplement materials can be provided at the appendix section.

5.    Classification Model Building and Model Evaluation (15%)

In this section, you are expected to build a classification model based on the results obtained from your cluster analysis above. Since this information would be most likely to be used to differentiate those listings/hosts that are likely to be “ not genuine short-term rentals”, it would be more meaningful to select segment/cluster(s) that would likely be defined as “not genuine short-term rental” in your classification model – the target variable.

You are expected to:

a)    Conduct further data preparation where necessary and justify the variables used for your model building – target/dependent variable and independent variables.

b)    Model building and model evaluation –classification methods used and provide your reasoning.

Make sure to include figures and tables (screenshots) to support your model buildings, analyses and findings. Supplement materials can be provided at the appendix section.

6.    Conclusion, critical reflection, and suggestion for improvements (15 marks)

In this section, you should conclude the outcomes of your findings in relation to the data mining goal. Reflect your learning and discuss the limitations of your data mining process, this might include the assessment of the suitability of data and variables, methods and techniques used, assumptions made, and provide suggestion for model improvements.

In addition, there are 10 marks allocated to the structure (clarity of organisation and structure - addresses all components of the assignment brief with appropriate weighting across each component, logical structure to the overall argument that is easy to follow), and presentation (e.g. effective use of tables and diagrams, proper use of citation and referencing in an Author-Year e.g., Harvard, APA format, length/page limit).





站长地图