代写INMR77 Business Intelligence and Data Mining 2025帮做R编程
- 首页 >> DatabaseINMR77
Business Intelligence and Data Mining
1. The Case
Opportunities and Challenges of Sharing Economy: Airbnband InsideAirbnb
Airbnb - Holiday Lets, Homes, Experiences & Places
Airbnb (Airbnb.co.uk) is an online marketplace for arranging or offering short-term rental/lodging i.e. temporary accommodation, primarily homestays, or tourism experiences. It was founded in August 2008 by Brain Chesky and friends, and it currently has 6,300 employees as in 2021.
Airbnb went public with a valuation of over $100 billion on December 10, 2020, making it one of the largest IPOs (Initial Public Offerings) of 2020. It is reported that Airbnb capital market was more than the top three largest hotel chains (Marriott, Hilton, and Intercontinental) combined. Though some are calling over evaluation, the company lacks traditional mortgages, employee fees, and maintenance fees which burden hotels. Airbnb hosts pay their own mortgage and clean their apartments, leaving the company much freer of debt, thus making it far more valuable.
Airbnb service overview
Airbnb provides a platform for hosts to accommodate guests with short-term lodging and tourism-related activities. Guest can search for accommodation using filters such as location, price, specific types of home. Before booking, users must provide personal and payment information. Some hosts also require a scan of government-issues identification before accepting a reservation. Hosts provide prices and other details for their rental or listing e.g. number of guests included in the price, type of property, type of room, number of bathrooms, number of bedrooms, number of beds and type of bed, and minimum number of nights for a reservation, and amenities.
In addition, Airbnb provides a guest review system where hosts and guests can leave reviews about their experience, and rate each other after a stay. However, the truthfulness and impartiality of reviews may be adversely affected by concerns of future stays because prospective hosts may refuse to host a user who generally leaves negative reviews. Besides, the company's policy requires users to forego anonymity, which may also detract from users' willingness to leave negative reviews.
Criticism of Airbnb
Airbnb has attracted criticism for increasing housing/residential rental prices in cities where it operates and creating nuisances and security issues etc for those living near leased properties and has negatively affects the quality of life in residential areas, and housing crisis around city in the UK, USA and Europe. The company has attracted regulatory attention from cities such as San Francisco, New York City, and the European Union over the past number years. It has also faced challenges from the hotel industry and other, similar companies.
Airbnb has made a quarter (25%) of its global workforce redundant in 2020 due to the global pandemic. But the news was welcome by some campaigners who were fighting for soaring rents in cities with large number of Airbnb hosts. The number of longer-term rental properties (i.e. residential as opposed to short-term/holiday lets) in central Dublin was up 71% on comparable period last year, as landlords abandoned short-term lets through Airbnb.
Inside Airbnb – Adding Data to the Debate (insideairbnb.com)
Inside Airbnb (insideairbnb.com) is an independent, non-commercial set of tools and data that allows individual to explore how Airbnb is really used in cities around the world. It was set up by Murray Cox and John Morries in 2016.
Airbnb claims to be part of the “sharing economy” and disrupting the
hotel industry by offering short-term rental/lodging. However, data shows that most Airbnb listings in most cities are entire homes, many of which are rented all year round (i.e. illegal listings)– disrupting housing and communities.
Most recently, New York City’s plans to crackdown on illegal listings which could remove as many as 10,000 Airbnb listing is sparking fierce debates about housing, hotels, the tourist
market and residents’ rights.
By analysing publicly available information about a city’s Airbnb’s listings, Inside Airbnb provides filters and key metrics so users can see how Airbnb is being used to compete with the residential housing market. With Inside Airbnb, user can ask fundamental questions about Ainbnb in any neighbourhood, or across the city as a whole, such as:
• how many listings are in my neighbourhood and where are they?
• how many houses and apartments are being rented out frequently to tourists and not to long-term residents?
• how much are hosts making from renting to tourists (compare that to long-term residential rentals)?
• which host are likely running a business with a multiple listings and where are they?
These questions (and the answers) get to the core of the debate for many cities around the world, with Airbnb claiming that their hosts only occasionally rent the homes in which they live. In addition, many cities or state legislation or ordinances that address residential housing, short term or vacation rentals, and zoning usually make reference to allowed use, including:
• how many nights a dwelling is rented per year
• minimum nights stay
• whether the host is present
• how many rooms are being rented in a building
• the number of occupants allowed in a rental
• whether the listing is licensed
Inside Airbnb is a mission driven project that provides data and advocacy about Airbnb's impact on residential communities with a vision where communities are empowered with data and information to understand, decide and control the role of renting residential homes to tourists. The Inside Airbnb tool or data can be used to answer some of these questions. Some understanding of how the Airbnb platform is being used will help clear up the laws as they change.
2. Coursework requirements
The sharing economy has brought opportunities and challenges to homeowners, society, residents, communities and governments. One of the biggest issues with Airbnb is whether hosts are sharing the primary residence in which they live "occasionally" (i.e. genuine short- term rentals) or are renting out residential properties permanently as hotels.
Airbnb could easily answer this question but instead it is up to us to shape our communities and solve our urgent need to house tourists, housing shortage/crisis, and to address the nuisances, security and safety issues etc for those living near leased properties by Airbnb.
In this assignment, you are required to carryout data mining tasks using data of Airbnb listings of London, UK, from InsideAirbnb, and report your findings as a result your data mining/analysis.
2.1 The DATA
The data is available to download from InsideAirbnb (http://insideairbnb.com/get-the-data) as shown in Figure 1 below. A copy of the data, for the purpose of this assignment, is also provided and available to download on the blackboard.
http://insideairbnb.com/get-the-data (Search for London)
Figure 1: London Data compiled by InsideAirbnbason 6 September 2024
As shown in Figure 1:
1) Listings.csv.gz contains detailed listing data of London. The data was compiled
on 6 September 2024. Each row of the data represents a single listing and contains information about the host of the property, the property’s characteristics and overall rating of the property and its features by guests. There are 96,182 listings and 60+ variables in the data set. Listing can be deleted in the Airbnb platform. The data presented is a snapshot of listings available at a particular of time as on and up to 6 September 2024.
2) Reviews.csv.gz contains the detailed reviews data for each listing. The data was used for a number of derived variables in the detailed listing data e.g.
number_of_reviews, number_of_review_ltm, first_review, last_review, and reviews_per_months.
3) Calender.csv.gz contains detailed calendar data i.e. the availability calendar for
365 days in the future for each listing. In addition
4) A data dictionary – can be viewed and downloaded from
https://docs.google.com/spreadsheets/d/1iWCNJcSutYqpULSQHlNyGInUvHg2Bo UGoNRIGa6Szc4/edit#gid=1322284596
2.2. Your tasks
You are required to use the detailed listing data (listings.xslx) to find meaningful pattens and rules of whether hosts London are renting out residential properties as hotels/business or genuinely sharing the primary residence in which the live “occasionally” .
You are expected to conduct literature search for the problem domain, and conduct data exploration, data preparation, and model building using relevant methods and techniques introduced during the module.
2.2 What to deliver
You are required to produce a report of 20 pages of A4 OR 5,000 words (+/- 10%), including tables and diagrams but excluding references and appendices. An appendix can be used to include support materials to backup main body points where necessary. You are also required to submit the supplementary materials of your work using SAS Enterprise Miner and/or other files/evidences on blackboard by the specified deadline.
The total of 100 marks will be allocated to the following aspects of the report, which should also be used as a guideline to structure the report.
1. Introduction – Background and Data Mining goal (15%)
Critically discuss with examples of problem and issues of Airbnb in the context of sharing economy, and a clear statement of the data mining goal. Use literature to support your statements.
2. Data Understanding (15%)
In this section, you are expected to conduct exploratory data analysis e.g. summaries statistics and data visualisation, using suitable techniques and methods and report your key findings, including variables and measurement identified for your subsequent data preparation tasks.
3. Data Preparation (15%)
In this section, you are expected to take the variables identified in the previous step and prepare them for your model building. This should include:
a) data cleaning e.g. missing data handling
b) data transformation e.g. creating new derive variables
c) data reduction (e.g. correlation analysis)
You are expected to justify the approaches taken backup by relevant literature/sources. Make sure to include figures and tables (screenshot) to support your analyses and findings.
4. Cluster Analysis and Results Interpretation (15%)
In this section, you are expected to conduct cluster analysis i.e. identifying clusters/segments of listings based a combination set of variables e.g. host’s characteristic, listings/property’s
characteristics and availability, and reviews from guests etc. This should include,
a) list of variables and clustering techniques used with reasoning for your cluster analysis
b) result interpretations and comments on the characteristics of the clusters/segments obtained.
Make sure to include figures and tables (screenshots) to support your findings and analysis. Supplement materials can be provided at the appendix section.
5. Classification Model Building and Model Evaluation (15%)
In this section, you are expected to build a classification model based on the results obtained from your cluster analysis. Since this information would be most likely to be used to differentiate those listings/hosts that are likely to be renting out residential properties permanently as hotels, it would be more meaningful to select segment/cluster(s) i.e. the target variable that would likely be defined as such or viceversa in your classification model.
You should:
a) Conduct further data preparation and justify the variables used for your model building.
b) Model building and model evaluation – classification methods used and provide your reasoning.
Make sure to include figures and tables (screenshots) to support your model buildings, analyses and findings. Supplement materials can be provided at the appendix section.
6. Conclusion, critical reflection, and suggestion for improvements (15 marks)
In this section, you should conclude the outcomes of your findings in relation to the data mining goal. Reflect your learning and discuss the limitations of your data mining process, this might include the assessment of the suitability of data and variables, methods and techniques used, assumptions made, and provide suggestion for model improvements.
In addition, there are 10 marks allocated to the structure (clarity of organisation and structure - addresses all components of the assignment brief with appropriate weighting across each component, logical structure to the overall argument that is easy to follow), and presentation (e.g. effective use of tables and diagrams, proper use of citation and referencing in an Author-Year e.g., Harvard, APA format, length/page limit).