AirBNB Data Analysis
Project 1 of Udacity Data Scientist Nanodegree
Estimated Revenue metrics [1] is introduced and used to answer the following questions:
Linear Regression is used to model the rental property price dependency on several objective characteristics:
number of bedrooms and bathrooms, number of beds and how many peoples can be accomodated.
One-hot encoded information about property location is included as explanatory variables.
$ git clone git@github.com:barkas62/data_scientist_project1.git
$ cd data_scientist_project1
Main notebook. Data are loaded, and preprocessed.
New data (Estimated Property Revenue) are derived from listings and reviews data.
These data are used to get a answers on stated questions and create a predicive model
Can be downloaded from here.
Unzipped data files (listings.csv and reviews.csv) must be placed in /data subfolder.
$ jupyter notebook
Jupyter environment will be running in browser. Click on airbnb_data_analysis.ipynb
Estimated Revenue metric is really useful for getting some important insight from Seattle AirBnB data:
Also we created and trained a simple Linear Regression model which can be used for helping the prospective buyers to estimate the rental price of their property, depending on several property features (number of bedrooms, bathrooms, beds; how many peoples can be accomodated) and on the neighbourhood, where the property is located.
@barkas62/using-airbnb-data-to-help-prospective-hosts-b80b0cd74375"">https://medium.com/@barkas62/using-airbnb-data-to-help-prospective-hosts-b80b0cd74375
1: https://towardsdatascience.com/airbnb-in-seattle-data-analysis-8222207579d7