Data analytics of public AirBnB data to understand the pricing of properties in NYC.
using open source AirBnB data. \
http://insideairbnb.com/get-the-data.html
This study tries to build upon the existing literature by focusing not only on limited traditional features from Airbnb dataset but also enriching the data from other sources. In this novel approach the geospatial data is combined with data from Inside Airbnb. Thus, also correlating features like accessibility from the nearest subway station, eateries close to the accommodation and count of attractions within 200m with the price of the listing. In addition, we take into account seasonality by discretizing weather data while engineering data models which helps understand the shift in price across months.
This project aims to research on the New York City Airbnb data by exploring various decision making and machine learning techniques to best predict the price of a property given some input features. The dataset includes scrapped data from Airbnb combined with geospatial data from OpenStreetMap. Multiple linear regression is used as a baseline model followed by experiments done with more sophisticated models, namely Random Forests, XGBoost and Neural Network.