Namespace:    words-housing

PI: Volkan Vural
Institution: University of California, San Diego
Project description:

Developing Machine Learning based techniques to predict the property values in San Diego County. These techniques are being developed using data obtained from multiple sources. The data can be categorized as:
1. Intrinsic property features such as number of bedrooms, bathrooms, square footage, etc.
2. External features such as school districts and school ratings.
3. Historical transactions such as sales, valuations, foreclosures.
Property characteristics, address details, sales, valuation, foreclosure datasets span a 34 year period from 1987 to 2017 and were collected from City of San Diego whereas other data sets such as school districts, school ratings are available online. Using these datasets, more than 250 features were engineered but only 37 features that offered significant predictive power were selected to feed into the prediction system. In order to further improve the performance of the developed predictive system, more external variables are being analyzed and incorporated to the system. These external variables include but not limited to the macroeconomic indicators such as mortgage rates, libor, prime rates, inflation, census data, USD to MXN (Mexican Peso) exchange rate and etc. Additionally, the cross-border economic activity and cultural identity on the housing market in San Diego will also be investigated. This could be the first study on cross-border economic effects on housing prices using Machine Learning techniques in the literature. This project also serves as a capstone for the grad students at MAS

Software: Python, Conda, TensorFlow

Back to namespaces