# Context:
Our client is a large **Real Estate Investment Trust (REIT)** based in **New York State, USA**. They **invest** in houses/ apartments/ condos and as part of their business, they attempt to **predict a fair transaction price** for a property **before it is sold**. This is done to calibrate internal pricing models and to 'keep a pulse' on the housing market.
As things stand, they currently employ **independent appraisers** to evaluate a property and to provide an estimate on the price of the property based on the features of that property. However, the drawback of this approach is that the quality of the appraisers can **vary wildly**, and thus, so can their ability to **accurately predict** transaction prices.
Inexperienced appraisers have, **on average, an error of 70K USD** between the predicted transaction price, and the actual transaction price.
In order to eliminate any errors between the predicted and actual price, the client would like to replace the services of the appraisers with a machine learning model and they have hired us to develop that.
# Objectives:
1) Explore the techniques involved in an entire **Machine Learning workflow** from start to finish.
2) Build a real estate pricing model to accurately predict transaction price with a **Mean Absolute Error (MAE) of < 70K USD.**
# Document Structure
We have followed **5 key steps** in our Machine Learning framework:
**Section 1** - Exploratory Analysis: This first step is meant to be quick, efficient and decisive with the end goal of allowing us to get to 'know the data'. We will uncover hints on how to approach data cleaning and which features to select for feature engineering.
**Section 2** - Data Cleaning: Before building any models, we will clean our dataset. In the real world, most datasets are 'messy', i.e., they may contain typos/ duplicate data/ measurement errors and need to be thoroughly cleaned before they can be used for analysis.
**Section 3** - Feature Engineering: We will create 'new input features' from the existing features within our dataset. Once we go through this process, we will create an 'Analytical Base Table (ABT)' which will represent our dataset after it has been cleaned and augmented through feature engineering.
**Section 4** - Algorithm selection: We will consider three common machine learning regression algorithms, namely:
- Linear Regression
- Regularized Regression
- Tree Ensemble Methods
**Section 5** - Model Training: We will examine the performance of each of these algorithms on our dataset whilst attempting to avoid overfitting through dataset splitting, pre-processing data pipelines and cross- validation.
# Executive summary
After considering various linear regression algorithms, we found that the win condition of MAE < 70K USD can be achieved by a Random Forest algorithm.
The MAE achieved was 68,116 USD and the corresponding R^2 value was 0.57.
没有合适的资源?快使用搜索试试~ 我知道了~
Real-Estate-Investment-Trust:用于预测美国纽约州房地产交易价格的线性回归模型
共3个文件
csv:1个
ipynb:1个
md:1个
需积分: 10 1 下载量 140 浏览量
2021-03-09
20:18:57
上传
评论
收藏 1.11MB ZIP 举报
温馨提示
语境: 我们的客户是一家位于美国纽约州的大型房地产投资信托(REIT) 。 他们投资于房屋/公寓/公寓,作为业务的一部分,他们试图在出售之前预测房地产的公平交易价格。 这样做是为了校准内部定价模型并“保持脉动”于房地产市场。 就目前的情况而言,他们目前聘请独立评估师评估房地产,并根据该房地产的特征提供对房地产价格的估计。 但是,这种方法的缺点是评估人员的素质可能相差很大,因此他们准确预测交易价格的能力也可能如此。 缺乏经验的评估师在预测交易价格和实际交易价格之间平均会有7万美元的误差。 为了消除预测价格与实际价格之间的任何误差,客户希望用机器学习模型代替评估员的服务,他们聘请了我们来开发该模型。 目标: 从头到尾探索整个机器学习工作流程中涉及的技术。 建立房地产定价模型,以平均平均绝对误差(MAE)<70K USD准确预测交易价格。 文件结构 我们在机器学习框架中遵循了5个关键
资源推荐
资源详情
资源评论
收起资源包目录
Real-Estate-Investment-Trust-main.zip (3个子文件)
Real-Estate-Investment-Trust-main
analytical_base_table.csv 219KB
README.md 3KB
Real estate price predictions.ipynb 1.56MB
共 3 条
- 1
资源评论
西西里上尉
- 粉丝: 24
- 资源: 4667
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功