没有合适的资源?快使用搜索试试~ 我知道了~
C50193Tsinghua University, China.pdf
需积分: 5 0 下载量 67 浏览量
2023-05-25
14:33:58
上传
评论
收藏 1.09MB PDF 举报
温馨提示
试读
24页
C50193Tsinghua University, China
资源推荐
资源详情
资源评论
For office use only
T1
T2
T3
T4
Team Control Number
50193
Problem Chosen
C
For office use only
F1
F2
F3
F4
2016
MCM/ICM
Summary Sheet
An Educational Donation Mechanism Based On Data Insight
Summary
Recent years, Big Data has become increasingly popular and the guidance of big data is
required in many fields, including charitable field. In our paper, we construct a new ROI
evaluation system for charitable organization using data mining methods to process data,
and succeed in determining an optimal investment strategy for Goodgrant Foundation.
First, we operate on the data. We do data screening according to the integrity and re-
dundancy of the information, deleting data with information less than the threshold, and
merging different attributes using linear fitting and PCA. For the reserved attributes and
schools, we do data imputation to fill the missing data based on K-means Clustering.
Then we normalize all the data to make them comparable in the following analysis.
Second, we construct a ROI evaluation criteria, which is a ratio of output and input mul-
tiplying an adjustment coefficient, named “urgency”. The ratio reflects the benefits in
related to the cost, while urgency reflects the demand for money which is an important
factor should be considered by charitable organization. We use PCA to select attributes,
letting salary, education quality and some others to represent output, tuition to represent
input and Federal loan, debt and some others to represent urgency. Then, AHP is used to
measure the importance between different factors and allocate weights.
Third, we put forward two kinds of model, basic model for one year and time series model
for five year. Seeing the ROI as benefits from investment, we introduce the fluctuation
of output as “risk”, imitating the concept of Modern Portfolio Theory in the financial
sector to solve the problems. In the basic model, we develop a Mixed Integer Linear Pro-
gramming Algorithm and succeed in finding 14 schools for the investment. Further, we
consider the time factor and improve the model into a time series model, using MILP and
Grey Prediction to determine the long-term investment strategy. 16 schools are chosen
with different time duration and different amounts of money.
Finally, we make sensitivity analysis for our model, changing the amount of schools, the
restriction for money, whether allocating the money equally or not and so on to analyze
the different results of the output and to find the better parameters for ideal results.
To sum up, our model is a feasible and reasonable model with technical and data support.
Because of the subjectivity, this model can be used flexibly after data training.
更多数学建模资料请关注微店店铺“数学建模学习交流”
https://k.weidian.com/RHO6PSpA
Team # 50193 Page 2 of 22
Contents
1 Introduction 3
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Overview of Our Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Data Processing 5
2.1 Data Screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Data imputation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Data Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 ROI Evaluation System 9
3.1 Concept of ROI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Using Grey Theory to predict ROI . . . . . . . . . . . . . . . . . . . . . 12
4 Model Construction 13
4.1 Definition of Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Basic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Results of basic model . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Time series model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5 Results of time series model . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Sensitive analysis and validation 18
5.1 Risk-Return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 School number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.3 Policy of distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6 Future Work 20
7 Conclusion 21
7.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.2 weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Team # 50193 Page 3 of 22
1 Introduction
1.1 Background
If it were a decade ago, you would not image that the pageview of ”Facebook” can be
more than millions in one minute; you would not image that when you open ”google
maps”, all the travel information is already in the palm of your hand; you would not
image that through data mining you can gain an insight into the development trend of an
enterprise to guide your investment. Nowadays big data has penetrated into our work and
lives, and has brought such huge changes. In turn, it has become particularly important
for us to find useful information from the mass of data to guide our work and life.
As a special column from ”New York Times” in Feb., 2012 says, Big Data Era has come.
In the commercial, economic and other areas, information will increasingly be made
based on data and analysis, rather than based on experience and intuition. Also the field of
charity is in the same case. In the past, it is more difficult to give money away intelligently
than to earn it in the first place.[1] We do not know how to do the charitable giving
rationally, thus, the calculation of ROI is also out of the question. But now new and
faster information could make charitable giving more effective and efficient. Moreover,
it provides a possibility to link charitable giving issues with the investment issues in the
financial sector.
This article is about a charitable donation issue of universities in the U.S. We aim to
design a measure system of the return on investment based on large quantities of data
through data mining methods. To solve the problem, we will use the Portfolio Theory,
Linear Programming, Grey Theory and some other methods to determine the optimal
strategy in terms of dimension of time.
1.2 Overview of Our Work
First, we find a few key points in this question :
• The volume of data is large and of different types. How to do the normalization of
the data.
• Among the massive data, there are many missing data in the files, which contains
less information and not easy to do batch processing.
• How to classify the large amount of attributes of universities.
• Different attributes focus on different aspects and how to judge their importance.
• How to choose schools from the candidate list and how the allocate investment
amount among them.
• The charitable investment process lasts for 5 years, so time is an important factor
which will influence our ROI criteria.
Team # 50193 Page 4 of 22
On the basis of above discussion, to determine the optimal investment strategy, we may
boil down the tasks to the following four steps :
• First, we do data screening according to the integrity and usefulness of the informa-
tion. For the retention attributes and schools, we use K-means clustering method
to fill the missing data. And normalize all the data.
• Second, we use PCA method to choose and classify different attributes. Then we
use AHP method and knowledge of finance to construct ROI concept. And we use
the ROI concept to process data, ranking the candidate schools.
• Third, we introduce the concept of Modern Portfolio Theory and construct two
models. In the basic model, we develop a Mixed Integer Linear Programming Al-
gorithm to determine an optimal investment strategy. Further, we consider the time
factor and improve the model into a timing model, using Dynamic Programming
Algorithm and Grey Prediction to determine the long-term investment strategy.
• Further analysis and discussion of the model.
1.3 Assumptions
• Ignore inflation and deflation of money and other time value of money. The value
of money remains unchanged.
• As a charitable organization, our aim is to improve educational performance and
expect for more social benefits rather than gaining profit.
• Our charitable donation is a comprehensive scholarship, not rewarding for the out-
standing contributions in specific fields.
• If we have different goals and strategies from other charitable organization, we
believe it can reduce the possibility of duplication of investment to a great extend.
• The object for evaluating and scholarship granting is school, not individual, though
some individual information is included in our criteria. And it is business of school
to allocate the money.
• We focus on the fairness of our strategy so as to invest more schools regardless of
the reputation of them.
• Because of Marginal utility, we try not to invest large amount of money to one
school.
• In the timing model, we assume that the influence of other factors which are ex-
cluded outside the model can be ignored. That’s to say, future is predictable.
Team # 50193 Page 5 of 22
2 Data Processing
2.1 Data Screening
The amount of raw data is large, so we should first do data screening according to the
integrity and usefulness of the information.
First, we do data screening on 7805 schools :
• We only consider the 2978 candidate schools in the file Problem C - IPEDS UID
for Potential Candidate Schools, and match the schools with their 95 attributes in
the file Problem C - Most Recent Cohorts Data (Scorecard Elements).xlsx.
• Delete the schools which are not currently operating institution and which are on
Heightened Cash Monitoring 2 by Department of Education, meaning that they
encounter economic depression and lack students and which has no or very limited
information about percentage of degrees awarded. It’s meaningless to invest on
these schools.
• Delete the schools of whom 50% of the attributes are ”NULL”. If the percentage
of missing data exceeds 50%, the imputation will result in great error, which we
treat as a threshold for missing data.[2]
Second, we do data screening on 95 attributes :
• We combine some binary variables together. For example, we delete the net price
for different income classes and combine the total net price for public school and
private school together, named as ”Net Price”. And we use the total retention
rate for weighting to combine retention rate for full-time and part-time students
together, named as ”Retention Rate”.
• Though there are large quantities of missing data in ”SAT scores” and ”ACT scores”,
we reserve the midpoint of them as categorical variables. High entrance require-
ments with SAT or ACT scores would indicate higher education quality, which we
will treat differently from the low entrance requirements school in the following
ROI evaluation system.
• Besides, we reserve all the flag attributes like ”flag for Historically Black College
and University”, ”flag for women-only college” and all the subject attributes like
”percentage of degrees awarded in Architecture And Related Services”, using them
for the clustering and data imputation in the next step.
• We delete location, age of students and some other information contributes less to
our ROI evaluation system.
After data screening, there are about 2700 schools each with about 60 attributes in the
candidate list.
剩余23页未读,继续阅读
资源评论
xiaoshun007~
- 粉丝: 3856
- 资源: 3129
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 三层独栋别墅图纸编号D066-三层-12.00&12.00米- 施工图.dwg
- 农村小别墅图纸编号D065-三层-14.40&18.55米-施工图.dwg
- 基于YOLOv8检测高铁吊弦缺陷实现的系统的Python源码+文档说明+训练源文件+模型.zip
- 三层农村小别墅图纸编号D064-三层-13.80&22.20米-施工图.dwg
- 三层别墅图纸编号D063-三层-13.57&17.40米- 施工图.dwg
- STC IAP15F2K61S2单片机i2c 接口PCF8591-ADC实验+DAC实验 KEIL例程源码+开发板硬件原理图
- Jlink-windows-v7967
- ADC直流分量影响分析
- 二层半独栋别墅结构水电施工图结构专家电.dwg
- 二层半独栋别墅结构水电施工图结构水暖图.dwg
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功