# region_estimators package
[![Build Status](https://travis-ci.org/UoMResearchIT/region_estimators.svg?branch=master)](https://travis-ci.org/UoMResearchIT/region_estimators)
region_estimators is a Python library to calculate regional estimations of scalar quantities, based on some known scalar quantities at specific locations.
For example, estimating the NO2 (pollution) level of a postcode/zip region, based on sensor data nearby.
This first version of the package is initialised with 2 estimation methods:
1. Diffusion: look for actual data points in gradually wider rings, starting with sensors within the region, and then working in rings outwards, until sensors are found. If more than one sensor is found at the final stage, it takes the mean.
2. Simple Distance measure: This is a very basic implementation... Find the nearest sensor to the region and use that value.
If sensors exist within the region, take the mean.
## Installation
Use the package manager [pip](https://pip.pypa.io/en/stable/) to install region_estimators.
```bash
pip install shapely
pip install pandas
pip install geopandas
pip install region_estimators
```
## Usage
```python
>>> from shapely import wkt
>>> import pandas as pd
>>> from region_estimators import RegionEstimatorFactory
# Prepare input files (For sample input files, see the 'sample_input_files' folder)
>>> df_regions = pd.read_csv('/path/to/file/df_regions.csv', index_col='region_id')
>>> df_sensors = pd.read_csv('/path/to/file/df_sensors.csv', index_col='sensor_id')
>>> df_actuals = pd.read_csv('/path/to/file/df_actuals.csv')
# Convert the regions geometry column from string to wkt format using wkt
>>> df_regions['geometry'] = df_regions.apply(lambda row: wkt.loads(row.geometry), axis=1)
# Create estimator, the first parameter is the estimation method.
>>> estimator = RegionEstimatorFactory.region_estimator('diffusion', df_sensors, df_regions, df_actuals)
# Make estimations
>>> estimator.get_estimations('urtica', 'AB', '2017-07-01')
>>> estimator.get_estimations('urtica', None, '2018-08-15') # Get estimates for all regions
>>> estimator.get_estimations('urtica', 'AB', None) # Get estimates for all timestamps
>>> estimator.get_estimations('urtica', None, None) # Get estimates for all regions and timestamps
# Convert dataframe result to (for example) a csv file:
>>> df_region_estimates = estimator.get_estimations('urtica', None, '2018-08-15')
>>> df_region_estimates.to_csv('/path/to/file/df_urtica_2018-08-15_estimates.csv')
##### Details of region_estimators classes / methods used above: #####
'''
# Call RegionEstimatorFactory.region_estimator
# Reguired inputs:
# method_name (string): the estimation method. For example, in the first version
# the options are 'diffusion' or 'distance-simple'
# 3 pandas.Dataframe objects: (For sample input files, see the 'sample_input_files' folder)
sensors: list of sensors as pandas.DataFrame (one row per sensor)
Required columns:
'sensor_id' (INDEX): identifier for sensor (must be unique to each sensor)
'latitude' (numeric): latitude of sensor location
'longitude' (numeric): longitude of sensor location
Optional columns:
'name' (string): Human readable name of sensor
regions: list of regions as pandas.DataFrame (one row per region)
Required columns:
'region_id' (INDEX): identifier for region (must be unique to each region)
'geometry' (shapely.wkt/geom.wkt): Multi-polygon representing regions location and shape.
actuals: list of actual sensor values as pandas.DataFrame (one row per timestamp)
Required columns:
'timestamp' (string): timestamp of actual reading
'sensor_id': ID of sensor which took actual reading (must match with a sensors.sensor_id
in sensors (in value and type))
[one or more value columns] (float): value of actual measurement readings.
each column name should be the name of the measurement e.g. 'NO2'
verbose: (int) Verbosity of output level. zero or less => No debug output
Returns:
Initialised instance of subclass of RegionEstimator
'''
estimator = RegionEstimatorFactory.region_estimator(method_name, df_sensors, df_regions, df_actuals)
# Call RegionEstimator.get_estimations
# Required inputs:
# region_id: region identifier (string (or None to get all regions))
# timestamp: timestamp identifier (string (or None to get all timestamps))
# print_progress print progress (boolean, default:False)
#
# WARNING! - estimator.get_estimates(None, None) will calculate every region at every timestamp.
result = estimator.get_estimations('urtica', 'AB', '2018-08-15')
# result is a pandas dataframe, with columns:
# 'measurement'
# 'region_id'
# 'timestamp'
# 'value' (the estimated value)
# 'extra_data' (extra info about estimation calculation)
```
## Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
## License
[MIT](https://opensource.org/licenses/MIT)
挣扎的蓝藻
- 粉丝: 14w+
- 资源: 15万+
最新资源
- 零基础之转录组分析,趋势分析差异分析热图
- html渲染器,粘贴html代码到这个渲染器即可渲染出对应的效果
- 计应4班-李长文-07-人工智能期末考试试卷B.doc
- 圣诞树html网页代码
- build(1).gradle
- 含微网的配电网优化调度yalmip 采用matlab编程,以IEEE33节点为算例,编写含sop和3个微网的配电网优化调度程序,采用yalmip+cplex 这段程序是一个微网系统的建模程序,用于对微
- MMC整流器(Matlab),技术文档 1.MMC工作在整流侧,子模块个数N=18,直流侧电压Udc=25.2kV,交流侧电压6.6kV 2.控制器采用双闭环控制,外环控制直流电压,采用PI调节器,电
- Cyclecharacter01234
- C# 通过串口实时获取温湿度
- 基于粒子群算法的配电网日前优化调度 采用IEEE33节点配电网搭建含风光,储能,柴油发电机和燃气轮机的经济调度模型 以运行成本和环境成本最小为目标,考虑储能以及潮流等约束,采用粒子群算法对模型进行求
- Smart Log Tool V1.7
- 两相交错并联LLC谐振变器,均流和不均流方式都有,联系前请注明是否均流 模型均可实现输出电压闭环控制 第二幅波形图模拟的效果为电容相差15%,均流效果良好 仿真模型的运行环境是matlab simul
- 云上探索实验室活动说明
- matlab三电平statcom无功检测双闭环svpwm调制两电平 三电平逆变器拓扑,pq无功电流检测模块,直流电压外环电流内环解耦控制,svpwm调制生成触发信号 附带Word讲解
- C# 获取网卡物理地址源码
- .archivetemp阅读天数.py
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈