<!-- [![PyPI version](https://badge.fury.io/py/sentimentanalyser.svg)](https://badge.fury.io/py/sentimentanalyser)
[![Python 3.6](https://img.shields.io/badge/python-3.6-blue.svg)](https://www.python.org/downloads/release/python-360/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![HitCount](http://hits.dwyl.io/ashhadulislam/sentiment-analyser-lib.svg)](http://hits.dwyl.io/ashhadulislam/sentiment-analyser-lib)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/sentimentanalyser.svg)](https://img.shields.io/pypi/dm/sentimentanalyser.svg)
[![CodeFactor](https://www.codefactor.io/repository/github/ashhadulislam/sentiment-analyser-lib/badge/master)](https://www.codefactor.io/repository/github/ashhadulislam/sentiment-analyser-lib/overview/master) -->
# Augment data library
### About
A generic package to help data scientists balance their dataset by increasing the datapoints for an imbalanced class.
### Installation
Use below command to install
`pip install sentimentanalyser`
### Usage
Convert your dataset to numpy array.
All values of the data must be numeric.
The last column must be the class label
Function call: 5 inputs
```
augment(data=df.values,k=k,class_ind=1,N=45000,randmx=randmx)
```
- data is the array like input of data, last column of data is class label
- k is number of neighbors, it should be bigger or equal to 1
- class_ind is the value of data that needs to be augmented. For example, if the class labels are 0 or 1 and the datapoints for 0 need to be upsampled, class_ind=0
- N is the number of Datapoints that needs to be added
- randmx will be a value between 0 and 1, inclusive. smaller the randmx, closer is the data to each original data. randmx, uniform[0,randmx], ; randmx<=1
The outputs are:
- Data_a: complete data with augmented datapoints
- Ext_d: Only the augmented data points
- Ext_not: The datapoints that was created but ignored
Example implementation
```
from augmentdata import data_augment
l=[
[1,3,4,1],
[2,3,4,1],
[1,2,1,0],
[3,2,1,0],
[3,1,1,0],
[2,1,1,0],
[3,2,2,1],
[3,4,2,1],
[4,3,1,1]
]
l=np.array(l)
k=2
randmx=1
daug = data_augment.DataAugment()
[Data_a,Ext_d,Ext_not]=daug.augment(data=l,k=k,class_ind=0,N=5,randmx=randmx)
print(Data_a)
```
Output
```
array([[1. , 2. , 1. , 0. ],
[3. , 2. , 1. , 0. ],
[3. , 1. , 1. , 0. ],
[2. , 1. , 1. , 0. ],
[1.29027148, 1.98510073, 1. , 0. ],
[1.65549291, 1.4418645 , 1. , 0. ],
[2.02559196, 1.01965248, 1. , 0. ],
[2.79469135, 1.69371064, 1. , 0. ],
[2.907707 , 1.38716444, 1. , 0. ],
[1. , 3. , 4. , 1. ],
[2. , 3. , 4. , 1. ],
[3. , 2. , 2. , 1. ],
[3. , 4. , 2. , 1. ],
[4. , 3. , 1. , 1. ]])
```
### Authors
- Dr Samir Brahim Belhaouari: samir.brahim@gmail.com, sbelhaouari@hbku.edu.qa
- Ashhadul Islam: ashhadulislam@gmail.com, aislam@mail.hbku.edu.qa
PyPI 官网下载 | augmentdata-0.0.10.tar.gz
版权申诉
180 浏览量
2022-02-10
03:19:08
上传
评论
收藏 7KB GZ 举报
挣扎的蓝藻
- 粉丝: 13w+
- 资源: 15万+
最新资源
- 基于HTML的旅游网页制作源码设计.zip
- 基于HTML的旅游网页制作源码设计.zip
- 大数据揭秘京沪程序员的爱情代码 WIFIPIX(PDF格式).rar
- 大数据实战Demo系统-MaxCompute数据仓库数据转换实践(PDF格式).rar
- 六一儿童节代码祝福六一儿童节代码祝福六一儿童节代码祝福.txt
- sql语句sql语句sql语句sql语句.txt
- ubuntu20.04安装教程ubuntu20.04安装教程.txt
- imgcache.0
- 高分项目基于faster-rcnn知识蒸馏的目标检测模型增量深度学习方法源码.zip
- 基于python和图数据库neo4j构建电影应用(高分毕设项目)
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈