Python库|c4v-py-0.1.0.dev2021118.tar.gz资源-CSDN文库

版权申诉

73 浏览量 2022-04-07 00:29:25 上传评论收藏 64KB GZ 举报

共59个文件

py：52个

toml：2个

json：2个

资源推荐

资源详情

资源评论

收起资源包目录

c4v-py-0.1.0.dev2021118.tar.gz （59个子文件）

c4v-py-0.1.0.dev2021118

PKG-INFO 6KB

angostura_connection.json 2KB

pyproject.toml 2KB

explanation.html 2KB

src

c4v

scraper

scraped_data_classes

scraped_data.py 3KB

base_scraped_data.py 694B

primicia_scraped_data.py 1005B

elpitazo_scraped_data.py 1004B

scrapy_settings.py 179B

scraper.py 3KB

spiders

el_pitazo.py 2KB

primicia.py 2KB

__init__.py 0B

utils.py 4KB

crawler

__init__.py 214B

crawlers

el_pitazo_crawler.py 496B

__init__.py 0B

primicia_crawler.py 1KB

base_crawler.py 7KB

__init__.py 218B

settings.py 1KB

scrapers

el_pitazo_scraper.py 392B

base_scraper.py 3KB

primicia_scraper.py 393B

base_scrapy_scraper.py 2KB

__init__.py 192B

spider_manager.py 3KB

persistency_manager

example_dict_manager.py 3KB

base_persistency_manager.py 3KB

sqlite_storage_manager.py 12KB

__init__.py 172B

c4v_cli.py 23KB

data

angostura_loader.py 3KB

data_loader.py 12KB

baseline_models.py 11KB

tweet_loader.py 3KB

data_sampler.py 14KB

settings.toml 0B

classifier

classifier_experiment.py 7KB

classifier.py 18KB

language_model

language_model.py 12KB

language_model_experiment.py 3KB

data_gathering

client_dataset_reformat.py 2KB

create_confirmation_dataset.py 4KB

create_training_csv.py 956B

primicia_irrelevant_news_scraping.py 2KB

primicia_irrelevant_news_wrangle.py 2KB

__init__.py 254B

base_model.py 5KB

experiment.py 8KB

__init__.py 22B

config.py 2KB

microscope

manager.py 20KB

utils.py 2KB

metadata.py 3KB

__init__.py 42B

myjson_metadata.json 101B

setup.py 6KB

README.md 4KB

# c4v-py <p align="center"> <img width="125" src="assets/logo.png"> </p> > Solving Venezuela pressing matters one commmit at a time `c4v-py` is a library used to address Venezuela's pressing issues using computer and data science. - [Installation](#installation) - [Development](#development) - [Pending](#pending) ## Installation Use pip to install the package: ```python3 pip install c4v-py ``` ## Usage _TODO_ [Can you help us? Open a new issue in minutes!](https://github.com/code-for-venezuela/c4v-py/issues/new/choose) ## Contributing The following tools are used in this project: - [Poetry](https://python-poetry.org/) is used as package manager. - [Nox](https://nox.thea.codes/) is used as automation tool, mainly for testing. - [Black](https://black.readthedocs.io/) is the mandatory formatter tool. - [PyEnv](https://github.com/pyenv/pyenv/wiki) is recommended as a tool to handle multiple python versions in your machine. The library is intended to be compatible with python ~3.6.9, ~3.7.4 and ~3.8.2. But the primary version to support is ~3.8.2. The general structure of the project is trying to follow the recommendations in [Cookiecutter Data Science](https://drivendata.github.io/cookiecutter-data-science/). The main difference lies in the source code itself which is not constraint to data science code. ### Setup 1. Install pyenv and select a version, ie: 3.8.2. Once installed run `pyenv install 3.8.2` 2. Install poetry in your system 3. Clone this repo in a desired location `git clone https://github.com/code-for-venezuela/c4v-py.git` 4. Navigate to the folder `cd c4v-py` 5. Make sure your poetry picks up the right version of python by running `pyenv local 3.8.2`, if 3.8.2 is your right version. 6. Since our toml file is already created, we need to get all dependencies by running `poetry install`. This step might take a few minutes to complete. 7. Install nox 8. From `c4v-py` directory, on your terminal, run the command `nox -s tests` to make sure all the tests run. If you were able to follow every step with no error, you are ready to start contributing. Otherwise, [open a new issue](https://github.com/code-for-venezuela/c4v-py/issues/new/choose)! ## Roadmap - [ ] Add CONTRIBUTING guidelines - [ ] Add issue templates - [ ] Document where to find things (datasets, more info, etc.) - This might be done (in conjunction) with Github Projects. Managing tasks there might be a good idea. - [ ] Add LICENSE - [ ] Change the authors field in pyproject.toml - [ ] Change the repository field in pyproject.toml - [ ] Move the content below to a place near to the data in the data folder or use the reference folder. Check [Cookiecutter Data Science](https://drivendata.github.io/cookiecutter-data-science/) for details. - [ ] Understand what is in the following folders and decide what to do with them. - [ ] brat-v1.3_Crunchy_Frog - [ ] creating_models - [x] data/data_to_annotate - [ ] data_analysis - [ ] Set symbolic links between `brat-v1.3_Crunchy_Frog/data` and `data/data_to_annotate`. `data_sampler` extracts to `data/data_to_annotate`. Files placed here are read by Brat. - [ ] Download Brat - `wget https://brat.nlplab.org/index.html` - [ ] untar brat - `tar -xzvf brat-v1.3_Crunchy_Frog.tar.gz` - [ ] install brat - `cd brat-v1.3_Crunchy_Frog && ./install.sh` - [ ] replace default annotation conf for current configuration - `wget https://raw.githubusercontent.com/dieko95/c4v-py/master/brat-v1.3_Crunchy_Frog/annotation.conf -O annotation.conf` - [ ] replace default config.py for current configuration - `wget https://raw.githubusercontent.com/dieko95/c4v-py/master/brat-v1.3_Crunchy_Frog/config.py -O config.py`

评论收藏

内容反馈

版权申诉