PyPI官网下载|foreshadow-0.3.dev2.tar.gz资源-CSDN文库

版权申诉

Python库

53 浏览量 2022-01-11 20:31:32 上传评论收藏 990KB GZ 举报

共192个文件

py：150个

json：21个

csv：9个

资源推荐

资源详情

资源评论

收起资源包目录

PyPI 官网下载 | foreshadow-0.3.dev2.tar.gz （192个子文件）

breast_cancer.csv 121KB

processed_data.csv 84KB

boston_housing.csv 35KB

boston_housing_processed.csv 21KB

heart-h.csv 19KB

heart-h_impute_multi.csv 8KB

heart-h_impute_median.csv 4KB

heart-h_impute_mean.csv 3KB

heart-h_impute_mode.csv 2KB

foreshadow_tpot.json 185KB

foreshadow_boston_housing_linear_regression.json 92KB

X_train_summary.json 22KB

test_serialize.json 4KB

complete_pipeline_test.json 822B

invalid_transformer_class.json 713B

invalid_transformer_params.json 700B

malformed_transformer.json 622B

override_column_intent_pipeline.json 489B

invalid_optimizer_config.json 452B

optimizer_test.json 450B

test.json 318B

override_multi_pipeline.json 210B

override_intent_pipeline_single.json 205B

empty_pipeline_test.json 198B

override_intent_pipeline_multi.json 196B

configs_override4.json 39B

configs_override3.json 39B

configs_empty.json 2B

configs_override2.json 2B

configs_override1.json 2B

LICENSE 11KB

README.md 2KB

PKG-INFO 6KB

resolver_components.pkl 3.82MB

test_params.pkl 14KB

search_space_optimize.pkl 7KB

search_space_no_combo.pkl 2KB

search_space_no_cfg.pkl 2KB

configs_default.pkl 724B

configs_empty.pkl 2B

test_foreshadow.py 34KB

parallelprocessor.py 24KB

test_transformers.py 22KB

preparerstep.py 21KB

foreshadow.py 19KB

all.py 16KB

wrapper.py 15KB

raw_data_set_featurizer_via_lambda.py 15KB

serializers.py 15KB

test_smart.py 14KB

test_internal.py 12KB

heuristics.py 11KB

console.py 11KB

test_cache_manager.py 10KB

auto.py 9KB

test_serializer.py 9KB

smart.py 9KB

cachemanager.py 9KB

ngram_featurizer.py 9KB

logging.py 9KB

base.py 8KB

preparer.py 8KB

base_data_set_parser.py 8KB

base_text_featurizer.py 7KB

test_base.py 7KB

test_console.py 7KB

test_logging.py 7KB

metrics.py 7KB

test_auto.py 7KB

test_registry.py 7KB

config.py 7KB

pipeline.py 7KB

param_mapping.py 7KB

param_distribution.py 7KB

intent_resolver.py 7KB

setup.py 6KB

test_utils.py 6KB

raw_data_set_parser.py 6KB

common.py 6KB

validation.py 6KB

__init__.py 5KB

test_general.py 5KB

test_data_cleaner.py 5KB

tuner.py 5KB

test_newpreprocessor.py 5KB

default_estimator_factory.py 4KB

test_preparer.py 4KB

feature_reducer.py 4KB

random_search.py 4KB

test_random_search.py 3KB

testing.py 3KB

test_config.py 3KB

financial.py 3KB

meta_data_set_featurizer_via_lambda.py 3KB

intentresolver.py 3KB

cleaner.py 3KB

test_metrics.py 3KB

fancyimpute.py 3KB

test_meta.py 3KB

test_integration.py 3KB

共 192 条

# automl_research Code repository for AutoML research to support Foreshadow project --- ## Feature Type Inference (Intent Resolution) When analyzing raw data set feature columns in `Foreshadow`, the type (intent) of the each feature column has to be known a priori to select the appropriate feature transformation downstream. The goal of this research project is to build an intent resolver that can separate numerical and categorical raw feature columns. More classes can be added in the future. ### Installation This library was developed on Python 3.6.8 and uses the same package dependencies as `Foreshadow` as of Oct. 17, 2019. To install additional package dependencies for research-based functionalities, run the following: ``` pip install -r research_requirements.txt ``` ### Usage The functionality of this library is exposed through the `IntentResolver` class API as shown below. The class outputs a prediction of "Numerical", "Categorical" or "Neither" for each raw feature column. Predictions with confidences lower than the `threshold` parameter (default = 0.7) in the `.predict` method are set to "Neither". ``` import pandas as pd from lib import IntentResolver # Initialise object raw = pd.read_csv('path_to_dataset.csv', encoding='latin', low_memory=False) resolver = IntentResolver(raw) # Predict intent # Outputs a pd.Series of predicted intents resolver.predict() # OR: Predict intent with confidences at a lower threshold (i.e. less rigorous prediction) # Outputs a pd.DataFrame of predicted intent and confidences resolver.predict(threshold=0.6, return_conf=True) ``` ### Data Sources - [Original Meta Data Set (OMDS)](https://github.com/pvn25/ML-Data-Prep-Zoo/tree/master/ML%20Schema%20Inference/Data) - [360 Raw Data Sets (RDSs)](https://drive.google.com/file/d/1HGmDRBSZg-Olym2envycHPkb3uwVWHJX/view) (Sourced from the [GitHub README.md](https://github.com/pvn25/ML-Data-Prep-Zoo/tree/master/ML%20Schema%20Inference)) ### References 1. V. Shah, P. Kumar, K. Yang, and A. Kumar, “Towards semi-automatic mlfeature type inference." 2. N. Hynes, D. Sculley, and M. Terry, “The data linter: Lightweight, auto-mated sanity checking for ml data sets,” in NIPS MLSys Workshop, 2017.

评论收藏

内容反馈

版权申诉