# TensorFlow Datasets
TensorFlow Datasets provides many public datasets as `tf.data.Dataset`s.
[![Travis](https://img.shields.io/travis/tensorflow/datasets.svg)](https://travis-ci.org/tensorflow/datasets)
* [List of datasets](https://github.com/tensorflow/datasets/tree/master/docs/datasets.md)
* [Try it in Colab](https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/overview.ipynb)
* [API docs](https://www.tensorflow.org/datasets/api_docs/python/tfds)
* [Add a dataset](https://github.com/tensorflow/datasets/tree/master/docs/add_dataset.md)
**Table of Contents**
* [Installation](#installation)
* [Usage](#usage)
* [`DatasetBuilder`](#datasetbuilder)
* [NumPy usage](#numpy-usage-with-tfdsas-numpy)
* [Want a certain dataset?](#want-a-certain-dataset)
* [Disclaimers](#disclaimers)
### Installation
```sh
pip install tensorflow-datasets
# Requires TF 1.12+ to be installed.
# Some datasets require additional libraries; see setup.py extras_require
pip install tensorflow
# or:
pip install tensorflow-gpu
```
### Usage
```python
import tensorflow_datasets as tfds
import tensorflow as tf
# tfds works in both Eager and Graph modes
tf.enable_eager_execution()
# See available datasets
print(tfds.list_builders())
# Construct a tf.data.Dataset
ds_train, ds_test = tfds.load(name="mnist", split=["train", "test"])
# Build your input pipeline
ds_train = ds_train.shuffle(1000).batch(128).prefetch(10)
for features in ds_train.take(1):
image, label = features["image"], features["label"]
```
Try it interactively in a
[Colab notebook](https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/overview.ipynb).
### `DatasetBuilder`
All datasets are implemented as subclasses of
[`DatasetBuilder`](https://www.tensorflow.org/datasets/api_docs/python/tfds/core/DatasetBuilder.md)
and
[`tfds.load`](https://www.tensorflow.org/datasets/api_docs/python/tfds/load.md)
is a thin convenience wrapper.
[`DatasetInfo`](https://www.tensorflow.org/datasets/api_docs/python/tfds/core/DatasetInfo.md)
documents the dataset.
```python
import tensorflow_datasets as tfds
# The following is the equivalent of the `load` call above.
# You can fetch the DatasetBuilder class by string
mnist_builder = tfds.builder("mnist")
# Download the dataset
mnist_builder.download_and_prepare()
# Construct a tf.data.Dataset
dataset = mnist_builder.as_dataset(split=tfds.Split.TRAIN)
# Get the `DatasetInfo` object, which contains useful information about the
# dataset and its features
info = mnist_builder.info
print(info)
tfds.core.DatasetInfo(
name='mnist',
version=1.0.0,
description='The MNIST database of handwritten digits.',
urls=[u'http://yann.lecun.com/exdb/mnist/'],
features=FeaturesDict({
'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10)
},
total_num_examples=70000,
splits={
u'test': <tfds.core.SplitInfo num_examples=10000>,
u'train': <tfds.core.SplitInfo num_examples=60000>
},
supervised_keys=(u'image', u'label'),
citation='"""
@article{lecun2010mnist,
title={MNIST handwritten digit database},
author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
journal={ATT Labs [Online]. Available: http://yann. lecun. com/exdb/mnist},
volume={2},
year={2010}
}
"""',
)
```
### NumPy Usage with `tfds.as_numpy`
As a convenience for users that want simple NumPy arrays in their programs, you
can use
[`tfds.as_numpy`](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_numpy.md)
to return a generator that yields NumPy array
records out of a `tf.data.Dataset`. This allows you to build high-performance
input pipelines with `tf.data` but use whatever you'd like for your model
components.
```python
train_ds = tfds.load("mnist", split=tfds.Split.TRAIN)
train_ds = train_ds.shuffle(1024).batch(128).repeat(5).prefetch(10)
for example in tfds.as_numpy(train_ds):
numpy_images, numpy_labels = example["image"], example["label"]
```
You can also use `tfds.as_numpy` in conjunction with `batch_size=-1` to
get the full dataset in NumPy arrays from the returned `tf.Tensor` object:
```python
train_data = tfds.load("mnist", split=tfds.Split.TRAIN, batch_size=-1)
numpy_data = tfds.as_numpy(train_data)
numpy_images, numpy_labels = numpy_dataset["image"], numpy_dataset["label"]
```
Note that the library still requires `tensorflow` as an internal dependency.
## Want a certain dataset?
Adding a dataset is really straightforward by following
[our guide](https://github.com/tensorflow/datasets/tree/master/docs/add_dataset.md).
Request a dataset by opening a
[Dataset request GitHub issue](https://github.com/tensorflow/datasets/issues/new?assignees=&labels=dataset+request&template=dataset-request.md&title=%5Bdata+request%5D+%3Cdataset+name%3E).
And vote on the current
[set of requests](https://github.com/tensorflow/datasets/labels/dataset%20request)
by adding a thumbs-up reaction to the issue.
#### *Disclaimers*
*This is a utility library that downloads and prepares public datasets. We do*
*not host or distribute these datasets, vouch for their quality or fairness, or*
*claim that you have license to use the dataset. It is your responsibility to*
*determine whether you have permission to use the dataset under the dataset's*
*license.*
*If you're a dataset owner and wish to update any part of it (description,*
*citation, etc.), or do not want your dataset to be included in this*
*library, please get in touch through a GitHub issue. Thanks for your*
*contribution to the ML community!*
*If you're interested in learning more about responsible AI practices, including*
*fairness, please see Google AI's [Responsible AI Practices](https://ai.google/education/responsible-ai-practices).*
*`tensorflow/datasets` is Apache 2.0 licensed. See the `LICENSE` file.*
没有合适的资源?快使用搜索试试~ 我知道了~
tensorflow-datasets-1.0.1.tar.gz
0 下载量 24 浏览量
2024-03-21
12:35:52
上传
评论
收藏 280KB GZ 举报
温馨提示
Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。
资源推荐
资源详情
资源评论
收起资源包目录
tensorflow-datasets-1.0.1.tar.gz (150个子文件)
setup.cfg 38B
README.md 6KB
PKG-INFO 1KB
PKG-INFO 1KB
dataset_builder.py 28KB
feature.py 21KB
dataset_info.py 19KB
splits.py 19KB
subword_text_encoder.py 17KB
dataset_builder_test.py 17KB
splits_test.py 16KB
text_encoder.py 15KB
dataset_info_generated_pb2.py 14KB
download_manager.py 14KB
download_manager_test.py 12KB
open_images.py 12KB
registered.py 12KB
file_format_adapter.py 12KB
text_encoder_test.py 11KB
test_utils.py 11KB
dataset_builder_testing.py 11KB
resource.py 10KB
coco.py 10KB
document_datasets.py 9KB
dataset_info_test.py 9KB
sequence_feature.py 9KB
py_utils.py 9KB
librispeech.py 9KB
features_test.py 9KB
sequence_feature_test.py 8KB
moving_sequence.py 8KB
starcraft.py 7KB
celeba.py 7KB
wmt.py 7KB
squad.py 7KB
subword_text_encoder_test.py 7KB
cifar.py 7KB
class_label_feature.py 6KB
mnist.py 6KB
image_feature.py 6KB
lm1b.py 6KB
open_images.py 6KB
imagenet.py 6KB
dataset_utils.py 6KB
bair_robot_pushing.py 6KB
imdb.py 6KB
extractor.py 6KB
tf_utils.py 5KB
image_folder.py 5KB
nsynth.py 5KB
tf_compat.py 5KB
downloader.py 5KB
registered_test.py 5KB
wmt_enfr.py 5KB
omniglot.py 5KB
text_feature.py 5KB
file_format_adapter_test.py 5KB
extractor_test.py 5KB
class_label_feature_test.py 5KB
dataset_utils_test.py 5KB
download_and_prepare.py 5KB
downloader_test.py 5KB
celebahq.py 4KB
resource_test.py 4KB
cifar.py 4KB
cats_vs_dogs.py 4KB
quickdraw.py 4KB
diabetic_retinopathy_detection.py 4KB
image_feature_test.py 4KB
lsun.py 4KB
svhn.py 3KB
naming_test.py 3KB
py_utils_test.py 3KB
moving_mnist.py 3KB
text_feature_test.py 3KB
wmt_ende.py 3KB
imagenet.py 3KB
setup.py 3KB
flowers.py 3KB
util.py 3KB
api_utils.py 3KB
starcraft.py 3KB
api_utils_test.py 3KB
__init__.py 2KB
mnist.py 2KB
naming.py 2KB
bounding_boxes.py 2KB
test_utils_test.py 2KB
lazy_imports.py 2KB
test_case.py 2KB
moving_sequence_test.py 2KB
audio_feature_test.py 2KB
audio_feature.py 2KB
video_feature.py 2KB
version.py 2KB
tf_utils_test.py 2KB
version_test.py 2KB
bair_robot_pushing.py 2KB
open_images_test.py 2KB
public_api.py 2KB
共 150 条
- 1
- 2
资源评论
程序员Chino的日记
- 粉丝: 3726
- 资源: 5万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于多边形逼近与仿射不变量的部分遮挡物体识别算法
- matlab 滤波器设计,基于matlab的模拟滤波器和数字滤波器设计,其中数字滤波器包扩IIR和FIR的低通、高通、带通、带阻四大类型,模拟滤波器包括巴特沃斯(Butterworth)和切比雪夫(C
- 基于PyCharm开发实现串口与MQTT客户端互相转发工具的python源码
- C2000,28335Matlab Simulink代码生成技术,处理器在环,里面有电力电子常用的GPIO,PWM,ADC,DMA,定时器中断等各种电力电子工程师常用的模块儿,只需要有想法剩下的全部自
- 基于几何距离非迭代最小二乘法椭圆拟合方法及其应用
- 逻辑漏洞ppt总结文件
- 电子PCB板龙门铣自动化生产线sw17可编辑全套技术资料100%好用.zip
- 椭圆拟合中误差变量回归的双重最优方法研究与应用
- 1735975657158015_2648_104187696.html
- 双机并联同步发电机仿真模型 并联同步发电机 1.两台VSG并联,开始各自带负载10KW,在0.3秒的时候加入公共负载10KW,稳定后两台VSG可以均分公共负载的功率 2.输出的三相电压电流波形THD<
- 解码《黑神话:悟空》背后的计算机技术
- comsol锂枝晶模型 Comsol 锂枝晶生长模型,锂枝晶生长,锂离子浓度分布,电势分布 此链接是无序生长随机形核
- 二维码生成与解析工具,给二维码进行加密解密
- 该模型采用无差拿电流预测控制代替传统电流环的PI控制器,并采用模型参自适应对电机参数进行辨识
- 200smart 电子洁净厂房净化空调串级 P ID 自控程序 串级 PID 控制 自写双向 PID 子程序 自写露点与焓值计算子程序 控制精度:温度+-1 度,湿度+-5%
- 电梯厅门板喷粉机器人自动上下件工作站sw19可编辑全套技术资料100%好用.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功