# Amazon Berkeley Objects (c) by Amazon.com
## License
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0
International Public License. To obtain a copy of the full license, see
`LICENSE-CC-BY-NC-4.0.txt`, visit
[CreativeCommons.org](https://creativecommons.org/licenses/by-nc/4.0/)
or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
Under the following terms:
* Attribution — You must give appropriate credit, provide a link to the
license, and indicate if changes were made. You may do so in any reasonable
manner, but not in any way that suggests the licensor endorses you or your
use.
* NonCommercial — You may not use the material for commercial purposes.
* No additional restrictions — You may not apply legal terms or technological
measures that legally restrict others from doing anything the license
permits.
## Attribution
Credit for the data, including all images and 3d models, must be given to:
> Amazon.com
Credit for building the dataset, archives and benchmark sets must be given to:
> Matthieu Guillaumin (Amazon.com), Thomas Dideriksen (Amazon.com),
> Kenan Deng (Amazon.com), Himanshu Arora (Amazon.com),
> Jasmine Collins (UC Berkeley) and Jitendra Malik (UC Berkeley)
## Description
Amazon Berkeley Objects is a collection of 147,702 product listings with
multilingual metadata and 398,212 unique catalog images. 8,222 listings come
with turntable photography (also referred as *spin* or *360º-View* images), as
sequences of 24 or 72 images, for a total of 586,584 images in 8,209 unique
sequences. For 7,953 products, the collection also provides high-quality 3d
models, as glTF 2.0 files.
The collection is made of the following files:
* `README.md` - The present file.
* `LICENSE-CC-BY-NC-4.0.txt` - The License file. You must read, agree and
comply to the License before using the Amazon Berkeley Objects data.
* `listings/metadata/listings_<i>.json.gz` - Product description and metadata.
Each of the 16 files is encoded with UTF-8 and gzip-compressed. Each line of
the decompressed files corresponds to one product as a JSON object (see
http://ndjson.org/ or https://jsonlines.org/ ). Each product JSON object
(a.k.a dictionary) has any number of the following keys:
- `brand`
- Content: Brand name
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `bullet_point`
- Content: Important features of the products
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `color`
- Content: Color of the product as text
- Format: `[{"language_tag": <str>, "standardized_values": [<str>],
"value": <str>}, ...]`
- `color_code`
- Content: Color of the product as HTML color code
- Format: `[<str>, ...]`
- `country`
- Content: Country of the marketplace, as an
[ISO 3166-1 alpha 2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)
code
- Format: `<str>`
- `domain_name`
- Content: Domain name of the marketplace where the product is found.
A product listing in this collection is uniquely identified by
(`item_id`, `domain_name`)
- Format: `<str>`
- `fabric_type`
- Content: Description of product fabric
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `finish_type`
- Content: Description of product finish
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `item_dimensions`
- Content: Dimensions of the product (height, width, length)
- Format: `{"height": {"normalized_value": {"unit": <str>, "value":
<float>}, "unit": <str>, "value": <float>}, "length":
{"normalized_value": {"unit": <str>, "value": <float>}, "unit": <str>,
"value": <float>}, "width": {"normalized_value": {"unit": <str>,
"value": <float>}, "unit": <str>, "value": <float>}}}`
- `item_id`
- Content: The product reference id. A product listing in this
collection is uniquely identified by (`item_id`, `domain_name`).
A corresponding product page may exist at
`https://www.<domain_name>/dp/<item_id>`
- Format: `<str>`
- `item_keywords`
- Content: Keywords for the product
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `item_name`
- Content: The product name
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `item_shape`
- Content: Description of the product shape
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `item_weight`
- Content: The product weight
- Format: `[{"normalized_value": {"unit": <str>, "value": <float>},
"unit": <str>, "value": <float>}, ...]`
- `main_image_id`
- Content: The main product image, provided as an `image_id`. See the
descripton of `images/metadata/images.csv.gz` below
- Format: `<str>`
- `marketplace`
- Content: Retail website name (Amazon, AmazonFresh, AmazonGo, ...)
- Format: `<str>`
- `material`
- Content: Description of the product material
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `model_name`
- Content: Model name
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `model_number`
- Content: Model number
- Format: `[{ "language_tag": <str>, "value": <str> }, ...]`
- `model_year`
- Content: Model year
- Format: `[{ "language_tag": <str>, "value": <int> }, ...]`
- `node`
- Content: Location of the product in the category tree. A node page
may exist at `https://www.<domain_name>/b/?node=<node_id>` for
browsing
- Format: `[{ "node_id": <int>, "path": <str>}, ...]`
- `other_image_id`
- Content: Other available images for the product, provided as
`image_id`. See the description of `images/metadata/images.csv.gz`
below
- Format: `[<str>, ...]`
- `pattern`
- Content: Product pattern
- Format: `[{ "language_tag": <str>, "value": <int> }, ...]`
- `product_description`
- Content: Product description as HTML
- Format: `[{ "language_tag": <str>, "value": <int> }, ...]`
- `product_type`
- Content: Product type (category)
- Format: `<str>`
- `spin_id`
- Content: Reference to the 360º View image sequence. See the
description of `spins/metadata/spins.csv.gz` below
- Format: `<str>`
- `style`
- Content: Style of the product
- Format: `[{ "language_tag": <str>, "value": <int> }, ...]`
- `3dmodel_id`
- Content: Reference to the 3d model of the product. See the description
of `3dmodels/metadata/3models.csv.gz`
- Format: `<str>`
* `images/metadata/images.csv.gz` - Image metadata. This file is a
gzip-compressed comma-separated value (CSV) file with the following
columns: `image_id`, `height`, `width`, and `path`.
- `image_id` (string): this id uniquely refers to a product image. This id
can be used to retrieve the image data from Amazon's Content Delivery
Network (CDN) using the template:
`https://m.media-amazon.com/image/I/<image_id>.<extension>` [^1],
where `<extension>` is composed of the characters following the dot in the
`path` field. Any value occurring in the `main_image` and `other_images`
attributes of product metadata is an `image_id` present in this file.
- `height` (int) and `width` (int): respectively, the height and width of
the original image.
- `path`: the location of the image file relative to the `images/original/`
or `images/small/` directories. A path is composed of lowercase hex
characters (`0-9a-f`) that also u
没有合适的资源?快使用搜索试试~ 我知道了~
探索 Redis 中的向量相似度.zip
共21个文件
ipynb:5个
txt:3个
csv:2个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 16 浏览量
2024-12-04
12:21:34
上传
评论
收藏 120KB ZIP 举报
温馨提示
探索 Redis 中的向量相似度与 Redis 的视觉和语义相似性此演示与新 Redis 矢量相似性搜索的公告同时发布您将使用真实数据集试验向量相似性搜索应用程序的两个关键应用语义搜索给定一个句子,检查产品关键词中具有语义相似文本的产品视觉搜索给定一个查询图像,在目录中找到“视觉上”最相似的前 K 个图像关于 Amazon 产品数据集本演示中使用的 CSV 产品数据来自“Amazon Berkeley Objects Dataset”CSV 文件中的每一行对应原始数据集中的一个产品。开始之前安装Git LFS确保通过运行以下命令初始化 LFSgit lfs installDockerDocker 组成克隆仓库git clone https://github.com/RedisAI/vecsim-demo.git启动Docker容器用于docker compose启动2个容器vesim端口 6379 上的带有向量相似性搜索 (VSS) 的 redis 容器jupyter8888 端口上的 Python 笔记本服务器,预装了 4 个笔
资源推荐
资源详情
资源评论
收起资源包目录
探索 Redis 中的向量相似度.zip (21个子文件)
VisualSearch100k.ipynb 21KB
VisualSearch1k.ipynb 20KB
100k-item-keyword-vectors.npy 134B
标签.txt 2B
.gitattributes 84B
data
product_image_data.csv 134B
.ipynb_checkpoints
Untitled-checkpoint.ipynb 72B
README.md 15KB
LICENSE-CC-BY-NC-4.0.txt 14KB
product_data.csv 134B
LICENSE 1KB
SemanticSearch1k.ipynb 17KB
docker-compose.yml 752B
docs
jupyter-log.png 44KB
big_sur_docker 51KB
jupyter
Dockerfile 329B
资源内容.txt 1008B
.gitignore 102B
SemanticSearch100k.ipynb 16KB
README.md 4KB
100k-image-vectors.npy 134B
共 21 条
- 1
资源评论
徐浪老师
- 粉丝: 8317
- 资源: 1万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功