探索Redis中的向量相似度.zip资源-CSDN文库

共21个文件

ipynb：5个

txt：3个

csv：2个

版权申诉

16 浏览量 2024-12-04 12:21:34 上传评论收藏 120KB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

探索 Redis 中的向量相似度.zip （21个子文件）

VisualSearch100k.ipynb 21KB

VisualSearch1k.ipynb 20KB

100k-item-keyword-vectors.npy 134B

标签.txt 2B

.gitattributes 84B

data

product_image_data.csv 134B

.ipynb_checkpoints

Untitled-checkpoint.ipynb 72B

README.md 15KB

LICENSE-CC-BY-NC-4.0.txt 14KB

product_data.csv 134B

LICENSE 1KB

SemanticSearch1k.ipynb 17KB

docker-compose.yml 752B

docs

jupyter-log.png 44KB

big_sur_docker 51KB

jupyter

Dockerfile 329B

资源内容.txt 1008B

.gitignore 102B

SemanticSearch100k.ipynb 16KB

README.md 4KB

100k-image-vectors.npy 134B

# Amazon Berkeley Objects (c) by Amazon.com ## License This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International Public License. To obtain a copy of the full license, see `LICENSE-CC-BY-NC-4.0.txt`, visit [CreativeCommons.org](https://creativecommons.org/licenses/by-nc/4.0/) or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. Under the following terms: * Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. * NonCommercial — You may not use the material for commercial purposes. * No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. ## Attribution Credit for the data, including all images and 3d models, must be given to: > Amazon.com Credit for building the dataset, archives and benchmark sets must be given to: > Matthieu Guillaumin (Amazon.com), Thomas Dideriksen (Amazon.com), > Kenan Deng (Amazon.com), Himanshu Arora (Amazon.com), > Jasmine Collins (UC Berkeley) and Jitendra Malik (UC Berkeley) ## Description Amazon Berkeley Objects is a collection of 147,702 product listings with multilingual metadata and 398,212 unique catalog images. 8,222 listings come with turntable photography (also referred as *spin* or *360º-View* images), as sequences of 24 or 72 images, for a total of 586,584 images in 8,209 unique sequences. For 7,953 products, the collection also provides high-quality 3d models, as glTF 2.0 files. The collection is made of the following files: * `README.md` - The present file. * `LICENSE-CC-BY-NC-4.0.txt` - The License file. You must read, agree and comply to the License before using the Amazon Berkeley Objects data. * `listings/metadata/listings_<i>.json.gz` - Product description and metadata. Each of the 16 files is encoded with UTF-8 and gzip-compressed. Each line of the decompressed files corresponds to one product as a JSON object (see http://ndjson.org/ or https://jsonlines.org/ ). Each product JSON object (a.k.a dictionary) has any number of the following keys: - `brand` - Content: Brand name - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `bullet_point` - Content: Important features of the products - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `color` - Content: Color of the product as text - Format: `[{"language_tag": <str>, "standardized_values": [<str>], "value": <str>}, ...]` - `color_code` - Content: Color of the product as HTML color code - Format: `[<str>, ...]` - `country` - Content: Country of the marketplace, as an [ISO 3166-1 alpha 2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) code - Format: `<str>` - `domain_name` - Content: Domain name of the marketplace where the product is found. A product listing in this collection is uniquely identified by (`item_id`, `domain_name`) - Format: `<str>` - `fabric_type` - Content: Description of product fabric - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `finish_type` - Content: Description of product finish - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `item_dimensions` - Content: Dimensions of the product (height, width, length) - Format: `{"height": {"normalized_value": {"unit": <str>, "value": <float>}, "unit": <str>, "value": <float>}, "length": {"normalized_value": {"unit": <str>, "value": <float>}, "unit": <str>, "value": <float>}, "width": {"normalized_value": {"unit": <str>, "value": <float>}, "unit": <str>, "value": <float>}}}` - `item_id` - Content: The product reference id. A product listing in this collection is uniquely identified by (`item_id`, `domain_name`). A corresponding product page may exist at `https://www.<domain_name>/dp/<item_id>` - Format: `<str>` - `item_keywords` - Content: Keywords for the product - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `item_name` - Content: The product name - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `item_shape` - Content: Description of the product shape - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `item_weight` - Content: The product weight - Format: `[{"normalized_value": {"unit": <str>, "value": <float>}, "unit": <str>, "value": <float>}, ...]` - `main_image_id` - Content: The main product image, provided as an `image_id`. See the descripton of `images/metadata/images.csv.gz` below - Format: `<str>` - `marketplace` - Content: Retail website name (Amazon, AmazonFresh, AmazonGo, ...) - Format: `<str>` - `material` - Content: Description of the product material - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `model_name` - Content: Model name - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `model_number` - Content: Model number - Format: `[{ "language_tag": <str>, "value": <str> }, ...]` - `model_year` - Content: Model year - Format: `[{ "language_tag": <str>, "value": <int> }, ...]` - `node` - Content: Location of the product in the category tree. A node page may exist at `https://www.<domain_name>/b/?node=<node_id>` for browsing - Format: `[{ "node_id": <int>, "path": <str>}, ...]` - `other_image_id` - Content: Other available images for the product, provided as `image_id`. See the description of `images/metadata/images.csv.gz` below - Format: `[<str>, ...]` - `pattern` - Content: Product pattern - Format: `[{ "language_tag": <str>, "value": <int> }, ...]` - `product_description` - Content: Product description as HTML - Format: `[{ "language_tag": <str>, "value": <int> }, ...]` - `product_type` - Content: Product type (category) - Format: `<str>` - `spin_id` - Content: Reference to the 360º View image sequence. See the description of `spins/metadata/spins.csv.gz` below - Format: `<str>` - `style` - Content: Style of the product - Format: `[{ "language_tag": <str>, "value": <int> }, ...]` - `3dmodel_id` - Content: Reference to the 3d model of the product. See the description of `3dmodels/metadata/3models.csv.gz` - Format: `<str>` * `images/metadata/images.csv.gz` - Image metadata. This file is a gzip-compressed comma-separated value (CSV) file with the following columns: `image_id`, `height`, `width`, and `path`. - `image_id` (string): this id uniquely refers to a product image. This id can be used to retrieve the image data from Amazon's Content Delivery Network (CDN) using the template: `https://m.media-amazon.com/image/I/<image_id>.<extension>` [^1], where `<extension>` is composed of the characters following the dot in the `path` field. Any value occurring in the `main_image` and `other_images` attributes of product metadata is an `image_id` present in this file. - `height` (int) and `width` (int): respectively, the height and width of the original image. - `path`: the location of the image file relative to the `images/original/` or `images/small/` directories. A path is composed of lowercase hex characters (`0-9a-f`) that also u

评论收藏

内容反馈

版权申诉