<p align="center">
<!-- <a href="https://www.meetup.com/jina-community-meetup/events/279360997/"><img src="https://github.com/jina-ai/jina/blob/master/.github/images/meetup.svg"></a> -->
<a href="https://jina.ai/"><img src="https://github.com/jina-ai/jina/blob/master/.github/logo-only.gif?raw=true" alt="Jina logo: Jina is a cloud-native neural search framework" width="200px"></a>
</p>
<p align="center">
<b>Cloud-Native <ins>Neural Search</ins><sup><a href=".github/2.0/neural-search.md">[?]</a></sup> Framework for <i>Any</i> Kind of Data</b>
</p>
<p align=center>
<a href="https://pypi.org/project/jina/"><img src="https://github.com/jina-ai/jina/blob/master/.github/badges/python-badge.svg?raw=true" alt="Python 3.7 3.8 3.9" title="Jina supports Python 3.7 and above"></a>
<a href="https://pypi.org/project/jina/"><img src="https://img.shields.io/pypi/v/jina?color=%23099cec&label=PyPI&logo=pypi&logoColor=white" alt="PyPI"></a>
<a href="https://hub.docker.com/r/jinaai/jina/tags"><img src="https://img.shields.io/docker/v/jinaai/jina?color=%23099cec&label=Docker&logo=docker&logoColor=white&sort=semver" alt="Docker Image Version (latest semver)"></a>
<a href="https://pepy.tech/project/jina"><img src="https://pepy.tech/badge/jina/month"></a>
<a href="https://codecov.io/gh/jina-ai/jina"><img src="https://codecov.io/gh/jina-ai/jina/branch/master/graph/badge.svg" alt="codecov"></a>
<a href="https://slack.jina.ai"><img src="https://img.shields.io/badge/Slack-1.2K%2B-blueviolet?logo=slack&logoColor=white"></a>
</p>
<!-- start elevator-pitch -->
Jina<sup><a href=".github/pronounce-jina.mp3">`ð`</a></sup> allows you to build search-as-a-service powered by deep learning in just minutes.
<!-- end elevator-pitch -->
ð **All data types** - Large-scale indexing and querying of any kind of unstructured data: video, image, long/short text, music, source code, PDF, etc.
ð©ï¸ **Fast & cloud-native** - Distributed architecture from day one, scalable & cloud-native by design: enjoy
containerizing, streaming, paralleling, sharding, async scheduling, HTTP/gRPC/WebSocket protocol.
â±ï¸ **Save time** - *The* design pattern of neural search systems, from zero to a production-ready system in minutes.
ð± **Own your stack** - Keep end-to-end stack ownership of your solution, avoid integration pitfalls you get with
fragmented, multi-vendor, generic legacy tools.
## Run Quick Demo
- [ð Fashion image search](./.github/pages/hello-world.md#-fashion-image-search): `jina hello fashion`
- [ð¤ QA chatbot](./.github/pages/hello-world.md#-covid-19-chatbot): `pip install "jina[demo]" && jina hello chatbot`
- [ð° Multimodal search](./.github/pages/hello-world.md#-multimodal-document-search): `pip install "jina[demo]" && jina hello multimodal`
- ð´ Fork the source of a demo to your folder: `jina hello fork fashion ../my-proj/`
## Install
- via PyPI: `pip install -U jina`
- via Docker: `docker run jinaai/jina:latest`
<details>
<summary>More installation options</summary>
| On x86/64, arm64/v6/v7 | Linux/macOS with Python 3.7/3.8/3.9 | Docker Users |
| --- | --- | --- |
| Minimum <br>(no HTTP, WebSocket, Docker support) | `JINA_PIP_INSTALL_CORE=1 pip install jina` | `docker run jinaai/jina:latest` |
| Minimum but more performant <br>(use `uvloop` & `lz4`) | `JINA_PIP_INSTALL_PERF=1 pip install jina` | `docker run jinaai/jina:latest-perf` |
| With <a href="https://api.jina.ai/daemon/">Daemon</a> | `pip install "jina[daemon]"` | [Run JinaD](.github/2.0/cookbooks/Daemon.md#run) |
| Full development dependencies | `pip install "jina[devel]"` | `docker run jinaai/jina:latest-devel` |
| Pre-release<br>(all tags above can be added)| <sub>`pip install --pre jina` | `docker run jinaai/jina:master` |
Version identifiers [are explained here](https://github.com/jina-ai/jina/blob/master/RELEASE.md). Jina can run
on [Windows Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/install-win10). We welcome the community
to help us with [native Windows support](https://github.com/jina-ai/jina/issues/1252).
</details>
## Get Started
Document, Executor, and Flow are the three fundamental concepts in Jina.
- [ð **Document**](.github/2.0/cookbooks/Document.md) is the basic data type in Jina;
- [âï¸ **Executor**](.github/2.0/cookbooks/Executor.md) is how Jina processes Documents;
- [ð **Flow**](.github/2.0/cookbooks/Flow.md) is how Jina streamlines and distributes Executors.
1ï¸â£ Copy-paste the minimum example below and run it:
<sup>ð¡ Preliminaries: <a href="https://en.wikipedia.org/wiki/Word_embedding">character embedding</a>, <a href="https://computersciencewiki.org/index.php/Max-pooling_/_Pooling">pooling</a>, <a href="https://en.wikipedia.org/wiki/Euclidean_distance">Euclidean distance</a></sup>
<img src="https://github.com/jina-ai/jina/blob/master/.github/2.0/simple-arch.svg" alt="The architecture of a simple neural search system powered by Jina">
<!-- README-SERVER-START -->
```python
import numpy as np
from jina import Document, DocumentArray, Executor, Flow, requests
class CharEmbed(Executor): # a simple character embedding with mean-pooling
offset = 32 # letter `a`
dim = 127 - offset + 1 # last pos reserved for `UNK`
char_embd = np.eye(dim) * 1 # one-hot embedding for all chars
@requests
def foo(self, docs: DocumentArray, **kwargs):
for d in docs:
r_emb = [ord(c) - self.offset if self.offset <= ord(c) <= 127 else (self.dim - 1) for c in d.text]
d.embedding = self.char_embd[r_emb, :].mean(axis=0) # average pooling
class Indexer(Executor):
_docs = DocumentArray() # for storing all documents in memory
@requests(on='/index')
def foo(self, docs: DocumentArray, **kwargs):
self._docs.extend(docs) # extend stored `docs`
@requests(on='/search')
def bar(self, docs: DocumentArray, **kwargs):
docs.match(self._docs, metric='euclidean', limit=20)
f = Flow(port_expose=12345, protocol='http', cors=True).add(uses=CharEmbed, parallel=2).add(uses=Indexer) # build a Flow, with 2 parallel CharEmbed, tho unnecessary
with f:
f.post('/index', (Document(text=t.strip()) for t in open(__file__) if t.strip())) # index all lines of _this_ file
f.block() # block for listening request
```
<!-- README-SERVER-END -->
2ï¸â£ Open `http://localhost:12345/docs` (an extended Swagger UI) in your browser, click <kbd>/search</kbd> tab and input:
```json
{"data": [{"text": "@requests(on=something)"}]}
```
That means, **we want to find lines from the above code snippet that are most similar to `@request(on=something)`.** Now click <kbd>Execute</kbd> button!
<p align="center">
<img src="https://github.com/jina-ai/jina/blob/master/.github/swagger-ui-prettyprint1.gif?raw=true" alt="Jina Swagger UI extension on visualizing neural search results" width="85%">
</p>
3ï¸â£ Not a GUI person? Let's do it in Python then! Keep the above server running and start a simple client:
<!-- README-CLIENT-START -->
```python
from jina import Client, Document
from jina.types.request import Response
def print_matches(resp: Response): # the callback function invoked when task is done
for idx, d in enumerate(resp.docs[0].matches[:3]): # print top-3 matches
print(f'[{idx}]{d.scores["euclidean"].value:2f}: "{d.text}"')
c = Client(protocol='http', port=12345) # connect to localhost:12345
c.post('/search', Document(text='request(on=something)'), on_done=print_matches)
```
<!-- README-CLIENT-END -->
, which prints the following results:
```text
Client@1608[S]:connected to the gateway at localhost:12345!
[0]0.168526: "@requests(on='/index')"
[1]0.181676: "@requests(on='/search')"
[2]0.218218: "from jina import Document, DocumentArray, Executor, Flow, requests"
```
<sup>ð Doesn't work? Our bad! <a href="https://github.com/jina-ai/jina/issues/new?assignees=&labels=kind%2Fbug&template=---found-a-bug-and-i-solved-it.md&title="