If you are reading this, you have decided to start on the exciting journey of exploring neural search with Jina AI.
Will the journey be easy? No guarantees, but we will make it as smooth as possible.
Will it be rewarding? For sure! By the end of this section, you will be equipped with the knowledge to process different data types in Jina AI and use them to build a search solution.
Also, we have some amazing swag for people who will be completing the Bootcamp. Wanna know how to get them? Read more about it here!
As we recommend these steps, keep in mind that these are just ideal steps of progression designed for ease and not a strict code of conduct that you must always follow in the same order. Feel free to skip the parts if you are already familiar with them!
Jina AI equips developers with the tools to build end-to-end sophisticated search applications that can be easily scaled and deployed in the cloud.
Check out this video to understand how different products connect together in Jina Ecosystem to provide a holistic search experience:
DocArray is a data structure for unstructured data. It can accommodate all kinds of data including text, images, audio, video, etc. You can get started right away with DocArray without any prerequisites since it is designed to work with Python intuitively.
To get started with DocArray, you can install it with the following command:
pip install docarray
To install DocArray with all the external dependencies, use the following command:
pip install "docarray[full]"
DocArray also provides a safe and secure environment for collaborating with others. It lets you share a DocumentArray using the push/pull feature over the internet. You can push a pre-processed DocumentArray with a unique ID, and your colleague sitting in a different part of the world can just pull and use it. To understand more about this feature, check out the blog and documentation.
In the above section, we learned how Documents are the primary data type and can be used to contain any kind of data like text, images, audio, videos, tables, 3D Mesh, etc. We have designed a few tutorials for you to learn how to manipulate different data types in DocArray:
A multimodal document consists of a mixture of data modalities such as text, image, audio, video, etc. For example, the article card from The Washington Post in the below figure can be viewed as a multimodal document as it consists of a sentence, an image, and some tags (i.e. author, column section).
DocArray provides dataclass as a high-level API for representing a multimodal document using nested Document structure. It follows the design of the standard Python dataclass, allowing users to represent a complicated multimodal document intuitively via the Document/DocumentArray API.
DocArray provides a decorator @dataclass and a set of multimodal types in docarray.typing, which allows the left multimodal document to be represented as the right code snippet:
from docarray import dataclass, Document
from docarray.typing import Image, Text, JSON
a = WPArticle(
headline='Everything to know about flying with pets, from picking your seats to keeping your animal calm',
'author': 'Nathan Diller',
'column': 'By the Way - A Post Travel Destination',
Before moving further to build real-world applications with Jina, we recommend you join our community channels. So, even if you get stuck, there is no stopping!
Neural search is a vast and exciting topic with huge potential. Can you think of any project idea or use case that might use neural search?
If YES, open a GitHub issue here and stand a chance to win our exclusive swag!
You have learned the basic concepts of Jina, which has introduced you to the exciting world of Neural Search.
Take this quiz to continue your journey towards building future-proof search solutions and earn an exclusive beginner level certificate.
We’d appreciate any feedback you’d have about your experience with the developer portal. Please check it out and provide us with your valuable feedback.