Neural search leverages the power of deep neural networks to build search systems. Think of any great search engine you have ever used like Google. Even Google uses AI and neural networks to make its search effective. We, as developers, sometimes want to create intelligent search systems, so neural search comes into the picture. Jina's neural search framework helps you build exactly these kinds of search systems.
By playing around with Jina's examples and getting hands-on. All code is available in open-source repos for you to fork and tweak as needed
To use Jina you'll need only intermediate-level Python and a working PC or Mac. Previous knowledge of ML and AI is a plus, but it's not a requirement. We want Jina to be of use to even the newest people in this field, and that's why the implementation is as simple as possible for developers who want to make neural search systems.
Jina can be installed in three ways:
For more details about installing Jina, visit our Docs.
We recommend installing WSL2 and running Jina from there. Alternatively, you could try a VM like Virtualbox running a recent Linux distribution.
You may have trouble installing or using Jina on your Mac if you have a newer model with an M1 chip. Please follow the steps written in this blog.
You need to know three simple concepts to build a simple working application:
You can fully integrate a traditional search system with Jina. As in the DocQA example, the first step is pulling candidates and finding out similar passages by vector indexing. The second step is to use a more computationally intensive deep learning model to find the needed answers from the passages.
When doing the recall in the first step, you can use the method based on vector index or TF-IDF or bm25, so it is entirely possible to use the traditional inverted index to recall Jina.
The resources required to build a neural search system with Jina depend on the business requirements, such as data volume, stability requirements, required response time, etc.
The CPU can cope with lesser data volume( up to one million) for a single data type. It is necessary to use GPU for applications with larger data volume - such as retrieving hundreds of millions of videos and requiring millisecond-level feedback.
Of course, some customers use Jina to build a search system for the company's internal resources searching, including PDF search, through the text directly searching the relevant semantic content, or through the text to match the images in the PDF.
You can reduce the size of your indexed data by projecting the embedding to a smaller dimensionality, using pre-trained ResNet results in features represented as 2048d (if you're using a fully connected layer as an embedding layer). You can further encode it into another dimensionality, such as 512. You can achieve this with Finetuner by adopting a multi-layer perceptron on top of your embedding model. For instance, in this tutorial, we attached a SimpleMLP on top of the embedding model, and the final embedding has been encoded into 1024d, two times smaller than the pre-trained embedding. You can do it in 128/256/512 or any compact representation. This should significantly reduce your embedding size.
We provide three ways of using pre-built Executors from Jina Hub. The first way is using the Executor, the second way is used in a Flow via Docker, and the third way is used in a Flow via the source code. Visit here to see the code snippets and understand the syntax.
Executors can be shared by pushing them onto Jina Hub.. You can choose to share your Executors either publicly or privately. By default, Executors are public, but you can make them private by a secret. Only people who have the secret can access private Executors. If no --public or --private argument is provided, the Executor is public by default.
Once published to Jina Hub, Executors cannot be deleted or removed since Jina Hub is a shared space. We suggest re-pushing your Executor with the --private argument to remove it from public view.
It can be because of a connection issue (VPN), and reloading should fix it.
Running Jina in Jupyter notebook and Python are the same. So, you can use Executors in the notebook the same way you would in your local system.
Check out this blog. After you have performed the necessary steps, we will reach out for delivery addresses. Remember to make your certificates public (via social media) and tag us!
We're working extremely hard to get swag packages delivered to your address in the middle of a pandemic. There may be some delays due to COVID restrictions, and we thank you in advance for your patience as we navigate challenges that come our way due to constantly changing regulations.
Please send an email to [email protected] or reach out on Twitter.
Thanks for showing interest in writing a blog for Jina AI. Nothing excites us more than community contributions. Send a message in Slack with your blog idea and where do you think you might publish it. Our team will reach out to you and reply to the thread. Also, if you are writing a blog, please follow our writing style guide to make sure there are no errors/violations.
Check out our careers page!
Also, we're not JinaAI, Gina, Jina-AI or any other variation
We invite you to answer some unanswered questions on our FAQ sheet:
We see you've chosen to take the road less travelled. We're here to help you on your way!
This FAQ Document has been made possible due to the efforts of our community members:
Missing link? Notice a typo? Create an issue! Check out our contributing guidelines to know how
We’d appreciate any feedback you’d have about your experience with the developer portal. Please check it out and provide us with your valuable feedback.