Image similarity search using SentiSight.ai

Image similarity search using SentiSight.ai

2022-04-13

Computer vision encompasses a broad range of artificial intelligence subfields that aim to train computers to see and interpret their surroundings the same way humans can. It is one of the first steps required in creating an intelligent machine by enabling it to see, observe and understand the world around us. To succeed, having masses of data to process over and over until the machine learns to recognize images and objects within them is essential.

Various subdomains of computer vision technologies include but are not limited to scene reconstruction, object detection, pose estimation, and similarity search.

What is an image similarity search?

As the name suggests, similarity search in computer vision is a process of searching through the dataset to find similar items to the input data. Both supervised and unsupervised learning methods can be used for this process.

There are multiple ways of finding items similar to the provided query image, including but not limited to comparing visual features between images, image classification, and knowledge transfer methods.

Comparing the RGB values of images: This computer vision-based method works by calculating RGB values of the input and dataset images and then comparing these values with an aim to find similarities between them.

Comparing patches of areas between images: This method follows the template matching approach, meaning that the query image template is being compared to other images. The more exact patches between the images are found, the more similar they are.

Image classification: A straightforward method of finding similar items by using deep learning to classify the query data. The resulting images belong to the same class, therefore, share similar qualities. However, since the image gets assigned only a class, the more specific data, such as the item’s color, shape, or texture, is lost.

Knowledge transfer: It is a machine learning method that passes the images through a pre-trained convolutional neural network, extracts their features, and compares how similar the images are judging by the number of these features.

Applications

Within large datasets that usually do not have a natural order of items, image similarity tackles the problem of finding items that are the closest or the most similar to the input data. We are able to apply this approach in applications, such as:

Hierarchical data clustering and analysis: a method seeking to build and analyze a hierarchy of clusters in order to group similar items into clusters.

Image similarity identification: finding the similarity score between several visual files to help detect plagiarism and copyright infringement.

Near-duplicate detection: a search for nearly identical objects in a large dataset, such as performing a deduplication operation on a biometric database to remove the duplicate entries.

Recommendation systems: implementation of a visual search engine into e-commerce online shops that recommends customers items of their taste to encourage product discoverability.

This model type of image recognition can be applied to various visual data formats, as well as being a useful addition in the computer vision field.

You can find it ever-present in various industries, notably image recognition for retail where similarity search is used to provide users with recommendations of similar items to those that are interested in.

Enterprise relation

When you think about image similarity search, the first thing that comes to mind is Google Search by Images, also known as Google reverse image search. Developed in 2011, it is a feature that allows its users to complete a search within the engine by using a visual query rather than a written one. The results include either similar images to the given query or a mix of similar ones with their exact copies.

Moreover, as of recently, various e-commerce platforms, such as eBay, and ASOS, have started to implement visual search engine functionality within their marketplaces. This allows their users to search for items they wish to obtain by providing a visual image of the product. As a consequence, the search results in the exact item as well as similar alternatives that are sometimes more affordable.

Evidently, this is an efficient and convenient feature, saving their customers a lot of time and resources. The aim is to narrow down the assortment size while also encouraging users to discover new things they might like.

Image similarity using SentiSight.ai

Using an image similarity tool via our platform is quite easy. Since it does not require any preparation, such as labeling a dataset or training a model, it is suitable for beginners and experts alike. As with most of our models, image similarity can be executed online via the SentiSight.ai web platform, via our REST API server, or offline by downloading the model and setting up the REST API server on your local device.

To start an image similarity training process, you need to upload the dataset into a newly created project. Needless to say, you are not required to label the images for the search.

Image similarity via the web interface

Execute an image similarity search via our online web platform by navigating to the Image Similarity section on the top menu and choosing the Search by similarity option. The pop-up window allows selecting the search type and changing the number of results and threshold which determines the minimum similarity score for the matching images. Moreover, other functionalities include uploading the image you wish to find similar items to, checking the search history, and downloading the model itself.

Image similarity can also be executed with any existing image from the dataset by right-clicking on it and navigating to AI tools > Similarity search.

Image similarity via the REST API server

To deploy image similarity search via the REST API server your API token and Project ID are required.

image similarity via the REST API server

Just like on the web interface, you are able to choose the number of results and threshold score of similarity. Furthermore, if you wish to use a smaller dataset for the search, you can filter the images prior to the image similarity search by specifying the parameters when formatting the URL for the request.

More information on how to get the most of our models can be found in the previous guide about ways to deploy SentiSight.ai models.

Functionalities

Image similarity is a great tool to find similar items to the input data within your dataset and can be successfully used when searching for duplicates. SentiSight.ai offers its users two varieties:

1vN similarity search – used to find similar images to a single image provided during the request. By simply uploading the query image or specifying the image name for the request, you are able to fetch all images similar to it based on their similarity score, sorted by top results.

NvN similarity search – used to find the most similar pairs of images in the dataset. No uploads are required, by selecting the NvN type the whole dataset would be divided into similar pairs displaying the similarity score between the pair. They are also sorted by top resulting pairs.

Image similarity search using SentiSight.ai - SentiSight.ai

After the search, you can download top scoring images by similarity or the results in JSON or CSV formats.

Conclusion

Similarity search is a part of artificial intelligence-based processes, aiming to find the most similar items to the given data. By implementing image similarity search in our everyday life, we are able to address a variety of problems, such as identification of image similarity to detect plagiarism, implementation of recommendation systems, and the detection of near-duplicate items.

Image similarity search allows search engines and e-commerce applications to engage with their customers and increase their profits by giving them an option to search by image. You can start building your dataset and solving real-world problems by using the SentiSight.ai image recognition platform today.