What is a vector search? Better search with AI

Orignally published on 2021-09-28 10:18:45 by

Suppose you want to implement a music service that behaves like Spotify and find a song that resembles your favorite song. what should I do? One way is to classify each song by some characteristics, store those “vectors” in an indexed database, and search the database to find a description vector for the songs that are “near” your favorite. .. That is, you can perform a vector similarity search.

What is a vector similarity search?

Vector similarity search usually has four components. Vector embedding that captures the key characteristics of the original object, such as songs, images, and text. A distance metric that represents the “closeness” between vectors. Search algorithm; A database that holds vectors and supports vector searches using indexes.

What is vector embedding?

Vector embedding is essentially a feature vector, as will be understood in the following context. Machine learning When Deep learning.. These can be defined by performing functional engineering manually or using the output of the model.

For example, text strings use neural networks, dimensionality reduction of word co-occurrence matrices, probabilistic models, explainable knowledge-based methods, and explicit representations in terms of the context in which words are displayed. It can be converted to word embedding (feature vector). .. Common models for training and using word embedding include: word2vec (Google), Gloves (Stanford), ELMo (Allen Institute / University of Washington), BERT (Google), and fastText (Facebook).

Images are often embedded by capturing the output of a convolutional neural network (CNN) model or transformer model. These models automatically reduce the dimensions of the feature vector by rolling (“convolving”) pixel patches together to feature and downsampling using a pool layer.

Product recommendations may be based on embedding words or phrases in the product description, embedding images of the product, or both. The audio embedding may be based on the Fourier transform of the audio (which gives the spectrum). A description of the composer, genre, artist, tempo, rhythm, and loudness. Or by both spectrum and keywords. As this field is evolving rapidly, I think there are new embedding technologies in many application areas.

Copyright © 2021 IDG Communications, Inc.

Orignally published on 2021-09-28 10:18:45 by

Back to top button