How to Generate Embeddings from a Server and Index Them Using FAISS, with API

Introduction

In this blog post, we will demonstrate how to set up a simple server for generating embeddings using SentenceTransformer and then index these embeddings using the FAISS library. We will also show you how to build different APIs for searching and adding documents to the FAISS index.

Setting Up the Embedding Server

First, we need to set up a server that generates embeddings for the input text. For this purpose, we will use the Flask framework and the SentenceTransformer library.

1. Install the required libraries:

In [ ]:

pip install Flask sentence-transformers faiss-cpu

2. Create a new file called embedding_server.py and paste the following code

In [ ]:

from flask import Flask, request
from sentence_transformers import SentenceTransformer
import torch

app = Flask(__name__)

device = 'cpu'
model = SentenceTransformer('all-MiniLM-L6-v2', device=device)

@app.route('/embedding', methods=['POST'])
def generate_embedding():
    query = request.json['query']
    xc = model.encode(query)
    return {'embedding': xc.tolist()}

if __name__ == '__main__':
    app.run(port=8001)

This script creates a Flask server that listens for incoming POST requests on the /embedding route. When it receives a request, it uses the SentenceTransformer model to generate embeddings for the input query and returns them as a JSON response.

3. Run the server:

In [ ]:

python embedding_server.py

Your embedding server is now up and running on port 8001.

Indexing Embeddings with FAISS

Now that we have our embedding server running, let's index the embeddings using FAISS.

Create a new file called index_embeddings.py and paste the following code:

In [ ]:

import numpy as np
import faiss
import requests


index = faiss.IndexFlatL2(384) #384 is the dimension limit for all-MiniLM-L6-v2
index = faiss.IndexIDMap(index)

index_file = "faiss_index_file.idx"

def get_embeddings(query):
    url = 'http://localhost:8001/embedding'
    payload = {'query': query}
    headers = {'Content-Type': 'application/json'}
    response = requests.post(url, json=payload, headers=headers)
    xc = response.json()['embedding']
    return(xc)

def add_doc_with_id_to_faiss(document, docid):
    embedding = get_embeddings(document)
    embeddings = np.array([embedding]).astype('float32')
    ids = np.array([docid], dtype='int64')
    index.add_with_ids(embeddings, ids)

    # Save the FAISS index to a file
    faiss.write_index(index, index_file) #optional

This script creates a FAISS index and defines two functions for interacting with the embedding server and the index. The get_embeddings() function retrieves embeddings from the server, while the add_doc_with_id_to_faiss() function adds documents nd ids to the index with their embeddings.

Building APIs for Searching and Adding Documents

Now, let's create APIs for searching and adding documents to the FAISS index. Modify the index_embeddings.py file and add the following code:

In [ ]:

import flask
from flask import request

app = flask.Flask(__name__)

@app.route('/search', methods=['GET'])
def search():
    query = request.json['query']
    embedding = np.array([get_embeddings(query)]).astype('float32')
    distances, I = index.search(embedding, 10)
    ids = I.tolist()
    response = {'ids':ids[0]}
    # Return the results
    return flask.jsonify(response)

@app.route('/add-to-faiss', methods=['POST'])
def add_doc_with_id_to_faiss_api():
    document = request.json['comment']
    upostid = request.json['upostid']
    add_doc_with_id_to_faiss(document, upostid)
    return {'status': "indexed"}

if name == 'main':
    app.run(port=8002)

This code adds two API endpoints to the Flask application:

/search: A GET endpoint that accepts a JSON request containing a query, generates its embedding, and searches the FAISS index for the 10 most similar documents. It returns the document IDs as a JSON response.
/add-to-faiss: A POST endpoint that accepts a JSON request containing a document and its ID. It adds the document to the FAISS index with its generated embeddings.

Now you have two APIs to interact with the FAISS index.

Conclusion

In this blog post, we showed you how to set up an embedding server using Flask and SentenceTransformer, index embeddings with FAISS, and create APIs for searching and adding documents to the index. With these tools, you can efficiently manage and search large collections of text data.

How to Generate Embeddings from a Server and Index Them Using FAISS, with API

Introduction

1. Install the required libraries:

2. Create a new file called embedding_server.py and paste the following code

3. Run the server:

Conclusion

Related Notebooks