Reranker model

Reranker model

Brief information

This repository contains reranker model bge-reranker-v2-m3 which you can run on HuggingFace Inference Endpoints.

Base model: BAAI/bge-reranker-v2-m3 with no any fine tune.
Commit: 953dc6f6f85a1b2dbfca4c34a2796e7dde08d41e

More details please refer to the repo of bse model.

Supporting architectures

Apple Silicon MPS
Nvidia GPU
HuggingFace Inference Endpoints (AWS)
- CPU (Intel Sapphire Rapids, 4 vCPU, 8 Gb)
- GPU (Nvidia T4)
- Infernia 2 (2 cores, 32 Gb RAM)

Example usage

HuggingFace Inference Endpoints

⚠️ When you will deploy this model in HuggingFace Inference endpoints plese select Settings -> Advanced settings -> Task: Sentence Similarity

curl "https://xxxxxxx.us-east-1.aws.endpoints.huggingface.cloud" \
-X POST \
-H "Accept: application/json" \
-H "Authorization: Bearer hf_yyyyyyy" \
-H "Content-Type: application/json" \
-d '{
  "inputs": {
    "source_sentence": "Hello, world!",
    "sentences": [
      "Hello! How are you?",
      "Cats and dogs",
      "The sky is blue"
      ]
  },
  "normalize": true
}'

Local inference

from FlagEmbedding import FlagReranker

class RerankRequest(BaseModel):
    query: str
    documents: list[str]

# Prepare array
arr = []
for element in request.documents:
    arr.append([request.query, element])
print(arr)

# Inference
reranker = FlagReranker('netandreus/bge-reranker-v2-m3', use_fp16=True)
scores = reranker.compute_score(arr, normalize=True)
if not isinstance(scores, list):
    scores = [scores]
print(scores)  # [-8.1875, 5.26171875]

netandreus
/

bge-reranker-v2-m3