Reranker model

Brief information

This repository contains reranker model bge-reranker-v2-m3 which you can run on HuggingFace Inference Endpoints.

More details please refer to the repo of bse model.

Supporting architectures

  • Apple Silicon MPS
  • Nvidia GPU
  • HuggingFace Inference Endpoints (AWS)
    • CPU (Intel Sapphire Rapids, 4 vCPU, 8 Gb)
    • GPU (Nvidia T4)
    • Infernia 2 (2 cores, 32 Gb RAM)

Example usage

HuggingFace Inference Endpoints

โš ๏ธ When you will deploy this model in HuggingFace Inference endpoints plese select Settings -> Advanced settings -> Task: Sentence Similarity

curl "https://xxxxxxx.us-east-1.aws.endpoints.huggingface.cloud" \
-X POST \
-H "Accept: application/json" \
-H "Authorization: Bearer hf_yyyyyyy" \
-H "Content-Type: application/json" \
-d '{
  "inputs": {
    "source_sentence": "Hello, world!",
    "sentences": [
      "Hello! How are you?",
      "Cats and dogs",
      "The sky is blue"
      ]
  },
  "normalize": true
}'

Local inference

from FlagEmbedding import FlagReranker

class RerankRequest(BaseModel):
    query: str
    documents: list[str]

# Prepare array
arr = []
for element in request.documents:
    arr.append([request.query, element])
print(arr)

# Inference
reranker = FlagReranker('netandreus/bge-reranker-v2-m3', use_fp16=True)
scores = reranker.compute_score(arr, normalize=True)
if not isinstance(scores, list):
    scores = [scores]
print(scores)  # [-8.1875, 5.26171875]
Downloads last month
55
Safetensors
Model size
568M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for netandreus/bge-reranker-v2-m3

Finetuned
(10)
this model