YAML Metadata Warning: The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

This is a version of flan-t5-xl fine-tuned on the KELM Corpus to take in sentences and output triplets of the form subject-relation-object to be used for knowledge graph generation.

The model uses custom tokens to delimit triplets:

special_tokens = ['<triplet>', '</triplet>', '<relation>', '<object>']
tokenizer.add_tokens(special_tokens)

You can use it like this:

model = model.to(device)
model.eval()

new_input = "Hugging Face, Inc. is an American company that develops tools for building applications using machine learning.",
inputs = tokenizer(new_input, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"))
    print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=False)[0])

Output: <pad><triplet> Hugging Face <relation> instance of <object> Business </triplet></s>

This model still isn't perfect, and may make mistakes! I'm working on fine-tuning it for longer and on a more diverse set of data.

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bew/t5_sentence_to_triplet_xl

Base model

google/flan-t5-xl
Adapter
(44)
this model

Dataset used to train bew/t5_sentence_to_triplet_xl