Avinash250325 commited on
Commit
e7ca9d6
·
verified ·
1 Parent(s): 23b0ba7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +117 -1
README.md CHANGED
@@ -1,3 +1,9 @@
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  datasets:
@@ -27,4 +33,114 @@ tags:
27
  - physics
28
  language:
29
  - en
30
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text2text-generation
3
+ widget:
4
+ - text: "<extra_id_97>short answer <extra_id_98>easy <extra_id_99> The sun is the center of our solar system."
5
+ ---
6
+
7
  ---
8
  license: mit
9
  datasets:
 
33
  - physics
34
  language:
35
  - en
36
+ ---
37
+ # T5-Based Question Generator Model
38
+
39
+ This model is a fine-tuned T5 model designed specifically for **automatic question generation** from any given context or passage. It supports different types of questions like **short answer**, **multiple choice question**, and **true or false quesiton**, while also allowing customization by **difficulty level** — easy, medium or hard.
40
+
41
+ ---
42
+
43
+ ## Why is this Project Important?
44
+
45
+ Educational tools, tutoring platforms, and self-learning systems need a way to **generate relevant questions** automatically from content. Our model bridges that gap by providing a flexible and robust question generation system using a **structured prompt** format and powered by a **fine-tuned `T5-base` model**.
46
+
47
+ ### Key Features
48
+
49
+ - Supports **multiple question types**:
50
+ - Short answer
51
+ - Multiple choice
52
+ - True/false
53
+
54
+ - Questions are generated based on:
55
+ - The **provided context**
56
+ - The **type of question**
57
+ - The **difficulty level**
58
+
59
+ - Difficulty reflects the **reasoning depth** required (multi-hop inference).
60
+
61
+ - Uses a **structured prompt format** with clearly defined tags, making it easy to use or integrate into other systems.
62
+
63
+ - Fine-tuned from the `t5-base` model:
64
+ - Lightweight and fast
65
+ - Easy to run on CPU
66
+ - Ideal for customization by teachers or Educational platforms
67
+
68
+ ### Ideal For
69
+
70
+ - Teachers creating quizzes or exam material
71
+ - EdTech apps generating practice questions
72
+ - Developers building interactive learning tools
73
+ - Automated assessment and content enrichment
74
+
75
+ ### Bonus: Retrieval-Augmented Generation (RAG)
76
+
77
+ A **custom RAG function** is also provided. This enables question generation from larger content sources like textbooks:
78
+
79
+ - Input can be a **subheading** or **small excerpt** from a textbook.
80
+ - The model fetches relevant supporting context form the textbook using a retirever.
81
+ - Generates questions grounded in the fetched material.
82
+
83
+ This extends the model beyond single-passage generation into more dynamic, scalable educational use cases.
84
+
85
+
86
+ ---
87
+
88
+ ## Prompt Format
89
+
90
+ To generate good quality questions, the model uses a **structured input prompt** format with special tokens. This helps the model understand the intent and expected output type.
91
+
92
+
93
+ ### Prompt Fields:
94
+ - `<extra_id_97>` – followed by the **question type**
95
+ - `short answer`, `multiple choice question`, or `true or false question`
96
+ - `<extra_id_98>` – followed by the **difficulty**
97
+ - `easy`, `medium`, or `hard`
98
+ - `<extra_id_99>` – followed by **[optional answer] context**
99
+ - `optional answer` – for targeted question generation, or you can leave it as blank
100
+ - `context` – the main passage/content from which questions are generated
101
+
102
+
103
+
104
+ ### Helper Function to Create the Prompt
105
+
106
+ To simplify prompt construction, use this Python function:
107
+
108
+ ```python
109
+ def format_prompt(qtype, difficulty, context, answer=""):
110
+ """
111
+ Format input prompt for question generation
112
+ """
113
+ answer_part = f"[{answer}]" if answer else ""
114
+ return f"<extra_id_97>{qtype} <extra_id_98>{difficulty} <extra_id_99>{answer_part} {context}"
115
+
116
+ ```
117
+
118
+ ---
119
+
120
+ ## How to Use the Model
121
+
122
+ ```python
123
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
124
+
125
+ # Load model from Hugging Face Hub
126
+ tokenizer = T5Tokenizer.from_pretrained("your-username/t5-question-gen")
127
+ model = T5ForConditionalGeneration.from_pretrained("your-username/t5-question-gen")
128
+
129
+ # Format input prompt
130
+ def format_prompt(qtype, difficulty, context, answer=""):
131
+ answer_part = f"[{answer}]" if answer else ""
132
+ return f"<extra_id_97>{qtype} <extra_id_98>{difficulty} <extra_id_99>{answer_part} {context}"
133
+
134
+ context = "The sun is the center of our solar system."
135
+ prompt = format_prompt("short answer", "easy", context)
136
+
137
+ # Tokenize and generate
138
+ inputs = tokenizer(prompt, return_tensors="pt")
139
+ outputs = model.generate(**inputs, max_length=150)
140
+
141
+ # Decode output
142
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
143
+
144
+ ```
145
+
146
+