amir22010 commited on
Commit
d351d70
·
verified ·
1 Parent(s): 632facd

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +37 -5
app.py CHANGED
@@ -103,17 +103,49 @@ async def greet(product,description):
103
  engine: groq
104
  model: llama-3.2-11b-text-preview
105
 
 
 
 
 
106
  rails:
107
  input:
108
  flows:
109
- - self check input
110
- - check blocked terms
111
 
112
  output:
113
  flows:
114
- - self check output
115
- - check blocked terms
116
- - self check facts
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
117
 
118
  streaming: False
119
  """
 
103
  engine: groq
104
  model: llama-3.2-11b-text-preview
105
 
106
+ - type: llama_guard
107
+ engine: groq
108
+ model: llama-guard-3-8b
109
+
110
  rails:
111
  input:
112
  flows:
113
+ - content safety check input $model=main
 
114
 
115
  output:
116
  flows:
117
+ - content safety check output $model=llama_guard
118
+
119
+ prompts:
120
+ - task: content_safety_check_input $model=main
121
+ content: |
122
+ You are a policy expert trying to help determine whether a user prompt is in
123
+ violation of the defined safety policies.
124
+
125
+ <start_of_turn>
126
+ Human Question: {{ user_input }}
127
+ <end_of_turn>
128
+
129
+ Our safety principle is defined in the below:
130
+
131
+ * "No Hate Speech": The prompt shall not contain or seek generation of content
132
+ that expresses, incites, or promotes hate based on race, gender, ethnicity,
133
+ religion, nationality, sexual orientation, disability status, or caste.
134
+ * "No Dangerous Content": The prompt shall not contain or seek generation of content that harming oneself
135
+ and/or others (e.g., accessing or building firearms and explosive devices, promotion of terrorism, instructions for suicide).
136
+ * "No Sexually Explicit Information": The prompt shall not contain or seek generation of content
137
+ that contains references to sexual acts or other lewd content (e.g., sexually graphic descriptions,
138
+ content aimed at causing arousal). Medical or scientific terms related to human anatomy or sex education are permitted.
139
+ * "No Harassment": The prompt shall not contain or seek generation of content that is malicious,
140
+ intimidating, bullying, or abusive content targeting another individual (e.g., physical threats,
141
+ denial of tragic events, disparaging victims of violence).
142
+ ===
143
+
144
+ Does the human prompt/question violate the above principle? Your answer must start with
145
+ 'Yes' or 'No'. And then walk through step by step to be sure we answer
146
+ correctly.
147
+
148
+ output_parser: is_content_safe
149
 
150
  streaming: False
151
  """