href

Running

alrope commited on 29 days ago

Commit

08706da

verified ·

1 Parent(s): 472397d

Update src/md.py

Files changed (1) hide show

src/md.py CHANGED Viewed

@@ -4,7 +4,7 @@ import pytz
 ABOUT_TEXT = """
 ## Overview
 HREF is evaluation benchmark that evaluates language models' capacity of following human instructions. It is consisted of 4,258 instructions covering 11 distinct categories, including Brainstorm ,Open QA ,Closed QA ,Extract ,Generation ,Rewrite ,Summarize ,Coding ,Classify ,Fact Checking or Attributed QA ,Multi-Document Synthesis , and Reasoning Over Numerical Data.
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/64dff1ddb5cc372803af964d/dSv3U11h936t_q-aiqbkV.png)
 ## Generation Configuration
 For reproductability, we use greedy decoding for all model generation as default. We apply chat templates to the instructions if they are implemented in model's tokenizer or explicity recommanded by the model's creators. Please contact us if you would like to change this default configuration.
@@ -30,6 +30,6 @@ pacific_tz = pytz.timezone('America/Los_Angeles')
 current_time = datetime.now(pacific_tz).strftime("%H:%M %Z, %d %b %Y")
 TOP_TEXT = f"""# HREF: Human Reference Guided Evaluation for Instructiong Following
-[Code]() | [Validation Set]() | [Human Agreement Set]() | [Results]() | [Paper]() | Total models: {{}} | Last restart (PST): {current_time}
 """

 ABOUT_TEXT = """
 ## Overview
 HREF is evaluation benchmark that evaluates language models' capacity of following human instructions. It is consisted of 4,258 instructions covering 11 distinct categories, including Brainstorm ,Open QA ,Closed QA ,Extract ,Generation ,Rewrite ,Summarize ,Coding ,Classify ,Fact Checking or Attributed QA ,Multi-Document Synthesis , and Reasoning Over Numerical Data.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64dff1ddb5cc372803af964d/0TK6xku0gdJPDs_nfwzns.png)
 ## Generation Configuration
 For reproductability, we use greedy decoding for all model generation as default. We apply chat templates to the instructions if they are implemented in model's tokenizer or explicity recommanded by the model's creators. Please contact us if you would like to change this default configuration.
 current_time = datetime.now(pacific_tz).strftime("%H:%M %Z, %d %b %Y")
 TOP_TEXT = f"""# HREF: Human Reference Guided Evaluation for Instructiong Following
+[Code](https://github.com/allenai/href) | [Validation Set](https://huggingface.co/datasets/allenai/href) | [Human Agreement Set](https://huggingface.co/datasets/allenai/href_preference) | [Results](https://huggingface.co/datasets/allenai/href_results) | [Paper](https://arxiv.org/abs/2412.15524) | Total models: {{}} | Last restart (PST): {current_time}
 """