Spaces:

AIML-TUDA
/

VerifiableRewardsForScalableLogicalReasoning

Running

Lukas Helff commited on Jun 25

Commit

1fe4885

1 Parent(s): ac97ee4

make eval config not obligatory

Files changed (1) hide show

VerifiableRewardsForScalableLogicalReasoning.py CHANGED Viewed

@@ -91,7 +91,7 @@ Args:
     references (`list` of `dict`): Each reference should contain:
         - 'validation_program' (`str`): Background knowledge in Prolog syntax
         - 'evaluation_config' (`dict`, optional): Configuration of predicates to use for evaluation.
-        Define: positive_predicate, and negative_predicate
 Returns:
     accuracy (`float`): The proportion of predictions that correctly solve all examples. Value is between 0 and 1.
     partial_score (`float`): Average proportion of correctly classified examples across all predictions. Value is between 0 and 1.
@@ -261,10 +261,7 @@ class VerifiableRewardsForScalableLogicalReasoning(evaluate.Metric):
                 'predictions': datasets.Value('string'),
                 'references': {
                     'validation_program': datasets.Value('string'),
-                    'evaluation_config': {
-                        'positive_predicate': datasets.Value('string'),
-                        'negative_predicate': datasets.Value('string')
-                    }
                 },
             }),
             codebase_urls=["https://github.com/AIML-TUDA/SLR-Bench"],

     references (`list` of `dict`): Each reference should contain:
         - 'validation_program' (`str`): Background knowledge in Prolog syntax
         - 'evaluation_config' (`dict`, optional): Configuration of predicates to use for evaluation.
+        Define: positive_predicate, and negative_predicate, the positive one should match the head of the rule to evaluate.
 Returns:
     accuracy (`float`): The proportion of predictions that correctly solve all examples. Value is between 0 and 1.
     partial_score (`float`): Average proportion of correctly classified examples across all predictions. Value is between 0 and 1.
                 'predictions': datasets.Value('string'),
                 'references': {
                     'validation_program': datasets.Value('string'),
+                    'evaluation_config': datasets.Value("dict", id=None)
                 },
             }),
             codebase_urls=["https://github.com/AIML-TUDA/SLR-Bench"],