--- base_model: unsloth/mistral-7b-instruct-v0.3-bnb-4bit library_name: transformers license: apache-2.0 pipeline_tag: text-generation tags: - bias-detection - logical-fallacy - critical-thinking - rationality - unsloth language: - en --- # Model Card for Model ID This is a Qlora specifically dedicated to the identification of sophism and cognitive bias His performance for now is 85%-100% in detecting sophism , and 85%-100% for detectiong cognitive bias It was trained with a custom dataset of 14k lines ## Model Details ### Model Description - **Developed by:** Arthur Vigier - **Model type:** Qlora - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model :** mistral-7b-instruct-v0.3-bnb-4bit ## Uses It is dedicated to be used by anyone that want to judge public discourse based on the fundational basis of there language and the solidity of it. Using for education and increasing critical thinking is also a good way to use this tool ### API PUBLIC API COMING SOON ### Performance chart ### Sophism [] ### Cognitive Bias [] ### Direct Use ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import re class RationalityDebugger: def __init__(self, base_model="mistralai/Mistral-7B-v0.1", lora_model="Artvv/rationality-debugger-v1.0"): """ Initialize the cognitive bias and logical fallacy detector. Args: base_model: Base model from Hugging Face lora_model: LoRA adapters for rationality analysis """ print(f"Loading base model: {base_model}") self.tokenizer = AutoTokenizer.from_pretrained(base_model) # Options for optimized loading model_kwargs = { "torch_dtype": torch.float16, "device_map": "auto", "low_cpu_mem_usage": True } # Try first with 4-bit quantization to save memory try: from transformers import BitsAndBytesConfig quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True ) model_kwargs["quantization_config"] = quantization_config self.base_model = AutoModelForCausalLM.from_pretrained(base_model, **model_kwargs) except: # Fallback if bitsandbytes is not available print("4-bit quantization not available, using standard loading...") self.base_model = AutoModelForCausalLM.from_pretrained(base_model, **model_kwargs) print(f"Applying LoRA adapters: {lora_model}") self.model = PeftModel.from_pretrained(self.base_model, lora_model) self.model.eval() # Evaluation mode self.prompt_template = """ Analyze the following argument and identify any logical fallacies or cognitive biases: {text} ###OUTPUT FORMAT [Argument] Valid/Invalid → If Valid: Type: [ANALYTICAL / INDUCTIVE / ABDUCTIVE] [Sophisms] Yes/No → If Yes: Which: [List detected fallacies] → Extract(s): [Provide exact snippet(s)] [Biases] Yes/No → If Yes: Which: [List detected biases] → Extract(s): [Provide exact snippet(s)] [Short explanation] """ def analyze(self, text, max_new_tokens=200, temperature=0.1): """ Analyze text to detect cognitive biases and logical fallacies. Args: text: Text to analyze max_new_tokens: Maximum number of new tokens to generate temperature: Temperature for generation (lower = more deterministic) Returns: dict: Structured analysis result and raw text """ prompt = self.prompt_template.format(text=text) inputs = self.tokenizer(prompt, return_tensors="pt").to(self.model.device) with torch.no_grad(): outputs = self.model.generate( **inputs, max_new_tokens=max_new_tokens, temperature=temperature, top_p=0.9, do_sample=temperature > 0 ) # Extract only the generated part (not the prompt) generated_text = self.tokenizer.decode( outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True ) # Parse the response to extract the structure result = self._parse_response(generated_text) return { "raw_text": generated_text, "structured": result } def _parse_response(self, text): """Parse the model's response to extract structured information""" result = { "argument_valid": None, "argument_type": None, "has_sophisms": None, "detected_sophisms": [], "has_biases": None, "detected_biases": [], "too_short": False, "explanation": "" } # Simple parsing example - adapt as needed text_lower = text.lower() # Argument validity detection if "valid argument" in text_lower or "[argument] valid" in text_lower: result["argument_valid"] = True elif "invalid argument" in text_lower or "[argument] invalid" in text_lower: result["argument_valid"] = False # Argument type detection for arg_type in ["ANALYTICAL", "INDUCTIVE", "ABDUCTIVE"]: if arg_type.lower() in text_lower: result["argument_type"] = arg_type # Fallacy detection sophism_keywords = ["ad hominem", "straw man", "red herring", "false dilemma", "slippery slope", "post hoc", "circular reasoning"] for sophism in sophism_keywords: if sophism in text_lower: result["detected_sophisms"].append(sophism) result["has_sophisms"] = len(result["detected_sophisms"]) > 0 # Cognitive bias detection bias_keywords = ["confirmation bias", "availability bias", "anchoring bias", "hindsight bias", "halo effect", "dunning-kruger"] for bias in bias_keywords: if bias in text_lower: result["detected_biases"].append(bias) result["has_biases"] = len(result["detected_biases"]) > 0 # Explanation explanation_match = re.search(r"\[Short explanation\](.*?)(?=$|\[)", text, re.DOTALL) if explanation_match: result["explanation"] = explanation_match.group(1).strip() else: # If no explanation tag, take the whole text result["explanation"] = text return result # --- Usage example --- if __name__ == "__main__": # Create the analyzer analyzer = RationalityDebugger( base_model="mistralai/Mistral-7B-v0.1", lora_model="Artvv/rationality-debugger-v1.0" ) # Analysis example argument = """ All birds can fly. Penguins are birds. Therefore, penguins can fly. """ result = analyzer.analyze(argument) # Display raw result print("\n=== RAW ANALYSIS ===") print(result["raw_text"]) # Display structured result print("\n=== STRUCTURED ANALYSIS ===") print(f"Valid argument: {result['structured']['argument_valid']}") if result["structured"]["detected_sophisms"]: print("\nDetected fallacies:") for sophism in result["structured"]["detected_sophisms"]: print(f"- {sophism}") if result["structured"]["detected_biases"]: print("\nDetected cognitive biases:") for bias in result["structured"]["detected_biases"]: print(f"- {bias}") print("\nExplanation:") print(result["structured"]["explanation"]) ``` ### Out-of-Scope Use It is not intended to harass anyone or being rude ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations He is very efficient to the most common sophism and cognitive bias but for some more niche like bias frequency illusion he can be less efficient. He is mainly dedicated to detect sophism and cognitive bias , he can detect valid reasoning but it is not his main purpose ## Model Card Contact mail : arvigier@gmail.com ### Framework versions - PEFT 0.14.0 [](https://github.com/unslothai/unsloth)