Kwaipilot KwaiCoder-AutoThink-preview (AutoThink Preview)

KwaiCoder-AutoThink-preview is the first public AutoThink LLM released by the Kwaipilot team at Kuaishou.
The model merges thinking and non‑thinking abilities into a single checkpoint and dynamically adjusts its reasoning depth based on the input’s difficulty.

✨ Key Highlights

Feature	What it means	Benefit
Auto Think	Diverse pre‑think data teaches the model to predict task difficulty	Better choice of when to think
Step‑SRPO	Token‑wise GRPO variant with process‑level rewards	More stable RL, higher “think” & “no‑think” accuracy
Agentic Data	Automated cot cold start data generation	Stronger inference models before reinforcement learning
KD + MTP	1 teacher → many‑token prediction distillation	<1⁄30 pre‑train cost

Evaluation Results

🔧 Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "Kwaipilot/KwaiCoder-AutoThink-preview"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.9,
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 
content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n")
print("prompt:\n", prompt)
print("content:\n", content)
"""
prompt:
Give me a short introduction to large language model.
content:
<judge>
This is a definitional query seeking a basic explanation, which can be answered with straightforward factual recall or a concise summary. Requires think-off mode.
</judge>

<think off>
Large Language Models (LLMs) are advanced artificial intelligence systems designed to understand and generate human-like text. They are trained on vast amounts of data to learn grammar, facts, reasoning, and context. Key features include:  

- **Scale**: Billions (or even trillions) of parameters, enabling complex pattern recognition.  
- **Versatility**: Can perform tasks like answering questions, writing code, summarizing text, and more.  
- **Adaptability**: Fine-tuned for specific uses (e.g., customer support, creative writing).  

Examples include OpenAI's GPT, Google's Gemini, and Meta's Llama. While powerful, LLMs may occasionally hallucinate or rely on outdated information. They’re transforming industries by automating text-based tasks and enhancing human productivity.  

Would you like a deeper dive into any aspect?
"""

🏗️ TODO

Technical reports will soon be released.
The released version of the model with improved performance will be comming soon.

🚦 Limitations & Notes

The preview checkpoint may occasionally over‑ or under‑think outside training distribution.
Use responsibly; verify factual outputs, especially when disabling thought traces.

📜 License

This repository is licensed under the MIT License. The use of KwaiCoder-AutoThink models is subject to the Model License. KwaiCoder-AutoThink models supports commercial use.

See the LICENSE-MODEL for more details.

This is a preview release. We will publish the full training recipe, data, and benchmarks soon.

Kwaipilot
/

KwaiCoder-AutoThink-preview