Nellyw888
/

VeriReason-codeLlama-7b-RTLCoder-Verilog-GRPO-reasoning-tb

Reinforcement Learning

text-generation

text-generation-inference

Model card Files Files and versions Community

Nellyw888 commited on May 20

Commit

cd2c227

·

verified ·

1 Parent(s): 15f3bae

Update README.md

Files changed (1) hide show

README.md +16 -0

README.md CHANGED Viewed

@@ -11,6 +11,9 @@ tags:
 For implementation details, visit our GitHub repository: [VeriReason](https://github.com/NellyW8/VeriReason)
 ## Update Log
 2025.05.17: Initial release of VeriReason-Llama-7b-RTLCoder-GRPO-reasoning-tb
@@ -73,6 +76,19 @@ The GRPO (Generative Reinforcement Learning from Preference Optimization) traini
    ```
 ## Citation
 ## Acknowledgement
 This repo benefits from OpenR1 and LLamaFactory.

 For implementation details, visit our GitHub repository: [VeriReason](https://github.com/NellyW8/VeriReason)
+Check out our paper: [VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation](https://arxiv.org/abs/2505.11849)
 ## Update Log
 2025.05.17: Initial release of VeriReason-Llama-7b-RTLCoder-GRPO-reasoning-tb
    ```
 ## Citation
+Please cite our paper if you use our model or dataset:
+```bibtex
+@misc{wang2025verireason,
+      title={VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation},
+      author={Yiting Wang and Guoheng Sun and Wanghao Ye and Gang Qu and Ang Li},
+      year={2025},
+      eprint={2505.11849},
+      archivePrefix={arXiv},
+      primaryClass={cs.AI},
+      url={https://arxiv.org/abs/2505.11849},
+}
+```
 ## Acknowledgement
 This repo benefits from OpenR1 and LLamaFactory.