qianhuiwu commited on
Commit
cc3ad83
Β·
verified Β·
1 Parent(s): c8a234a

update paper link.

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -13,7 +13,7 @@ This model was introduced in the paper [**GUI-Actor: Coordinate-Free Visual Grou
13
  It is developed based on [UI-TARS-2B-SFT](https://huggingface.co/ByteDance-Seed/UI-TARS-2B-SFT) and is designed to predict the correctness of an action position given a language instruction. This model is well-suited for **GUI-Actor**, as its attention map effectively provides diverse candidates for verification with only a single inference.
14
 
15
 
16
- For more details on model design and evaluation, please check: [🏠 Project Page](https://aka.ms/GUI-Actor) | [πŸ’» Github Repo](https://github.com/microsoft/GUI-Actor) | [πŸ“‘ Paper]().
17
 
18
 
19
  | Model List | Hugging Face Link |
@@ -194,9 +194,9 @@ answer = ground_only_positive(model, tokenizer, processor, instruction, image, p
194
  title={GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents},
195
  author={Qianhui Wu and Kanzhi Cheng and Rui Yang and Chaoyun Zhang and Jianwei Yang and Huiqiang Jiang and Jian Mu and Baolin Peng and Bo Qiao and Reuben Tan and Si Qin and Lars Liden and Qingwei Lin and Huan Zhang and Tong Zhang and Jianbing Zhang and Dongmei Zhang and Jianfeng Gao},
196
  year={2025},
197
- eprint={},
198
  archivePrefix={arXiv},
199
  primaryClass={cs.CV},
200
- url={},
201
  }
202
  ```
 
13
  It is developed based on [UI-TARS-2B-SFT](https://huggingface.co/ByteDance-Seed/UI-TARS-2B-SFT) and is designed to predict the correctness of an action position given a language instruction. This model is well-suited for **GUI-Actor**, as its attention map effectively provides diverse candidates for verification with only a single inference.
14
 
15
 
16
+ For more details on model design and evaluation, please check: [🏠 Project Page](https://aka.ms/GUI-Actor) | [πŸ’» Github Repo](https://github.com/microsoft/GUI-Actor) | [πŸ“‘ Paper](https://www.arxiv.org/pdf/2506.03143).
17
 
18
 
19
  | Model List | Hugging Face Link |
 
194
  title={GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents},
195
  author={Qianhui Wu and Kanzhi Cheng and Rui Yang and Chaoyun Zhang and Jianwei Yang and Huiqiang Jiang and Jian Mu and Baolin Peng and Bo Qiao and Reuben Tan and Si Qin and Lars Liden and Qingwei Lin and Huan Zhang and Tong Zhang and Jianbing Zhang and Dongmei Zhang and Jianfeng Gao},
196
  year={2025},
197
+ eprint={2506.03143},
198
  archivePrefix={arXiv},
199
  primaryClass={cs.CV},
200
+ url={https://www.arxiv.org/pdf/2506.03143},
201
  }
202
  ```