hkust-nlp
/

SynCSE-partial-RoBERTa-base

Feature Extraction

Transformers

PyTorch

roberta

RoBERTa

Model card Files Files and versions Community

leoozy commited on Jun 3, 2023

Commit

696f6dc

1 Parent(s): 17731a1

Update README.md

Browse files

Files changed (1) hide show

README.md +10 -31

README.md CHANGED Viewed

@@ -55,7 +55,7 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 ## Training Data
   The model craters note in the [Github Repository](https://github.com/SJTU-LIT/SynCSE/blob/main/README.md)
-> We train
 ## Training Procedure
@@ -75,15 +75,6 @@ More information needed
 # Evaluation
-## Testing Data, Factors & Metrics
-### Testing Data
- The model craters note in the [associated paper](https://arxiv.org/pdf/2104.08821.pdf)
-> Our evaluation code for sentence embeddings is based on a modified version of [SentEval](https://github.com/facebookresearch/SentEval). It evaluates sentence embeddings on semantic textual similarity (STS) tasks and downstream transfer tasks.
-For STS tasks, our evaluation takes the "all" setting, and report Spearman's correlation. See [associated paper](https://arxiv.org/pdf/2104.08821.pdf) (Appendix B) for evaluation details.
 ### Factors
@@ -108,16 +99,6 @@ More information needed
-# Environmental Impact
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** Nvidia 3090 GPUs with CUDA 11
-- **Hours used:** More information needed
-- **Cloud Provider:** More information needed
-- **Compute Region:** More information needed
-- **Carbon Emitted:** More information needed
 # Technical Specifications [optional]
 ## Model Architecture and Objective
@@ -143,11 +124,11 @@ More information needed.
 **BibTeX:**
 ```bibtex
-@inproceedings{gao2021simcse,
-   title={{SimCSE}: Simple Contrastive Learning of Sentence Embeddings},
-   author={Gao, Tianyu and Yao, Xingcheng and Chen, Danqi},
-   booktitle={Empirical Methods in Natural Language Processing (EMNLP)},
-   year={2021}
 }
 ```
@@ -159,13 +140,11 @@ More information needed
 More information needed
-# Model Card Authors [optional]
-Princeton NLP group in collaboration with Ezi Ozoani and the Hugging Face team.
 # Model Card Contact
-If you have any questions related to the code or the paper, feel free to email Tianyu (`tianyug@cs.princeton.edu`) and Xingcheng (`yxc18@mails.tsinghua.edu.cn`). If you encounter any problems when using the code, or want to report a bug, you can open an issue. Please try to specify the problem with details so we can help you better and quicker!
@@ -179,9 +158,9 @@ Use the code below to get started with the model.
 ```python
 from transformers import AutoTokenizer, AutoModel
-tokenizer = AutoTokenizer.from_pretrained("princeton-nlp/sup-simcse-bert-large-uncased")
-model = AutoModel.from_pretrained("princeton-nlp/sup-simcse-bert-large-uncased")
  ```
 </details>

 ## Training Data
   The model craters note in the [Github Repository](https://github.com/SJTU-LIT/SynCSE/blob/main/README.md)
+> We use 26.2k generated synthetic train SynCSE-partial-RoBERTa-base.
 ## Training Procedure
 # Evaluation
 ### Factors
 # Technical Specifications [optional]
 ## Model Architecture and Objective
 **BibTeX:**
 ```bibtex
+@article{zhang2023contrastive,
+  title={Contrastive Learning of Sentence Embeddings from Scratch},
+  author={Zhang, Junlei and Lan, Zhenzhong and He, Junxian},
+  journal={arXiv preprint arXiv:2305.15077},
+  year={2023}
 }
 ```
 More information needed
 # Model Card Contact
+If you have any questions related to the code or the paper, feel free to email Junlei (`zhangjunlei@westlake.edu.cn`). If you encounter any problems when using the code, or want to report a bug, you can open an issue. Please try to specify the problem with details so we can help you better and quicker!
 ```python
 from transformers import AutoTokenizer, AutoModel
+tokenizer = AutoTokenizer.from_pretrained("sjtu-lit/SynCSE-partial-RoBERTa-base")
+model = AutoModel.from_pretrained("sjtu-lit/SynCSE-partial-RoBERTa-base")
  ```
 </details>