arxiv:2405.04828

ChuXin: 1.6B Technical Report

Published on May 8, 2024

Authors:

Abstract

ChuXin, an open-source language model with 1.6 billion parameters, extends context length to 1M tokens and offers strong retrieval performance through lightweight continual pretraining.

AI-generated summary

In this report, we present ChuXin, an entirely open-source language model with a size of 1.6 billion parameters. Unlike the majority of works that only open-sourced the model weights and architecture, we have made everything needed to train a model available, including the training data, the training process, and the evaluation code. Our goal is to empower and strengthen the open research community, fostering transparency and enabling a new wave of innovation in the field of language modeling. Furthermore, we extend the context length to 1M tokens through lightweight continual pretraining and demonstrate strong needle-in-a-haystack retrieval performance. The weights for both models are available at Hugging Face to download and use.