Models trained on IPA-CHILDES and evaluated for phonological knowledge using the word segmentation task, linked to child language acquisition.
AI & ML interests
tokenization, CHILDES, word segmentation, phonemes, BabyLM
Organization Card
Edit this README.md
markdown file to author your organization card.
The IPA-CHILDES dataset along with the models and tokenizers used for phoneme-based language modeling for the 31 languages in CHILDES.
-
IPA-CHILDES & G2P+: Feature-Rich Resources for Cross-Lingual Phonology and Phonemic Language Modeling
Paper • 2504.03036 • Published -
phonemetransformers/IPA-CHILDES
Viewer • Updated • 12.5M • 189 • 2 -
phonemetransformers/ipa-childes-tokenizers
Updated -
phonemetransformers/ipa-childes-models
Updated
Models trained on IPA-CHILDES and evaluated for phonological knowledge using the word segmentation task, linked to child language acquisition.
The IPA-CHILDES dataset along with the models and tokenizers used for phoneme-based language modeling for the 31 languages in CHILDES.
-
IPA-CHILDES & G2P+: Feature-Rich Resources for Cross-Lingual Phonology and Phonemic Language Modeling
Paper • 2504.03036 • Published -
phonemetransformers/IPA-CHILDES
Viewer • Updated • 12.5M • 189 • 2 -
phonemetransformers/ipa-childes-tokenizers
Updated -
phonemetransformers/ipa-childes-models
Updated
models
36
phonemetransformers/ipa-childes-models-tiny
Updated
phonemetransformers/ipa-childes-models-small
Updated
phonemetransformers/ipa-childes-models-medium
Updated
phonemetransformers/ipa-childes-models-large
Updated
phonemetransformers/ipa-childes-tokenizers
Updated
phonemetransformers/ipa-childes-english-size-comparison
0.0B
•
Updated
•
32
phonemetransformers/ipa-childes-models
Updated
phonemetransformers/babble-tokenizers
Updated
phonemetransformers/childes-phoneme-tokenizers
Updated
phonemetransformers/GPT2-85M-BPE-TXT
0.1B
•
Updated
•
828
•
1