BEE-spoke-data/tiny-random-MPNetForMaskedLM
test model and test tokenizer (has smaller vocab)
MPNetForMaskedLM(
(mpnet): MPNetModel(
(embeddings): MPNetEmbeddings(
(word_embeddings): Embedding(1125, 64, padding_idx=1)
(position_embeddings): Embedding(512, 64, padding_idx=1)
(LayerNorm): LayerNorm((64,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): MPNetEncoder(
(layer): ModuleList(
(0-4): 5 x MPNetLayer(
(attention): MPNetAttention(
(attn): MPNetSelfAttention(
(q): Linear(in_features=64, out_features=64, bias=True)
(k): Linear(in_features=64, out_features=64, bias=True)
(v): Linear(in_features=64, out_features=64, bias=True)
(o): Linear(in_features=64, out_features=64, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(LayerNorm): LayerNorm((64,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(intermediate): MPNetIntermediate(
(dense): Linear(in_features=64, out_features=64, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): MPNetOutput(
(dense): Linear(in_features=64, out_features=64, bias=True)
(LayerNorm): LayerNorm((64,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(relative_attention_bias): Embedding(32, 4)
)
)
(lm_head): MPNetLMHead(
(dense): Linear(in_features=64, out_features=64, bias=True)
(layer_norm): LayerNorm((64,), eps=1e-12, elementwise_affine=True)
(decoder): Linear(in_features=64, out_features=1125, bias=True)
)
)
- Downloads last month
- 35
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support