Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

arxiv: 2306.11695

text-generation-inference

Inference Endpoints

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

58

Full-text search

Active filters: 2306.11695

wang7776/Llama-2-7b-chat-hf-10-sparsity

Text Generation • Updated Feb 4, 2024 • 92

wang7776/Llama-2-7b-chat-hf-30-sparsity

Text Generation • Updated Feb 5, 2024 • 89

wang7776/Llama-2-7b-chat-hf-20-sparsity

Text Generation • Updated Feb 5, 2024 • 88

wang7776/Mistral-7B-Instruct-v0.2-sparsity-10

Text Generation • Updated Feb 5, 2024 • 91

wang7776/vicuna-7b-v1.3-sparsity-20

Text Generation • Updated Feb 5, 2024 • 73

wang7776/vicuna-7b-v1.3-sparsity-30

Text Generation • Updated Feb 5, 2024 • 10

wang7776/vicuna-7b-v1.3-sparsity-10

Text Generation • Updated Feb 5, 2024 • 75

wang7776/Mistral-7B-Instruct-v0.2-sparsity-30-v0.1

Text Generation • Updated Feb 5, 2024 • 99 • 1

wang7776/Mistral-7B-Instruct-v0.2-sparsity-20-v0.1

Text Generation • Updated Feb 5, 2024 • 102 • 1

wang7776/Llama-2-7b-chat-hf-20-attention-sparsity

Text Generation • Updated Feb 5, 2024 • 17

wang7776/vicuna-7b-v1.3-attention-sparsity-20

Text Generation • Updated Feb 5, 2024 • 18

wang7776/Mistral-7B-Instruct-v0.2-attention-sparsity-20

Text Generation • Updated Feb 5, 2024 • 14

wang7776/Llama-2-7b-chat-hf-10-attention-sparsity

Text Generation • Updated Feb 5, 2024 • 14

wang7776/Llama-2-7b-chat-hf-30-attention-sparsity

Text Generation • Updated Feb 5, 2024 • 11

wang7776/vicuna-7b-v1.3-attention-sparsity-10

Text Generation • Updated Feb 5, 2024 • 11

wang7776/vicuna-7b-v1.3-attention-sparsity-30

Text Generation • Updated Feb 5, 2024 • 10

wang7776/Mistral-7B-Instruct-v0.2-attention-sparsity-10

Text Generation • Updated Feb 5, 2024 • 18

wang7776/Mistral-7B-Instruct-v0.2-attention-sparsity-30

Text Generation • Updated Feb 5, 2024 • 18

IntelLabs/shears-llama-7b-50-math-heuristic-adapter

Updated Feb 12 • 6 • 3

IntelLabs/shears-llama-7b-50-math-super-adapter

Updated Feb 12 • 13 • 3

IntelLabs/shears-llama-13b-50-math-super-adapter

Updated Feb 12 • 10 • 4

IntelLabs/shears-llama-13b-50-math-heuristic-adapter

Updated Feb 12 • 14 • 3

IntelLabs/shears-llama-7b-50-cs-heuristic-adapter

Updated Feb 12 • 7 • 3

IntelLabs/shears-llama-7b-50-cs-super-adapter

Updated Feb 12 • 13 • 3

kettleguts/zephyr-7b-beta_sparse05

Text Generation • Updated Mar 27, 2024 • 12

IntelLabs/shears-mpt-7b-50-base

Text Generation • Updated Feb 12 • 116 • 2

IntelLabs/sqft-phi-3-mini-4k-50-base

Text Generation • Updated Feb 12 • 792 • 2

IntelLabs/sqft-phi-3-mini-4k-60-base

Text Generation • Updated Feb 12 • 119 • 2

IntelLabs/sqft-phi-3-mini-4k-30-base

Text Generation • Updated Feb 12 • 119 • 2

IntelLabs/sqft-phi-3-mini-4k-40-base

Text Generation • Updated Feb 12 • 121 • 2