Text Generation
Transformers
PyTorch
llama
uncensored
text-generation-inference
ehartford's picture TheBloke's picture
Change use_cache to True which significantly speeds up inference (#2)
ca45eff