Can't host with vLLM - "LlamaModel" architecture is not supported.

by huggingfacemotnt - opened Jan 31, 2024

Discussion

huggingfacemotnt

Jan 31, 2024

This model was fine tuned on codellama/CodeLlama-70b-hf, which has this as the architecture in it's config.json:

"LlamaForCausalLM"

The config.json for this model has this as the architecture:

"LlamaModel"

This causes an error when hosting with vLLM, as it only supports LlamaForCausalLM. I tried changing the config.json for sqlcoder-70b-alpha to indicate "LlamaForCausalLM", but this gave a KeyError:

KeyError: 'layers.11.input_layernorm.weight'

Is this a bug, or is this intentionally a different architecture? If it's not a bug, it seems vLLM cannot host this model, even though it supports LlamaForCausalLM from CodeLlama. Is there a recommended way, through something like vLLM or TGI...etc, to host this model?

rishdotblog

Defog.ai org Jan 31, 2024

Hi there, we discovered a bizzare bug where the model's lm_head.weight was not uploaded to HF in the upload process. This is causing many integrations to break, and the model uploaded here is producing gibberish results.

Fix coming soon – hopefully in the next hour

rishdotblog

Defog.ai org Jan 31, 2024

•

edited Jan 31, 2024

Fixed with a reupload of the model weights! Apologies for the issue. You'll unfortunately have to re-download the model weights first (run rm ~/.cache/huggingface/hub/models--defog--sqlcoder-70b-alpha). Should work great after that

rishdotblog changed discussion status to closed Jan 31, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment