Function response is non-topical to the prompt

#6
by orby - opened

I'm just trying to get some sample code running. When the following is executed, the responses are all over the place. Sometimes its talking about numbers in PI, other times its talking about plastics or Coca-cola. My setup is probably wrong. Note the following is very slow on CPU, normally I can run mistral-instruct v0.1 on my 3090 in ~20gb or vram but this blows up the vram immediately. A complete runnable main.py would be extremely appreciated.

from transformers import AutoModelForCausalLM, AutoTokenizer

FUNCTION_METADATA = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "This function gets the current weather in a given city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "The city, e.g., San Francisco"
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use."
                    }
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_clothes",
            "description": "This function provides a suggestion of clothes to wear based on the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "temperature": {
                        "type": "string",
                        "description": "The temperature, e.g., 15 C or 59 F"
                    },
                    "condition": {
                        "type": "string",
                        "description": "The weather condition, e.g., 'Cloudy', 'Sunny', 'Rainy'"
                    }
                },
                "required": ["temperature", "condition"]
            }
        }
    }
]

prompt = [
    {
        "role": "function_metadata",
        "content": FUNCTION_METADATA
    },
    {
        "role": "user",
        "content": "What is the current weather in London?"
    },
    {
        "role": "function_call",
        "content": "{\n    \"name\": \"get_current_weather\",\n    \"arguments\": {\n        \"city\": \"London\"\n    }\n}"
    },
    {
        "role": "function_response",
        "content": "{\n    \"temperature\": \"15 C\",\n    \"condition\": \"Cloudy\"\n}"
    },
    {
        "role": "assistant",
        "content": "The current weather in London is Cloudy with a temperature of 15 Celsius"
    }
]

device = "cpu" # "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
#model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
tokenizer = AutoTokenizer.from_pretrained('Trelis/Mistral-7B-Instruct-v0.2-function-calling-v3', trust_remote_code=True)

prompt = tokenizer.apply_chat_template(prompt, tokenize=False)
encodeds = tokenizer.apply_chat_template(prompt, return_tensors="pt")
model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Same example output, it changes with each run:

Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3/3 [00:03<00:00,  1.12s/it]
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
<s>  [INST]  In response to a growing concern over the impact of plastics on the environment, several companies and organizations are taking steps to address the issue. One solution that is gaining popularity is the use of biodegradable plastics. Biodegradable plastics are designed to be broken down by microorganisms and other natural processes into water, carbon dioxide, and biomass. Let's take a closer look at some of the companies and organizations leading the way in this field and the progress they have made.

1. Coca-Cola: Coca-Cola is one of the largest consumer goods companies in the world and has acknowledged the problem of plastic waste. The company has set a goal to collect and recycle the equivalent of every bottle or can it sells globally by 2030. In addition, Coca-Cola is investing in biodegradable plastic technology and has already started using plant-based bottles made from 100% renewable raw materials in some markets.
2. Danone: Danone is a French multinational food-industry corporation, best known for its Activia and Actimel yogurt brands. The company is focusing on reducing the use of virgin plastic in its packaging and replacing it with recycled and biodegradable materials. Danone has set a target to reduce the amount of virgin plastic in its packaging by 40% by 2025.
3. Starbucks: Starbucks is a leading coffee chain with over 30,000 stores in 76 countries. The company has pledged to eliminate plastic straws, lids and other single-use plastic items from its stores by 2020. Starbucks is also exploring the use of biodegradable plastics for its cups and lids. In some markets, the company has already started testing compostable cups made from plant-based materials.
4. The Ellen MacArthur Foundation: The Ellen MacArthur Foundation is a global organization that aims to accelerate the circular economy. The organization is working on several initiatives to reduce plastic waste, including a collaboration with Holland & Barrett to create a refill system for plastic-free food and personal care products. The foundation is also supporting research into biodegradable plastics and is working with companies to develop closed-loop systems for plastic waste.
5. Unilever: Unilever is a British-Dutch multinational consumer goods company that owns brands such as Dove, Axe, and Lipton. The company is committed to reducing its use of plastic and has set a goal to halve its usage of virgin plastic by 2025. Unilever is also exploring the use of biodegradable plastics, including those made from sugarcane and other renewable sources.

These are just a few of the companies and organizations leading the way in the development and use of biodegradable plastics. As more businesses and governments recognize the need to reduce plastic waste, it is likely that we will see even more progress in this area in the coming years.</s>

Another run:

Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:25<00:00, 12.70s/it]
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
<s>  [INST] 0 - 1

20 - 20

-6 - 5

0 - 2

478 - 478

5 - 5

1 - 11

-8 - 7

1 - 0

2 - -19

1 - 0

3 - 3

-9 - 15

-2 - 9</s>
Trelis org

Howdy, I'm a bit confused because I see you're using the v3 model in the tokenizer.

Did you mean to post this issue on the v3 model?

Also, I see you are loading the base model, not the function calling model... which would explain why that's not working. Here's what I see you loading:

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")

Thanks for the quick reply.

I'm sure the model is where I'm getting confused. What is the function calling model? The only model reference I see on the page is to Instruct. Do you have the line of code handy I should be using?

Nm, I figured it out:

model = AutoModelForCausalLM.from_pretrained('Trelis/Mistral-7B-Instruct-v0.2-function-calling-v3', trust_remote_code=True)

The response is still nonsense after making this change. I'm going to sit down and go through your videos before asking more questions.

If you do have a single example py file or a git repo handy, I'd really appreciate it.

Thank you!

Trelis org

This issue relates to the wrong repo, see instead here

RonanMcGovern changed discussion status to closed

Sign up or log in to comment