Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use system_instruction conditionally #334

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

DinoChiesa
Copy link
Contributor

@DinoChiesa DinoChiesa commented Mar 24, 2025

This fix addresses #333 . It embeds the system_instruction portion of the POST body , conditionally.

With this fix, using most Google-hosted models, the POST body looks like:

{
  "system_instruction": {
    "parts": {
      "text": "You use markdown liberally to structure responses. Always show code snippets in markdown blocks with language labels."
    }
  },
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "prompt from user input here"
        }
      ]
    }
  ],
  "generation_config": {
    "temperature": 1,
    "topP": 1
  }
}

and using Gemma, the POST looks like:

{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "You use markdown liberally to structure responses. Always show code snippets in markdown blocks with language labels."
        }
      ]
    },
    {
      "role": "user",
      "parts": [
        {
          "text": "hello there"
        }
      ]
    }
  ],
  "generation_config": {
    "temperature": 1,
    "topP": 1
  }
}

@xenodium
Copy link
Owner

Thanks for the PR!

Can't we handle this in chatgpt-shell-google--validate-command like we do in chatgpt-shell-validate-no-system-prompt and guide the user to unset it? Otherwise, it feels a little misleading. The shell prompt is displaying a system value, but it's not actually used for anything.

image

@DinoChiesa
Copy link
Contributor Author

The shell prompt is displaying a system value, but it's not actually used for anything.

It is still being used. It's just placed in the user section of the prompt, rather than the system_instruction section of the prompt. See my example JSON payloads above.

What do you want to do?

BTW This is a bit of an edge-case. (a) Gemma is new and evolving and it will probably add system_instruction "real soon now." At which point this change will no longer be necessary. and (b) most people using chatgpt-shell I suppose won't use Gemma, which is mostly intended for use in embedded scenarios (on the phone, etc) for handling images and videos. I may be wrong about that; maybe chatgpt-shell is a good tool for handling images in prompts and responses. I have not explored this.

@xenodium
Copy link
Owner

It is still being used. It's just placed in the user section of the prompt, rather than the system_instruction section of the prompt. See my example JSON payloads above.

Thanks for clarifying. The way I was thinking about it is that if the model doesn't support system prompts, we don't add them to the payload at all, and guide users to unset if needed. I'm a little worried about what potential ambiguity it may cause. Maybe it's not a big deal and I'm overthinking it. Do you by chance know how other projects go about this?

@DinoChiesa
Copy link
Contributor Author

Do you by chance know how other projects go about this?

I don't. The space is moving so quickly. I suppose that's why this model doesn't support system instructions; Google's in a hurry. (I don't have any actual insight tho) And I suppose Gemma will support system instructions "at some point".

Maybe the optimal thing is to wait for someone else to scream about it, and if that happens, as you say, "guide users to unset" the behavior. The problem may just go away. But maybe you have higher aspirations for this project. I mean you may be concerned about users who try it and don't complain, but just .. silently give up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants