Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replaceSystemMessage model config option #3787

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ferenci84
Copy link
Contributor

Description

#3786

A model config option, named "replaceSystemMessage" (self-descriptive), default false.

Behavior:
When true, the model's systemMessage property, or the config's root systemMessage property will replace the default hard-coded system message (Something like "When generating new code:\n\n1. Always produce a single code block.\n2. Never separate the code into multiple code blocks.\n3. Only include the code that is being added.\n4. Replace existing code with a "lazy" comment like this ...." and also "When using tools, follow the following guidelines:\n- Avoid calling tools unless they are absolutely necessary...")

Setting false or not setting at all, will keep the current behavior.

Minor change:
The existing behavior, that the model config's systemMessage has higher precedence than the root config's systemMesage, was not documented, I added that to make the information in the documentation more obvious about the new config.

Checklist

  • The relevant docs, if any, have been updated or created
  • The relevant tests, if any, have been updated or created

Screenshots

[ For visual changes, include screenshots. ]

Testing instructions

Testing replaceSystemMessage = true

  1. Create a systemMessage in the default Claude model config.
  2. Add replaceSystemMessage true in the model config.
  3. Send any message with the model
  4. Check the Continue LLM Prompt/Completion on the Output tab.
  5. The system message in the sent request should now be the just the one given in the model config.
  6. Remove the config and send the message again.
  7. The system message should be appended to the default one.

Copy link

netlify bot commented Jan 20, 2025

Deploy Preview for continuedev ready!

Name Link
🔨 Latest commit b058b60
🔍 Latest deploy log https://app.netlify.com/sites/continuedev/deploys/678e90c06ade1c0008036da5
😎 Deploy Preview https://deploy-preview-3787--continuedev.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@sestinj
Copy link
Contributor

sestinj commented Jan 27, 2025

@ferenci84 thanks for the thoughtful PR! I'm thinking that there might be a deeper problem, which is just that the default system message isn't satisfactory. Can you share what the biggest motives for replacing it are? If our default was just problematic then I'd want to fix that first before adding additional complexity to config.json. At the end of the day, we're always going to be forced to have some level of opinion about prompting in order to have a functioning product, so a full replace would be somewhat misleading, or we wouldn't do it entirely

@ferenci84
Copy link
Contributor Author

Hi sestinj, as per my experience, especially with Claude, which is the best LLM available in my opinion (not with flaws though), that a small change in system message cause major effect in results. I experimented a lot, starting with a quite complex system message containing many instructions, and I found that the highest quality code was produced when I simplified the most. Also I believe some LLMs may work the best without system message, and one system message may work with one and not work with an other.
The specific problem with the system message was that the sample code has no path, so it doesn't include paths in the response either, however this is not the question in this PR, the question is the ability to customize. For me, the ability to customizing the system message is a deal breaker. I think it's just impossible to craft a system message that works in all the cases (and especially with all different LLMs). A basic default with option to customize should be sufficient I think.
Right now I'm using the customized version of Continue, and I just excluded all instructions about how to format code and how to use the tools, and I'm happy with the result this way.

FYI, my instructions that I found and simplified to the point where it didn't affect the code quality of Claude: "You are intelligent, helpful and an expert developer, who always gives the correct answer and follow instructions. Style your response using Github Flavored Markdown." (In one instance I had to remove "Style your response using Github Flavored Markdown.") I also got amazing results with these additional instructions and adding a complex high-level task description in a prompt file: "User may use the following tags to provide additional context before describing the task: Project-level information like tools and versions used, or additional guidelines about the structure to follow when adding features to the project.\nBackground information about the current task we are implementing. Note that the current instructions may implement only a part of the whole task, but it may help to know the whole task in advance.\nAdditional information about the current step within the task, for example user may describe features already implemented and the features we are implementing now.\nUse the tags described above in order to find additional context, but only follow the instructions at the end of the user message. Context tags are optional, they may be missing." I also added this section too, but it has no effect at all: "If the user submits a code block that contains a filename in the language specifier, always include the filename along with the path in any code block you generate based on that file. The filename and path should be on the same line as the language specifier in your code block."

I just show this to point to the fact that it's not a one size fits all situation. These instructions work for me, I have made a lot of experiments, and I need to be able to transfer them from one extension to the other to get predictable results. I started with the Genie extension, then used CodeGPT in IntelliJ, and now Continue.

@ferenci84
Copy link
Contributor Author

@sestinj Adding to my previous message: I'm planning to experiment the specific changes on the default system instructions, specifically to adding paths to test files, i.e. "src/test.js" instead of "test.js". I would create a separate PR about this modification.

Also one improvement to this PR may be that instead of replacing allow user to include placeholders like {{modification_instructions}} and {{tools_instructions}}, or just {{default_instructions}}. This way it's possible to prepend, append or insert the default, for example: "You are intelligent, helpful and an expert developer.\n\n{{default_instructions}}\n\nUser may use the following tags to provide additional context...". May be there could be two options: "append" (current behavior) and "placeholder" or just a boolean parameter "systemMessagePlaceholders" with ussage explained in the description of the setting. In addition to giving more freedom, this parameter naming may be better, because it won't raise the direct question whether it's a good idea to replace the system message, but it inherently give the user freedom to do so.

Also thinking that it would be better for user to see the default instructions and change if needed or easily revert back to the default. This would be possible by including it in the VS code settings with a default. This is better than hardcoding, gives the user possibility to look the default, change and revert back to default.

@OlhinAS
Copy link

OlhinAS commented Feb 16, 2025

Any update on the PR? This feature is kinda useful since default systemMessage has some pretty specific rules:

When generating new code:

  1. Always produce a single code block.
  2. Never separate the code into multiple code blocks.
  3. Only include the code that is being added.
  4. Replace existing code with a "lazy" comment like this: "// ... existing code ..."
  5. The "lazy" comment must always be a valid comment in the current context (e.g. "" for HTML, "// ... existing code ..." for JavaScript, "{/* ... existing code */}" for TSX, etc.)
  6. You must always provide 1-2 lines of context above and below a "lazy" comment
  7. If the user submits a code block that contains a filename in the language specifier, always include the filename in any code block you generate based on that file. The filename should be on the same line as the language specifier in your code block.
    ....
    When using tools, follow the following guidelines:
    Avoid calling tools unless they are absolutely necessary. For example, if you are asked a simple programming question you do not need web search. As another example, if the user asks you to explain something about code, do not create a new file.

I literally ran into the problem that I can't turn off "lazy" comments, and periodically prompted the model asking for code block be with complete code output. This worked for 2-3 replies, but then the model start to make "lazy" comments again. "Lazy" comments make "Insert at cursor" tool very unusable - I have to select section of changed code above/below of lazy comment and copy/paste them manually. Actually, such behavior (forcing the systemMessage and it's actual content) is not covered in docs!

I tried to delete it by forcing empty systemMessage in root config and customize models sys message, but it doesn't work.

Also, such injection in system message could ruin the users system prompt which, for example, structured for forcing the CoT.

@ferenci84
Copy link
Contributor Author

@OlhinAS Good to know that it's not just me who finds this useful. I'm thinking about an update to make the settings more intuitive and flexible.
It's an interesting point:

I tried to delete it by forcing empty systemMessage in root config and customize models sys message, but it doesn't work.

I tried the same out of desperation when I got annoyed with the default system message that cannot be removed.

The current behaviour (that is apparent only when you look into the code), that if there is a systemMessage in the model config, the systemMessage in the root config is ignored.

What complicates this, that two defaults that are added, one is the instructions about code changes and other is the instructions about tool use, instructions about the code is not added to every providers (for example if I run Claude via AWS, it's not added), and tool instructions is added only for models/providers with tool support and only if there are any tools enabled. I think the complete solution should allow the user to modify those two default instructions. This may also make the developers of this extension a more flexible way to experiment with different instructions in real-life projects without having to recompile everything.

@ferenci84
Copy link
Contributor Author

Thinking about this configuration:

Bare-minimum version:
root config:
toolUseInstructions - ability to override tool use instructions
codeResponseInstructions - ability to override code response instructions

model config:
disableCodeResponseInstructions - ability to disable code response instructions for this model setup (useful for setups in which such instructions are not applicable)
disableToolUseIInstructions - ability to disable tool use instructions for this model setup
(Above configs may be with "enable" if we want them to be more optimistic)
systemMessage - adding placeholders like "{code_response_instructions}{tool_use_instructions}"

Little mode complicated version:

root config:
systemMessage - adding placeholders, default should be "{code_response_instructions}{tool_use_instructions}", this would
enableSytemMessagePlaceholders - enable placeholders - allows reverting to previous behaviour
toolUseInstructions - ability to override tool use instructions
codeResponseInstructions - ability to override code response instructions

model config:
disableCodeResponseInstructions - ability to disable code response instructions for this model setup (useful for setups in which such instructions are not applicable)
disableToolUseIInstructions - ability to disable tool use instructions for this model setup
toolUseInstructions - ability to override tool use instructions for this model (empty string will force to be removed)
codeResponseInstructions - ability to override code response instructions for this model (empty string will force to be removed)
systemMessage - adding similar placeholders, but also adding {defaultSystemMessage} that will add the systemMessage from the root config.

@OlhinAS
Copy link

OlhinAS commented Feb 18, 2025

@ferenci84 Thank you for such a detailed response.

In context of systemMessage only:
Don't You think that instead of adding additional switches and complicating configuration (of course it would be more flexible, but more parameters - more chances to go wrong in config), it would be better to generate the config on initial installation of plugin already with changeable pre-filled root systemMessage? Simply bring it from hardcode to the config.

I am pretty sure that users writing systemMessages doesn't know about additive nature of "Always produce a single code block...." hardcoded instructions, so they think only about their systemMessage. In such case, when they rewrite systemMessage or model systemMessage (it's higher prior) they get only what they wrote.

If user wants to add - they add, if not - just rewrite completely. Or leave as it.


In context of ToolUse only:

I think it's not a good idea to completely disable ToolInstructions. Prompting tools behaviour is quite tricky since it could lead to the incorrect interpretation project root path and attempts to write/use files/folders/terminal incorrectly. I already faced this problem and in such case Continue get stuck. Not safety. But keep hardcoded prompt is not flexible.

And i did't get what disableCodeResponseInstructions is, sorry(

@ferenci84
Copy link
Contributor Author

ferenci84 commented Feb 19, 2025

@OlhinAS Thank you for your inputs.

@sestinj Would you mind joining the conversation? I think any solution would be good that give use control over the whole system message for specific model setups, if we want to, for any reason.

I tend to overcomplicate things. I agree with @OlhinAS that the bottom line is that if users write a custom systemMessage, they expect that to be used without any addition. Maybe they are just as easily copy-paste the default instructions if they want to into their custom system message, and no need for any placeholders.

@sestinj
Copy link
Contributor

sestinj commented Feb 21, 2025

Thanks for @ 'ing me to make sure I saw this

I agree with the principle of giving you full control. I'm going to take a first step of removing entirely the "lazy comments" system message in a release before next week. The reason I've taken my time in doing that is it's used for apply currently, so we needed a new solution there. Finally have one.

After that, I'd gladly revisit the further levels of control, though my strong preference (to really maximize simplicity) would be to just have a single system message that can be set by the user

@sestinj
Copy link
Contributor

sestinj commented Feb 27, 2025

We've taken step #1 and removed the massive formatting prompt that was causing so many problems. I want to take the time to get more feedback on how this works before completely adding a new field to the config, but in the case that we do, I am heavily leaning toward the following:

models:
  - name: Example
     provider: anthropic
     model: claude-3-5-sonnet-latest
     apiKey: <API_KEY>
     promptTemplates:
       toolSystemMessage: "..."
       

We're already using this layout for a custom apply prompt for some models

@sestinj
Copy link
Contributor

sestinj commented Feb 27, 2025

If possible, I'd like to move this conversation to issues for now

@ferenci84
Copy link
Contributor Author

Replaced by #4507

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants