You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had a look at the guiact-web-multi datasets (l1-l3).
The conversations I saw, look like this (e.g. for uid_record_07674_step_00):
| [{'end_turn': None, 'from': 'system', 'recipient': None, 'value': 'You are a GUI agent. You are given a task and a screenshot of the screen. You need to perform a series of pyautogui actions to complete the task.\n\nYou have access to the following functions:\n- {"name": "answer", "description": "Answer a question", "parameters": {"type": "object", "properties": {"answer": {"type": "string", "description": "The answer to the question"}}, "required": ["answer"]}}\n'} {'end_turn': None, 'from': 'human', 'recipient': None, 'value': "\nPlease generate the next move according to the UI screenshot, instruction and previous actions.\n\nInstruction: What information does the site provide about Judith Lauand's career, works and exhibitions?\n\nPrevious actions:\nNone"} {'end_turn': False, 'from': 'gpt', 'recipient': 'all', 'value': "Observation: The screenshot shows a dropdown menu on MutualArt with search results for 'Judith Lauand'. There are multiple entries detailing her exhibitions and mentions, such as in São Paulo, London, and various art reviews.\nThought: The goal is to gather information about Judith Lauand’s career, works, and exhibitions. The list provides various leads, each likely directing to detailed pages about specific aspects of her career. Selecting an option from the dropdown is essential to access more detailed information.\nAction: Click on the link labeled 'Judith Lauand: Brazilian 1922-2022' to explore more about her career and exhibitions.\n"} {'end_turn': True, 'from': 'gpt', 'recipient': 'os', 'value': 'pyautogui.click(x=0.41, y=0.178)'}]
Why is there only the answer function specified, not the other ones (clicking, scrolling, ...)?
The text was updated successfully, but these errors were encountered:
Hi, as mentioned in the paper, since clicking and scrolling as the inner function of pyautogui, we don't need to explicitly add these info in the system prompt as it is not additional knowledge. In other words, we tune these knowledge into the inner model instead of teach it to learn to read from system prompt.
Hi!
I had a look at the guiact-web-multi datasets (l1-l3).
The conversations I saw, look like this (e.g. for uid_record_07674_step_00):
| [{'end_turn': None, 'from': 'system', 'recipient': None, 'value': 'You are a GUI agent. You are given a task and a screenshot of the screen. You need to perform a series of pyautogui actions to complete the task.\n\n
\nPlease generate the next move according to the UI screenshot, instruction and previous actions.\n\nInstruction: What information does the site provide about Judith Lauand's career, works and exhibitions?\n\nPrevious actions:\nNone"} {'end_turn': False, 'from': 'gpt', 'recipient': 'all', 'value': "Observation: The screenshot shows a dropdown menu on MutualArt with search results for 'Judith Lauand'. There are multiple entries detailing her exhibitions and mentions, such as in São Paulo, London, and various art reviews.\nThought: The goal is to gather information about Judith Lauand’s career, works, and exhibitions. The list provides various leads, each likely directing to detailed pages about specific aspects of her career. Selecting an option from the dropdown is essential to access more detailed information.\nAction: Click on the link labeled 'Judith Lauand: Brazilian 1922-2022' to explore more about her career and exhibitions.\n"} {'end_turn': True, 'from': 'gpt', 'recipient': 'os', 'value': 'pyautogui.click(x=0.41, y=0.178)'}]
You have access to the following functions:\n- {"name": "answer", "description": "Answer a question", "parameters": {"type": "object", "properties": {"answer": {"type": "string", "description": "The answer to the question"}}, "required": ["answer"]}}\n'}
{'end_turn': None, 'from': 'human', 'recipient': None, 'value': "Why is there only the answer function specified, not the other ones (clicking, scrolling, ...)?
The text was updated successfully, but these errors were encountered: