Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reserve cache with key _state #1090

Closed
2 of 5 tasks
kamaradclimber opened this issue Sep 6, 2023 · 11 comments
Closed
2 of 5 tasks

Unable to reserve cache with key _state #1090

kamaradclimber opened this issue Sep 6, 2023 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@kamaradclimber
Copy link

Description:

Processed issue cache cannot be saved, leading to going over same PRs over and over.

Action version:

b69b346

Platform:

  • Ubuntu
  • macOS
  • Windows

Runner type:

  • Hosted
  • Self-hosted

Repro steps:

Just wait for the action to launch

name: 'Stale issue handler'
on:
  workflow_dispatch:
  schedule:
    - cron: '0 */2 * * *'

permissions:
  contents: read
  issues: read
  pull-requests: write

jobs:
  stale:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/stale@b69b346013879cedbf50c69f572cd85439a41936
        id: stale
        with:
          stale-pr-message: 'This PR is stale because it has been open 30 days with no activity. Remove stale label (or add the `never-stale` label) otherwise this PR will be closed in 30 days'
          close-pr-message: 'This PR has been closed because it remained without activity for 30 days and in "stale" state for 30 days. Of course, you can reopen it if necessary.'
          days-before-stale: 30
          days-before-close: 30
          close-pr-label: 'closed-reason: stale'
          exempt-pr-labels: 'never-stale'
          exempt-draft-pr: true # warning: this consumes 1 additional API request per PR
          operations-per-run: 100 # let’s increase batch size if necessary
      - name: Print outputs
        run: echo ${{ join(steps.stale.outputs.*, ',') }}

Expected behavior:

Cache being used to go over all PR (1000+ in our repo).

Actual behavior:

Action go over the same 500PR.

In the action logs:

Github API rate remaining: 14887; reset at: Wed Sep 06 2023 06:27:06 GMT+0000 (Coordinated Universal Time)
state: persisting info about 459 issue(s)
/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C /home/runner/work/logs-backend/logs-backend --files-from manifest.txt --use-compress-program zstdmt
Failed to save: Unable to reserve cache with key _state, another job may be creating this cache. More details: Cache already exists. Scope: refs/heads/prod, Key: _state, Version: fa41d75081481069cfb6b92a5f83a94c6e06ef3ab2e6b762649ac5f86f46153f

Cache file _state does exist but has been created once and never updated apparently.

@kamaradclimber kamaradclimber added bug Something isn't working needs triage labels Sep 6, 2023
@dsame dsame removed the needs triage label Sep 6, 2023
@dsame
Copy link
Contributor

dsame commented Sep 6, 2023

Hello @kamaradclimber , i see the log message another job may be creating this cache - is it the real situation?
As a first attempt to solve the problem can you please remove the cache manually and confirm the problem raised once again despite no other jobs run stale action?

To remove the cache manually, please click actions tab, next Caches under Management title in sidebar and then delete button against the _state cache.

@kamaradclimber
Copy link
Author

Hello @dsame thanks for your reply.
I’m relatively confident no other job is trying to update/create this file since it was not present before I updated the action yesterday.
I already tried to delete the _state file once (yesterday when seing this for the first time) and have let the action create it again (run n+1) and then fail again (run n+2 and following).

@kamaradclimber
Copy link
Author

I got a bit further this time:

  1. delete the cache file
  2. launch manually action
The saved state was not found, the process starts from the first issue.
...
Github API rate used: 101
Github API rate remaining: 14731; reset at: Wed Sep 06 2023 09:04:45 GMT+0000 (Coordinated Universal Time)
state: persisting info about 452 issue(s)
/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C /home/runner/work/logs-backend/logs-backend --files-from manifest.txt --use-compress-program zstdmt
Cache Size: ~0 MB (1168 B)
Cache saved successfully

Now the cache file exist correctly
3. Relaunch the job a second time

The saved state was not found, the process starts from the first issue.
...
Github API rate used: 101
Github API rate remaining: 14628; reset at: Wed Sep 06 2023 09:04:45 GMT+0000 (Coordinated Universal Time)
state: persisting info about 452 issue(s)
/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C /home/runner/work/logs-backend/logs-backend --files-from manifest.txt --use-compress-program zstdmt
Failed to save: Unable to reserve cache with key _state, another job may be creating this cache. More details: Cache already exists. Scope: refs/heads/prod, Key: _state, Version: fa41d75081481069cfb6b92a5f83a94c6e06ef3ab2e6b762649ac5f86f46153f

so my understanding is that the method checking for cache existence is not working as expected because I would have assumed the cache to be found on the second run (and it did exist).

@jonstacks
Copy link

I was recently hitting the same thing using the latest main and adding

permissions:
  actions: write

fixed it for me.

I'm not entirely sure, but what I think is happening is that these calls are using an octokit client with the GITHUB_TOKEN which I believe can vary based on user settings. However, the @actions/cache is using an environment variable for uploading and restoring the cache.

I think giving it actions:write also grants it more permissions than it probably needs, but it worked for me. I haven't looked too deep but I wonder if those other 2 octokit calls could use the @actions/cache client and get their permissions from the ACTIONS_RUNTIME_TOKEN.

@kamaradclimber
Copy link
Author

Indeed the workaround is correct:

state: persisting info about 609 issue(s)
/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C /home/runner/work/logs-backend/logs-backend --files-from manifest.txt --use-compress-program zstdmt
Cache Size: ~0 MB (1512 B)
Cache saved successfully

Thanks @jonstacks . I’ll keep the permission until there is a fix in the action.

@dsame dsame self-assigned this Sep 8, 2023
@dsame
Copy link
Contributor

dsame commented Sep 8, 2023

Hello @kamaradclimber , i suspect

permissions:
  contents: write
  issues: write

as it mentioned in the README could be enough since the action was tested with it.

Thank you @jonstacks for the note, but the ACTIONS_RUNTIME_TOKEN variable is not documented and should not be used. But the set of the required permission is given in the README, so they should be used first.

@kamaradclimber
Copy link
Author

Ok I’ll try with those options as well. Readme was unclear about the fact it might be required to have those write permissions for statefulness.

@jonstacks
Copy link

@dsame, sorry for the confusion. I worded it poorly, but didn't mean to suggest that the ACTIONS_RUNTIME_TOKEN variable should be used directly since it is undocumented. I was thinking that instead of using an octokit client for checking the cache, the native @actions/cache library(which internally makes use of the ACTIONS_RUNTIME_TOKEN and is already used) could be the only method of interacting with the cache and just handle errors if restoring from a particular cache key fails instead of proactively using an octokit client with the GITHUB_TOKEN to check if it exists or not.

From the github docs,

The permissions for the GITHUB_TOKEN are initially set to the default setting for the enterprise, organization, or repository. If the default is set to the restricted permissions at any of these levels then this will apply to the relevant repositories. For example, if you choose the restricted default at the organization level then all repositories in that organization will use the restricted permissions as the default. The permissions are then adjusted based on any configuration within the workflow file, first at the workflow level and then at the job level. Finally, if the workflow was triggered by a pull request from a forked repository, and the Send write tokens to workflows from pull requests setting is not selected, the permissions are adjusted to change any write permissions to read only

I think this means that the permissions for the GITHUB_TOKEN might work without adding actions: write in some organizations or repositories, but others will have issues if their organization policy is using restricted permissions.I think using only @actions/cache would bypass this issue.

@dsame
Copy link
Contributor

dsame commented Sep 11, 2023

Hello @jonstacks , much thanks for the clarification. The problem is @action/cache uses undocumented limited API that is different from GitHub API and the using of 2 different clients is an known flaw caused by the limitation of the cache API and it is planned to be fixed. But, for now, to have actions: write or context: write is the only solution.

Did this answer helps?

@dsame
Copy link
Contributor

dsame commented Oct 18, 2023

@jonstacks i am closing the issue as resolved with workaround, but please feel free to reopen it or create new one if the problem still exists

@dsame dsame closed this as completed Oct 18, 2023
kamontat added a commit to kc-workspace/kcws-js that referenced this issue Jul 6, 2024
kevingranade added a commit to CleverRaven/Cataclysm-DDA that referenced this issue Jul 23, 2024
This is required to allow stateful operation as per actions/stale#1090
kevingranade added a commit to CleverRaven/Cataclysm-DDA that referenced this issue Jul 23, 2024
This is required to allow stateful operation as per actions/stale#1090
@PlayLikeNeverB4
Copy link

For people coming from google: I needed to add actions: write, but also #1136

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants