Skip to content

toke.c - only try to shrink an SV buffer when conceivably possible #23293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: blead
Choose a base branch
from

Conversation

richardleach
Copy link
Contributor

A few places within toke.c try to return unused string buffer space by doing something like:

if (SvCUR(sv) + 5 < SvLEN(sv)) {
    SvPV_shrink_to_cur(sv);
}

The rationale for the 5 is not commented upon. Maybe it's a random thumb in the air, maybe it was 1 byte for the trailing null plus a 4 byte pointer to account for allocation alignment?

Either way, on a platform with 8 byte pointers, that code could try to shrink buffers when there's no realistic chance of doing so. (The space between chunk sizes will be at least 1x PTRSIZE, often 2x PTRSIZE at smaller sizes and then progressively more.)

For a reallocation that is at least has the potential to be meaningful, the difference between CUR and LEN should be at least:

  • 1 byte for a trailing null byte +
  • 1 byte to enable the SV to be COWed +
  • 1x PTRSIZE

This commit changes makes that so.


  • This set of changes does not require a perldelta entry.

A few places within toke.c try to return unused string buffer space by
doing something like:

    if (SvCUR(sv) + 5 < SvLEN(sv)) {
        SvPV_shrink_to_cur(sv);
    }

The rationale for the 5 is not commented upon. Maybe it's a random
thumb in the air, maybe it was 1 byte for the trailing null plus
a 4 byte pointer to account for allocation alignment?

Either way, on a platform with 8 byte pointers, that code could
try to shrink buffers when there's no realistic chance of doing
so. (The space between chunk sizes will be at least 1x PTRSIZE,
often 2x PTRSIZE at smaller sizes and then progressively more.)

For a reallocation that is at least has the potential to be
meaningful, the difference between CUR and LEN should be at least:
* 1 byte for a trailing null byte +
* 1 byte to enable the SV to be COWed +
* 1x PTRSIZE

This commit changes makes that so.
@richardleach richardleach added the defer-next-dev This PR should not be merged yet, but await the next development cycle label May 15, 2025
@richardleach
Copy link
Contributor Author

#23111 is of the same theme.

As with that PR, only doing 1x PTRSIZE is quite conservative and perhaps we would be happy with a larger multiple.

I did a quick instrument of make perl for the change in S_scan_const - on a 64bit platform - and saw:

  • 137220 calls to SvPV_shrink_to_cur
  • 90409 to shrink by 1x PTRSIZE or less
  • 19088 to shrink by between 1-2x PTRSIZE
  • 22584 to shrink by between 2-3x PTRSIZE
  • 5161 to shrink by more than 3x PTRSIZE

@ilmari
Copy link
Member

ilmari commented May 15, 2025

Instead of repeating that logic at every callsite, why not move it into SvPV_shrink_to_cur or a new wrapper macro/function?

@richardleach
Copy link
Contributor Author

Instead of repeating that logic at every callsite, why not move it into SvPV_shrink_to_cur or a new wrapper macro/function?

Yeah, that's definitely an option.

There's one call in each of pp_hot.c and pp_sys.c that guard with if (SvLEN(sv) - SvCUR(sv) > 20), but there are no comments to explain why 20 specifically, or why more than the 5 in toke.c. If there's no real reason to have differing numbers, the check can definitely be moved into SvPV_shrink_to_cur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defer-next-dev This PR should not be merged yet, but await the next development cycle
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants