Skip to content

AJAX POST requests not replayed correctly - causing timestamp mismatch #988

@rosemlondon

Description

@rosemlondon

Expected behavior

We have one example of a URI whereby AJAX POST requests are captured, but in PyWB replay, are pulling in the incorrect timestamp of a resource - in the context of a page with a ‘Load More’ button. This results in a chronological list of articles replaying non-chronologically.

https://webarchive.nationalarchives.gov.uk/ukgwa/20250506041746/https://travelhealthpro.org.uk/outbreaks

The WARC data itself is intact and correctly captured. The problem can be isolated to the replay environment.

The page (https://travelhealthpro.org.uk/outbreaks) requests a POST request to https://travelhealthpro.org.uk/ajax_more.php for the "load more" functionality.

This is captured by the crawler as text/html content type, matching the live response content type. However, the replay engine is requesting these requests with application/x-www-form-urlencoded; charset=UTF-8 content type, suggesting this mismatch is likely causing the issue.

What actually happened

  1. Open this particular capture instance: https://webarchive.nationalarchives.gov.uk/ukgwa/20250506041746/https://travelhealthpro.org.uk/outbreaks
  2. Observe that all the ‘Outbreaks’ listed on the first page are dated 01.05.2025. This is to be expected as the crawl was launched on 06.05.2025.
  3. However, if ‘Load More’ button is clicked -the next batch of outbreaks are dated in June 2025, after the crawl was finished.

Browser

  • OS: MacOS Tahoe 26.3.1
  • Browser: Chrome 146.0.7680.165
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions