forked from ebroder/debothena
-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
It looks like some RFCs use different markup from others, leading to the RFC fetcher to fail to extract the title.
For example, https://datatracker.ietf.org/doc/html/rfc3986 uses
<span class="h1">Uniform Resource Identifier (URI): Generic Syntax</span>
and is successfully matched by
title = xml.xpath('string(//span[@class="h1"])')
but e.g., https://datatracker.ietf.org/doc/html/rfc9110 uses
<h1 id="title">HTTP Semantics</h1>
and is not matched.
It's possible that adding
if not title:
title = xml.xpath('string(//h1[@id="title"])')
in https://github.com/sipb/chiron/blob/master/chiron_bot/fetchers.py fetch_rfc
might help, but I didn't test this.
Metadata
Metadata
Assignees
Labels
No labels