Skip to content
Steve edited this page Nov 5, 2016 · 8 revisions

We get_rest results from https://en.wikipedia.org/api/rest_v1/

/page/mobile-text/ => OK for lead HTML
    json["sections"][0]["items"]
    - ignore hatnote
    - prune references
    - prune IPA audio

/page/mobile-sections-lead/ => NOPE
    json["sections"][0]["text"]
    - first paragraph, followed by Infobox, lead section remainder
    - needs parsing to extract lead paragraphs

/page/summary/ => NOPE
    json["extract"]
    - first few sentences of plain text (truncated lead)
    + close to Google text
Clone this wiki locally