Recover loss PTT data

Following https://github.com/disinfoRG/ZeroScraper/issues/105 we have these articles snapshoted from PTTRead.  We also have the PTTRead parser ready with #25.  To get them into the datasets we still need a way to switch between PTT and PTTRead parsers for these snapshots.  Since ZeroScraper project concerns only about scraping, it seems more reasonable to leave the choice of parsers to ArticleParser project.  That means we should replicate here the information in `SnapshotLoss` table in scraper db somehow.

I think this is something that will happen again in the future so better to build certain mechanism for it.  We can:

* Add a "parser" field in `publication_mapping.info`.
* Add a CLI option for `ap-parse.py` to manually choose a parser for one article, overriding the default parser.  This information should be recorded in `publication_mapping.info`.
* Have the program always check `publication_mapping.info` to see if a parser is specified when updating a publication; use the default parser if there is none.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recover loss PTT data #27

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Recover loss PTT data #27

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions