Open
Description
Problem Statement
Individual SPLs sometimes have more than one NDC within the XML (and within the "Ingredients and Appearance" section). This prevents us from:
- Using NDC as a primary key
- Using NDC as a
lookup_field
Example: https://dailymed.nlm.nih.gov/dailymed/drugInfo.cfm?setid=2524b253-069e-4028-819c-361b888df110
Criteria for Success
Either modify Scrapy logic to remove duplicates (if appropriate) or modify Django Rest Framework to use an auto-incrementing number as the primary key and don't use NDC as a lookup_field
.
Success = 0 duplicate errors during Scrapy run.
Additional Information
The two XML documents I found are below:
- 3c8e9c87-0475-444b-bbe7-99620519a581.xml
- 8d4d72be-638b-11ea-918e-832cfc2ca371.xml
I pulled this from the API for one of them:
GET /spl/8d4d72be-638b-11ea-918e-832cfc2ca371/
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
{
"id": "8d4d72be-638b-11ea-918e-832cfc2ca371",
"set": "http://192.168.1.12:8000/set/2524b253-069e-4028-819c-361b888df110/",
"ndcs": [
"http://192.168.1.12:8000/ndc/50458-178-00/",
"http://192.168.1.12:8000/ndc/50458-178-15/",
"http://192.168.1.12:8000/ndc/50458-178-00/",
"http://192.168.1.12:8000/ndc/50458-178-20/",
"http://192.168.1.12:8000/ndc/50458-178-00/",
"http://192.168.1.12:8000/ndc/50458-178-12/",
"http://192.168.1.12:8000/ndc/50458-178-28/",
"http://192.168.1.12:8000/ndc/50458-178-06/",
"http://192.168.1.12:8000/ndc/50458-176-00/",
"http://192.168.1.12:8000/ndc/50458-176-15/",
"http://192.168.1.12:8000/ndc/50458-176-28/",
"http://192.168.1.12:8000/ndc/50458-176-06/"
]
}