Skip to content

invalid values #229

@lwaldron

Description

@lwaldron

The following line in bugphyzzExports is identifying invalid values and dropping them. @sdgamboa please raise such curation issues here and discuss whether they should be resolved by correcting the invalid values, adding to the allowed vocabulary, or continuing to drop these values. For some, dropping certainly does seem like the right choice for ASR, but for others (like aerophilicity and shapes) I'm not so sure.

https://github.com/waldronlab/bugphyzzExports/blob/a9fc18914cb3b1d9ea3a3d1c0121ccac5c8d482a/inst/scripts/export_bugphyzz.R#L126

[1] "Invalid values for aerophilicity: "
# A tibble: 3 × 2
  Attribute_group Attribute         
  <chr>           <chr>             
1 aerophilicity   facultative aerobe
2 aerophilicity   microaerotolerant 
3 aerophilicity   positive          
[1] "Invalid values for biosafety level: "
# A tibble: 6 × 2
  Attribute_group Attribute                                           
  <chr>           <chr>                                               
1 biosafety level "biosafety level Risk group (German classification)"
2 biosafety level "biosafety level 11o58'14.4\\\""                    
3 biosafety level "biosafety level Germany"                           
4 biosafety level "biosafety level 1+"                                
5 biosafety level "biosafety level 3**"                               
6 biosafety level "biosafety level L1"                                
[1] "Invalid values for disease association: "
# A tibble: 13 × 2
   Attribute_group     Attribute                                      
   <chr>               <chr>                                          
 1 disease association caries                                         
 2 disease association periodontal disorder                           
 3 disease association Infection caused by Escherichia coli (disorder)
 4 disease association Endocarditis                                   
 5 disease association Meningitis                                     
 6 disease association Periodontal Disorder                           
 7 disease association Infection                                      
 8 disease association arthritis                                      
 9 disease association meningitis septicemia                          
10 disease association septicemia arthritis                           
11 disease association Fever                                          
12 disease association urlnary tract infection                        
13 disease association Tetnus                                         
[1] "Invalid values for growth medium: "
# A tibble: 2,191 × 2
   Attribute_group Attribute                                                                                  
   <chr>           <chr>                                                                                      
 1 growth medium   NUTRIENT AGAR (DSMZ Medium 1)                                                              
 2 growth medium   Marine agar (MA)                                                                           
 3 growth medium   R2A MEDIUM (DSMZ Medium 830)                                                               
 4 growth medium   ACETIVIBRIO MEDIUM (DSMZ Medium 122)                                                       
 5 growth medium   Zobell marine agar (ZMA)                                                                   
 6 growth medium   MEDIUM 1 - for Acetobacter, Azotobacter, Gluconobacter, Gluconacetobacter, Mesorhizodium c7 growth medium   MEDIUM 85 - for Abiotrophia                                                                
 8 growth medium   GS2 agar plates                                                                            
 9 growth medium   TRYPTICASE SOY YEAST EXTRACT MEDIUM (DSMZ Medium 92)                                       
10 growth medium   MLO agar                                                                                   
# ℹ 2,181 more rows
# ℹ Use `print(n = ...)` to see more rows
[1] "Invalid values for shape: "
# A tibble: 20 × 2
   Attribute_group Attribute         
   <chr>           <chr>             
 1 shape           square            
 2 shape           vibriod cell      
 3 shape           rod-shaped        
 4 shape           coccus-shaped     
 5 shape           filament-shaped   
 6 shape           ellipsoidal       
 7 shape           pleomorphic-shaped
 8 shape           ovoid-shaped      
 9 shape           oval-shaped       
10 shape           other             
11 shape           sphere-shaped     
12 shape           spiral-shaped     
13 shape           curved-shaped     
14 shape           helical-shaped    
15 shape           vibrio-shaped     
16 shape           ring-shaped       
17 shape           spore-shaped      
18 shape           crescent-shaped   
19 shape           star-shaped       
20 shape           diplococcus-shaped
> 

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions