You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note csv_utf8-BOM_DH_VirusSeq_Portal.tsv has ...study or ef bb bf compared to the other two.
Behaviour on clinical portal
BOM prevents file from being uploaded however if encoding if fixed, file succeeds.
Solution
Suggestion
Adding encoding step in submission that converts submitted TSV file (regardless of encoding) into UTF-8.
Python script used
import os
import argparse
def convert_encoding(bom_file,new_file):
with open(bom_file, 'r', encoding='utf-8-sig') as infile:
content = infile.read()
# Write the content back with utf-8 encoding (no BOM)
with open(new_file, 'w', encoding='utf-8') as outfile:
outfile.write(content)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Convert UTF-8-SIG encoded file to UTF-8.')
parser.add_argument('-i','--input_file', help='Path to the input file (UTF-8 with BOM)',required=True)
parser.add_argument('-o','--output_file',default=False,help='Path to save the output file (UTF-8 without BOM)')
args = parser.parse_args()
bom_file=os.path.abspath(args.input_file)
if args.output_file:
deBOMed_file=os.path.abspath(args.output_file)
else:
deBOMed_file="%s/deBOMed_%s" % (os.path.dirname(bom_file),os.path.basename(bom_file))
convert_encoding(bom_file, deBOMed_file)
Description
As Data submitter for clinical submission,
I want to submit the clinical data files with UTF-8-BOM encoding
but i am facing error while upload.
Troubleshooting errors
Acceptance criteria
As a user, i should be allowed to upload file with no encoding restrictions on virusseq
The text was updated successfully, but these errors were encountered: