if i wanna use my own textual data to pre-train a electra from scatch, what is the format of the text? Only sentence segmentation or even more ?? Please help.