School project exploring seq-2-seq models for machine translation.
Development was done on Google Colab environment.
main.ipynb
- file containing main logic of training, selecting and evaluating models.preprocessing.py
- script with util functions for data preprocessing.models.py
- script with util functions for creating different models.wikipedia.ipynb
- notebook that downloads and translates wikipedia article, saved to csv file.
Folder Data
contains datasets. There are no GloVe ambeddings there, but they can be downloaded from https://nlp.stanford.edu/projects/glove/.
Folder Results
contains some training metrics and translations generated by the models.
There is no folder Models
that should containe saved models. It is available there: https://drive.google.com/drive/folders/1OfbNaHTfRLlajUzOFnUElVZp4Q6lQ4Sn?usp=sharing