Skip to content

Latest commit

 

History

History
15 lines (11 loc) · 896 Bytes

README.md

File metadata and controls

15 lines (11 loc) · 896 Bytes

machine-translation

School project exploring seq-2-seq models for machine translation.

Usage

Development was done on Google Colab environment.

  • main.ipynb - file containing main logic of training, selecting and evaluating models.
  • preprocessing.py - script with util functions for data preprocessing.
  • models.py - script with util functions for creating different models.
  • wikipedia.ipynb - notebook that downloads and translates wikipedia article, saved to csv file.

Folder Data contains datasets. There are no GloVe ambeddings there, but they can be downloaded from https://nlp.stanford.edu/projects/glove/.

Folder Results contains some training metrics and translations generated by the models.

There is no folder Models that should containe saved models. It is available there: https://drive.google.com/drive/folders/1OfbNaHTfRLlajUzOFnUElVZp4Q6lQ4Sn?usp=sharing