Skip to content

Latest commit

 

History

History
104 lines (80 loc) · 4.24 KB

README.md

File metadata and controls

104 lines (80 loc) · 4.24 KB

ViLongT5 • twitter

PRs welcome! twitter

A pretrained Transformer-based encoder-decoder model for the multi-document text-summarization task in Vietnamese language. The code represents a non-framework implementation, which combines flaxformer, t5x and purely based on JAX library.

ViLongT5 is trained on a large NewsCorpus of Vietnamese news texts. We benchmark ViLongT5 on multidocument text-summarization tasks, Abstractive Text Summarization and Named Entity Recognition. All the experiments are shown in our paper Pre-training LongT5 for Vietnamese Mass-Media Multi-document Summarization Task

Pretrained Models

Vocabulary: ViLongT5_vocab / training-script

Model Gin File Location Checkpoint Location
ViLongT5-Large ViLongT5_large.gin ViLongt5-finetuned-large.tar.gz

📄 Example scripts based on Flaxformer library for model: finetunning / inferring / evaluating

Results

image

Datasets

List of datasets utilized in experiments conduction:

Installation

NOTE: considering GPU as a computational device. This project has been tested under the following configuration

Local Installation

  • Initialize virtual environment and install project dependencies:
virtualenv env --python=/usr/bin/python3.9`
pip install -r dependencies.txt

Kaggle Installation

For testing under Kaggle, there is a separted tutorial.

Fine-tuning

We finetunning the model based on training part of the vims+vmds+vlsp training part as follows:

python -m t5x.train --gin_file="longt5_finetune_vims_vmds_vlsp_large.gin" --gin_search_paths='./configs'

Inferring

Evaluation

For vims+vmds+vlsp (test part) is as follows:

python -m t5x.eval --gin_file="longt5_eval_vims_vmds_vlsp_large.gin" --gin_search_paths='./configs'

For vlsp (validation part) is as follows:

python -m t5x.eval --gin_file="configs/longt5_infer_vlsp_validation_large.gin" --gin_search_paths='./configs'

References

@inproceedings{rusnachenko2023pretraining,
    title = "Pre-training {LongT5} for Vietnamese Mass-Media Multi-document Summarization Task",
    author = "Rusnachenko, Nicolay and Le, The Anh and Nguyen, Ngoc Diep",
    booktitle = "Proceedings of Artificial Intelligence and Natural Language",
    year = "2023"
}