Adding word models

Corpora have the option to include word vectors. (This option is only supported for Python corpora.)

Textcavator visualisations are built for diachronic word models, showing how word meaning changes over time. As such, Textcavator expects that you trained models for different time intervals.

Expected file format

Word embeddings are expected to come with the following files:

_full.wv (contains gensim KeyedVectors for a model trained on the whole time period) For each time bin, it expects files of the format
_{startYear}_{endYear}.wv (contains gensim KeyedVectors for a model trained on the time bin)

Documentation

Please include documentation on the method and settings used to train a model. See the separate documentation on how to include documentation pages and how to write documenation pages.

Including word models

If your are adding newly trained word models, you will also need to specify in the corpus definition that they may be included. Set the word_models_path property in the corpus to the directory in which the word models are stored. See troonredes.py or uk.py for examples.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding word models

Expected file format

Documentation

Including word models

FilesExpand file tree

Adding-word-models.md

Latest commit

History

Adding-word-models.md

File metadata and controls

Adding word models

Expected file format

Documentation

Including word models