Skip to content

Latest commit

 

History

History
19 lines (12 loc) · 1.31 KB

File metadata and controls

19 lines (12 loc) · 1.31 KB

Adding word models

Corpora have the option to include word vectors. (This option is only supported for Python corpora.)

Textcavator visualisations are built for diachronic word models, showing how word meaning changes over time. As such, Textcavator expects that you trained models for different time intervals.

Expected file format

Word embeddings are expected to come with the following files:

  • _full.wv (contains gensim KeyedVectors for a model trained on the whole time period) For each time bin, it expects files of the format
  • _{startYear}_{endYear}.wv (contains gensim KeyedVectors for a model trained on the time bin)

Documentation

Please include documentation on the method and settings used to train a model. See the separate documentation on how to include documentation pages and how to write documenation pages.

Including word models

If your are adding newly trained word models, you will also need to specify in the corpus definition that they may be included. Set the word_models_path property in the corpus to the directory in which the word models are stored. See troonredes.py or uk.py for examples.