+ >The GloVe word embeddings include sets that were trained on billions of tokens, some up to 840 billion tokens. These algorithms exhibit stereotypical biases, such as gender bias which can be traced back to the original training data. For example certain occupations seem to be more biased towards a particular gender, reinforcing problematic stereotypes. The nearest solution to this problem are some de-biasing algorithms as the one presented in [this research article](https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1184/reports/6835575.pdf), which one can use on embeddings of their choice to mitigate bias, if present.
0 commit comments