Wikipedia word2vec models

Image credit: CodeProject

Wikipedia word2vec models

I have just released some word2vec models that I have trained on Wikipedia for my research on language models. There is a model that contains 400-dimensional high-quality embeddings for around 2.8M English words, and I also have 100-dimensional embeddings for 850K Dutch words.

The models are available at our research lab’s new Datashare website. It just launched today, but I hope to vastly extend the content over there in the coming months and years. If you use the embeddings in your research or project, please make a reference to our papers!

comments powered by Disqus