SupWSD

A software suite for SUPervised Word Sense Disambiguation

Java Framework

SupWSD Toolkit is an easy-to-use tool for the research community, designed to be modular, fast and scalable for training and testing on large datasets. Our framework includes the implementation of a state-of-the-art supervised WSD system together with a NLP pipeline.

Get Start

Web API

SupWSD API is a web service that gives you programmatic access to SupWSD. The service is available for English, French, German, Italian and Spanish and support 2 different trained models (Train-O-Matic, & WordNet Glosses) including Word Embedding.

Get Start

Translation

SupWSD provides a sensing-based translation system that allows you to retrieve headwords and word definitions in different languages. The system is fully integrated into our web API, supports the same models and can be used with the same languages.

Get Start
Demo en fr de it es
Disambiguation in progress...

Features

  • State-of-the-art accuracy
  • Low memory requirement
  • Optimized for larger datasets
  • Designed to be modular, extendable and scalables
  • 6 different parser types
  • Supports the most widely used NLP pipeline
  • New: Translation capabilities

Up to 6X FASTER than other systems

less memory consumption

We relied on a testing corpus with 1M words and more than 250K target instances to disambiguate, and we used both frameworks on SemCor and OMSTI as training sets. Results show a considerable gain in execution time achieved by SupWSD, which is around 3 times faster on Semcor, and almost 6 times faster on OMSTI than other systems.
Furthermore, our system parallelize the preprocessing module’s execution and implements lazy loading techniques to make it less memory-intensive on large datasets.

See Details

Benchmarks

We evaluated SupWSD on the evaluation framework of Raganato et al. (2017), which includes 5 test sets from the Senseval/Semeval series and two training corpus of different size, i.e. SemCor (Miller et al., 1993) and OMSTI (Taghipourand Ng, 2015). Our system guarantees the state-of-the-art performance in terms of F-Measure, sometimes even outperforming its competitor by a considerable margin.


See Details

System Senseval 2 Senseval 3 SemEval 7 Semeval 13 Semeval 15
SupWSD 72.6 68.9 60.2 65.8 70.0
SupWSD emb 73.8 70.8 64.2 67.2 71.8
IMS 72.8 69.3 61.3 65.3 69.5
Context2Vec 72.3 69.1 61.5 67.2 71.9
ELMo 71.6 69.6 62.2 66.2 71.3
GAS 72.0 70.0 * 66.7 71.6
MFS 66.6 66.0 54.5 63.8 67.1
Coming soon

Neural SupWSD

SupWSD meets Neural Networks!
Our goal is to build a new classifier capable of covering the entire sense inventory, using the definitions of the senses as annotations.

October 2020

Modular Toolkit GUI

Are you a student, a teacher or a researcher?
You will like to know that we are building a new toolkit with a graphical user interface to simplify model training and testing and provide new interesting features.

Available now

Word Embedding

SupWSD now offers two models trained with Word Embedding that cover all the WordNet sense inventory.