SupWSD

A software suite for SUPervised Word Sense Disambiguation

Java Framework

SupWSD Toolkit is an easy-to-use tool for the research community, designed to be modular, fast and scalable for training and testing on large datasets. Our framework includes the implementation of a state-of-the-art supervised WSD system together with a NLP pipeline.

Get Start

Web API

SupWSD API is a web service that gives you programmatic access to SupWSD. The service is available for English, French, German, Italian and Spanish and support 5 different trained models (SemCor, OMSTI, Train-O-Matic, OneSec & WordNet Glosses).

Get Start

Pocket Edition

SupWSD Pocket is a light version of SupWSD which allows you to perform the disambiguation process in offline mode using the best-known configuration, without the need to configure the toolkit pipeline. Language models are available for download.

Get Start
Try a live demo! English French German Italian Spanish
Disambiguation in progress...

Features

  • State-of-the-art accuracy
  • Low memory requirement
  • Optimized for larger datasets
  • Designed to be modular, extendable and scalables
  • 6 different parser types
  • Supports the most widely used NLP pipeline
  • Pretrained word embeddings
  • Support for Wordnet and Babelnet sense inventory

Up to 6X FASTER than other systems

less memory consumption

We relied on a testing corpus with 1M words and more than 250K target instances to disambiguate, and we used both frameworks on SemCor and OMSTI as training sets. Results show a considerable gain in execution time achieved by SupWSD, which is around 3 times faster on Semcor, and almost 6 times faster on OMSTI than other systems.
Furthermore, our system parallelize the preprocessing module’s execution and implements lazy loading techniques to make it less memory-intensive on large datasets.

See Details

Benchmarks

We evaluated SupWSD on the evaluation framework of Raganato et al. (2017), which includes 5 test sets from the Senseval/Semeval series and two training corpus of different size, i.e. SemCor (Miller et al., 1993) and OMSTI (Taghipourand Ng, 2015). Our system guarantees the state-of-the-art performance in terms of F-Measure, sometimes even outperforming its competitor by a considerable margin.


See Details

System Senseval 2 Senseval 3 SemEval 7 Semeval 13 Semeval 15
SupWSD 72.6 68.9 60.2 65.8 70.0
SupWSD emb 73.8 70.8 64.2 67.2 71.8
IMS 72.8 69.3 61.3 65.3 69.5
Context2Vec 72.3 69.1 61.5 67.2 71.9
ELMo 71.6 69.6 62.2 66.2 71.3
GAS 72.0 70.0 * 66.7 71.6
MFS 66.6 66.0 54.5 63.8 67.1
January 2020

Neural SupWSD

SupWSD meets Neural Networks!!
Our goal is to build a new classifier capable of covering the entire sense inventory, using the definitions of the senses as annotations.

March 2020

Modular Toolkit GUI

Are you a student, a teacher or a researcher?
You will like to know that we are building a new toolkit with a graphical user interface to simplify model training and testing and provide new interesting features.

Coming soon

Universal API Client

Aren't you very familiar with java and python?
Don't worry, we're preparing a new guide that describes how to use SupWSD with other languages ​​or through the HTTP interface.