In many practical applications in the financial and the legal domains, thousands of documents need to be annotated with one or more of possibly tens or thousands of labels. In addition to their size, the label sets are frequently updated, making it very impractical to maintain the correct labels per document. Therefore, one would like to train document classifiers that assign labels automatically. Training such classifiers with machine learning methods is a challenge, not only due to the number of the different labels and their volatility but also due to their highly imbalanced distribution. In effect, it is very difficult to get training data that adequately cover all classes. Our research focuses on text classification with few- and zero-shot learning capability to handle rare and unseen classes.
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.