In many practical applications in the financial and the legal domains, thousands of documents need to be annotated with one or more of possibly tens or thousands of labels. In addition to their size, the label sets are frequently updated, making it very impractical to maintain the correct labels per document. Therefore, one would like to train document classifiers that assign labels automatically. Training such classifiers with machine learning methods is a challenge, not only due to the number of the different labels and their volatility but also due to their highly imbalanced distribution. In effect, it is very difficult to get training data that adequately cover all classes. Our research focuses on text classification with few- and zero-shot learning capability to handle rare and unseen classes.