What's the issue?
Currently, several decorator combinations (e.g. multitask + classification, higher rank features + classification) are skipped due to dataset conflicts. Instead, we should provide datasets that work for the combination, so we can ensure everything is tested. If the decorators are actually incompatible, we should instead raise an exception saying so.