What's in store for Auto-Sklearn? -- From the Developers

### What's going on?
Auto-Sklearn has recently been under-maintained, we appreciate that this has caused many users to face dependency issues as pinned dependencies slowly start going out of data. While we support this project primarily through academic means, we are still proud of the community that has formed around it and are dedicated to push it forward. 

### Will Auto-Sklearn still be maintained?
Yes, auto-sklearn will be maintained and updated moving forward!  We initially tried some of these updates, e.g. #1611, #1618 but there were larger issues at play. To alleviate this, we are currently working on a major refactor of the tool, introducing more flexibility and long-wanted features, including pipeline export, flexible pipelines, and a modular design. We expect the first prototype will be available within the next 1-2 months.

### Why the refactor?
Auto-Sklearn was initially built during Python 2 and during the eariler days of scikit-learn. Machine learning libraries and their eco-system were still developing and a lot has changed since then. There were also a lot of lessons learned which while easy in concept, truly difficult to integrate into the current design.

Doing research with Auto-Sklearn has also become harder. By becoming a robust and well-performing tool, this has made performing novel research with Auto-Sklearn more difficult.

### What to expect?
... Not that much, it's a refactor to get back to where we were but with the goal to make it more extensible.

We will still maintain the front facing `AutoSklearnClassifier` and `AutoSklearnRegressor`, to act primarily as it did before and staying very _scikit-learn like_ with it's simple interface.

This refactor will allow us to solve some long standing issues that have arose. We looked through all the issues and tried to categorize what this new refactor will enable. Not all of these issues will be solved upon release but they will provide a tangible rode towards these.

* We will have a new flexible scheduling system, allowing users to hook into events as they happen, hopefully handling issues like:
    * #1236
    * #1624
    * #1569
    * #1522
    * #986
    * #397
* A more flexible pipeline definition, allowing you to create your own or just modify the default, solving:
    * #1661
    * #578
    * #1110
    * #1587
    * #1548
    * #1429
    * #1268
    * #577
    * #1150
    * #379
* Auto-Sklearn will allow you to optimize your own custom sklearn pipelines and try it's darn best to return you pure functioning sklearn pipelines (no auto-sklearn custom parts attached). This means you will be able to run any library that supports sklearn pipelines. This should allow great strides towards:
    * #388
    * #1006
    * #1633
    * #1667
    * #1663
    * #1641
    * #1272
    * #1640
    * #1634
    * #1607
    * #1597
    * #1600
    * #1467
    * #1102
    * #1448
    * #786
* By refactoring, we can also use newer features of sklearn, that previously we tried to bolt in, but was never a first class citizen at the time of Auto-Sklearn's conception.
    * #288
    * #1615
    * #1614
    * #1613
    * #1596
    * #1494
    * #1334

### What can I do?
Please let us know what you think and what you'd like to see from this rebuild!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's in store for Auto-Sklearn? -- From the Developers #1677

What's going on?

Will Auto-Sklearn still be maintained?

Why the refactor?

What to expect?

What can I do?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What's in store for Auto-Sklearn? -- From the Developers #1677

Description

What's going on?

Will Auto-Sklearn still be maintained?

Why the refactor?

What to expect?

What can I do?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions