Meta-Feature Taxonomy for Supporting Automatic Machine Learning

Date
2019-12-23
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Many automatic machine learning (AutoML) libraries have been developed recently, meeting public demand for more machine learning tools which can be used without an expert. A common tactic illicited by these frameworks is to initially generate meta-features which are then used as an initial heuristic for further evaluation in recent AutoML frameworks. In this thesis we provide a systematic categorization of meta-features in the AutoML literature. Current implementations of automatic machine learning frameworks fail to provide reasoning for meta-feature selection, and a taxonomic categorization is needed. Our approach reviewed current AutoML frameworks and created a taxonomy of five categories into which any meta-feature can be categorized. We have created a general framework with which any currently used meta-features can be described, as well as demonstrate some scenarios for their applications. Additionally, a runtime analysis of the wall-clock time required for meta-feature generation is provided for 18 data collections found in previous CHALearn AutoML competitions, which took between 0:10:26.9, and 98:43:46.5. Additionally we found that a sample percentage of 0.1 is sufficient for use in Sample Variant Landmark Meta-Feature generation when using the Nearest Neighbour, Elite Nearest Neighbour, Best Decision Node, and Random Decision Node Landmarks which indicates potential use as meta-features in AutoML.
Description
Keywords
Machine Learning, Meta-Feature, Automatic Machine Learning, AutoML
Citation
Cooper T.S. Meta-Feature Taxonomy for Supporting Automatic Machine Learning (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.