Title: Multi-target Prediction: Basic and Advanced Techniques
Abstract: Increasingly often, data mining has to learn predictive models from big data, which may have many examples and many input/output dimensions or may be streaming at very high rates. When more than one target variable has to be predicted, we talk about multi-target prediction. Predictive modeling problems may also be complex in other ways, i.e., they may involve incompletely/partially labelled data, and data placed in a network context.
The talk will first give an introduction to the different tasks of multi-target prediction, such as multi-target classification and regression, as well as hierarchical versions thereof. It will continue to present some basic methods, such as trees and tree ensembles for multi-target prediction, and move on towards advanced methods for multi-target prediction, including feature importance estimation, semi-supervised learning and learning on data streams. Finally, it will review applications of multi-
target prediction, ranging from gene function prediction, through image annotation, to space exploration.
Bio: Saso Dzeroski is a scientific councillor at the Jozef Stefan Institute and the Centre of Excellence for Integrated Approaches in Chemistry and Biology of Proteins, both in Ljubljana, Slovenia. He is also a full professor at the Jozef Stefan International Postgraduate School and the University of Ljubljana, Faculty of Computer and Information Sciences. His research group investigates machine learning and data mining (including structured output prediction and automated modeling of dynamic systems) and their applications (in environmental sciences, incl. ecology/ecological modelling, and life sciences, incl. systems biology/systems medicine).
The publication record of Professor Dzeroski includes 30 volumes (1 co-authored book, 4 co-edited research monographs, 8 conference proceedings published with reputed publishers, 10 workshop proceedings and 7 journal special issues), more than 40 book chapters, more than 150 journal papers (more than 125 in journals with impact factors), and more than 290 conference papers. The latest two research monographs he has co-edited are »Computational Discovery of Scientific Knowledge« (2007) and »Inductive Databases and Constraint-Based Data Mining« (2010). His work is highly cited, with more than 17800 citations and an h-index value of 59 (in Google Scholar on 31 OCT 2018).
He has participated in many international research projects and coordinated three of them in the past. Most recently, he lead the FET XTrack project MAESTRA (Learning from Massive, Incompletely annotated, and Structured Data). He is currently one of the principal investigators in the FET Flagship Human Brain Project. He has been scientific and/or organizational chair of numerous international conferences, including ECML PKDD 2017, DS-2014 & 2019, MLSB-2009 & 2010, ECEM and EAML-2004, ICML-1999 & 2005 and ILP-1997 & 1999: ICML and ECML PKDD are two of the most prominent scientific events in the area of machine learning and data science worldwide.
Saso Dzeroski received his Ph.D. degree in computer science from the University of Ljubljana in 1995 and was awarded a Jozef Stefan Golden Emblem Prize for his outstanding doctoral dissertation. Immediately thereafter, he received a fellowship from ERCIM, The European Research Consortium for Informatics and Mathematics, awarded to 5% of applicants. He became a fellow of EurAI, the European Association of Artificial Intelligence (formerly ECCAI) in 2008, in recognition for his "Pioneering Work in the field of AI and Outstanding Service for the European AI community". In 2015, he was elected a foreign member of the Macedonian Academy of Sciences and Arts and in 2016 a member of Academia Europea (European Academy of Humanities, Letters and Sciences).
Recent relevant publications:
(1) Breskvar, M.; Kocev, D.; Dzeroski, S. Ensembles for multi-target regression with random output selections. Machine Learning 107 (11), 1673-1709 (2018).
(2) Simidjievski, N.; Tanevski, J.; Zenko, B.; Levnajic, Z.; Todorovski, L.; Dzeroski, Decoupling approximation robustly reconstructs directed dynamical networks. New Journal of Physics 20, 113003 (2018).
(3) Kuzmanovski, V.; Todorovski, L.; Dzeroski, S. Extensive evaluation of the generalized relevance network approach to inferring gene regulatory networks. GigaScience, DOI: 10.1093/gigascience/giy118 (2018).
(4) Vidulin, V.; Šmuc, T.; Džeroski, S.; Supek, F. The evolutionary signal in metagenome phyletic profiles predicts many gene functions. Microbiome 6 (1), 129 (2018).
(5) Korbee, CJ.; Heemskerk, MT.; Kocev, D.; Strijen, E.; Rabiee, O. Combined chemical genetics and data-driven bioinformatics approach identifies receptor tyrosine kinase inhibitors as host-directed antimicrobials. Nature Communications 9 (1), 358 (2018).