My research is on Machine Learning and
Data Mining, both in algorithmic aspects and their applications.
Years ago (pre-2000, mostly) I worked in the field known as
Computational Complexity.
These are my Google Scholar profile
and DBLP publication list.
Topics in algorithmics of Machine Learning:
- Stream mining: In this scenario, the data to be analyzed is not available at once, but
arrives as a sequence of items, possibly in high volume and at high speed, so models need to adapt to new instances
and possibly forget the effect of old ones. I have worked in prediction tasks and on frequent pattern mining tasks
on data streams, with particular emphasis on streams whose distribution evolves over time.
Coauthors include Albert Bifet, Geoff Holmes, Bernhard Pfahringer, and Massimo Quadrana.
- Inference of latent-variable models: Models in which data is assumed to be generated as a (probabilistic) consequence
of some unobserved variables. Coauthors include Matteo Ruffini and Marta Casanellas.
- Inference of finite-state machines: A special case of the above in which the models are finite-state machines
that generate sequences of symbols. Coauthors include Jorge Castro, Borja Balle, Joelle Pineau, and Doina Precup.
- Causality: Most machine learning methods find correlations in the data, which is different from finding causes, often
the interesting task. Causal graphs and do-calculus are two ways of representing and reasoning about causality from data.
Coauthors include Gilles Blondel and Marta Arias.
Concerning applications, I am particularly interested in the applications of
Data Science to Social Good.
Some on which I am working / have worked include:
- Analysis of Healthcare Data: This is my main endeavor these days. Healthcare systems of the developed world are facing
a tsunami of chronic, complex disease related to aging that threatens its very survival. Exploiting in full the data collected
by healthcare agencies and organizations for improving attention processes and resource usage is mandatory. Techniques
such as predictive modeling, clustering, and frequent pattern mining are useful, but they need to be taylored
to this domain and made accessible to healthcare professionals to be practical. Maximizing the real impact of this research
is the motivation for the Amalfi Analytics spinoff of UPC. I also collaborate
with a number of public and private healthcare organizations, see Activities.
- Sports Analytics, in collaboration with Football Club Barcelona,
including the Industrial Doctorate of Javier Fernández.
- Prediction of Urban and Road Traffic via Machine Learning: This was the PhD thesis of Rafael Mena at
Aimsun (defended March 2020).
- Improved Management of Computer Systems using Machine Learning (2007-2013): In collaboration with researchers at
the Barcelona Supercomputing Center, including Jordi Torres, Josep-Lluís Berral, Javier Alonso,
David Carrera, and Nicolas Poggi. Topics included dealing with software aging, maximizing throughput
in Web workloads, and energy-efficient supercomputing (Green Computing).
This is just a summary. I am easy to tempt into problems with algorithmic flavor or with a social good component.