Short bio

Robson L. F. Cordeiro received the BSc degree in Computer Science (CS) from the University of Oeste Paulista, Brazil, in 2002, the MSc degree in CS from the Federal University of Rio Grande do Sul, Brazil, in 2005, and the PhD degree in CS from the University of São Paulo, Brazil, in 2011. His PhD program included a visiting period of one year at the Carnegie Mellon University, USA, from 2009 to 2010. He was also a Postdoctoral Researcher at the University of São Paulo, Brazil, from 2011 to 2013. His PhD Dissertation won the ’best CS Dissertation Award’ in 2012 from the Brazilian Computer Society - SBC, and generated one book published by Springer that was chosen as one of the 'Computing Reviews' Notable Computing Books and Articles of 2013' by ACM. Robson is currently an Assistant Professor at the University of São Paulo, Brazil. His research interests include mining and managing Big Data of moderate-to-high dimensionality, complex data and large graphs. He is a member of the IEEE, ACM, and SBC.

Data Mining in Large Sets of Complex Data

image This book was chosen by ACM Computing Reviews as one
of the 'Notable Computing Books and Articles of 2013'.

image

Publisher: Springer
Series: SpringerBriefs in Computer Science
Authors: Cordeiro, Robson L. F.; Faloutsos, Christos; and Traina Júnior, Caetano
Year: 2013
Pages: 116 p.
Keywords: Analysis of Breast Cancer Data; Analysis of Large Graphs from Social Networks; Analysis of Satellite Imagery; Big Data; Correlation Clustering; Terabyte-scale Data Analysis with MapReduce; Data Mining; Linear or Quasi-linear Complexity; Low-labor Labeling; Summarization and Attention Routing

About this book: The amount and the complexity of the data gathered by current enterprises are increasing at an exponential rate. Consequently, the analysis of Big Data is nowadays a central challenge in Computer Science, especially for complex data. For example, given a satellite image database containing tens of Terabytes, how can we find regions aiming at identifying native rainforests, deforestation or reforestation? Can it be made automatically? Based on the work discussed in this book, the answers to both questions are a sound “yes”, and the results can be obtained in just minutes. In fact, results that used to require days or weeks of hard work from human specialists can now be obtained in minutes with high precision. Data Mining in Large Sets of Complex Data discusses new algorithms that take steps forward from traditional data mining (especially for clustering) by considering large, complex datasets. Usually, other works focus in one aspect, either data size or complexity. This work considers both: it enables mining complex data from high impact applications, such as breast cancer diagnosis, region classification in satellite images, assistance to climate change forecast, recommendation systems for the Web and social networks; the data are large in the Terabyte-scale, not in Giga as usual; and very accurate results are found in just minutes. Thus, it provides a crucial and well timed contribution for allowing the creation of real time applications that deal with Big Data of high complexity in which mining on the fly can make an immeasurable difference, such as supporting cancer diagnosis or detecting deforestation.