Skip to main content
  • CIS
    Members: Free
    IEEE Members: Free
    Non-members: Free
    Length: 00:49:06
08 Jul 2018

Bernadette Buchon-Meunier Keynote Talk (FUZZ-IEEE) at WCCI 2018
Abstract: "The management of big data is certainly one of the most important challenges in the modern digital society. Beyond the problems of Volume, Velocity and Variety (or heterogeneity) classically mentioned as the three V’s in all analyses on big data, it is important to pay attention to the fourth V, usually called Veracity in a broad sense, related to uncertainty in data. In this regard, we differenciate data quality from information quality. The first one depends on the completeness, accuracy, errors and validity of available data. The second one is based on the truth attached to pieces of information in function of the confidence of sources in the information they provide, their reliability, as well as the level of inconsistency in the obtained information and its suitability for the final user needs. The analysis of data and information quality is complex and depends on intertwined objective and subjective factors, according to the nature of data: open data or temporal data, collaborative information, news streams or data acquired from connected devices, for instance.
Statistics and statistical machine learning appear preeminent in the so-called data science. We highlight the importance of non-statistical models to cope with the drawbacks we mentioned, mainly fuzzy set and possibility-based methods which are particularly useful to deal with subjective criteria and to provide easily interpretable information. We also mention solutions based on evidence-based methods, interval computation or non-classical logics. We review existing methods and we provide examples of non-statistical models, pointing out the interest of opening new possibilities to solve the difficult problem of quality in big data and related information.