- Massimo Mecella - Sapienza Università di Roma
Proceedings and Publicity Chair
- Andrea Marrella - Sapienza Università di Roma
- Roberto De Virgilio - Università Roma Tre
Abstract: Our goal is to bridge the difference between theoretical and practical
approaches to answering queries over databases with nulls, and to
suggest ways in which SQL's handling of nulls can be modified to
provide some correctness guarantees. Theoretical research has long
ago identified the notion of correctness of query answering over
incomplete data: one needs to find certain answers, which are true
regardless of how incomplete information is interpreted. In practice,
on the other hand, query answering must be very efficient, and to
achieve this, SQL uses three-valued logic for evaluating queries on
databases with nulls. Due to the complexity mismatch, the two
approaches cannot coincide, but perhaps they are related in some way?
For instance, does SQL always produce answers we can be certain about?
This is not so: SQL's and certain answers semantics could be totally unrelated, and SQL can produce both false negatives (i.e., miss some tuples that are in the answer) and even worse, false positives (i.e., return tuples that should not be in the answer). We show, however, that a slight modification of the three-valued semantics for queries can provide certainty guarantees: false positives no longer occur. The key point of the new scheme is to fully utilize the three-valued semantics, and classify answers not into certain or non-certain, as was done before, but rather into certainly true, certainly false, or unknown. This yields relatively small changes to the evaluation procedure, which we consider at the level of both declarative (relational calculus) and procedural (relational algebra) queries. We also introduce a new notion of certain answers with nulls, which properly accounts for queries returning tuples containing null values.
Bio: Leonid Libkin is Professor of Foundations of Data Management in the School of Informatics at the University of Edinburgh. He was previously a Professor at the University of Toronto and a member of research staff at Bell Laboratories in Murray Hill. He received his PhD from the University of Pennsylvania in 1994. His main research interests are in the areas of data management and applications of logic in computer science. He has written five books and over 180 technical papers. His awards include a Marie Curie Chair Award and five Best Paper Awards. He has chaired programme committees of major database conferences (ACM PODS, ICDT) and was the conference chair of the 2010 Federated Logic Conference. He has given many invited conference talks and has served on multiple program committees and editorial boards. He is an ACM fellow and a fellow of the Royal Society of Edinburgh.
Abstract: Big Data has become a hot topic in the last few years in both industry and the research community. For the most part, these developments were initially triggered by the requirements of Web 2.0 companies. Both technical and non-technical issues have continued to fuel the rapid pace of developments in the Big Data space. Open source and non-traditional software entities have played key roles in the latter. As it always happens with any emerging technology, there is a fair amount of hype that accompanies the work being done in the name of Big Data. The set of clear-cut distinctions that were made initially between Big Data systems and traditional database management systems are being blurred as the needs of the broader set of (“real world”) users and developers have come into sharper focus in the last couple of years. In this talk, I will survey the developments in Big Data and try to distill reality from the hype!
Bio: Dr. C. Mohan has been an IBM researcher for 33 years in the information management area, impacting numerous IBM and non-IBM products, the research and academic communities, and standards, especially with his invention of the ARIES family of database locking and recovery algorithms, and the Presumed Abort commit protocol. This IBM, ACM and IEEE Fellow has also served as the IBM India Chief Scientist for 3 years. In addition to receiving the ACM SIGMOD Innovation Award (1996), the VLDB 10 Year Best Paper Award (1999) and numerous IBM awards, Mohan was elected to the US and Indian National Academies of Engineering, and was named an IBM Master Inventor. This Distinguished Alumnus of IIT Madras received his PhD at the University of Texas at Austin. He is an inventor of 45 patents. He has served on the advisory board of IEEE Spectrum and on the IBM Software Group Architecture Board’s Council. Mohan is a frequent speaker in North America, Europe and India, and has given talks in 40 countries. More information can be found in his home page at http://bit.ly/CMohan
Abstract: Techniques for efficiently managing Semantic Web data have attracted
significant interest from the data management and knowledge representation
communities. In particular, as RDF is the most widely used model for Semantic
Web data, great deal of effort has been invested, especially in the database
community, into algorithms and tools for efficient RDF query evaluation.
Semantic Web data can be seen as a colection of facts enriched with
ontological schemas, or semantic constraints, based on which reasoning can be
applied to infer new information. Taking into account this implicit
information is crucial for answering queries.
The literature provides two classes of techniques for implementing reasoning,
namely query reformulation and database saturation. The performance of the
respective algorithms depends on the expressive power of the ontological
schema language, as well as on the characteristics of the data and queries.
While saturation appears simple and robust, it is not always feasible and it
may also perform poorly, especially in a distributed setting.
This talk describes recent work on the topic of efficiently evaluating
reformulated queries by adequately translating them to first-order logic. This
translation enables their evaluation by relational database management systems
(RDBMSs), leveraging thus their efficient storage and processing engines for
the benefit of complex reformulated queries on which they perform very poorly
or fail, in the absence of our optimizations.
Joint work with Damian Bursztyn and François Goasdoué
Bio: Ioana Manolescu is a senior researcher at Inria Saclay, and the lead of the joint team OAK (https://team.inria.fr/oak) between Inria and Université de Paris Sud in Orsay, France. She has been a post-doctoral fellow and visiting professor at Politecnico di Milano and has obtained a PhD in 2001 from Université de Versailles Saint-Quentin and Inria Rocquencourt. Her main research interests algebraic and storage optimizations for semistructured data and in particular data models for the Semantic Web, novel data models and languages for complex data management, data models and algorithmsfor fact-checking, and distributed architectures for complex large data. More detail at http://pages.saclay.inria.fr/ioana.manolescu/
|By car||If you drive from Rome take Highway A1 (autostrada A1) direction Naples, exit Cassino and follow directions for Gaeta passing through Formia. From Rome is also possible to take the S.S. n.148 (Pontina) or S.S. n.7 (Appia). Once you have passed Terracina take the S.S. 213 (Flacca) to Gaeta; If you drive from Naples, take Highway A1 (autostrada A1) direction Rome, exit Cassino and follow directions for Gaeta passing through Formia.|
|By train||The closest train station is in Formia, which is located just 5 kilometers away. (www.trenitalia.com); a taxi service from Formia to Gaeta is available and can be reserved at http://www.gaetataxiservice.it/|
|By plane||Being halfway from Rome and Naples you can fly both to Fiumicino/Ciampino (Rome) or Capodichino (Naples)|