SEBD 2015 keynotes
Leonid Libkin
Professor, University of Edinburgh
C. Mohan
IBM Fellow, Almaden Research Center
Ioana Manolescu
Senior Inria researcher
Title: "SQL's Handling of Nulls: Can It Be Fixed?"
Speaker: Leonid Libkin
Abstract: Our goal is to bridge the difference between theoretical and practical
approaches to answering queries over databases with nulls, and to
suggest ways in which SQL's handling of nulls can be modified to
provide some correctness guarantees. Theoretical research has long
ago identified the notion of correctness of query answering over
incomplete data: one needs to find certain answers, which are true
regardless of how incomplete information is interpreted. In practice,
on the other hand, query answering must be very efficient, and to
achieve this, SQL uses three-valued logic for evaluating queries on
databases with nulls. Due to the complexity mismatch, the two
approaches cannot coincide, but perhaps they are related in some way?
For instance, does SQL always produce answers we can be certain about?
This is not so: SQL's and certain answers semantics could be totally
unrelated, and SQL can produce both false negatives (i.e., miss some
tuples that are in the answer) and even worse, false positives (i.e.,
return tuples that should not be in the answer). We show, however,
that a slight modification of the three-valued semantics for queries
can provide certainty guarantees: false positives no longer occur. The
key point of the new scheme is to fully utilize the three-valued
semantics, and classify answers not into certain or non-certain, as
was done before, but rather into certainly true, certainly false, or
unknown. This yields relatively small changes to the evaluation
procedure, which we consider at the level of both declarative
(relational calculus) and procedural (relational algebra) queries. We
also introduce a new notion of certain answers with nulls, which
properly accounts for queries returning tuples containing null values.
Bio: Leonid Libkin is Professor of Foundations of Data Management in the
School of Informatics at the University of Edinburgh. He was
previously a Professor at the University of Toronto and a member of
research staff at Bell Laboratories in Murray Hill. He received his
PhD from the University of Pennsylvania in 1994. His main research
interests are in the areas of data management and applications of
logic in computer science. He has written five books and over 180
technical papers. His awards include a Marie Curie Chair Award and
five Best Paper Awards. He has chaired programme committees of major
database conferences (ACM PODS, ICDT) and was the conference chair of
the 2010 Federated Logic Conference. He has given many invited
conference talks and has served on multiple program committees and
editorial boards. He is an ACM fellow and a fellow of the Royal
Society of Edinburgh.
=======================================================
Title: "Big Data: Hype and Reality"
Speaker: Dr. C. Mohan
Abstract: Big Data has become a hot topic in the last few years in both
industry and the research community. For the most part, these developments
were initially triggered by the requirements of Web 2.0 companies. Both
technical and non-technical issues have continued to fuel the rapid pace of
developments in the Big Data space. Open source and non-traditional software
entities have played key roles in the latter. As it always happens with any
emerging technology, there is a fair amount of hype that accompanies the work
being done in the name of Big Data. The set of clear-cut distinctions that
were made initially between Big Data systems and traditional database
management systems are being blurred as the needs of the broader set of
(“real world”) users and developers have come into sharper focus in the last
couple of years. In this talk, I will survey the developments in Big Data and
try to distill reality from the hype!
Bio: Dr. C. Mohan has been an IBM researcher for 33 years in the information management area,
impacting numerous IBM and non-IBM products, the research and academic communities, and standards,
especially with his invention of the ARIES family of database locking and recovery algorithms,
and the Presumed Abort commit protocol. This IBM, ACM and IEEE Fellow has also served as the
IBM India Chief Scientist for 3 years. In addition to receiving the ACM SIGMOD Innovation Award (1996),
the VLDB 10 Year Best Paper Award (1999) and numerous IBM awards,
Mohan was elected to the US and Indian National Academies of Engineering,
and was named an IBM Master Inventor. This Distinguished Alumnus of IIT
Madras received his PhD at the University of Texas at Austin. He is an inventor of 45 patents.
He has served on the advisory board of IEEE Spectrum and on the IBM Software Group Architecture
Board’s Council. Mohan is a frequent speaker in North America, Europe and India,
and has given talks in 40 countries. More information can be found in his
home page at http://bit.ly/CMohan
=======================================================
Title: "Database Optimization Techniques for Semantic Queries"
Speaker: Ioana Manolescu
Abstract: Techniques for efficiently managing Semantic Web data have attracted
significant interest from the data management and knowledge representation
communities. In particular, as RDF is the most widely used model for Semantic
Web data, great deal of effort has been invested, especially in the database
community, into algorithms and tools for efficient RDF query evaluation.
Semantic Web data can be seen as a colection of facts enriched with
ontological schemas, or semantic constraints, based on which reasoning can be
applied to infer new information. Taking into account this implicit
information is crucial for answering queries.
The literature provides two classes of techniques for implementing reasoning,
namely query reformulation and database saturation. The performance of the
respective algorithms depends on the expressive power of the ontological
schema language, as well as on the characteristics of the data and queries.
While saturation appears simple and robust, it is not always feasible and it
may also perform poorly, especially in a distributed setting.
This talk describes recent work on the topic of efficiently evaluating
reformulated queries by adequately translating them to first-order logic. This
translation enables their evaluation by relational database management systems
(RDBMSs), leveraging thus their efficient storage and processing engines for
the benefit of complex reformulated queries on which they perform very poorly
or fail, in the absence of our optimizations.
Joint work with Damian Bursztyn and François Goasdoué
Bio: Ioana Manolescu is a senior researcher at Inria Saclay, and the lead of the
joint team OAK (https://team.inria.fr/oak) between Inria and Université de
Paris Sud in Orsay, France.
She has been a post-doctoral fellow and visiting professor at Politecnico di
Milano and has obtained a PhD in 2001 from Université de Versailles
Saint-Quentin and Inria Rocquencourt. Her main research interests algebraic
and storage optimizations for semistructured data and in particular data
models for the Semantic Web, novel data models and languages for complex data
management, data models and algorithmsfor fact-checking, and distributed
architectures for complex large data. More detail at
http://pages.saclay.inria.fr/ioana.manolescu/