Show simple item record

AuthorAl Marri, Wadha J.
AuthorMalluhi, Qutaibah
AuthorOuzzani, Mourad
AuthorTang, Mingjie
AuthorAref, Walid G.
EditorTraina, Agma Juci Machado
EditorTraina Jr., Caetano
EditorCordeiro, Robson Leonardo Ferreira
Available date2016-05-01T13:33:02Z
Publication Date2014
Publication NameSimilarity Search and Applications: 7th International Conference, SISAP 2014, Los Cabos, Mexico, October 29-31, 2014. Proceedings
ResourceScopus
CitationAl Marri, W.J., Malluhi, Q., Ouzzani, M., Tang, M., Aref, W.G. "The similarity-aware relational intersect database operator" (2014) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8821, pp. 164-175.
ISBN978-3-319-11987-8
ISBN978-3-319-11988-5 (Online)
URIhttp://dx.doi.org/10.1007/978-3-319-11988-5_15
URIhttp://hdl.handle.net/10576/4483
AbstractIdentifying similarities in large datasets is an essential operation in many applications such as bioinformatics, pattern recognition, and data integration. To make the underlying database system similarity-aware, the core relational operators have to be extended. Several similarity-aware relational operators have been proposed that introduce similarity processing at the database engine level, e.g., similarity joins and similarity group-by. This paper extends the semantics of the set intersection operator to operate over similar values. The paper describes the semantics of the similarity-based set intersection operator, and develops an efficient query processing algorithm for evaluating it. The proposed operator is implemented inside an open-source database system, namely PostgreSQL. Several queries from the TPC-H benchmark are extended to include similarity-based set intersetion predicates. Performance results demonstrate up to three orders of magnitude speedup in performance over equivalent queries that only employ regular operators.
SponsorNPRP grant 4-1534-1-247 from the Qatar National Research Fund and by the National Science Foundation Grants IIS 0916614, IIS 1117766, and IIS 0964639.
Languageen
PublisherSpringer International Publishing
Series relationLecture Notes in Computer Science
Subjectbioinformatics
data integration
pattern recognition
query processing
semantics
database operators
query processing algorithms
regular operators
relational operator
set intersection
similarity group byes
three orders of magnitude
Tpc-h benchmarks
TitleThe similarity-aware relational intersect database operator
TypeConference Paper
Pagination164-175
Volume Number8821


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record