Show simple item record

AuthorNassar, Mohamed
AuthorMalluhi, Qutaibah
AuthorAtallah, Mikhail
AuthorShikfa, Abdullatif
Available date2024-07-17T07:14:46Z
Publication Date2019
Publication NameIEEE Transactions on Cloud Computing
ResourceScopus
Identifierhttp://dx.doi.org/10.1109/TCC.2017.2682860
ISSN21687161
URIhttp://hdl.handle.net/10576/56756
AbstractThis paper addresses the problem of sharing person-specific genomic sequences without violating the privacy of their data subjects to support large-scale biomedical research projects. The proposed method builds on the framework proposed by Kantarcioglu et al. [1] but extends the results in a number of ways. One improvement is that our scheme is deterministic, with zero probability of a wrong answer (as opposed to a low probability). We also provide a new operating point in the space-time tradeoff, by offering a scheme that is twice as fast as theirs but uses twice the storage space. This point is motivated by the fact that storage is cheaper than computation in current cloud computing pricing plans. Moreover, our encoding of the data makes it possible for us to handle a richer set of queries than exact matching between the query and each sequence of the database, including: (i) counting the number of matches between the query symbols and a sequence; (ii) logical OR matches where a query symbol is allowed to match a subset of the alphabet thereby making it possible to handle (as a special case) a "not equal to" requirement for a query symbol (e.g., "not a G"); (iii) support for the extended alphabet of nucleotide base codes that encompasses ambiguities in DNA sequences (this happens on the DNA sequence side instead of the query side); (iv) queries that specify the number of occurrences of each kind of symbol in the specified sequence positions (e.g., two 'A' and four 'C' and one 'G' and three 'T', occurring in any order in the query-specified sequence positions); (v) a threshold query whose answer is 'yes' if the number of matches exceeds a query-specified threshold (e.g., "7 or more matches out of the 15 query-specified positions"). (vi) For all query types, we can hide the answers from the decrypting server, so that only the client learns the answer. (vii) In all cases, the client deterministically learns only the query's answer, except for query type (v) where we quantify the (very small) statistical leakage to the client of the actual count.
SponsorThis work was supported in part by NPRP grants from the Qatar National Research Fund (award number NPRP 09-622-1-090 and NPRP X-063-1-014); by National Science Foundation Grants CPS-1329979, CNS-0915436; and by sponsors of the Center for Education and Research in Information Assurance and Security. The statements made herein are solely the responsibility of the authors. This work was initiated while Mohamed Nassar was at Qatar University.
Languageen
PublisherInstitute of Electrical and Electronics Engineers Inc.
Subjectcloud security
DNA databases
secure outsourcing
TitleSecuring Aggregate Queries for DNA Databases
TypeArticle
Pagination827-837
Issue Number3
Volume Number7


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record