Securing Aggregate Queries for DNA Databases
المؤلف | Nassar, Mohamed |
المؤلف | Malluhi, Qutaibah |
المؤلف | Atallah, Mikhail |
المؤلف | Shikfa, Abdullatif |
تاريخ الإتاحة | 2024-07-17T07:14:46Z |
تاريخ النشر | 2019 |
اسم المنشور | IEEE Transactions on Cloud Computing |
المصدر | Scopus |
المعرّف | http://dx.doi.org/10.1109/TCC.2017.2682860 |
الرقم المعياري الدولي للكتاب | 21687161 |
الملخص | This paper addresses the problem of sharing person-specific genomic sequences without violating the privacy of their data subjects to support large-scale biomedical research projects. The proposed method builds on the framework proposed by Kantarcioglu et al. [1] but extends the results in a number of ways. One improvement is that our scheme is deterministic, with zero probability of a wrong answer (as opposed to a low probability). We also provide a new operating point in the space-time tradeoff, by offering a scheme that is twice as fast as theirs but uses twice the storage space. This point is motivated by the fact that storage is cheaper than computation in current cloud computing pricing plans. Moreover, our encoding of the data makes it possible for us to handle a richer set of queries than exact matching between the query and each sequence of the database, including: (i) counting the number of matches between the query symbols and a sequence; (ii) logical OR matches where a query symbol is allowed to match a subset of the alphabet thereby making it possible to handle (as a special case) a "not equal to" requirement for a query symbol (e.g., "not a G"); (iii) support for the extended alphabet of nucleotide base codes that encompasses ambiguities in DNA sequences (this happens on the DNA sequence side instead of the query side); (iv) queries that specify the number of occurrences of each kind of symbol in the specified sequence positions (e.g., two 'A' and four 'C' and one 'G' and three 'T', occurring in any order in the query-specified sequence positions); (v) a threshold query whose answer is 'yes' if the number of matches exceeds a query-specified threshold (e.g., "7 or more matches out of the 15 query-specified positions"). (vi) For all query types, we can hide the answers from the decrypting server, so that only the client learns the answer. (vii) In all cases, the client deterministically learns only the query's answer, except for query type (v) where we quantify the (very small) statistical leakage to the client of the actual count. |
راعي المشروع | This work was supported in part by NPRP grants from the Qatar National Research Fund (award number NPRP 09-622-1-090 and NPRP X-063-1-014); by National Science Foundation Grants CPS-1329979, CNS-0915436; and by sponsors of the Center for Education and Research in Information Assurance and Security. The statements made herein are solely the responsibility of the authors. This work was initiated while Mohamed Nassar was at Qatar University. |
اللغة | en |
الناشر | Institute of Electrical and Electronics Engineers Inc. |
الموضوع | cloud security DNA databases secure outsourcing |
النوع | Article |
الصفحات | 827-837 |
رقم العدد | 3 |
رقم المجلد | 7 |
الملفات في هذه التسجيلة
الملفات | الحجم | الصيغة | العرض |
---|---|---|---|
لا توجد ملفات لها صلة بهذه التسجيلة. |
هذه التسجيلة تظهر في المجموعات التالية
-
علوم وهندسة الحاسب [2402 items ]
-
الذكاء المعلوماتي [93 items ]