عرض بسيط للتسجيلة

المؤلفYu, Yongyang
المؤلفTang, Mingjie
المؤلفAref, Walid G.
المؤلفMalluhi, Qutaibah M.
المؤلفAbbas, Mostafa M.
المؤلفOuzzani, Mourad
تاريخ الإتاحة2020-09-20T08:35:38Z
تاريخ النشر2017
اسم المنشورProceedings - International Conference on Data Engineering
المصدرScopus
الرقم المعياري الدولي للكتاب10844627
معرّف المصادر الموحدhttp://dx.doi.org/10.1109/ICDE.2017.150
معرّف المصادر الموحدhttp://hdl.handle.net/10576/16192
الملخصThe use of large-scale machine learning and data mining methods is becoming ubiquitous in many application domains ranging from business intelligence and bioinformatics to self-driving cars. These methods heavily rely on matrix computations, and it is hence critical to make these computations scalable and efficient. These matrix computations are often complex and involve multiple steps that need to be optimized and sequenced properly for efficient execution. This paper presents new efficient and scalable matrix processing and optimization techniques for in-memory distributed clusters. The proposed techniques estimate the sparsity of intermediate matrix-computation results and optimize communication costs. An evaluation plan generator for complex matrix computations is introduced as well as a distributed plan optimizer that exploits dynamic cost-based analysis and rule-based heuristics to optimize the cost of matrix computations in an in-memory distributed environment. The result of a matrix operation will often serve as an input to another matrix operation, thus defining the matrix data dependencies within a matrix program. The matrix query plan generator produces query execution plans that minimize memory usage and communication overhead by partitioning the matrix based on the data dependencies in the execution plan. We implemented the proposed matrix processing and optimization techniques in Spark, a distributed in-memory computing platform. Experiments on both real and synthetic data demonstrate that our proposed techniques achieve up to an order-of-magnitude performance improvement over state-ofthe-Art distributed matrix computation systems on a wide range of applications.
راعي المشروعThis work was supported by an NPRP grant 4-1534-1-247 from the Qatar National Research Fund and by the National Science Foundation Grant IIS 1117766.
اللغةen
الناشرIEEE Computer Society
الموضوعDistributed computing
Matrix computation
Query optimization
العنوانIn-memory distributed matrix computation processing & optimization
النوعConference Paper
الصفحات1047-1058


الملفات في هذه التسجيلة

الملفاتالحجمالصيغةالعرض

لا توجد ملفات لها صلة بهذه التسجيلة.

هذه التسجيلة تظهر في المجموعات التالية

عرض بسيط للتسجيلة