新闻动态

美国加州大学河滨分校 Zizhong Chen博士来访中心

2016-12-08

2016年12月8日,美国加州大学河滨分校 Zizhong Chen博士来访中心,并作了题为“Reliable Matrix Computations via Algorithm-Based Fault Tolerance”的学术报告。

 

Normal 0 7.8 磅 0 2 false false false EN-US ZH-CN X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:普通表格; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.5pt; mso-bidi-font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-font-kerning:1.0pt;}

 

 

 

 

Normal 0 7.8 磅 0 2 false false false EN-US ZH-CN X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:普通表格; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.5pt; mso-bidi-font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-font-kerning:1.0pt;}

Abstract: Errors are common in today's computer systems. When an error occurs, if the affected application continues, we call it a fail-continue error. Otherwise, we call it a fail-stop error. In this talk, I will discuss our recent work on algorithm-based fault tolerance for reliable matrix computations. We have developed some highly efficient error correction techniques for selected widely used matrix computation algorithms to tolerate both fail-continue and fail-stop errors according to their specific algorithmic characteristics. By leveraging the algorithmic characteristics of these algorithms, the proposed techniques can achieve much higher efficiency than the traditional general techniques (i.e., Triple Modular Redundancy for fail-continue errors and checkpoint/restart for fail-stop errors).

Biography: Dr. Zizhong (Jeffrey) Chen is a faculty member in the Department of Computer Science and Engineering at the University of California, Riverside. He is interested in high performance computing, parallel and distributed systems, big data analytics, cluster and cloud computing, algorithm-based fault tolerance (ABFT), power and energy efficient computing, numerical algorithms and software, and large scale computer simulations. His research has been supported by National Science Foundation, Department of Energy, CMG Reservoir Simulation Foundation, Abu Dhabi National Oil Company, Nvidia, and Microsoft Corporation. He has published over 70 papers with many in highly competitive conferences and journals such as HPDC, PPoPP, SC, ICS, IPDPS, TPDS, TC, JPDC, PARCO, SIMAX, SISC, and IBMRD. He has received a CAREER Award from the U.S. National Science Foundation and a Best Paper Award from the International Supercomputing Conference. Dr. Chen is a Senior Member of the IEEE and a Life Member of the ACM. He currently serves as Subject Area Editor for Elsevier journal and Associate Editor for IEEE Transcations on Parallel and Distributed Systems. 

 

点击此处下载海报。