StaticDeadlockDetectioninMPISynchronizationCommunication资源-CSDN文库

共1个文件

pdf：1个

需积分: 9 109 浏览量 2007-09-24 10:53:53 上传评论收藏 60KB RAR 举报

《静态死锁检测在MPI同步通信中的应用》这篇文章深入探讨了在高性能计算（HPC）领域，特别是使用Message Passing Interface (MPI)进行同步通信时，如何有效地检测和预防死锁问题。MPI是一种广泛用于分布式计算环境的编程接口，它允许程序员在多进程之间传递消息，以实现并行计算。然而，当多个进程相互依赖，形成无法继续执行的循环等待状态时，就会出现死锁，这严重影响了系统的效率和稳定性。死锁模型是理解并解决死锁问题的基础。在MPI同步通信中，死锁通常发生在多个进程之间因为等待对方释放资源或者完成特定操作而形成无法打破的僵局。文章可能详细分析了这种模型，包括死锁的四个必要条件：互斥条件、请求和保持条件、不可剥夺条件以及环路等待条件，并阐述了这些条件在MPI环境中的具体表现。 MPI同步通信模型的死锁检测算法是文章的核心内容。这些算法可能包括基于图论的方法，如通过构建进程间的依赖图来寻找环路，或者使用资源分配图来分析资源的分配和请求情况。此外，文章可能还讨论了静态检测策略，即在程序运行前就预测并防止死锁的可能性，相比于动态检测，静态检测的优点在于可以在程序执行前就发现问题，避免了运行时的性能损失。可能的算法实现包括深度优先搜索、拓扑排序等图论算法，用于识别潜在的环路等待。这些算法的复杂性和效率可能是文章讨论的重点，因为对于大规模的MPI应用，高效的死锁检测算法至关重要。为了支持其理论和算法，文章很可能包含了实际案例分析和实验结果，以验证所提出的死锁检测方法的有效性。实验可能在不同的MPI应用程序或模拟环境中进行，通过比较不同方法的检测准确性和时间开销，证明静态检测策略在MPI同步通信中的优越性。这篇文章为MPI同步通信中的死锁问题提供了理论基础和实用解决方案，对于从事并行计算和MPI编程的研究人员和开发者来说，是一份有价值的参考资料。通过学习和理解其中的理论和算法，可以提高并行程序设计的效率，避免因死锁导致的计算停滞。

资源推荐

资源详情

资源评论

收起资源包目录

Static Deadlock Detection in MPI Synchronization Communication.rar （1个子文件）

Static Deadlock Detection in MPI Synchronization Communication.pdf 81KB

Static Deadlock Detection in MPI Synchronization Communication

Liao Ming-Xue, He Xiao-Xin, Fan Zhi-Hua

Institute of Software, the Chinese Academy of Sciences, 100080, China

liaomingxue@sohu.com

Abstract

t is very common to use dynamic methods to detect

deadlocks in MPI programs for the reason that static

methods have some restrictions. To guarantee high reliability

of some important MPI-based application software, a model

of MPI synchronization communication is abstracted and a

type of static method is devised to examine deadlocks in such

modes. The model has three forms with different complexity:

sequential model, single-loop model and nested-loop model.

Sequential model is a base for all models. Single-loop model

must be treated with a special type of equation group and

nested-loop model extends the methods for the other two

models. A standard Java-based software framework

originated from these methods is constructed for determining

whether MPI programs are free from synchronization

communication deadlocks. Our practice shows the software

framework is better than those tools using dynamic methods

because it can dig out all synchronization communication

deadlocks before an MPI-based program goes into running.

Keywords: Message Passing Interface (MPI);

deadlock; static method; model; synchronization

communication

1. Introduction

Deadlock is a very common problem in software

designing and it may cause a running software

program to break down. Deadlock in big application

may even result into great loss, for example, the

deadlock happened in the control software on NASA’s

Pathfinder landed on Mars

0. In 1971 Coffman

addressed three strategies to process deadlocks:

deadlock prevention, deadlock avoidance and deadlock

detection and recovery

0. Deadlock prevention and

avoidance have many deficiencies so that they are

often used in systems which require high reliability

000. MPI 0 is a library specification for message-

passing, proposed as a standard by a broadly based

committee of vendors and users. It was designed for

high performance on both massively parallel machines

and on workstation clusters, however, it is very

difficult to debug software programs based on it

00.

Currently there are a few tools based on dynamic

methods to debug errors in MPI programs, especially

to detect deadlocks in them

0000. Both 0 and 0 need to

insert some hand-shake codes into user’s source

programs to gather status of nodes to determine a

deadlock. W. Haque utilizes MPI Profiling interface to

intercept all MPI routine calls in order to check

deadlocks

0. 0 is to find kinds of errors in MPI

programs and uses MPI Profiling interface too,

however, its interest covers not only deadlocks but also

other kinds of errors.

However, these dynamic deadlock detection

methods have a deadly deficiency that we can do

nothing but suffering oncoming disaster when the

deadlock happens. Systems requiring high reliability

can not suffer this deficiency. For example, if an on-

satellite cluster for monitoring rural flood or crops

breaks down from a deadlock the life and economic

loss will be innumerous. Therefore static methods are

necessary to be developed to serve in such

environments.

This paper introduces a static method to detect

deadlocks in MPI synchronization communication.

This static method is totally different from a static

method in

0 which is based on techniques of searching

finite state machine.

The second section defines a model of MPI

synchronization communication. The model takes

three different forms which are sequential model (S-

Model), single loop model (L0) and nested loop model

(L2). S-Model is the most basic models. We need to

transform L0 and L2 into S-Model at appropriate time

in order to detect deadlocks in them. To detect

deadlocks in L0 involves a special type of equation

group called ratio equation group. Algorithm for L2

deadlock detection is a combination of methods for L0

and S-Model.

Section 3 demonstrates how to examine deadlocks

in a sequential model and section 4 and 5 follows to

discuss L0 and L2. Section 6 gives an overview of our

software framework for determining whether MPI

programs are free from synchronization

communication deadlocks and concludes this paper.

2. Modeling MPI programs

The MPI program studied in this paper is in Fig 1.

::= + /*MPI program consists of

programs running on each nodes*/

::=< , >/*Node program

includes a node ID and statements*/

::= +/*Statement

program node - program

node - program nodeID statements

statements statement s are a non-null

set of statement*/

::= |

/*A statement is either a sequential one or a loop one*/

::= | /*A sequential

statement is eit

statement sequence - statement for - statement

sequence - statement send receive

her a MPI synchronizaiton send API

call or a receive one*/

::=for( ){ }/*A loop statement

includes loop times n and statements*/

for - statement n statements

Fig 1 MPI Synchronization Communication Model L2

Programs taking the form in Fig 1 are permitted to

contain multi-layer nested loops but have no

conditional statements included. Such programs are in

a model called L2. To explain why conditional

statements are not covered in L2 let us see an MPI

program (1):

if(condition 1) if(condition 2)

send Msg To recv Msg From

Process(machine) 0 Process(machine) 1

a Process1 a Process0

(1)

To detect deadlocks in (1) requires some dynamic

techniques which are not included in this paper.

3. Sequential model

Model L2 is called a sequential model (S-Model) if

it has no loop statements. Program (2) is an example

of S-Model:

send Msg To recv Msg From

send Msg To send Msg From

recv Msg From

P0 P1

a P2 b P0

bP1 c P2

c P1

a P0

(2)

To detect deadlocks in (2), the first step usually is

to build its Message Dependence Graph (MDG). The

MDG of (2) is shown in Fig. 2:

send a recv b

send b send c recv a

recv c

Fig. 2 MDG of (2)

This figure represents a directed graph. Line without

an arrow is bidirectional. A circle with length greater

than 2 is “send a Æ send b Æ recv b Æ send c Æ recv

c Æ recv a Æ send a”.

MDG in Fig. 1 contains a circle which length is 6

greater than 2 so that we declare a deadlock in this

MDG. As a result program (2) has deadlocks. The

deadlock of MDG of (2) indicates a situation: all three

processes can not forward one step while waiting for

other processes forms a circle. Theorem 1 discloses

this fact:

Theorem 1 An S-Model has no deadlock if and

only if its MDG has no circle with length greater than

Circle detection is often used to check deadlocks.

Our MDG includes a special circle between a pair of

matching messages and this special circle (which

length is 2) does not mean a deadlock so that theorem

1 excludes this case. In fact, an MDG contains all

temporal relationship among all messages.

How to build S-Model’s MDG is not included in

this paper. Moreover, this paper does not explain

why theorem 1 holds. Strict proof of theorem 1 needs

many definitions. Intuitively the theorem is correct.

However, it is costly to directly use theorem 1 to

check an S-Model’s MDG because searching for a

length-more-than-2 circle is not very easy and

building an MDG is costly too. Checking MDG is

very frequent in our software framework for detecting

MPI deadlocks so that we developed a very efficient

algorithm to find deadlocks in MPI S-Model instead

of finding circles in MDG. Below is a brief

introduction to the algorithm.

Firstly, the sequential model is mapped into a set of

character strings and its deadlock detection problem

is translated into an equivalent multi-queue string

matching problem. The following step is an infinite

loop until all queues become empty or any queues can

not be updated again. If the loop ends when any

queues can not be updated again, we declare a

deadlock. Each time the loop starts, we update two

评论收藏

内容反馈

liaomingxue

粉丝: 10
资源: 3

Static Deadlock Detection in MPI Synchronization Communication

Deadlock detection

使用C++实现的Windows进程死锁查看工具

SQL Server上的一个奇怪的Deadlock及其分析方法

DeadLock查找死锁位置

c语言 deadlock

some-hints-on-how-to-encure-deadlock-in-SOA-and-Cloud-Computing

分布式算法 Distributed Algorithm

C3P0错误APPARENT DEADLOCK 解决根本问题

tc_skbedit.rar_Blocked_tc skbedit

Java DeadLock and Concurrency

MPI程序同步通信基本模型死锁检测

Deadlock-Free Adaptive Routing in Meshes with Fault-Tolerance Ability Based on Channel Overlapping

MPI同步通信顺序模型死锁检测算法

Deadlock_Avoidance_in_PCI_Express_Based_Systems_Final

SQL Server Blocking and Deadlock

S7A驱动720版本

微软内部资料-SQL性能优化3

Qt 5实现串口调试助手 （源工程文件、0积分下载）

【SystemVerilog】路科验证V2学习笔记（全600页）.pdf

AutoSAR标准协议4.2.2

光伏-储能并网系统仿真.rar

XCP协议的规范文档

GD32替换STM32注意事项.pdf

NPPJSONViewer.zip

蓝牙BLE协议中文版.pdf

CANoe通过CAPL脚本实现自动测试

AD20官方中文教程.pdf

最新资源

Qt 5实现串口调试助手（源工程文件、0积分下载）