没有合适的资源?快使用搜索试试~ 我知道了~
一篇分析诊断被"hang"住数据库的资料(Oracle Performance Diagnostic Gui...
需积分: 10 11 下载量 148 浏览量
2013-08-09
13:19:25
上传
评论
收藏 180KB PDF 举报
温馨提示
试读
45页
分析、定位oracle数据库被"hang"住的资料。
资源推荐
资源详情
资源评论
Oracle Performance Diagnostic Guide
Hang/Locking
Version 3.1.0 January 13, 2009
Welcome to the Oracle Performance Diagnostic Guide This guide is intended to help you resolve query tuning, hang/locking, and slow database
issues. The guide is not an automated tool but rather seeks to show methodologies, techniques, common causes, and solutions to performance
problems.
Most of the guide is finished but portions of the content under the Hang/Locking tab is still under development.
Your feedback is very valuable to us - please email your comments to:
Vickie.Carbonneau@oracle.com
Contents
Hang/Locking > Identify the Issue > Overview
Recognize a Hang or Locking Issue
Clarify the Issue
Verify the Issue
Next Step - Data Collection
Hang/Locking > Identify the Issue >Data Collection
Gather Operating System (OS) Performance Data
Gather Hanganalyze and SystemState
Gather V$ View Data
Next Step - Analyze
Hang/Locking > Identify the Issue > Analysis
Verify OS Resource Usage
Verify the Database is Hung
Next Step - Determine a Cause
Would You Like to Stop and Log a Service Request?
Hang/Locking > Determine a Cause >Overview
Hang/Locking > Determine a Cause >Data Collection
"True" Database Hang
"Stuck" or "Locked" Sessions
Next Step - Analyze
Hang/Locking > Determine a Cause >Analysis
Identify Blocker and Waiter Data
"True" Database Hang
"Stuck" or "Locked" Sessions
Open a Service Request with Oracle Support Services
Give Us Your Feedback
Feedback
We look forward to your feedback. Please email any comments, suggestion to help improve this guide, or any issues that you have encountered
with the tool usage to Vickie.Carbonneau@oracle.com, Technical Advisor, Center of Expertise (CoE).
Hang/Locking > Identify the Issue > Overview
To properly identify the issue we want to resolve, we must do three things:
● Recognize a query hang or locking issue
● Clarify the details surrounding the issue
● Verify that the issue is indeed the problem.
Recognize a Hang or Locking Issue
What is a Hang or Locking issue?
A "true" database hang issue can manifest itself as:
● A database that is no longer allowing users to connect
● A database that is no longer performing work
● Select 1 from dual does not produce output
● Create table does not complete
A locking issue can manifest itself as:
● One or more sessions that have completely stopped functioning
You might have identified the hang or lock from:
● benchmarking/testing
● user complaints
● systemstate or hanganalyze showing stuck sessions
● a query appearing to hang
● session consuming a large amount of CPU
● ORA-60 errors appearing in the alert log
These problems might appear after:
● schema changes
● changes in database parameters
● changes in application
● database upgrades
Clarify the Issue
A clear problem statement is critical. You need to be clear on exactly what the problem is. It may be that
in subsequent phases of working through the issue, the real problem becomes clearer and you have to
revisit and re-clarify the issue.
To clarify the issue, you must know as much as possible of the following:
● The affected users
● The sequence of events leading up to the problem
● Where/how it was noticed
● The significance of the problem
● What IS working
● What is the expected or acceptable result
● What have you done to try to resolve the problem
As an example:
● Many sessions do not appear to be completing the requests that were made
Notes
● ODM Reference: Identify the Issue
How-To
● How to Identify Resource Intensive
SQL for Tuning
Case Studies
● Resolving High CPU usage in Oracle
Servers
● It was noticed by end users.
● New connections cannot be made to the database
● Nothing seems to be working at the moment
● Normally, the user's requests run in less than 2 seconds.
● We checked the alert log for ORA-60 errors
Why is this step mandatory?
Skipping this step will be risky because you might attack the wrong problem and waste significant time
and effort. A clear problem statement is critical to begin finding the cause and solution to the problem.
Verify the Issue
Our objective in this step of the diagnostic process is to ensure a database hang or lock is actually the
query at the root of the performance problem. At this point, you need to collect data that verifies the
existence of a problem.
To verify the existence of the issue you must collect evidence of the hang or locking.
Example:
SQL> set timing on
SQL> SELECT 1 from dual;
If this query does not return results, chances are this is a "true" database hang.
Further examples and advice on what diagnostic information will be needed to resolve the problem will be
discussed in the DATA COLLECTION section.
Once the data is collected, you will review it to either verify there is a database hang or locking issue, or
decide it is a different issue.
Why is this step mandatory?
If you skip this step, you might have identified the wrong issue and waste significant time and effort
before you realize it. For example, the issue may not actually be a database hang or lock. Instead the
issue might be due to an OS performance problem In this case, hang/locking will not help solve this
problem.
Notes
● ODM Reference: Identify the Issue
Next Step - Data Collection
When you have done the above, click "NEXT" to get some guidance on collecting data that will help to
validate that you are looking at the right problem and that will help in diagnosing the cause of the
problem.
Hang/Locking > Identify the Issue >Data Collection
In this step, we will collect data to verify the database hang is due to the database and not external to it. Note: Collect data when the database is
hung.
Gather Operating System (OS) Performance Data
OS data is needed to see the overall performance of the machine(s) where Oracle is running.
Automatically Using Tools
This section will discuss how to gather data using scripts or tools. Describe when/why you'd use the
particular tool, but use the sidebar refs to point to a doc that gives the details on how to setup and
use.
10g or higher: If you have obtained a 10gR2 or higher statspack report, you do not need to collect
detailed OS data as described below to verify CPU or memory saturation. However, the data
captured using the following tools are thorough and may improve the quality of the diagnosis in some
cases.
1. OS Watcher (OSW) (Preferred Method)
OS Watcher (OSW) is a collection of UNIX shell scripts intended to collect and archive
operating system and network metrics to aid support in diagnosing performance issues.
OSW operates as a set of background processes on the server and gathers OS data on a
regular basis, invoking such Unix utilities as vmstat, netstat and iostat.
OSW is the preferred way of gathering data on Unix-based systems because it is very
simple to install and will collect files that can be analyzed later by Oracle engineers.
Please read the
OS Watcher User's Guide for more information on setting up OSW and
collecting data. Use OSWg to graph the data for quick analysis.
2. LTOM
The Lite Onboard Monitor (LTOM) is a java program designed as a real-time diagnostic
platform for deployment to a customer site. LTOM differs from other support tools, as it
is proactive rather than reactive. LTOM provides real-time automatic problem detection
and data collection. LTOM is very well suited to transient and unpredictable performance
issues..
Please read the
LTOM - The On-Board Monitor User's Guide for more information.
3. Enterprise Manager
Enterprise Manager's performance management pages in Grid Control or DB Control
include charts that show CPU and memory performance data. To see these charts, click
on the host target name (in the "General" section) and then click on the "Performance"
page link. From here you will see charts for CPU, memory, and I/O utilization as well as
detailed process information.
Enterprise Manager is very good for real-time analysis; however, screen captures will be
required for later analysis by Oracle Support.
Manually
OS Watcher is preferred over manual methods. But if you are unable to use OS Watcher and wish to
manually collect OS data, please read
How to use OS commands to diagnose Database
Performance issues for more information.
Note:
Data should be gathered at the same time as the database hang data is gathered
How-To
● How to use OS commands to
diagnose Database Performance issues
Scripts and Tools
● OS Watcher User's Guide
● LTOM - The On-Board Monitor
User's Guide
Gather Hanganalyze and SystemState
This section will discuss how to collect diagnostic data using the HANGANALYZE and Systemstate
commands.
About Hanganalyze
HANGANALYZE uses internal kernel calls to determine if a session is waiting for a resource, and
reports the relationships between blockers and waiters. In addition, it determines which processes
are "interesting" to be dumped, and may perform automatic PROCESSSTATE dumps and
ERRORSTACKS on those processes, based on the level used while executing the HANGANALYZE.
The "HANGANALYZE" command has been available since Oracle Release 8.1.6. In Oracle9i it was
enhanced to provide "cluster wide" information in Real Application Cluster (RAC) environments on a
single shot. The meaning of this is that it will generate information for all the sessions in the cluster
regardless of the instance that issued the command.
HANGANALYZE syntax:
3 Syntax Examples:
ALTER SESSION SET EVENTS 'immediate trace name HANGANALYZE level <level>';
ORADEBUG hanganalyze <level>
ORADEBUG -g def hanganalyze <level>
(Cluster wide syntax)
The <level> sets the amount of additional information that will be extracted from the processes
found by HANGANALYZE (ERROSTACK dump) based on the "STATE" of the node.
The levels are defined as follows:
10 Dump all processes (IGN state)
5 Level 4 + Dump all processes involved in wait chains (NLEAF state)
4 Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF,LEAF_NW,IGN_DMP state)
3 Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
1-2 Only HANGANALYZE output, no process dump at all
It is advisable not to use levels higher than 3 due to the potentially large number of trace files that
may be produced (and could overwhelm the I/O subsystem). Since HANGANALYZE will be mostly
used to diagnose "true" hangs, a level 3 will dump the processes that are involved in the hang
condition - this usually involves fewer than 4 processes.
NOTE: In some cases hanganalyze could appear to be hung. Hanganalyze will
only return when all processes are dumped. To verify that hanganalyze is
working on non-clustered environments, monitor the size of the trace files in the
user_dump_destination. If the trace files are growing in size, hanganalyze is not
hung. To verify that hanganalyze is working on clustered environments, monitor
the size of the "diag" trace file in the background_dump_destination.
About Systemstate
A system state is a set of process states for all processes on the instance when the dump is taken.
A system state dump is useful in determining the interaction between processes. A systemstate
dump will report on resources that are being held by each process.
SYSTEMSTATE syntax:
3 Syntax Examples:
ALTER SESSION SET EVENTS 'immediate trace name SYSTEMSTATE level <level>';
ORADEBUG dump systemstate <level>
ORADEBUG -g all dump systemstate <level>
(Cluster wide syntax)
The <level> sets the amount of additional information that will be extracted from the processes
found by SYSTEMSTATE dump based on the "STATE" of the node.
The levels are defined as follows:
1 Very basic process information only
2 process + session state objects
10 Most common level - includes state object trees for all processes
Notes
● Steps to generate HANGANALYZE
trace files
● Interpreting HANGANALYZE trace
files to diagnose hanging and
performance problems
Scripts and Tools
● LTOM User Guide
● HANGFG User Guide
剩余44页未读,继续阅读
资源评论
lhdz_bj
- 粉丝: 6469
- 资源: 20
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功