CaseStudy：ResolvingHighCPUUsageonOracleServers资源-CSDN文库

1星需积分: 9 113 浏览量 2007-12-30 22:18:09 上传评论收藏 277KB PDF 举报

资源推荐

资源详情

资源评论

Case Study: Resolving High CPU Usage on Oracle Servers

Author: Hector Pujol, Consulting Technical Advisor, Center of Expertise, Oracle USA

Skill Level Rating for this Case Study: Intermediate

About Oracle Case Studies

Oracle Case Studies are intended as learning tools and for sharing information or

knowledge related to a complex event, process, procedure, or to a series of related events.

Each case study is written based upon the experience that the writer/s encountered.

Each Case Study contains a skill level rating. The rating provides an indication of what

skill level the reader should have as it relates to the information in the case study. Ratings

are:

• Expert: significant experience with the subject matter

• Intermediate: some experience with the subject matter

• Beginner: little experience with the subject matter

Case Study Abstract

This case study demonstrates a top-down tuning methodology that was applied to a

performance problem involving high CPU usage on a database server. The diagnostics that

were collected and analyzed for this problem include: operating system (OS) data such as

vmstat and top; statspack, and trace event 10046 with TKProf output. The conclusion from

analyzing the collected data was that a specific query was consuming most of the CPU and

needed to be tuned. We present various ways of tuning the query as well as verification that

the recommended solution solved the problem.

Case History

A large university reported severe performance problems with their email application and

other applications that relied upon an Oracle RDBMS server for authentication services.

The Oracle database was used solely for user lookups, authentication, and other assorted

LDAP queries. It was hosted on a 2-CPU Linux server that also hosted Oracle’s

Application Server (iAS) and related daemons.

The email application was installed approximately 6 months prior and ran without incident

until some application security patches were installed. The day after the patches were

installed, the performance problems started occurring.

The DBAs noticed that this authentication server was often seeing load averages over 20

processes (as shown in uptime or top). Normally, prior to the patch, the load averages

were seldom above 1 or 2 processes. The performance seemed to be much worse when

“mass mailing” occurred. These mass-mailing jobs sent out email to thousands of recipients

and relied upon the authentication/lookup services provided by the database.

The main symptoms reported by users when there was a performance problem were that

they were unable to log-in to their email accounts and they experienced timeouts in the

application (visible as browser timeouts or client error messages, depending on the mail

client used).

The database server version was 9.0.1.5; it had been upgraded from 9.0.1.4 during the same

outage period when the application security patches were installed.

Analysis

Summary

Our strategy in diagnosing this problem was to take a “top-down” approach and follow the

trail of highest CPU consumption down to the offending processes, and eventually the

offending sessions and query. The top-down approach can be summarized as follows

1. Verify CPU Consumption on the server

• Verify the observation of high CPU consumption on the affected server and collect

top and vmstat data for a period of time (from 30 minutes up to a few hours)

• Find which processes are using the most CPU

• If the top CPU consumers are Oracle processes, then find out which database

sessions correspond to these processes

2. Collect Extended SQL Trace Data (Event 10046)

• Gather SID, SERIAL#, and other information from V$SESSION about these

sessions

• Trace the sessions using an extended SQL trace via event 10046, level 12

• Produce a TKProf report and look for the queries having the longest elapsed time

The steps have been categorized according to the Oracle Diagnostic Method (ODM), see MetaLink Doc ID

312789.1

If the processes were not Oracle RDBMS processes, then we would have to stop and find out what kind of

processes they are and how to diagnose them.

Page 2

Detailed Analysis

Verify CPU Consumption on the server

Before we can start looking at why this problem is occurring we must first understand the

problem and verify what the customer is reporting (its symptoms). This can be done by

collecting OS statistics during a recent spike in activity due to a mass mailing. The data

was collected using Oracle Support’s OSWatcher tool

The vmstat output during this time looked like this:

# of processes in the run queue CPU User mode

CPU Sys mode

CPU idle

zzz ***Thu Mar 9 18:30:00 PST 2006

cpu procs memory swap io system

r b w swpd free buff cache si so bi bo in cs us sy id

3 0 0 920 8596 342908 4088164 0 0 10 514 501 1790 43 7 50

5 0 0 920 8844 343024 4089564 0 0 8 79 319 1751 29 4 67

21 0 1 920 5332 343160 4093964 0 0 18 320 390 1690 97 3 0

22 0 0 920 22864 343304 4057500 0 0 11 149 370 2563 97 3 0

25 0 3 920 9728 343404 4059732 0 0 77 412 400 2412 95 5 0

23 0 0 920 8416 343688 4044128 0 0 2 2 254 1308 97 3 0

21 0 1 920 6760 343520 4045256 0 0 56 2428 704 2382 89 10 1

24 0 0 920 14856 343812 4045968 0 0 54 66 278 2785 95 5 0

21 0 2 920 17600 343956 4048296 0 0 17 70 249 1677 96 4 0

20 0 0 920 6028 344080 4050368 0 0 8 52 265 1461 94 6 0

22 0 1 920 27172 344184 4053608 0 0 25 49 263 2513 96 4 0

14 0 2 920 58264 344364 4057036 0 0 73 125 614 3821 79 20 1

1 1 0 920 65756 344476 4070080 0 0 358 1964 410 2118 12 4 84

0 0 0 920 48628 344580 4080864 0 0 17 106 229 1190 9 2 88

Each line represents a sample collected at 1-minute intervals.

Notice the how the run queues

rapidly increase from 3 to 25 as the CPU utilization

increases from about 50 percent to 100 percent. This shows the importance of knowing the

length of the run queue as well as the CPU utilization. When the CPU is pegged at 100%

utilization, the severity of the CPU starvation won’t be reflected in the percentage of CPU

utilization (its pegged at 100%), but the run queue will clearly show the impact. Similarly,

knowing only the run queue will not provide you with knowledge on the exact CPU usage,

nor the spread of time usage across system and user modes.

CPU utilization is classified into three types: system (sy), user (us), and idle (id). System

mode CPU utilization occurs whenever a process requires resources from the system; e.g.,

OSWatcher may be downloaded from MetaLink by viewing Doc ID 301137.1

The run queue is a queue of processes that are ready to run but must wait for their turn on a CPU; a run queue of 20

means that 20 processes are currently waiting to execute.

Page 4

I/O or memory allocation calls. In typical OLTP systems, the percentage of system mode

CPU utilization is often less than 10 percent; in data warehouses more I/O calls are

performed and a higher percentage of system mode CPU utilization is common. User mode

CPU utilization accounts for the rest of the time the CPU is busy and

not running in system

mode. Idle CPU is essentially the time that the CPU is not busy and waiting for work to do.

In this case, we can quickly see that the percentage of CPU utilization in system mode is

usually less than 10 percent. This indicates that we are probably not suffering from memory

shortages or doing excessive I/Os

. We can further dismiss memory shortages by looking at

the paging and swapping statistics in vmstat (only swap-ins (si) and swap-outs (so) are

shown here). If no pages are being swapped out during peak time, then there can’t possibly

be a memory shortage.

If we were seeing system-mode CPU utilization higher than 15 percent and we weren’t

seeing any memory problems, we might begin to suspect excessive I/O calls, very frequent

database connections/disconnections, or some anomaly (we would keep this in mind as we

collect more data for individual processes; e.g., using the truss command or while looking

at RDBMS metrics like logons cumulative per second).

The fact that most CPU utilization is occurring in user mode tells us that whatever is

occurring, it’s happening within the application that the process is running, not as part of a

system call or action. The question is now, which processes are using the CPU?

To answer this question, we can look at the output of the top command. This command is

useful because it shows at a glance the overall CPU statistics and the processes consuming

the most CPU (ranked in descending order of CPU usage). The following shows a typical

output of the top command during the performance problem:

This is just a casual observation – we would need to look at the process in detail (at least at the statspack level for

the database) to conclusively decide that excessive I/O is not being performed

Page 5

剩余32页未读，继续阅读

评论收藏

内容反馈

superman0140

2014-02-23

挺实用的文档，谢谢分享

QiuJinHua

粉丝: 0
资源: 10

Case Study：Resolving High CPU Usage on Oracle Servers

最新资源

Case Study：Resolving High CPU Usage on Oracle Servers

Best Practice: Scale-Resolving Simulations in ANSYS CFD.pdf

Resolving_Oracle_Latch_Contention.rar_latch contention_oracle

Windows.Software.Compatibility.and.Hardware.Troubleshooting.148421062X

CompTIA Network+ Study Guide, 3rd Edition--2018

Apress - Troubleshooting Oracle Performance, 2nd Edition

docker容器中 bash: vi: command not found，docker apt-get 异常 Temporary failure resolving

vue-cli-plugin-windicss::leaf_fluttering_in_wind:适用于Vue CLI的Windi CSS:high_voltage:

All-dielectric metasurface for fully resolving arbitrary beams on a higher-order Poincaré sphere

cabal-install-trace:将cabal安装日志呈现为图形

优先级纱线：如果项目具有yarn.lock，请将npm install更改为yarn

dart-webrtc:dart 中 webrtc 的简洁概念证明

测试评论

英文原版-Resolving Ethical Dilemmas A Guide for Clinicians 5th Edition

端口查看工具

pakr：通过任何与pacman兼容的包装器管理Arch Linux软件包

mint-ui-website:Mint UI网站

toolchains

Typed.PHP.Stronger.Types.For.Cleaner.Code.1484221133

数据库系统概念（原书第七版）习题答案

oci.dll 12版本全部

Kingbase KCA题库

数据库系统概念（原书第七版）课后作业题

oracle 触发器语法及 for each row 详细说明

fineReport课后10题答案.zip

win64-11gR2-client.zip

Oracle11安装依赖包，11.2.0.4.0版本

Oracle19c-Windows客户端

oracle 12c windows 安装包下载

Oracle19c(19.3)数据库驱动ojdbc.jar

最新资源