Download presentation
Presentation is loading. Please wait.
1
GTSC UNIVERSITY SQL Server 2005 性能调优
2
哪些因素影响性能? 等待系统资源 错误的配置 不优化的查询 & 设计 索引问题 不好的执行计划 和客户端网络交互太多 内存,CPU,IO
硬件 & 软件 不优化的查询 & 设计 写法不好, 设计不周 索引问题 和工作量与资源配置没有关系 不好的执行计划 和客户端网络交互太多 While the goal is to maximize resources, there are many ways to fall short of reaching this goal. Subsystem resources are normally measured by PERFMON counters. Queues indicate when resources cannot keep up with requests. There are other counters that measure the performance of the subsystem such as the time it takes to complete a read or write operation. SQL Server is an application insofar as it needs to utilize system resources. If hardware is not properly configured, you may have inadequate memory or IO performance. Database and index design should complement the workload requirements e.g. queries.
3
课程安排 SQL 2005对系统资源的使用和监控方法 语句执行计划优化 执行计划预测与选择 执行计划重用与重编译 有用的性能监视指标
4
有效地搜集信息 SQL Trace Performance Monitor DMV & DMF (动态管理视图和函数)
SQL Profiler对性能的影响可能比较大 Server Side Trace Performance Monitor DMV & DMF (动态管理视图和函数) 数据库定义以及数据统计信息
5
使用SQLDiag 在SQL 2000里面的PSSDiag 缺省在SQL 2005里面就有安装 可以自动搜集 附带两个模板
SQL Trace Performance Log SQLDiag report Windows Event Logs 附带两个模板 SD_Detailed.XML SD_General.XML
6
Lab 1 使用SQLDiag监视SQL 2005的运行
7
共享的资源 数据库级 共享的资源 服务器级 共享的资源 机器级 共享的资源
数据库性能受到日志文件最大写入能力的限制, 日志的写入必须是串行的! 可以通过以下改进性能 增加更多的物理硬盘 增加数据库的数目,以增加日志的数目 服务器级 共享的资源 TEMPDB Memory (64-bit) Memory (32-bit) 32-bit AWE扩展的内存只能cache数据页面 Proc cache, locks, user connections, sorting 还是只能使用 2-3GB 的地址空间 可以通过在一台机器上安装多个instance解决 机器级 共享的资源 CPU 和网络 Shared resources can be a bottleneck if improperly used Transaction log Tempdb Data Cache turnover CPU Network Memory
8
数据库的scalability 受到日志文件最大写入能力的限制
Disk I/O 实例(Instance)的 scalability 受到进程最大资源数目的限制 Memory 服务器的 scalability受到机器能力的限制 CPU 网络带宽
9
硬盘读写性能问题 确定问题的特征 测试某个硬件配置的IO能力(不使用SQLServer) : 特殊考虑: 写瓶颈 读瓶颈 SQLIOSim
日志文件 (~100% 串行写) Lazy Writer (随机) 读瓶颈 随机 vs. 串行 测试某个硬件配置的IO能力(不使用SQLServer) : SQLIOSim 特殊考虑: 日志文件 一颗CPU一个Tempdb 数据文件 (328551) - When the tempdb database is heavily used, SQL Server may experience contention when it tries to allocate pages. From the sysprocesses system table output, the wait resource may show up as "2:1:1" (PFS Page) or "2:1:3" (SGAM Page). The first step in evaluating IO issues is to remove SQL Server from the equation. Thus, file copies, utilities such as SQLIO or IOMeter will establish whether the disk subsystem reaches manufacturer specifications.
10
硬盘读写瓶颈 I/O 瓶颈通常比较容易发现 对日志文件一定要小心 RAID5写性能问题: 使用独立的设备 使用RAID 10
Each RAID5 write = 2 READS + 2 WRITES ! 最近的测试结果是RAID5的写性能比RAID 0+1要差50% Spreading the transaction log across multiple spindles will increase performance, up to the limit of spindles RAID5 has a significant write penalty. 下面加上发现硬盘瓶颈的快速方法。
11
阻塞是由于并发的连接争抢共同的资源, 但是没有形成死锁 检测工具
阻塞问题 阻塞是由于并发的连接争抢共同的资源, 但是没有形成死锁 检测工具 SQL 2005 Profiler就可以检测出 DMVs Sp_who2 & sp_lock Snapshot Isolation - Row Versioning 读不阻塞写 Locks ensure data consistency. Writers can block other writers. Ordinarily, reads and writers block each other due to incompatible shared and exclusive locks. Profiler can be used to identify long blocks. New transaction isolation level read committed snapshot means shared locks are not taken. Thus, while the read is transactionally consistent, it does not block other writing activity. Overhead for row versioning is in tempdb 3605 sends dbcc results to errorlog 3604 sends results to the user
12
DMF sys.dm_db_index_operational_stats() 可以看出资源争抢的对象
检测阻塞 DMF sys.dm_db_index_operational_stats() 可以看出资源争抢的对象 Row locks counts (行锁申请数目) Row lock waits counts (行锁等待次数) Total wait time for blocks (总共被阻塞的时间) 可以算出发生阻塞的百分比和平均等待时间 row_lock_wait_count / row_lock_count row_lock_wait_in_ms / row_lock_wait_count DMVs can be used to identify the points of index contention. If a lock cannot be granted, a lock wait is recorded. You can figure out the frequency of blocks as well as the average block time.
13
Sp_lock & sp_who2 – lists real time blocks
检测阻塞 Sp_lock & sp_who2 – lists real time blocks Trace – for historical analysis Capture long blocks using the Trace Event “Block Process Report” Sp_configure “blocked process threshold”,15 (seconds) This stored procedure with display real time blocks including the actual blocked statement and the blocker proc or batch.
14
Lab 2 确认SQL Server内部的阻塞问题
15
Tempdb资源争抢 Tempdb 在 SS2005里使用量更大 使用方式 Tempdb 的配置更加重要
1 DBCC CHECKDB - small change Internal objects: work file (hash join, SORT_IN_TEMPDB) Internal objects: work table (cursor, spool) - small changes 4 Large object (LOB) variables 5 Service Broker 6 Temporary objects: global/local temp table, table variables 7 Temporary objects: SPs and cursors - small changes 8 Version store: General 9 Version store: MARS 10 Version store: Online index 11 Version store: Row version based isolation levels 12 Version store: Triggers 13 XML SQL Server 2005 has more potential uses for tempdb including index sorts, row versioning, online index etc.
16
Tempdb使用预测 On line 索引重建: Versioning:
2x-3x size of index – Sort size, temp index and rollback Versioning: [Size of Version Store] = 2 * [Version store data generated per minute] * [Longest running time (minutes) of your transaction] * number of concurrent transactions/users 可以通过Performance Monitor里面的counter进行监视 事先就设置好一个合适的大小,tempdb自动增长会严重影响性能 Sizing tempdb correctly is important for performance.
17
如果存储过程大量地使用Create Table and Create Index,就要考虑使用.
Tempdb – Trace Flag 1118 能够减少分配页的争抢 如果存储过程大量地使用Create Table and Create Index,就要考虑使用. Shared Global Allocation map – each SGAM records the status of mixed extents.
18
检查tempdb使用情况DMV select
sum(user_object_reserved_page_count)*8 as user_objects_kb, sum(internal_object_reserved_page_count)*8 as internal_objects_kb, sum(version_store_reserved_page_count)*8 as version_store_kb, sum(unallocated_extent_page_count)*8 as freespace_kb from sys.dm_db_file_space_usage where database_id = 2 This helps track the size of tempdb, and breaks down by function or component.
19
检查tempdb使用情况DMV SELECT t1.session_id,
(t1.internal_objects_alloc_page_count + task_alloc) as allocated, (t1.internal_objects_dealloc_page_count + task_dealloc) as deallocated , t3.sql_handle, t3.statement_start_offset , t3.statement_end_offset, t3.plan_handle from sys.dm_db_session_space_usage as t1, sys.dm_exec_requests t3, (select session_id, sum(internal_objects_alloc_page_count) as task_alloc, sum (internal_objects_dealloc_page_count) as task_dealloc from sys.dm_db_task_space_usage group by session_id) as t2 where t1.session_id = t2.session_id and t1.session_id >50 and t1.database_id = tempdb is database_id=2 and t1.session_id = t3.session_id order by allocated DESC See SEAS06PT – Tempdb usage by statement.sql Displays use of tempdb including sql_handle, statement using start_offset, end_offset, & plan_handle. To find overall use of tempdb: Select SUM (user_object_reserved_page_count)*8 as user_objects_kb, SUM (internal_object_reserved_page_count)*8 as internal_objects_kb, SUM (version_store_reserved_page_count)*8 as version_store_kb, SUM (unallocated_extent_page_count)*8 as freespace_kb From sys.dm_db_file_space_usage Where database_id = 2
20
编程时的注意事项 使用正确的,并且是尽量短的数据类型 一次提交的命令长度要合适 尽量使用RPC Call,而不是执行命令行
程序中声明的类型必须和数据库中的一致 一次提交的命令长度要合适 尽量使用RPC Call,而不是执行命令行 {call dbo.qi ('M01', 'M ')} –adds ADHOC query plans due to SQL string parsing 尽量避免使用数据库端游标 Server has two distinct and optimized code paths Goal is to utilize the correct code path! Language event Every statement not being a (stored) procedure extra parsing required to figure out what is in the string Adhoc query plans for string (in addition to Stored Proc plans) Generic code which executes procedures via a language event, for example OSQL, Query Analyzer etc. RPC event Stored procedure invocations using {call} syntax Increases performance eliminates parameter processing and statement parsing Lessons: Don’t send more bytes than necessary Batch Packet size should correlate with batch Avoid binding for large batches Use RPC {Call} syntax, not language events Trading floor 65K trades per second on 4way hyper- threaded box
21
连接在Runnable状态,完全是在等待CPU时间片执行 CPU使用
执行计划的compile和recompile Sort, Join, Aggregation 通常情况下,SQL的CPU使用量不会太大。相对于CPU, memory和disk更容易成为系统的瓶颈 Notes: High plan re-use is desirable for OLPT where there are high volumes of identical transactions. Plan re-use can be determined as: Plan Re-use = (Batch requests - Initial compiles) / batch requests Initial compiles = compiles/sec – recompiles/sec
22
编程时的注意事项 尽量避免从数据库中取出大量的数据 一次把所有的数据都从数据库里取走
业务逻辑可以用stored procedure完成 一次把所有的数据都从数据库里取走 没有取走的数据在数据库中会用active的游标的方式处理,影响并发度 如果客户端放弃取走所有数据,服务器还要清理这些数据 SET NOCOUNT ON 避免INSERT, UPDATE and DELETE 语句导致的不必要的网络传输 Completely fetching a result set is important. Otherwise SQL server is buffering a result set, that is never fetched by the application.
23
执行计划与它的生成
24
执行计划重用 引用该缓存对象的其他缓存对象数 自开始以来使用该缓存对象的次数
Master..Sys.dm_exec_cached_plans 包含存储过程和语句 引用该缓存对象的其他缓存对象数 自开始以来使用该缓存对象的次数 可以重用的执行计划 Procs, Triggers, Views Defaults, Check constraints, rules adhoc SQL, sp_executesql Query plans are divided into two pieces starting with SQL7 – Compile and execution. Plan re-use is critical for OLTP performance. Look at sys.dm_exec_cached_plans to see usecounts for plan re-use. USE master; GO SELECT * FROM sys.dm_exec_cached_plans cp CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) cross apply sys.dm_exec_sql_text(plan_handle) as q order by usecounts desc ;
25
检查计划重用情况 SQL Batch requests/sec SQL Compilations/sec
包含初始的compiles AND re-compiles 去掉re-compilations,能大致算出初始compiles的数目 在Sys.dm_exec_cached_plans里面找出usecounts最低的SQL语句 SQL Re-compilations/sec 语句一级的Recompiles Check profiler for sp:recompile event to identify SQL statement. --- SEAS06PT: Worst Plan Re-use by statement --- --- lowest 50 Usecounts by statement SELECT TOP 50 cp.cacheobjtype ,cp.usecounts ,size=cp.size_in_bytes ,stmt_start=qs.statement_start_offset ,stmt_end=qs.statement_end_offset ,qt.dbid ,qt.objectid ,qt.text ,SUBSTRING(qt.text,qs.statement_start_offset/2, (case when qs.statement_end_offset = -1 then len(convert(nvarchar(max), qt.text)) * 2 else qs.statement_end_offset end -qs.statement_start_offset)/2) as statement ,qs.sql_handle ,qs.plan_handle FROM sys.dm_exec_query_stats qs cross apply sys.dm_exec_sql_text(qs.sql_handle) as qt inner join sys.dm_exec_cached_plans as cp on qs.plan_handle=cp.plan_handle where cp.plan_handle=qs.plan_handle and qt.dbid is NULL ORDER BY [usecounts],[statement] asc
26
监视执行计划的compile和recompile
Perfmon: SQLServer:SQL Statistics Batch requests / sec { >1000’s/sec server is busy} SQL Compilations / sec {>10s/sec could be problem} SQL Recompilations / sec {OLTP should avoid high recomps} Ratio of compiles / requests is important Compiles – recompiles = initial compiles Plan re-use = (Batch requests – initial compiles) / Batch requests Recompile 的原因: 表格的定义发生变化 先前的并发计划需要串行执行 统计值更新过了 表格更新的行数超过了限度 – sys.sysindexes.rowmodctr
27
使用SQL Trace观察执行计划重用量 主要的event有: SP:Starting 标志一个stored procedure开始执行
SP:CacheMiss (event ID 34 in Profiler) SP:CacheInsert (event ID 35 in Profiler) SP:CacheRemove (event ID 36 in Profiler) SP:Recompile (event ID 37 in Profiler) SP:CacheHit (event ID 38 in Profiler) SP:Starting 标志一个stored procedure开始执行 SP:StmtStarting 标志单个语句开始执行 Example: sequence is SP:StmtStarting SP:CacheMiss (no plan found) SP:CacheInsert (plan created) 注意:使用SQL Profiler可能会影响SQL性能! 加入Eventsubclass 字段可以显示recompile的原因 Cache events indicate when plans are created, recompiled or discarded. You can also see some cache events using “Performance Statistics” event. New SQL is indicated by eventsubclass=0 New Plans are indicated by eventsubclass in (1,2)
28
Eventsubclass 字段显示recompile的原因
Description 1 Schema changed. 2 Statistics changed. 3 Deferred compile. 4 SET option changed. 5 Temporary table changed. 6 Remote rowset changed. 7 FOR BROWSE permission changed. 8 Query notification environment changed. 9 Partitioned view changed. 10 Cursor options changed. 11 OPTION (RECOMPILE) requested. Use Profiler to display SQL statements & reasons for recompilation, in the eventsubclass column.
29
Lab 3 观察执行计划重用情况
30
有用的性能监视器指标 Memory: Page faults/sec Memory: pages/sec
Physical Disk: Avg. Disk Queue Length Physical Disk: Avg. Disk sec/Transfer Physical Disk: Avg. Disk sec/Read Physical Disk: Avg. Disk sec/Write Physical Disk: Current Disk Queue Length Processor: %Processor Time SS Access Methods: Forwarded Records/sec SS Access Methods: Full Scans/sec SS Access Methods: Index Searches/sec SS Access Methods: Page Splits/sec SS Access Methods: Range Scans/sec SS Access Methods: Table Lock escalations/sec SS Buffer Manager: Checkpoint pages/sec SS Buffer Manager: Lazy writes/sec SS Buffer Manager: Page Life expectancy SS Buffer Node:Foreign Pages SS Buffer Node:Page Life expectancy SS Buffer Node:Stolen Pages SS Databases: Log Flush Wait time SS Databases: Log Flush Waits/sec SS General Statistics: User Connections SS Latches: Average Latch Wait Time(ms) SS Latches: Latch Waits/sec SS Latches: Total Latch Wait Time (ms) SS Locks: Average Wait Time(ms) SS Locks: Lock requests/sec SS Locks: Lock Wait Time (ms) SS Locks: Lock Waits/sec SS Memory Manager: Memory grants pending SS SQL Statistics: Auto-Params attempts/sec SS SQL Statistics: Batch requests/sec SS SQL Statistics: Safe Auto-Params/sec SS SQL Statistics: SQL Compilations/sec SS SQL Statistics: SQL Re-Compilations/sec System: Processor Queue Length
31
OLTP数据库设计指标 1 2 3 4 Description Source DB Design Problem Description
Resource Issue Rule Description Source Problem Description DB Design 1 经常有多个表格join的语句出现 Sys.dm_exec_sql_text Sys.dm_exec_cached_plans 表格join次数太多,会影响OLTP系统的性能 2 经常更新的表格上有很多index Sys.indexes sys.dm_db_operational_index_stats Index数目多,可以提高查询速度,但是会降低修改速度,增加阻塞和死锁发生的机率 3 作了太多的硬盘读写 range scans table scans Perfmon object SQL Server Access Methods Sys.dm_exec_query_stats Missing index, flushes cache 4 没有用的index * Sys.dm_db_index_usage_stats Index maintenance for unused indexes While the values can be debated, there are a good starting point to indicate performance issues. OLTP should not have big IOs. select r.*, t.dbid, t.objectid, t.encrypted, substring(t.text, statement_start_offset / 2, (case when statement_end_offset = -1 then datalength(t.text) else statement_end_offset end - statement_start_offset) / 2) as query_text, p.* from sys.dm_exec_requests r cross apply sys.dm_exec_sql_text(sql_handle) as t cross apply sys.dm_exec_query_plan(plan_handle) as p Top 50 IO contributer. select top 50 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, (total_physical_reads/execution_count) as avg_phys_reads, Execution_count, statement_start_offset as stmt_start_offset, statement_end_offset as stmt_end_offset, sql_handle, substring(sql_text.text, (statement_start_offset/2), case when (statement_end_offset -statement_start_offset)/2 <=0 then 64000 else (statement_end_offset -statement_start_offset)/2 end) as exec_statement, sql_text.text, plan_handle, plan_text.* from sys.dm_exec_query_stats cross apply sys.dm_exec_sql_text(sql_handle) as sql_text cross apply sys.dm_exec_query_plan(plan_handle) as plan_text order by (total_logical_reads + total_logical_writes) /Execution_count Desc --(total_logical_reads + total_logical_writes) Desc 1. The following DMV query can be used to get useful information about the index usage for all objects in all databases. select object_id, index_id, user_seeks, user_scans, user_lookups from sys.dm_db_index_usage_stats order by object_id, index_id _______________________________________________________ 2. All indexes which haven't been used yet can be retrieved with the following statement: select object_name(object_id), i.name from sys.indexes i where i.index_id NOT IN (select s.index_id from sys.dm_db_index_usage_stats s where s.object_id=i.object_id and i.index_id=s.index_id and database_id = <dbid> ) order by object_name(object_id) asc
32
检查系统是否有IO瓶颈 1 Perfmon object Physical Disk 2 3 >1 4 Description
Resource Issue Rule Description Value Source Problem Description IO 1 Avg Disk seconds / read > 10 ms Perfmon object Physical Disk Reads should take 4-8ms with NO IO pressure 2 Avg Disk seconds / write Writes (sequential) can be as fast as 1ms for transaction log. 3 Big IOs range scans table scans >1 SQL Server Access Methods Missing index, flushes cache 4 If Top 2 values for Wait stats includes: ASYNCH_IO_COMPLETION IO_COMPLETION LOGMGR WRITELOG PAGEIOLATCH_x Top 2 Sys.dm_os_wait_stats If top 2 wait_stats values include IO, there is an IO bottleneck While the values can be debated, there are a good starting point to indicate performance issues. OLTP should not have big IOs. select * from Sys.dm_os_wait_stats order by wait_time_ms
33
判断系统是否有阻塞发生 1 Block percentage >2% 2 Block process report 3 4
Resource Issue Rule Description Value Source Problem Description Blocking 1 Block percentage >2% Sys.dm_db_index_operational_stats Frequency of blocks 2 Block process report 30 sec Sp_configure “blocked process threshold” profiler “blocked process report” Report of long blocks e.g. statements 3 Avg Row Lock Waits > 100ms Duration of blocks 4 If Top 2 values for wait stats are any of the following: 1. LCK_x Top 2 Sys.dm_os_wait_stats If top 2 wait_stats values include locking, there is a blocking bottleneck OLTP transactions should be short. However, in a high concurrency environment, blocking can be significant. Issues are writers will block writers, readers and writers block each other. SELECT * FROM sys.dm_db_index_operational_stats(NULL, NULL, NULL, NULL)
34
是否有CPU瓶颈 1 2 Plan Re-use Perfmon object SQL Server Statistics 3
Resource Issue Rule Description Value Source Problem Description CPU 1 SUSPENDED state > 25% sys.dm_os_workers Time in runnable queue is pure CPU wait. 2 Plan Re-use < 90% Perfmon object SQL Server Statistics OLTP identical transactions should ideally have >95% plan re-use 3 Parallelism: CXPACKET waits > 5% Sys.dm_os_wait_stats Parallelism reduces OLTP throughput Signal waits are the time spent waiting for CPU. SELECT t1.session_id, CONVERT(varchar(10), t1.status) AS status, CONVERT(varchar(15), t1.command) AS command, CONVERT(varchar(10), t2.state) AS worker_state, w_suspended = CASE t2.wait_started_ms_ticks WHEN 0 THEN 0 ELSE t3.ms_ticks - t2.wait_started_ms_ticks END, w_runnable = CASE t2.wait_resumed_ms_ticks WHEN 0 THEN 0 ELSE t3.ms_ticks - t2.wait_resumed_ms_ticks END FROM sys.dm_exec_requests AS t1 INNER JOIN sys.dm_os_workers AS t2 ON t2.task_address = t1.task_address CROSS JOIN sys.dm_os_sys_info AS t3 WHERE t1.scheduler_id IS NOT NULL; In the output, when w_runnable and w_suspended are equal, this represents the time that the worker is in the SUSPENDED state. Otherwise, w_runnable represents the time that is spent by the worker in the RUNNABLE state. Plan re-use avoids CPU usage in plan creation. Parallelism should be avoided in OLTP.
35
是否有内存瓶颈 Average Page Life Expectancy 1 2 3 SQL Server Memory Manager
Resource Issue Rule Description Value Source Problem Description Memory 1 Average Page Life Expectancy < 300 sec Perfmon object SQL Server Buffer Mgr SQL Server Buffer Nodes Cache flush,due to big read Possible missing index 2 Drops by 50% 3 Memory Grants Pending > 1 SQL Server Memory Manager Current number of processes waiting for a workspace memory grant Cache turnover is the key. Normally you don’t expect much cache turnover in OLTP because the transactions are small. If you have big drops in Page Life, this indicates a big scan.
36
Lab 4 分析SQLDiag搜集的日志文件
37
Q&A
38
GTSC UNIVERSTIY Thank You!
Similar presentations