2009年1月21日星期三

Monitoring Open and Cached Cursors

http://www.orafaq.com/node/758

Monitoring Open and Cached Cursors

Submitted by Natalka Roshak on Thu, 2005-12-01 23:55

RDBMS Server

Just about every DBA has had to deal with ora-1000 errors, "Maximum open cursors exceeded." This article will discuss initialization parameters that affect open cursors, the difference between open and cached cursors, closing cursors, and monitoring open and cached cursors.

Open cursors

Open cursors take up space in the shared pool, in the library cache. To keep a renegade session from filling up the library cache, or clogging the CPU with millions of parse requests, we set the parameter OPEN_CURSORS.

OPEN_CURSORS sets the maximum number of cursors each session can have open, per session. For example, if OPEN_CURSORS is set to 1000, then each session can have up to 1000 cursors open at one time. If a single session has OPEN_CURSORS # of cursors open, it will get an ora-1000 error when it tries to open one more cursor.

The default is value for OPEN_CURSORS is 50, but Oracle recommends that you set this to at least 500 for most applications. Some applications may need more, eg. web applications that have dozens to hundreds of users sharing a pool of sessions. Tom Kyte recommends setting it around 1000.

Session cached cursors

There are two main initialization parameters that affect cursors, and many folks get them confused. One is OPEN_CURSORS, and the other is SESSION_CACHED_CURSORS.

SESSION_CACHED_CURSORS sets the number of cached closed cursors each session can have. You can set SESSION_CACHED_CURSORS to higher than OPEN_CURSORS, lower than OPEN_CURSORS, or anywhere in between. This parameter has no effect on ora-1000's or on the number of cursors a session will have open. Conversely, OPEN_CURSORS has no effect on the number of cursors cached. There's no relationship between the two parameters.

If SESSION_CACHED_CURSORS is not set, it defaults to 0 and no cursors will be cached for your session. (Your cursors will still be cached in the shared pool, but your session will have to find them there.) If it is set, then when a parse request is issued, Oracle checks the library cache to see whether more than 3 parse requests have been issued for that statement. If so, Oracle moves the session cursor associated with that statement into the session cursor cache. Subsequent parse requests for that statement by the same session are then filled from the session cursor cache, thus avoiding even a soft parse. (Technically, a parse can't be completely avoided; a "softer" soft parse is done that's faster and requires less CPU.)

In the session cursor cache, Oracle manages the cached cursors using a LRU list. Once more than SESSION_CACHED_CURSORS closed cursors are cached, Oracle starts dropping cached cursors off the LRU end of the list whenever it needs to make room to cache a new cursor.

Why cache cursors?

The obvious advantage to caching cursors by session is reduced parse times, which leads to faster overall execution times. This is especially so for applications like Oracle Forms applications, where switching from one form to another will close all the session cursors opened for the first form. Switching back then opens identical cursors. So caching cursors by session really cuts down on reparsing.

There's another advantage, though. Since a session doesn't have to go looking in the library cache for previously parsed SQL, caching cursors by session results in less use of the library cache and shared pool latches. These are often points of contention for busy OLTP systems. Cutting down on latch use cuts down on latch waits, providing not only an increase in speed but an increase in scalability.

Monitoring open cursors

I believe a lot of the confusion about open cursors vs. cached cursors comes from the names of the Oracle dynamic performance views used to monitor them. v$open_cursor shows cached cursors, not currently open cursors, by session. If you're wondering how many cursors a session has open, don't look in v$open_cursor. It shows the cursors in the session cursor cache for each session, not cursors that are actually open.

To monitor open cursors, query v$sesstat where name='opened cursors current'. This will give the number of currently opened cursors, by session:

--total cursors open, by session
select a.value, s.username, s.sid, s.serial#
from v$sesstat a, v$statname b, v$session s
where a.statistic# = b.statistic#  and s.sid=a.sid
and b.name = 'opened cursors current';

If you're running several N-tiered applications with multiple webservers, you may find it useful to monitor open cursors by username and machine:

--total cursors open, by username & machine
select sum(a.value) total_cur, avg(a.value) avg_cur, max(a.value) max_cur,
s.username, s.machine
from v$sesstat a, v$statname b, v$session s
where a.statistic# = b.statistic#  and s.sid=a.sid
and b.name = 'opened cursors current'
group by s.username, s.machine
order by 1 desc;

Tuning OPEN_CURSORS

The best advice for tuning OPEN_CURSORS is not to tune it. Set it high enough that you won't have to worry about it. If your sessions are running close to the limit you've set for OPEN_CURSORS, raise it. Your goal in tuning this parameter is to set it high enough that you never get an ora-1000 during normal operations.

If you set OPEN_CURSORS to a high value, this doesn't mean that every session will have that number of cursors open. Cursors are opened on an as-needed basis. And if one of your applications has a cursor leak, it will eventually show up even with OPEN_CURSORS set high.

To see if you've set OPEN_CURSORS high enough, monitor v$sesstat for the maximum opened cursors current. If your sessions are running close to the limit, up the value of OPEN_CURSORS.

SQL> select max(a.value) as highest_open_cur, p.value as max_open_cur
 2> from v$sesstat a, v$statname b, v$parameter p
 3> where a.statistic# = b.statistic#
 4> and b.name = 'opened cursors current'
 5> and p.name= 'open_cursors'
 6> group by p.value;

HIGHEST_OPEN_CUR MAX_OPEN_CUR
---------------- ------------
           1953         2500

After you've increased the value of OPEN_CURSORS, keep an eye on v$sesstat to see if opened cursors current keeps increasing for any of your sessions. If you have an application session whose opened cursors current always increases to catch up with OPEN_CURSORS, then you've likely got a cursor leak in your application code: your application is opening cursors and not closing them when it's done.

There is nothing you, as a DBA, can do to fix a cursor leak. The application developers need to go through the code, find the cursors that are being left open, and close them. As a stopgap, the most you can do is raise OPEN_CURSORS very high and schedule times when all the application sessions will be closed and reopened (eg. by kicking the webserver).

How not to tell if you're closing all your cursors

Frustratingly for developers, the session statistic 'currently open cursors' can include some cursors that the application has closed. When application code calls for a cursor to be closed, Oracle actually marks the cursor as "closeable". The cursor may not actually be closed until Oracle needs the space for another cursor.

So it's not possible to test to see if a complex application is closing all its cursors by starting a session, running a test, and then checking to see if currently open cursors has gone down to 1. Even if the application is closing all its cursors properly, currently open cursors may report that some "closeable" cursors are still open.

One way for application developers to tell if an application is closing all its cursors is to do a single test run, on a dedicated development box, while monitoring "opened cursors cumulative" in v$sesstat for the session that's running the test. Then set OPEN_CURSORS to a value a little bit higher than the peak cursors open during your test run, start a new session, and run through multiple iterations of the same test run. If your application still has a cursor leak, you will see the value of OPEN_CURSORS going up, and you may hit an ORA-1000 after a reasonable number of iterations. (Don't set OPEN_CURSORS too low or it may be used up by recursive SQL; if your single test run opens very few cursors, consider making your test run longer rather than setting OPEN_CURSORS unreasonably low.)

Monitoring the session cursor cache

v$sesstat also provides a statistic to monitor the number of cursors each session has in its session cursor cache.

--session cached cursors, by session
select a.value, s.username, s.sid, s.serial#
from v$sesstat a, v$statname b, v$session s
where a.statistic# = b.statistic#  and s.sid=a.sid
and b.name = 'session cursor cache count' ;

You can also see directly what is in the session cursor cache by querying v$open_cursor. v$open_cursor lists session cached cursors by SID, and includes the first few characters of the statement and the sql_id, so you can actually tell what the cursors are for.

select c.user_name, c.sid, sql.sql_text
from v$open_cursor c, v$sql sql
where c.sql_id=sql.sql_id  -- for 9i and earlier use: c.address=sql.address
and c.sid=&sid
;

Tuning SESSION_CACHED_CURSORS

If you choose to use SESSION_CACHED_CURSORS to help out an application that is continually closing and reopening cursors, you can monitor its effectiveness via two more statistics in v$sesstat. The statistic "session cursor cache hits" reflects the number of times that a statement the session sent for parsing was found in the session cursor cache, meaning it didn't have to be reparsed and your session didn't have to search through the library cache for it. You can compare this to the statistic "parse count (total)"; subtract "session cursor cache hits" from "parse count (total)" to see the number of parses that actually occurred.

SQL> select cach.value cache_hits, prs.value all_parses,
 2> prs.value-cach.value sess_cur_cache_not_used
 3> from v$sesstat cach, v$sesstat prs, v$statname nm1, v$statname nm2
 4> where cach.statistic# = nm1.statistic# 
 5> and nm1.name = 'session cursor cache hits'
 6> and prs.statistic#=nm2.statistic#
 7> and nm2.name= 'parse count (total)'
 8> and cach.sid= &sid and prs.sid= cach.sid ;

Enter value for sid: 947
old   8: and cach.sid= &sid and prs.sid= cach.sid
new   8: and cach.sid= 947 and prs.sid= cach.sid

CACHE_HITS ALL_PARSES SESS_CUR_CACHE_NOT_USED
---------- ---------- -----------------------
      106        210                     104

Monitor this in concurrence with the session cursor cache count.

--session cached cursors, for a given SID, compared to max
select a.value curr_cached, p.value max_cached, s.username, s.sid, s.serial#
from v$sesstat a, v$statname b, v$session s, v$parameter2 p
where a.statistic# = b.statistic#  and s.sid=a.sid and a.sid=&sid
and p.name='session_cached_cursors'
and b.name = 'session cursor cache count' ;

If the session cursor cache count is maxed out, session_cursor_cache_hits is low compared to all parses, and you suspect that the application is re-submitting the same queries for parsing repeatedly, then increasing SESSION_CURSOR_CACHE_COUNT may help with latch contention and give a slight boost to performance. Note that if your application is not resubmitting the same queries for parsing repeatedly, then session_cursor_cache_hits will be low and the session cursor cache count may be maxed out, but caching cursors by session won't help at all. For example, if your application is using a lot of unsharable SQL, raising this parameter won't get you anything.

Conclusion

We've covered the difference between open cursors and session cached cursors, their initialization parameters, and how to monitor and tune them.

About the author

Natalka Roshak is a senior Oracle and Sybase database administrator, analyst, and architect. She is based in Kingston, Ontario. More of her scripts and tips can be found in her online DBA toolkit at http://toolkit.rdbms-insight.com/.

Natalka Roshak's blog
Login to post comments

2009年1月15日星期四

Too many open files 问题再现

Too many open files 问题再现

在之前的“linux文件描述符1024限制” 一文中我们用命令ulimit -HSn 65536对句柄数做了修改，但最近发现Too many open files 问题有来了，用lsof -p $java_pic|wc -l发现跑到1200左右就出现大量的此类问题，于是通过网上一篇比较详细的介绍，更加彻底的了解了问题引起的原因！

Too many open files 问题出现有两种情况：
一种是在搜索的时候出现，多半是由于索引创建完毕之后被移动过，如果创建索引的时候不出现该错误，搜索的时候也一般是不会出现的。如果出现了，有两种处理办法，一种是修改合并因子和最小合并因子，并且使用
IndexWriter.Optimize() 优化索引，这样会将索引文件数量减少到文件系统限制之内；另外一种办法是修改操作系统的打开文件数量限制。方法如下：
1. 按照最大打开文件数量的需求设置系统，并且通过检查/proc/sys/fs/file-max文件来确认最大打开文件数已经被正确设置。

# cat /proc/sys/fs/file-max
如果设置值太小，修改文件/etc/sysctl.conf的变量到合适的值。这样会在每次重启之后生效。如果设置值够大，跳过下步。

# echo 2048 > /proc/sys/fs/file-max

编辑文件/etc/sysctl.conf，插入下行。

fs.file-max = 8192
2. 在/etc/security/limits.conf文件中设置最大打开文件数，下面是一行提示：

添加如下这行。

* - nofile 8192

这行设置了每个用户的默认打开文件数为2048。注意"nofile"项有两个可能的限制措施。就是项下的hard和soft。要使修改过得最大打开文件数生效，必须对这两种限制进行设定。如果使用"-"字符设定, 则hard和soft设定会同时被设定。

硬限制表明soft限制中所能设定的最大值。 soft限制指的是当前系统生效的设置值。 hard限制值可以被普通用户降低。但是不能增加。 soft限制不能设置的比hard限制更高。只有root用户才能够增加hard限制值。

当增加文件限制描述，可以简单的把当前值双倍。例子如下，如果你要提高默认值1024，最好提高到2048，如果还要继续增加，就需要设置成4096。
另外一种情况是在创建索引的时候，也有两种可能，一种是合并因子太小，导致创建文件数量超过操作系统限制，这时可以修改合并因子，也可以修改操作系统的打开文件数限制；另外一种是合并因子受虚拟机内存的限制，无法调整到更大，而需要索引的doc 数量又非常的大，这个时候就只能通过修改操作系统的打开文件数限制来解决了。

在此基础上，我还修改了以下一个配置文件

vi /etc/sysctl.conf

添加：

# Decrease the time default value for tcp_fin_timeout connection
net.ipv4.tcp_fin_timeout = 30
# Decrease the time default value for tcp_keepalive_time connection
net.ipv4.tcp_keepalive_time = 1800
# Turn off tcp_window_scaling
net.ipv4.tcp_window_scaling = 0
# Turn off the tcp_sack
net.ipv4.tcp_sack = 0
#Turn off tcp_timestamps

net.ipv4.tcp_timestamps = 0

然后 service network restart,这些都和TCP sockets有关的优化。

另外需要在 /etc/rc.d/rc.local里添加已使得重启的时候生效。

echo "30">/proc/sys/net/ipv4/tcp_fin_timeout
echo "1800">/proc/sys/net/ipv4/tcp_keepalive_time
echo "0">/proc/sys/net/ipv4/tcp_window_scaling
echo "0">/proc/sys/net/ipv4/tcp_sack
echo "0">/proc/sys/net/ipv4/tcp_timestamps

因为不是所有的程序都在root下跑的，所有linux有对hard 与soft open files 的区分，普通用户受hard的限制，无论ulimit -n $数值调到多高，都跑不到 /etc/security/limits.conf里nofile的值.

这样的优化后 lsof -p $java_pid|wc -l可以跑到4千以上都不会抛出too many open files。

但是我们通过以上的文章详细介绍知道，这样也是治标不治本，找到java哪个文件不关闭文件描述符或者被请求过多的原因才是最重要的！

http://www.ftponline.com/weblogicpro/2005_01/magazine/columns/troubleshootingdiary/

2009年1月14日星期三

网站加速--Cache为王篇

From: http://blog.sina.com.cn/iyangjian

一，Cache，王道也
二，Cache 基本原理介绍
三，我划分的３个刷新级别
四，我对HTTP协议做的一点创新(?maxage=6000000)
五，Yslow优化网站性能的14条军规点评
六，上线了 !=

Finished
七，提速度同时节约成本方法汇总
-----------------------------------------------------------------------------------------

一，Cache，王道也

我觉得系统架构不应该仅仅是搭建一个强硬的能承受巨大并发压力的后台，前端页面也是需要架构的而且同等重要，不理解前台的的后台工程师是不合格的。中国人讲究钢柔相济，后台强硬只能说你内功深厚，前端用的巧，那叫四两拨千斤。

一般后台工程师很少关心前端如何使用自己的资源，而前端工程师，不知道自己的一个简单的用法会对后端造成多大影响。我会给出一些数据，来震撼下你的眼球。

二，Cache 基本原理介绍 (参考Caching Tutorial)

为什么使用Cache？
1，减少延迟，让你的网站更快，提高用户体验。
2，避免网络拥塞，减少请求量，减少输出带宽。
补充一个cache的原则：不更新的资源就不应该让它再次产生HTTP请求，如果强制产生了请求，那么就看看能否返回304。

Cache的种类？
浏览器Cache，代理Cache，网关Cache。
后端还有 disk cache ,server cache，php cache，不过不属于我们今天讨论范围。

Cache如何工作的？
1，如果响应头告诉cache别缓存它，cache不对它做缓存；
2，如果请求需要验证的或者是需要安全性的，它将不被缓存；
3，如果响应头里没有ETag或Last-Modifed header这类元素，而且也没有任何显式的信息告诉如何对数据保鲜，则它被认为不可缓存。
4，在下面情况下，一个缓存项被认为是新鲜的(即，不需到原server上检查就可直接发送给client):
它设置了一个过期时间或age-controlling响应头，而且现在仍未过期。
如果浏览器cache里有某个数据项，并且被被设置为每个会话(session)过程中只检查一次；
如果一个代理cache里能找个某个数据项，并且它是在相对较长时间之前更新过的。
以上情况会认为数据是新鲜的，就直接走cache，不再查询源server。
5，如果有一项过期了，它将会让原server去更新它，或者告诉cache这个拷贝是否还是可用的。

怎么控制你的Cache？
Meta tags ：在html页面中指定，这个方法只被少数浏览器支持，Proxy一般不会读你html的具体内容然后再做cache决策的。

Pragma: no-cache : 一般被大家误用在http响应头中，这不会产生任何效果。而实际它仅仅应该用在请求头中。不过google的Server: GFE/1.3 响应中却这样用，难道人家也误用了呢。

Date: 当前主机GMT时间。

Last-Modified : 文件更新GMT时间，我在响应头中带上这个元素的时候，通常浏览器在cache时间内再发请求都会稍带上If-Modified-Since，让我们判断需要重新传输文件内容，还是仅仅返回个304告诉浏览器资源还没更新，需要缓存策略的服务器肯定都得支持的。有了这个请求，head请求在基本没太多用处了，除非在telnet上调试还能用上。

If-Modified-Since : 用在请求头里，见Last-Modified 。

Etag: 标识资源是否发生变化，etag的生成算法各是各样,通常是用文件的inode+size+LastModified进行Hash后得到的,可以根据应用选择适合自己的。Last-Modified 只能精确到秒的更新，如果一秒内做了多次更新，etag就能派上用场。貌似大家很少有这样精确的需求，浪费了http header的字节数，建议不要使用。

Expires : 指定缓存到期GMT的绝对时间，这个是http 1.0里就有的。这个元素有些缺点，一，服务器和浏览器端时间不一致时会有问题。二，一旦失效后如果忘记重新设置新的过期时间会导致cache失效。三，服务器端需要根据当前Date时间 + 应该cache的相对时间去计算这个值，需要cpu开销。我不推荐使用。

Cache-Control:
这个是http 1.1中为了弥补 Expires 缺陷新加入的，现在不支持http 1.1的浏览器已经很少了。
max-age: 指定缓存过期的相对时间秒数，max-ag=0或者是负值，浏览器会在对应的缓存中把Expires设置为1970-01-01 08:00:00 ,虽然语义不够透明，但却是我最推荐使用的。
s-maxage: 类似于max-age，只用在共享缓存上，比如proxy.
public: 通常情况下需要http身份验证的情况，响应是不可cahce的，加上public可以使它被cache。
no-cache: 强制浏览器在使用cache拷贝之前先提交一个http请求到源服务器进行确认。这对身份验证来说是非常有用的,能比较好的遵守 (可以结合public进行考虑)。它对维持一个资源总是最新的也很有用，与此同时还不完全丧失cache带来的好处，因为它在本地是有拷贝的，但是在用之前都进行了确认，这样http请求并未减少，但可能会减少一个响应体。
no-store: 告诉浏览器在任何情况下都不要进行cache，不在本地保留拷贝。
must-revalidate: 强制浏览器严格遵守你设置的cache规则。
proxy-revalidate: 强制proxy严格遵守你设置的cache规则。
用法举例: Cache-Control: max-age=3600, must-revalidate

其他一些使用cache需要注意的东西，不要使用post，不要使用ssl，因为他们不可被cache，另外保持url一致。只在必要的地方，通常是动态页面使用cookie，因为coolie很难cache。至于apache如何支持cache和php怎么用header函数设置cache，暂不做介绍，网上资料比较多。

如何设置合理的cache时间？
http://image2.sinajs.cn/newchart/min/n/sz000609.gif?1230015976759
拿我分时图举例，我们需要的更新频率是1分钟。但为了每次都拿到最新的资源，我们在后面加了个随机数，这个数在同一秒内的多次刷新都会变化。我们的js虽然能够很好的控制，一分钟只请求一次，但是如果用户点了刷新按纽呢？这样的调用是完全cache无关的，连返回304的机会都没有。

试想，如果很多人通过同一个代理出去的，那么所有的请求都会穿透代理，弄不好被网管封掉了。如果我们做只做一秒的cache，对直接访问源服务器的用户没太多影响，但对于代理服务器来说，他的请求可能会从10000 req/min 减少为 60 req/min ，这是160倍。

对于我们行情图片这样的情况，刷新频率为1分钟，比较好的做法是把后面的随机数(num)修改为 num=t-t%60 其中t是当前时间戳，这样你一分钟内刷这个url是不变的，下一分钟会增加1，会再次产生一个新请求。而我的max-age设置为默认59秒，即使设置120秒其实也没什么影响。可能你会说万一赶上临界点可能拿不到最新的数据，其实对用户来说，用那个多变的随即数和我这个分钟级的随即数，看到的效果是相同的下面我给你分析一下：如果用户打开了我们的分时间页面，当前随即数对他来说是新的，所以他会拿到一个当前最新的图片，然后他点了刷新按纽，用户会产生http请求，即使url 没变，服务器有最新图片也一定会返回，否则返回304，一分钟后js刷新图片，分钟数加了1，会得到全新资源。这和那个随时变化的随即数效果有区别吗？都拿到了最新的数据，但是却另外收益了cache带来的好处，对后端减少很多压力。

三，我划分的３个刷新级别

名词解释全新请求： url产生了变化,浏览器会把他当一个新的资源(发起新的请求中不带If-Modified-Since)。

1,在地址栏中输入http://sports.sinajs.cn/today.js?maxage=11地址按回车。重复n次，直到cache时间11秒过去后，才发起请求，这个请求是全新的，不带If-Modified-Since。

2,按F5刷新. 发起一个全新的请求，然后按F5会产生一个带If-Modified-Since的请求，如果返回304，将不再发起新的请求，直到第一次请求设置的cache过期，然后发起一个全新的请求。

3, ctrl+F5 ,总会发起一个全新的请求。

下面是按F5刷新的例子演示: http://sports.sinajs.cn/today.js?maxage=11
( 如果这个值大于浏览器最大cache时间maxage，将以浏览器最大cache为准)

----------------------------------------------------------发起一个全新请求
GET /today.js?maxage=11 HTTP/1.1
Host: sports.sinajs.cn
Connection: keep-alive

HTTP/1.x 200 OK
Server: Cloudia
Last-Modified: Mon, 24 Nov 2008 11:03:02 GMT
Cache-Control: max-age=11 (浏览器会cache这个页面内容，然后将cache过期时间设置为当前时间+11秒)
Content-Length: 312
Connection: Keep-Alive
---------------------------------------------------------- 按F5刷新
GET /today.js?maxage=11 HTTP/1.1
Host: sports.sinajs.cn
Connection: keep-alive
If-Modified-Since: Mon, 24 Nov 2008 11:03:02 GMT (按F5刷新，If-Modified-Since将上次服务器传过来的Last-Modified时间带过来)
Cache-Control: max-age=0

HTTP/1.x 304 Not Modified
Server: Cloudia
Connection: Keep-Alive
Cache-Control: max-age=11 (这个max-age有些多余，浏览器发现Not Modified，将使用本地cache数据，但不会重新设置本地过期时间)
----------------------------------------------------------
继续按F5刷新n次.......

这11秒内未产生http请求.直到11秒过去了...............
----------------------------------------------------------按F5刷新
GET /today.js?maxage=11 HTTP/1.1
Host: sports.sinajs.cn
Connection: keep-alive (cache过期后，发起的是一个全新的请求,未带)

HTTP/1.x 200 OK
Server: Cloudia
Last-Modified: Mon, 24 Nov 2008 11:03:02 GMT
Cache-Control: max-age=11
Content-Encoding: deflate
Content-Length: 312
Connection: Keep-Alive
Content-Type: application/x-javascript
----------------------------------------------------------按F5刷新
GET /today.js?maxage=11 HTTP/1.1
Host: sports.sinajs.cn
Connection: keep-alive
If-Modified-Since: Mon, 24 Nov 2008 11:03:02 GMT
Cache-Control: max-age=0

HTTP/1.x 304 Not Modified
Server: Cloudia
Connection: Keep-Alive
Cache-Control: max-age=11
----------------------------------------------------------

四，我对HTTP协议做的一点创新(?maxage=6000000)

上面看到了url后面有 ?maxage=xx 这样的用法，这不是一个普通的参数，作用也不仅仅是看起来那么简单。他至少有以下几个好处：

1，可以控制HTTP header的的 max-age 值。
2, 让用户为每个资源灵活定制精确的cache时间长度。
3, 可以代表资源版本号。

首先谈论对后端的影响：
服务器实现那块，不用再load类似mod_expires，mod_headers 这样额外的module，也不用去加载那些规则去比较，它属于什么目录，或者什么文件类型，应该cache多少时间，这样的操作是需要开销的。

再说说对前端的影响：
比如同一个分时行情图片，我们的分时页中需要1分钟更新，而某些首页中3分钟更新好。不用js控制的话，那我cache应该设置多少呢？有了maxage就能满足这种个性化定制需求。

另一种情况是，我们为了cache，把某个图片设置了一个永久cache，但是由于需求，我必须更新这个图片，那怎么让用户访问到这个更新了的图片呢？从yahoo的资料和目前所有能找到的资料中都描述了同一种方法，更改文件名字，然后引用新的资源。我觉得这方法太土, 改名后，老的还不能删除，可能还有地方在用，同一资源可能要存两份，再修改，又得改个名，存3份，不要不把inode当资源。我就不那样做，只需要把maxage=6000000 修改成 maxage=6000001 ，问题就解决了。

maxage=6000000 所产生的威力 (内存块消耗减少了250倍 ,请求数减少了37倍) ：
体育那边要上一个新功能，一开始动态获取那些数据，我觉得那样太浪费动态池资源，就让他们把xml文件到转移到我的js池上来，为了方便，他们把那个84k的flash文件也放在了一起，而且是每个用户必须访问的。说实在的，我不欢迎这种大块头，因为它不可压缩，按正常来说，它应该代表一个3M的文件。我的服务器只这样设计的，如果一次发送不完的就暂存在内存里，每个内存块10k，如果不带参数默认maxage=120 。我发现，由于这个文件，10w connections的时候，我消耗了10000个内存块。我自己写的申请连续内存的算法也是消耗cpu地，一个84k的文件，发送一次后，剩余的64k就应该能装的下，于是我把最小内存块大小改为64K。这样消耗10w conn的时候消耗1500个左右内存快，虽然内存消耗总量没怎么变小，但是它能更快的拿到64K的连续内存资源，cpu也节约下来了。接下来我让meijun把所应用的flash资源后面加上maxage=6000000 (大概=79天,浏览器端最长cache能达到着个就不错了)， 10w connections的时候，只消耗了不到40个内存块,也就是说内存块消耗减少了250倍 ,请求数减少了37倍。 35w+ connections, 5.67w req/s的时候也就消耗100块左右，比线性增加要少很多。也就是这点发现让我有了做这个技术分享的冲动，其他都是顺便讲讲。

五，Yslow优化网站性能的14条军规点评

其中黑色部分，跟后端是紧密相连的，在我们的内容中都已经涉及到了，而且做了更深入的讨论。兰色部分，5，6，7是相关页面执行速度的，构建前端页面的人应该注意的。 11属于避免使用的方法。红色部分我着重说一下：

gzip 我不推荐使用，因为有些早期IE支持的不好，它的表现为直接用IE访问没问题，用js嵌进去，就不能正常解压。这样的用户占比应该在2%左右。这个问题我跟踪了近一个月，差点放弃使用压缩。后来发现我以前用deflate压缩的文件却能正常访问。改用deflate问题解决。apache 1.x使用mod_gzip ,到了 2.x 改用cmod_deflate，不知道是否跟这个原因有关。另外对于小文件压缩来说，deflate 可比 gzip 省不少字节。

减少 DNS 查询: 这里也是有个取舍的，一般浏览器最多只为一个域名创建两个连接通道。如果我一个页面嵌了 image.xx.com 的很多图片，你就会发现，图片从上往下一张张显示出来这个过程。这造成了浏览器端的排队。我们可以通过增加域名提高并发度，例如 image0.xx.com ,image1.xx.com ,image2.xx.com，image3.xx.com 这样并发度就提上去了，但是会造成很多cache失效，那很简单，假如我们对文件名相加，对4取mod，就能保证，某个图片只能通过某个域名进行访问。不过，我也很反对一页面请求了数十个域名，很多域名下只有一到两个资源的做法，这样的时间开销是不划算的。

另外，我在这里再添一个第15条：错开资源请求时间，避免浏览器端排队。
随着ajax的广泛使用，动态刷新无处不在，体育直播里有个页面调用了我一个域名下的6个文件，3个js，3个xml。刷新频率大致是两个10秒的，两个30秒的，两个一次性载入的。观察发现正常响应时间都在7ms,但是每过一会就会出现一次在100ms以上的，我就很奇怪，服务器负载很轻呢。meijun帮我把刷新时间错开，11秒的，9秒的,31秒的，这样响应在100ms以上的概率减少了好几倍，这就是所谓的细节决定成败吧。

1. 尽可能的减少 HTTP 的请求数 [content]
2. 使用 CDN（Content Delivery Network） [server]
3. 添加 Expires 头(或者 Cache-control ) [server]
4. Gzip 组件 [server]
5. 将 CSS 样式放在页面的上方 [css]
6. 将脚本移动到底部（包括内联的） [javascript]
7. 避免使用 CSS 中的 Expressions [css]
8. 将 JavaScript 和 CSS 独立成外部文件 [javascript] [css]
9. 减少 DNS 查询 [content]
10. 压缩 JavaScript 和 CSS (包括内联的) [javascript] [css]
11. 避免重定向 [server]
12. 移除重复的脚本 [javascript]
13. 配置实体标签（ETags） [css]
14. 使 AJAX 缓存

六，上线了 != Finished

奥运期间我按1500w~2000w connections在线，设计了一套备用系统，现在看来，如果用户真达到了这个数目我会很危险，因为有部分服务器引入了32bit的centos 5未经实际线上检验，而我当时简单的认为它应该和centos 4表现出一样的特性。所以现在未经过完全测试的lib库和新版本，我都很谨慎的使用。没在真实环境中检验过，不能轻易下结论。

很多项目组好象不停的忙，做新项目，上线后又继续下个新项目，然后时不时的转过头去修理以前的bug。如果一个项目上线后，用户量持续上升，就应该考虑优化了，一个人访问，和100w人访问，微小的修改对后端影响是不能比较的，不该请求的资源就让它cache在用户的硬盘上，用户访问块了，你也省资源。上线仅仅代表可以交差了而已，对于技术人员来说持续的对一个重要项目进行跟踪和优化是必要的。

七，提速度同时节约成本方法汇总

1，编写节约的HTTP服务器 (高负载下速度明显提升，节约5~10倍服务器)
对一些重要的服务器量身定做。或者选用比较高效的开源软件进行优化。

2，不同服务混合使用（节约1~2倍服务器）
如果我们一台服务器只支持30w conn的话，那么剩余的75% cpu资源，95%的内存资源，和几乎所有的磁盘资源都可以部署动态池系统，我觉得DB对网卡中断的消耗还是有限的，我也不用新买网卡了。

3,对于纯数据部分启用新的域名(速度有提升，上行带宽节约1倍以上)
比如我们另外购买了sinajs.cn 来做数据服务，以避免cookie,节约带宽. Cookie不但会浪费服务器端处理能力，而且它要上行数据，而通常情况上行比下行慢。

4，使用长连接 (速度明显提升，节约带宽2倍以上，减少网络拥塞3~无数倍)
对于一次性请求多个资源，或在比较短的间隔内会有后续请求的应用，使用长连接能明显提升用户体验，减少网络拥塞，减少后端服务器建立新连接的开销。

5，数据和呈现分离，静态数据和动态数据分离 (速度明显提升，同时节约3倍带宽)
div+css 数据和呈现分离以后，据说文件大小能降到以前的1/3。
把页面中引用的js文件分离出来，把动态部分和静态部分也分离开来。

6，使用deflate压缩算法 (速度明显提升，节约3.33倍带宽)
一般来说压缩过的文件大小不到以前的30% 。
将上面分离出来的数据进行压缩(累计节约带宽10倍)。

7, 让用户尽可能多的Cache你的资源（速度明显提升，节约3~50倍服务器资源和带宽资源）
将上面分离出来的css和不经常变动的js数据部分cache住合适的时间。(理想情况,累计节约带宽30~500倍) 。

以上改进可以让速度大幅度提升的同时，服务器资源节约 5~20 倍，减少网络拥塞3~无数倍, 上行带宽节约1倍以上，下行带宽节约30~500倍，甚至更多。

Linux下安装配置apache，mysql，php并支持gd，jpeg，png，freetype的方法

一、准备工作
z*b(g q o n b9C r B0所需软件安装包及版本号（以下软件包均为tar.gz包）云南电视网博客 D/\7| u)h R,Y9|
1、PHP Version 5.1.6云南电视网博客 g2?!m#Z+v G i C
2、Apache/2.2.3 (Unix)
0r u a a k l? j1` k03、FreeType Version 2.2.1
,M D W.I,p \&A04、GD Version 2.0 or higher 云南电视网博客J0P+{0K { w+I ^)Z
5、libXML2 Version 2.6.26云南电视网博客 K!` E.\ I1o
6、Mysql version 5.0.27云南电视网博客)C,v Q l+M ` d
7、zlib Version 1.2.3 云南电视网博客.{ N h k ^ L
8、Zend Optimizer v3.0.2
)L P {3y p5Z M v Q Z-b C e09、ibpng Version 1.2.12云南电视网博客 \6Z7R*E0z z \
10、jpegsrc-o.v6b

\ CK q)Q J0二、开始安装
g Q9] m&g0为了便于安装配置，请先关闭防火墙/etc/init.d/iptables stop
-e8Z X)i H t c c w01、安装Mysql
9I \ ] X/B2I [0当你看到mysql有很多版本，如：
\.x-Y w v f u ^0mysql-max-5.0.27-linux-i686-glibc23.tar.gz和
/k D:_ R s t Y*W X0mysql-max-5.0.27-linux-i686.tar.gz
W A B l'c0这俩个到底选哪个呢，请你使用如下命令来决定云南电视网博客(D | l Z j"K0B
# rpm -qa | grep glibc
v j K e7a*r;T O0glibc-kernheaders-2.4-8.10
? ^ P L;p.U"p S d*|0glibc-common-2.3.2-11.9云南电视网博客 u;j2j(h o0S m
glibc-2.3.2-11.9云南电视网博客2J2H A2w W J J
glibc-devel-2.3.2-11.9
a%j @;y1s:p B,H k0如果出现以上信息，请选择mysql-max-5.0.27-linux-i686-glibc23.tar.gz版本

\9d v q8P y0#tar -zxvf mysql-max-5.0.27-linux-i686-glibc23.tar.gz
#mv mysql-max-5.0.27-linux-i686-glibc23 mysql云南电视网博客 |!C0?)G#a
#cd mysql云南电视网博客5|+r V%m+u
#groupadd mysql
,f.Z;E ~ t h A-S Z0#useradd -g mysql mysql
#scrīpts/mysql_install_db --user=mysql云南电视网博客#s#w.T(^)O Q&^%W
#chown -R root .
i:v n i1H l3S.f.{ w0#chown -R mysql data云南电视网博客;} c t$K/M I g0H
#chgrp -R mysql .云南电视网博客 U(A,} K o d r){ Z
#bin/mysqld_safe --user=mysql &
@2n m ? W0若能正常启动mysql则进行下面的工作否则云南电视网博客 M/e$Q&w p(N
killall -TERM mysqld云南电视网博客/K w-? t R9S1D
杀掉所有mysql的进程，删除mysql重复上面安装步骤。云南电视网博客 R)p4K ]&R C o [

/x(W9x C$} s1M _02、安装apache云南电视网博客 W!C Q k W f
#tar -zxvf httpd-2.2.3.tar.gz
F h A G-h ]%Y0o.H8E0#cd httpd-2.2.3
9}1].O T V7Y m R0#./configure --prefix=/usr/local/httpd --enable-so --enable-track-vars --enable-moudules-most

Z w,~0a:A q0以下注意安装顺序，先安装各种支持包
2Y a*r b F }03、安装libxml2
s W!@ I z e q X F0#tar -zxvf zlib-1.2.3.tar.gz
0g G#H*| m#]0#./configure云南电视网博客!UY I h]5~ x Z e
#make
A6V ] B K W M![3Y P&v0#make install

$H3g M h E a k04、安装freetype
#tar xzvf freetype-2.1.5.tar.gz
r(N5{1j U t%B)_3w#x0#cd freetype-2.1.5云南电视网博客 P c*K O }$G#f)L
#./configure --prefix=/usr/local/freetype
#make
#make install

6S c k U4O ?(I&{ P05、安装libpng云南电视网博客0y2f o5x g ~ | d;J
#不要用--prefix自定义安装目录，影响gd的安装
*D%Y,R j5~ I1X2D H u2d0#tar -xzvf libpng-1.2.12-no-config.tar.gz
v ] u K O ^%j O I0#cd libpng-1.2.12
'e d B/? x B ~ ? |&g0#cp scrīpts/makefile.std makefile
M B.] a @ ['s N Y0#make test云南电视网博客 `.W)O4Y ] Y
#make install云南电视网博客 J |;Z8y } M)B

6、安装jpeg云南电视网博客 g Q A%g1T6o
#mkdir /usr/local/modules云南电视网博客 q$Z k6T:V ~0H.X
#mkdir /usr/local/jpeg6云南电视网博客 r q j @@0d&r'Z+z
#mkdir /usr/local/jpeg6/bin
(r,d+L H$h n N0#mkdir /usr/local/jpeg6/lib
} k5S\ B/W!f0#mkdir /usr/local/jpeg6/include云南电视网博客(@ _ q a2w q K&u
#mkdir /usr/local/jpeg6/man
8m#Y ^&s4t Y;j0#mkdir /usr/local/jpeg6/man/man1
x S f r;e { F'I0#tar -xzvf jpegsrc.v6b.tar.gz云南电视网博客'? C;l7L G }M%u {
#cd jpeg6
o j }2F D*^ f0#./configure --prefix=/usr/local/jpeg6 --enable-shared --enable-static云南电视网博客*H I!h c j P ? f O
#make
O.Y,i q3]/F0#make install云南电视网博客 ` [%W o z w l,h(v

Y O'b s l3e @07、安装GD
|3D,s$h&h&T%{0#tar xzvf gd-2.0.33.tar.gz云南电视网博客0q$Q d b k.q k [#{
#cd gd-2.0.33
c%^ A K7N%p3\ h z B0#./configure --prefix=/usr/local/gd --with-jpeg=/usr/local/jpeg6 --with-png --with-zlib --with-freetype=/usr/local/freetype

--with-xpm
#make
f6R&m oL a#T"M0#make install云南电视网博客 O c @(b } d%z B'V

{ k q"L)? Y um!C0
'I N r s y"o"`0B、安装PHP

#tar -xzvf php-5.1.6.tar.gz云南电视网博客 I j1d W w'n J
#cd php-5.1.6
%? z {B C"g)T6P,F [0#./configure --with-apxs2=/usr/local/httpd/bin/apxs --with-mysql=/usr/local/mysql --with-zlib --with-jpeg-

dir=/usr/local/jpeg6 --with-png --with-freetype-dir=/usr/local/modules/freetype --with-xpm --enable-ftp --enable-sockets --云南电视网博客 L d l c ] _

with-gd-dir=/usr/local/modules/gd --enable-gd-native-ttf --with-ttf --enable-track-vars --enable-magic-quotes --with-iconv -

-with-mbstring --enable-ftp --with-config-file-path=/usr/local/php/etc云南电视网博客 z1s ] I6Y w }3[

#make
y T r5W#I,^0#make install
L:} \-h,e \0j h0#cp php.ini-dist /usr/local/php/etc/php.ini云南电视网博客 c K-z$S J j

7I P:y6d!W d t'?0更改apache的配制文件：得加几行，目的是让apache能解释php程序。云南电视网博客'v(L RQ n v;?
查找AddType application/x-tar .tgz 行，在下面添加
/n v u W-g _,S0AddType application/x-httpd-php .php云南电视网博客 y6V&j z'~ A,A4Q
AddType application/x-httpd-php .php3云南电视网博客 V c ~ f | R
AddType application/x-httpd-php .phtml
AddType application/x-httpd-php-source .phps

找到下面一行在后面加上index.php，这表示网站的默认页也能够为index.php云南电视网博客 Z1M!O3I%{)Z1J n

#DirectoryIndex index.html index.html.var index.php

进行php.ini文件的配置工作
查找safe_mode=Off,更改为safe_mode=On
1）查找max_execution_time = 30，更改为max_execution_time = 600
（2）查找max_input_time = 60，更改为max_input_time = 600
（3）查找memory_limit = 8M ，更改为memory_limit = 20M
（4）查找display_errors = On，更改为display_errors = Off
（5）查找register_globals = Off，更改为register_globals = On
（6）查找post_max_size = 8M，更改为post_max_size = 20M
（7）查找upload_max_filesize = 2M，更改为upload_max_filesize = 20M
（8）查找session.auto_start = 0，更改为session.auto_start = 1
保存后退出，从而完成了php.ini文件的配置工作。

订阅：博文 (Atom)

老曾的博客

2009年1月21日星期三