首页 > *nix技术, 跟踪调试 > systemtap初试用

systemtap初试用

2013年2月4日 发表评论 阅读评论 4,696 次浏览

久闻systemtap工具的大名,之前也断断续续的看过它的介绍,并且利用CentOS发行版自带的systemtap相关命令试用过几次,感觉还算不错。本文将介绍一下systemtap的安装,以及对应用程序的追踪示例,不过,在此之前,先描述一下systemtap的背景知识以及相关资料。

在Solaris系统上,有一个大名鼎鼎的动态跟踪工具DTRACE,这一个相当棒的工具,曾荣获《华尔街杂志》2006技术创新大奖中的金奖,而在Linux上却没有对应的工具,当然,那是以前,因为我们已经知道,现在Linux上有了systemtap。

和ZFS文件系统一样,DTrace一直都因版权问题而无法移植到Linux上,但Oracle(SUN公司被Oracle收购)在2012年2月宣布发布DTrace for Linux beta版,即将Solaris操作系统的动态跟踪工具移植到他们的Unbreakable Enterprise Kernel(2.6.39)内,也就是说Linux人员终于也可以使用DTrace了,不过DTrace不是本文的主角,所以还是来看systemtap,而关于systemtap与DTrace的比较,请看这里

一般的Linux发行版,比如Fedora、OpenSuse、CentOS等,已经包含有systemtap的完整支持了,看看在我的这台机器上,试用实例如下:

[root@localhost ~]# uname -a
Linux localhost.localdomain 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]# cat /etc/issue
CentOS Linux release 6.0 (Final)
Kernel \r on an \m

[root@localhost ~]# mkdir -p /home/work/systemtap/
[root@localhost ~]# cd !$
cd /home/work/systemtap/
[root@localhost systemtap]# vi lsprob.stp
[root@localhost systemtap]# cat lsprob.stp
probe process("/usr/local/nginx/sbin/nginx").function("*")
{
    printf("%s(%s)\n", probefunc(), $$parms)
}
[root@localhost systemtap]# /usr/local/nginx/sbin/nginx
[root@localhost systemtap]# stap -v lsprob.stp
Pass 1: parsed user script and 71 library script(s) using 87816virt/24540res/2428shr kb, in 150usr/80sys/366real ms.
Pass 2: analyzed script: 1370 probe(s), 260 function(s), 1 embed(s), 0 global(s) using 119184virt/53660res/5280shr kb, in 210usr/140sys/443real ms.
Pass 3: translated to C into "/tmp/stap6xNnxj/stap_bf1d7c57b21463aadca831f0f58503b1_586504.c" using 119152virt/53960res/5544shr kb, in 80usr/970sys/2227real ms.
Pass 4, preamble: (re)building SystemTap's version of uprobes.
Pass 4: compiled C into "stap_bf1d7c57b21463aadca831f0f58503b1_586504.ko" in 10620usr/11870sys/27102real ms.
Pass 5: starting run.
ngx_time_update()
ngx_gmtime(t=0x51015b28 tp=0x7fff7c70ff80)
ngx_sprintf(buf=0x68761a fmt=0x460c58)
ngx_vslprintf(buf=0x68761a last=0xffffffffffffffff fmt=0x460c58 args=0x7fff7c70fe60)
...
ngx_destroy_pool(pool=0xef4a10)
ngx_pool_cleanup_file(data=0xef53f8)
ngx_event_add_timer(timer=0xfde8 ev=0xf08600)
ngx_rbtree_insert(tree=0x689340 node=0xf08628)
ngx_handle_read_event(rev=0xf08600 flags=0x0)
ngx_pfree(pool=0xefe970 p=0xef4110)
ngx_pfree(pool=0xefe970 p=0xef4600)
ngx_reusable_connection(c=0x7fa37adcc180 reusable=0x1)
ngx_http_run_posted_requests(c=0x7fa37adcc180)
ngx_event_expire_timers()
ngx_rbtree_min(sentinel=? node=0xf08628)
ngx_event_process_posted(cycle=0xeea070 posted=0x689358)
ngx_http_keepalive_handler(rev=0xf08600)
ngx_palloc(pool=0xefe970 size=0x400)
ngx_palloc_large(pool=0xefe970 size=0x400)
ngx_alloc(size=0x400 log=0xefe9d0)
ngx_unix_recv(c=0x7fa37adcc180 buf=0xef4110 size=0x400)
ngx_http_close_connection(c=0x7fa37adcc180)
ngx_close_connection(c=0x7fa37adcc180)
ngx_event_del_timer(ev=0xf08600)
ngx_rbtree_delete(tree=0x689340 node=0xf08628)
ngx_epoll_del_connection(c=0x7fa37adcc180 flags=0x1)
ngx_reusable_connection(c=0x7fa37adcc180 reusable=0x0)
ngx_free_connection(c=0x7fa37adcc180)
ngx_destroy_pool(pool=0xefe970)
ngx_process_events_and_timers(cycle=0xeea070)
ngx_event_find_timer()
ngx_epoll_process_events(cycle=0xeea070 timer=0xffffffffffffffff flags=0x1)
^CPass 5: run completed in 20usr/150sys/28527real ms.
[root@localhost systemtap]#

另开一终端,发起一个nginx的HTTP请求:

[root@localhost ~]# curl 127.0.0.1
<html>
<head>
<title>Welcome to nginx!</title>
</head>
<body bgcolor="white" text="black">
<center><h1>Welcome to nginx!</h1></center>
</body>
</html>

从上面内容可以看到,在CentOS 6.0上,利用系统自动的systemtap,可完整的捕获到Nginx的执行过程。下面介绍systemtap的源码安装方式,并且内核也换为新的香草内核

ftp://sources.redhat.com/pub/systemtap/releases/下载到当前(2013-01-26)最新的systemtap源码包,注意我们这里要用systemtap来追踪应用程序,因此先看看systemtap源码包里的README文件的utrace部分:

Building a kernel.org kernel:

- Consider applying the utrace kernel patches, if you wish to probe
user-space applications. http://sourceware.org/systemtap/wiki/utrace
Or if your kernel is near 3.5, apply the uprobes and related patches
(see NEWS). Or if your kernel is >= 3.5, enjoy the built-in uprobes.

- Build the kernel using your normal procedures. Enable
CONFIG_DEBUG_INFO, CONFIG_KPROBES, CONFIG_RELAY, CONFIG_DEBUG_FS,
CONFIG_MODULES, CONFIG_MODULE_UNLOAD, CONFIG_UTRACE if able
- % make modules_install install headers_install
- Boot into the kernel.

从上面英文可以看到,如果要用systemtap追踪应用程序,那么有三种情况:
1,如果内核版本比较旧,那么需要下载对应的utrace补丁,比如2.6.37
2,如果内核版本比较新,但仍低于3.5,那么需要打3个系列固定补丁,具体请看systemtap的NEWS文件。
3,如果内核版本大于等于3.5,那好,直接支持。

为了偷懒,我下载了3.7.0版本的内核代码,执行编译:

[root@localhost ~]# cd /usr/src/
[root@localhost src]# tar xjf linux-3.7.tar.bz2
[root@localhost src]# cd linux-3.7/
[root@localhost src]# make menuconfig

保证选上这个:

Kernel hacking  --->
[*] Tracers  --->
[*]   Enable uprobes-based dynamic events

我在内核编译菜单里找了半天,也没有找到内核选项CONFIG_UTRACE,其实这个选项在新版本内核里已经没有了,所以不用管它。在make前,检查一下.config文件,确保如下几个选项存在:

[root@localhost src]# cat .config | grep CONFIG_DEBUG_INFO
[root@localhost src]# cat .config | grep CONFIG_KPROBES
[root@localhost src]# cat .config | grep CONFIG_RELAY
[root@localhost src]# cat .config | grep CONFIG_DEBUG_FS
[root@localhost src]# cat .config | grep CONFIG_MODULES
[root@localhost src]# cat .config | grep CONFIG_MODULE_UNLOAD
[root@localhost src]# make
[root@localhost src]# make modules
[root@localhost src]# make modules_install install headers_install

编译重启OK,一切顺利。

试试系统自带的systemtap:

[root@localhost ~]# cd /home/work/systemtap/
[root@localhost systemtap]# cat lsprob.stp
probe process("/usr/local/nginx/sbin/nginx").function("*")
{
    printf("%s(%s)\n", probefunc(), $$parms)
}
[root@localhost systemtap]# /usr/local/nginx/sbin/nginx
[root@localhost systemtap]# uname -a
Linux localhost.localdomain 3.7.0 #1 SMP Wed Jan 9 04:46:12 CST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost systemtap]# cat /etc/issue
CentOS Linux release 6.0 (Final)
Kernel \r on an \m

[root@localhost systemtap]# stap -v lsprob.stp
Pass 1: parsed user script and 71 library script(s) using 87952virt/24532res/2424shr kb, in 200usr/110sys/310real ms.
semantic error: process probes not available without kernel CONFIG_UTRACE while resolving probe point process("/usr/local/nginx/sbin/nginx").function("*")
Pass 2: analyzed script: 0 probe(s), 0 function(s), 0 embed(s), 0 global(s) using 88612virt/25324res/2560shr kb, in 10usr/0sys/8real ms.
Pass 2: analysis failed.  Try again with another '--vp 01' option.
[root@localhost systemtap]#

提示出错,这很正常,因为系统自带的systemtap版本比较低,还会依赖并判断CONFIG_UTRACE选项是否存在:

[root@localhost systemtap]# stap -V
SystemTap translator/driver (version 1.2/0.148 non-git sources)
Copyright (C) 2005-2010 Red Hat, Inc. and others
This is free software; see the source for copying conditions.

用新版本的systemtap试试,安装方法也很简单:

[root@localhost systemtap]# ls
lsprob.stp  systemtap-2.0.tar.gz
[root@localhost systemtap]# tar xzf systemtap-2.0.tar.gz
[root@localhost systemtap]# cd systemtap-2.0
[root@localhost systemtap-2.0]# ./configure --prefix=/usr/local/systemtap/
[root@localhost systemtap-2.0]# make; make install;
[root@localhost systemtap-2.0]# cd ..
[root@localhost systemtap]# /usr/local/systemtap/bin/stap -V
Systemtap translator/driver (version 2.0/0.148, non-git sources)
Copyright (C) 2005-2012 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
enabled features: LIBRPM LIBSQLITE3 NSS BOOST_SHARED_PTR TR1_UNORDERED_MAP NLS
[root@localhost systemtap]# /usr/local/systemtap/bin/stap -v lsprob.stp
Pass 1: parsed user script and 90 library script(s) using 186812virt/21572res/2744shr/19532data kb, in 190usr/50sys/244real ms.
Pass 2: analyzed script: 1370 probe(s), 272 function(s), 3 embed(s), 0 global(s) using 338640virt/37872res/6068shr/32224data kb, in 1690usr/2180sys/3630real ms.
Pass 3: translated to C into "/tmp/stapKIcR2o/stap_b06a01dc8793131b49ab0ed87f010c1c_610081_src.c" using 339056virt/38520res/6072shr/33036data kb, in 240usr/280sys/520real ms.
In file included from /usr/local/systemtap/share/systemtap/runtime/linux/task_finder.c:17,
                 from /usr/local/systemtap/share/systemtap/runtime/linux/runtime.h:169,
                 from /usr/local/systemtap/share/systemtap/runtime/runtime.h:17,
                 from /tmp/stapKIcR2o/stap_b06a01dc8793131b49ab0ed87f010c1c_610081_src.c:21:
/usr/local/systemtap/share/systemtap/runtime/linux/task_finder2.c: In function ‘__stp_get_mm_path’:
/usr/local/systemtap/share/systemtap/runtime/linux/task_finder2.c:441: error: ‘VM_EXECUTABLE’ undeclared (first use in this function)
/usr/local/systemtap/share/systemtap/runtime/linux/task_finder2.c:441: error: (Each undeclared identifier is reported only once
/usr/local/systemtap/share/systemtap/runtime/linux/task_finder2.c:441: error: for each function it appears in.)
In file included from /tmp/stapKIcR2o/stap_b06a01dc8793131b49ab0ed87f010c1c_610081_src.c:88294:
/usr/local/systemtap/share/systemtap/runtime/linux/uprobes-inode.c: In function ‘stapiu_change_plus’:
/usr/local/systemtap/share/systemtap/runtime/linux/uprobes-inode.c:410: error: ‘VM_EXECUTABLE’ undeclared (first use in this function)
/usr/local/systemtap/share/systemtap/runtime/linux/uprobes-inode.c: In function ‘stapiu_get_task_inode’:
/usr/local/systemtap/share/systemtap/runtime/linux/uprobes-inode.c:512: error: ‘VM_EXECUTABLE’ undeclared (first use in this function)
make[1]: *** [/tmp/stapKIcR2o/stap_b06a01dc8793131b49ab0ed87f010c1c_610081_src.o] Error 1
make: *** [_module_/tmp/stapKIcR2o] Error 2
WARNING: kbuild exited with status: 2
Pass 4: compiled C into "stap_b06a01dc8793131b49ab0ed87f010c1c_610081.ko" in 8080usr/4360sys/12363real ms.
Pass 4: compilation failed.  Try again with another '--vp 0001' option.
[root@localhost systemtap]#

错误更多了?从这里了解到是因为我们用的内核版本太新了,需要换用更新的systemtap:systemtap-20121215.tar.bz2,解压安装:

[root@localhost systemtap]# ls
lsprob.stp  systemtap-2.0  systemtap-20121215.tar.bz2  systemtap-2.0.tar.gz
[root@localhost systemtap]# tar xjf systemtap-20121215.tar.bz2
[root@localhost systemtap]# cd systemtap-20121215.tar.bz2
-bash: cd: systemtap-20121215.tar.bz2: Not a directory
[root@localhost systemtap]# cd src/
[root@localhost src]# ./configure --prefix=/usr/local/systemtap/
[root@localhost src]# make; make install

再试试:

[root@localhost systemtap]# /usr/local/systemtap/bin/stap -V
Systemtap translator/driver (version 2.1/0.148, non-git sources)
Copyright (C) 2005-2012 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
enabled features: LIBRPM LIBSQLITE3 NSS BOOST_SHARED_PTR TR1_UNORDERED_MAP NLS
[root@localhost systemtap]# /usr/local/systemtap/bin/stap -v lsprob.stp
Pass 1: parsed user script and 90 library script(s) using 189748virt/24468res/2736shr/22460data kb, in 170usr/40sys/209real ms.
Pass 2: analyzed script: 1370 probe(s), 265 function(s), 3 embed(s), 0 global(s) using 341640virt/40840res/6092shr/35216data kb, in 1540usr/2000sys/3328real ms.
Pass 3: translated to C into "/tmp/stap0lA0uk/stap_31d31fafe19083e2b9ad121ef41879bd_607455_src.c" using 342052virt/41460res/6072shr/36024data kb, in 240usr/400sys/655real ms.
Pass 4: compiled C into "stap_31d31fafe19083e2b9ad121ef41879bd_607455.ko" in 24160usr/6630sys/30387real ms.
Pass 5: starting run.
ngx_time_update()
ngx_gmtime(t=0x50ecb971 tp=0x7fff6851cb80)
ngx_sprintf(buf=0x68761a fmt=0x460c58)
...
ngx_event_expire_timers()
ngx_event_expire_timers(sentinel=? node=0x1f41628)
ngx_event_process_posted(cycle=0x1f23070 posted=0x689358)
ngx_http_keepalive_handler(rev=0x1f41600)
ngx_palloc(pool=0x1f37970 size=0x400)
ngx_palloc_large(pool=0x1f37970 size=0x400)
ngx_alloc(size=0x400 log=0x1f379d0)
ngx_unix_recv(c=0x7fdd38af2180 buf=0x1f2d110 size=0x400)
ngx_handle_read_event(rev=0x1f41600 flags=0x0)
ngx_process_events_and_timers(cycle=0x1f23070)
ngx_event_find_timer()
ngx_event_find_timer(sentinel=0x688c00 node=0x1f41628)
ngx_epoll_process_events(cycle=0x1f23070 timer=0xfde8 flags=0x1)
ngx_time_update()
ngx_http_keepalive_handler(rev=0x1f41600)
ngx_unix_recv(c=0x7fdd38af2180 buf=0x1f2d110 size=0x400)
ngx_http_close_connection(c=0x7fdd38af2180)
ngx_close_connection(c=0x7fdd38af2180)
ngx_close_connection(ev=0x1f41600)
ngx_rbtree_delete(tree=0x689340 node=0x1f41628)
ngx_epoll_del_connection(c=0x7fdd38af2180 flags=0x1)
ngx_reusable_connection(c=0x7fdd38af2180 reusable=0x0)
ngx_free_connection(c=0x7fdd38af2180)
ngx_destroy_pool(pool=0x1f37970)
ngx_event_expire_timers()
ngx_process_events_and_timers(cycle=0x1f23070)
ngx_event_find_timer()
ngx_epoll_process_events(cycle=0x1f23070 timer=0xffffffffffffffff flags=0x1)
^CPass 5: run completed in 50usr/170sys/108234real ms.
[root@localhost systemtap]#

终于正常了。

另外,如果提示这样的错误:
configure: error: missing elfutils development headers/libraries (install elfutils-devel, libebl-dev, libdw-dev and/or libebl-devel)
试试到https://fedorahosted.org/releases/e/l/elfutils/下载对应的elfutils源码包,然后以如下方式配置:
[root@localhost stap]# ./configure –prefix=/home/stap/install/ –with-elfutils=/home/elfutils-x.xxx
注意,一定不要尝试安装elfutils,否则可能将导致系统环境变乱。

参考资料:
甲骨文发布DTrace for Linux beta版

甲骨文发布其Linux内核更新 动态跟踪框架Dtrace备受瞩目

https://lists.linux-foundation.org/pipermail/ksummit-2008-discuss/2008-June/000192.html

http://dtrace.org/blogs/ahl/2011/10/05/dtrace-for-linux-2/

http://redmonk.com/sogrady/2008/07/01/dtrace-vs-systemtap-redux/

转载请保留地址:http://lenky.info/archives/2013/02/04/2200http://lenky.info/?p=2200


备注:如无特殊说明,文章内容均出自Lenky个人的真实理解而并非存心妄自揣测来故意愚人耳目。由于个人水平有限,虽力求内容正确无误,但仍然难免出错,请勿见怪,如果可以则请留言告之,并欢迎来讨论。另外值得说明的是,Lenky的部分文章以及部分内容参考借鉴了网络上各位网友的热心分享,特别是一些带有完全参考的文章,其后附带的链接内容也许更直接、更丰富,而我只是做了一下归纳&转述,在此也一并表示感谢。关于本站的所有技术文章,欢迎转载,但请遵从CC创作共享协议,而一些私人性质较强的心情随笔,建议不要转载。

法律:根据最新颁布的《信息网络传播权保护条例》,如果您认为本文章的任何内容侵犯了您的权利,请以Email或书面等方式告知,本站将及时删除相关内容或链接。

分类: *nix技术, 跟踪调试 标签: