如何追踪函数的完整调用过程
何追踪函数的完整调用过程?对于软件编程人员来说,这一个有趣的话题,而在stackoverflow上就恰好有这么一个讨论,本文就对这个讨论里的内容进行一下实际操作与转述。
我们知道两个命令,ltrace和strace,其中ltrace用于追踪记录动态库函数的调用执行,而strace用于追踪记录系统调用函数的调用执行,看示例:
[root@localhost trace]# uname -a Linux localhost.localdomain 3.7.0 #1 SMP Wed Jan 9 04:46:12 CST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@localhost trace]# cat /etc/issue CentOS Linux release 6.0 (Final) Kernel \r on an \m [root@localhost trace]# cat test.c #include <stdio.h> int triple (int x) { return 3 * x; } int main (void) { printf("%d\n", triple(10)); return 0; } [root@localhost trace]# gcc test.c -o test [root@localhost trace]# ltrace ./test __libc_start_main(0x4004d6, 1, 0x7fffd53fc748, 0x400520, 0x400510 <unfinished ...> printf("%d\n", 3030 ) = 3 +++ exited (status 0) +++ [root@localhost trace]# strace ./test execve("./test", ["./test"], [/* 31 vars */]) = 0 brk(0) = 0x1d63000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f01b3073000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=141335, ...}) = 0 mmap(NULL, 141335, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f01b3050000 close(3) = 0 open("/lib64/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\355\1\272>\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1838296, ...}) = 0 mmap(0x3eba000000, 3664040, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3eba000000 mprotect(0x3eba175000, 2097152, PROT_NONE) = 0 mmap(0x3eba375000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x3eba375000 mmap(0x3eba37a000, 18600, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3eba37a000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f01b304f000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f01b304e000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f01b304d000 arch_prctl(ARCH_SET_FS, 0x7f01b304e700) = 0 mprotect(0x3eba375000, 16384, PROT_READ) = 0 mprotect(0x3eb9a1e000, 4096, PROT_READ) = 0 munmap(0x7f01b3050000, 141335) = 0 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f01b3072000 write(1, "30\n", 330 ) = 3 exit_group(0) = ? [root@localhost trace]#
如上所示,ltrace只追踪到动态库函数printf的调用,而strace追踪到所有的系统调用函数调用,比如execve、brk等等。注意,printf只是一个库函数,其内使用了系统调用函数write。
严格来说,ltrace和strace不算是追踪到函数的完整调用过程,下面来看其它各种方法。
第一种,这个方法虽然看上去比较愚笨,却是最简易可行的,因为它用到的这个工具gdb,太普遍了。具体如下:
首先,获取二进制程序内所有函数符号:
[root@localhost trace]# readelf -s ./test | gawk ' { if($4 == "FUNC" && $2 != 0) { print "# code for " $NF; print "b *0x" $2; print "commands"; print "silent"; print "bt 1"; print "c"; print "end"; print ""; } }' > sym;
利用gawk生成对应的gdb命令,这些命令很巧妙,它在每一个函数符号处下个断点,比如“b *0x00000000004004c4”表示在函数triple处下断点,一旦程序执行到这就被断下来,并执行对应的commands命令内容,其中silent表示静默停断,即强制gdb不要打印额外的提示信息,而命令“bt 1”是根据当前需要来的,显示当前函数名即可,命令“c”继续:
如下:
[root@localhost trace]# cat sym # code for call_gmon_start b *0x000000000040040c commands silent bt 1 c end ... # code for triple b *0x00000000004004c4 commands silent bt 1 c end # code for main b *0x00000000004004d6 commands silent bt 1 c end # code for _init b *0x0000000000400390 commands silent bt 1 c end
接着,执行gdb,将上面生成的gdb命令以–command参数传进去:
[root@localhost trace]# gdb --command=sym ./test -q Reading symbols from /home/gqk/work/trace/test...done. Breakpoint 1 at 0x40040c Breakpoint 2 at 0x400430 Breakpoint 3 at 0x4004a0 Breakpoint 4 at 0x4005b0 Breakpoint 5 at 0x400510 Breakpoint 6 at 0x4003e0 Breakpoint 7 at 0x4005e8 Breakpoint 8 at 0x400520 Breakpoint 9 at 0x4004c4: file test.c, line 4. Breakpoint 10 at 0x4004d6: file test.c, line 9. Breakpoint 11 at 0x400390 (gdb)
上面显示设置好所有断点,执行r命令:
(gdb) r Starting program: /home/gqk/work/trace/test #0 0x00000000004003e0 in _start () #0 0x0000000000400520 in __libc_csu_init () #0 0x0000000000400390 in _init () #0 0x000000000040040c in call_gmon_start () #0 0x00000000004004a0 in frame_dummy () #0 0x00000000004005b0 in __do_global_ctors_aux () #0 main () at test.c:9 #0 triple (x=0) at test.c:4 30 #0 0x00000000004005e8 in _fini () #0 0x0000000000400430 in __do_global_dtors_aux () Program exited normally. Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.7.el6.x86_64 (gdb)
上面追踪出所有的函数调用过程,还是不错吧?
第二种,利用systemtap,由于systemtap太过强大,因此利用systemtap追踪函数的完整调用过程实在是太简单了:
[root@localhost trace]# cat lsprob.stp probe process("/home/gqk/work/trace/test").function("*") { printf("%s(%s)\n", probefunc(), $$parms) } [root@localhost trace]# /usr/local/systemtap/bin/stap lsprob.stp -c /home/gqk/work/trace/test 30 main() triple(x=0x0) [root@localhost trace]#
只捕获到了应用程序自身的main和triple函数调用,不过这基本算是达到要求了。
第三种,各种利用-finstrument-functions编译选项的自制工具(在书本《深入剖析Nginx》第二章也有提到一个自定义方法),比如etrace,还有其它各种靠谱或不靠谱的工具:ftrace、Valgrind的Callgrind套件,等等。
完全参考:
http://stackoverflow.com/questions/311840/tool-to-trace-local-function-calls-in-linux/324709
其它链接:
http://blog.superadditive.com/2007/12/01/call-graphs-using-the-gnu-project-debugger/
https://github.com/yotamr/traces/wiki/Introduction
http://sourceware.org/frysk/
转载请保留地址:http://lenky.info/archives/2013/02/05/2202 或 http://lenky.info/?p=2202
备注:如无特殊说明,文章内容均出自Lenky个人的真实理解而并非存心妄自揣测来故意愚人耳目。由于个人水平有限,虽力求内容正确无误,但仍然难免出错,请勿见怪,如果可以则请留言告之,并欢迎来讨论。另外值得说明的是,Lenky的部分文章以及部分内容参考借鉴了网络上各位网友的热心分享,特别是一些带有完全参考的文章,其后附带的链接内容也许更直接、更丰富,而我只是做了一下归纳&转述,在此也一并表示感谢。关于本站的所有技术文章,欢迎转载,但请遵从CC创作共享协议,而一些私人性质较强的心情随笔,建议不要转载。
法律:根据最新颁布的《信息网络传播权保护条例》,如果您认为本文章的任何内容侵犯了您的权利,请以或书面等方式告知,本站将及时删除相关内容或链接。