$ g++ main.cc -std=c++11 -L. -lfoo /tmp/cc39RCrN.o: In function `Base::Base()': main.cc:(.text._ZN4BaseC2Ev[_ZN4BaseC5Ev]+0x9): undefined reference to `vtable for Base' /tmp/cc39RCrN.o: In function `Derived::Derived()': main.cc:(.text._ZN7DerivedC2Ev[_ZN7DerivedC5Ev]+0x19): undefined reference to `vtable for Derived' collect2: error: ld returned 1 exit status
查看符号表
使用 nm 查看符号表:
1 2 3 4 5 6 7 8 9 10 11 12 13
$ nm libbase.so | egrep "(Base|Derived)" 0000000000000cda W _ZN4BaseD0Ev 0000000000000ca4 W _ZN4BaseD1Ev 0000000000000ca4 W _ZN4BaseD2Ev 0000000000000d42 W _ZN7DerivedD0Ev 0000000000000d00 W _ZN7DerivedD1Ev 0000000000000d00 W _ZN7DerivedD2Ev 0000000000000c20 T _ZNK7Derived5printERSo U _ZTI4Base 0000000000201058 V _ZTI7Derived 0000000000000dc8 V _ZTS7Derived U _ZTV4Base 0000000000201030 V _ZTV7Derived
0000000000000dda W _ZN4BaseD0Ev 0000000000000da4 W _ZN4BaseD1Ev 0000000000000da4 W _ZN4BaseD2Ev 0000000000000e42 W _ZN7DerivedD0Ev 0000000000000e00 W _ZN7DerivedD1Ev 0000000000000e00 W _ZN7DerivedD2Ev 0000000000000d20 T _ZNK7Derived5printERSo 00000000002010b8 V _ZTI4Base 00000000002010a0 V _ZTI7Derived 0000000000000ed1 V _ZTS4Base 0000000000000ec8 V _ZTS7Derived 0000000000201078 V _ZTV4Base 0000000000201050 V _ZTV7Derived
0000000000000d66 T _ZN4BaseD0Ev 0000000000000d30 T _ZN4BaseD1Ev 0000000000000d30 T _ZN4BaseD2Ev 0000000000000eb2 W _ZN7DerivedD0Ev 0000000000000e70 W _ZN7DerivedD1Ev 0000000000000e70 W _ZN7DerivedD2Ev 0000000000000dec T _ZNK7Derived5printERSo 0000000000201138 V _ZTI4Base 0000000000201170 V _ZTI7Derived 0000000000000f29 V _ZTS4Base 0000000000000f38 V _ZTS7Derived 0000000000201110 V _ZTV4Base 0000000000201148 V _ZTV7Derived
主要区别:
三个 _ZN4BaseD<i>Ev(i 是0,1,2)从 W(弱符号)变成了 T(文本段)。
而重新将 print 改成未定义的函数后,符号表变成了:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
0000000000000d96 T _ZN4BaseD0Ev 0000000000000d60 T _ZN4BaseD1Ev 0000000000000d60 T _ZN4BaseD2Ev 0000000000000ee2 W _ZN7DerivedD0Ev 0000000000000ea0 W _ZN7DerivedD1Ev 0000000000000ea0 W _ZN7DerivedD2Ev U _ZNK4Base5printERSo 0000000000000e1c T _ZNK7Derived5printERSo 0000000000201168 V _ZTI4Base 00000000002011a0 V _ZTI7Derived 0000000000000f59 V _ZTS4Base 0000000000000f68 V _ZTS7Derived 0000000000201140 V _ZTV4Base 0000000000201178 V _ZTV7Derived
最大的区别,这里的 U 是 _ZNK4Base5printERSo,很显然,就是基类 Base 的 print 方法。虽然对链接的知识已经忘了不少(得去补课了),但回顾这 4 张符号表,还是可以大致看出为啥析构函数单独分离出去后信息发生变化。
0000000000000b30 W _ZN7DerivedD0Ev 0000000000000b20 W _ZN7DerivedD1Ev 0000000000000b20 W _ZN7DerivedD2Ev 0000000000000b00 T _ZNK7Derived5printERSo U _ZTI4Base 0000000000200cb0 V _ZTI7Derived 0000000000000bc0 V _ZTS7Derived 0000000000200cc8 V _ZTV7Derived
相比默认的(-O0 编译):
1 2 3 4 5 6 7 8 9 10 11 12
0000000000000cda W _ZN4BaseD0Ev 0000000000000ca4 W _ZN4BaseD1Ev 0000000000000ca4 W _ZN4BaseD2Ev 0000000000000d42 W _ZN7DerivedD0Ev 0000000000000d00 W _ZN7DerivedD1Ev 0000000000000d00 W _ZN7DerivedD2Ev 0000000000000c20 T _ZNK7Derived5printERSo U _ZTI4Base 0000000000201058 V _ZTI7Derived 0000000000000dc8 V _ZTS7Derived U _ZTV4Base 0000000000201030 V _ZTV7Derived
首先前三个 Base 的符号(构造函数)被直接内联了。_ZTV4Base 也没了(虚表?),编译 main.cc 报错信息也少了:
$ g++ main.cc -std=c++11 -L. -lbase -O2 ./libbase.so: undefined reference to `typeinfo for Base’ collect2: error: ld returned 1 exit status
$ nm -C libbase.so | egrep "(Base|Derived)" 0000000000000b30 W Derived::~Derived() 0000000000000b20 W Derived::~Derived() 0000000000000b20 W Derived::~Derived() 0000000000000b00 T Derived::print(std::ostream&) const U typeinfo for Base 0000000000200cb0 V typeinfo for Derived 0000000000000bc0 V typeinfo name for Derived 0000000000200cc8 V vtable for Derived
PS:嗯,前面的内容就当踩坑了……懒得改……
此外,也可以看到析构函数放在源文件里时符号表多了:
1 2 3
0000000000000d70 T Base::~Base() 0000000000000d60 T Base::~Base() 0000000000000d60 T Base::~Base()
对应的符号是以 D 结尾的,即 D0/D1/D2。至于为啥有三个析构函数我也不知道……对比了下普通类,只有两个析构函数(D1/D2)。
$ g++ -o libbase.so base.cc derived.cc -std=c++11 -fPIC -shared -O2 $ nm -C libbase.so | egrep "(Base|Derived)" 0000000000000d70 W Base::~Base() 0000000000000d60 W Base::~Base() 0000000000000d60 W Base::~Base() 0000000000000de0 W Derived::~Derived() 0000000000000dd0 W Derived::~Derived() 0000000000000dd0 W Derived::~Derived() 0000000000000d50 T Base::f() const U Base::print(std::ostream&) const 0000000000000db0 T Derived::print(std::ostream&) const 0000000000201038 V typeinfo for Base 0000000000201078 V typeinfo for Derived 0000000000000e68 V typeinfo name for Base 0000000000000e78 V typeinfo name for Derived 0000000000201048 V vtable for Base 0000000000201090 V vtable for Derived
可见,这里 U 不再是符号表,而是虚函数本身。
总结
C++ 的 unsolved symbol 问题其实挺常见的,即使是踩过 N 次坑的我也容易因为一点失误而犯错。本文主要讲述了通过 nm 排查问题的方式,其中如果只有 vtable/typeinfo 缺失这种难以排查的信息,可以尝试加一个带实现的虚函数(比如前文的 f),再来排查符号表。