借助shared_ptr模拟RCU(read-copy-update)

RCU较RWL有更好的性能,其读操作几乎是free的;writer需要先对原始数据做一份拷贝,再进行修改(在同一时刻只能有一个writer),完成之后替换掉原来的指针(如果swap不是原子性操作,需要critical section的保护);然后,由reclaimer在适当的时机将原始数据占用的内存释放掉。这样,writer不像RWL会因为有reader而被阻塞。RCU和RWL一样,比较适用于读多写少的情景。

RCU多见于操作系统的内核中,也有user-space的实现可供参考(liburcu)。个人觉得,如果将reclaimer的回收工作分摊到某个reader或writer上,借用tr1::shared_ptr,应该可以很方便的实现。下面是实现的代码,我对并发编程经验尚浅,一直拿不太准,是否正确实现了,贴出来请大家多指正!

基本思路是这样的,reader可以通过get_reading_copy()获得当前数据的shared_ptr(记为Generation 1,G1);不过当m_is_swapping为1时,要阻塞等待。writer通过get_updating_copy()得到当前数据的一个副本(记为Generation 2,G2),由m_is_writing进行保护,只允许有一个写者进入;在更新完成之后,writer调用update()将这个副本的shared_ptr传回来,然后通过swap操作令m_data_ptr指向新的数据(G2),然后打开m_is_writing和m_is_swapping。writer持有的那个shared_ptr在调用update()之后指向了原始数据(G1),之前reader(s)持有的shared_ptr(s)也同样指向的是原始数据(G1),当这些shared_ptr(s)统统被析构时,会释放掉原始数据所占用的内存(可能发生在writer或reader上)。新的reader则会获得新数据的shared_ptr。如果有另外一个writer进入,得到的是新数据的副本(记为Generation 3,G3)。因此,如果指向G1的那些shared_ptr(s)还没有被完全析构时,有可能存在多个不同代(generations)的数据副本。

template <typename T>
class rcu_protected
{
public:
    typedef T                                   type;
    typedef const T                             const_type;
    typedef std::tr1::shared_ptr<type>          rcu_pointer;
    typedef std::tr1::shared_ptr<const_type>    rcu_const_pointer;

    rcu_protected() : m_data_ptr (new type()) {}

    rcu_protected(T* data) : m_data_ptr (data) {}

    rcu_const_pointer get_reading_copy ()
    {
        LockGuard< CRWLock,
                  &CRWLock::read_lock,
                  &CRWLock::read_unlock> l_guard (m_ptr_lock);

        return m_data_ptr;
    }

    rcu_pointer get_updating_copy ()
    {
        m_writer_lock.write_lock();

        LockGuard< CRWLock,
                  &CRWLock::read_lock,
                  &CRWLock::read_unlock> l_guard (m_ptr_lock);

        return rcu_pointer(new type(*m_data_ptr));
    }

    void update (rcu_pointer new_data_ptr)
    {
        LockGuard< CRWLock,
                  &CRWLock::write_lock,
                  &CRWLock::write_unlock> l_guard (m_ptr_lock);

        m_data_ptr.swap (new_data_ptr);

        m_writer_lock.write_unlock();
    }

private:
    CRWLock     m_ptr_lock;
    CRWLock     m_writer_lock;
    rcu_pointer m_data_ptr;
};

自注:

  1. 在VC2005之后(包括2005),编译器对volatile变量的访问会自动加fence,所以那两个barrier(s)可以省去。
  2. 其实在update()中,直接将new_data_ptr赋值给m_data_ptr应该也是可以的,但是可能导致在update()中去析构m_data_ptr指向的数据,而我们应该让update()能尽可能快的完成。
  3. 感谢Dmitry Vyukovreview,我遗漏了一种简单的情况,当reader等到m_is_swapping为0,准备拷贝指针时,突然又有writer线程进入要update指针,这就可能导致crash了… 应该用一个rw_lock将读和写保护起来 … 教训啊,任何关键数据的读写,都应该是全程保护的 …

C++ exceptions and shared objects

In many application systems, shared objects are loaded as sub-modules via dlopen(3C). In case it's written by C++, you need to be careful about throwing exceptions. The client may try unload this module (by dlclose(3C)) in exception handling block, so that the application may crash when C++ runtime routines try to clean the exception (by calling its destructor) before leaving current catching function. In practice, exception class is usually defined in header file, so that its code body maybe included both in app and sub-module. Even in this case, linker or loader may choose the copy in sub-module. (E.g., the -Bdirect flag of ld(1)).

The conclusion is, do not unload a shared object when catching an exception thrown from it, or not to throw exceptions from a shared object. This is the root cause that SCIM x11 frontend would crash for the 2nd launching.

boost, shared_ptr and bcp

众所周知,STL的containers只支持“value语义”,而不支持“reference语义”。直接将对象的指针类型作为模板参数的方案并不完美,需要我们在erase()或remove()之外自己来释放对象。几乎所有人的推荐方案是,使用shared_ptr这个boost或者TR1中的智能指针(gccc 4.x已经提供了对TR1的支持)。不过,你知道boost是很大的(约40多M),而且不那么容易用Sun的C++编译器平顺地编译过去(如果不使用libstl-port的话)。关于Solaris上的Boost近况,参见http://blogs.sun.com/sga/category/Boost

后来又了解到bcp这个工具,可以将所需的module(s)提取出来,单独发布。于是就下载了boost的源代码包,编译了bjam和bcp,再用bcp将shared_ptr提取出来。结果吓了一跳,shared_ptr及其依赖的文件有300多个,共约3.5M。看来只好自己照猫画虎实现一个简化的ad hoc版本了...

我书架上的C++书籍

个人很喜欢买书,因为深受过买不到绝版书的痛苦,所以自己虽然买回来从头看到尾的不多,但也买回来藏着。上图是我历年来购买的C++的书籍,这还不包括"The C++ Programming Language"以及"C++ Primer"这样的百科大全,以及其例如"Advanced Corba Programming with C++","Data Structure and Algorithm Analysis with C++","XML programming with C++"这样特定领域的C++书籍,更不消说久以前的MFC书籍了。其中也有若干是在特价时买的。这一匣书,大约占我家里现存各种书籍的4%,所以总共也就千余册,值数万元。

Function Pointer as Template Parameter in SunStudio C++

//extern "C" void foo (void*);
extern void foo (void*);

template <typename T, void(*cb)(T*)> class Test {};
typedef Test<void, foo> TestVoid;

SunStudio C++ compiler will fail if I try to use the 1st prototype, and reports:

line 5: Error: Template parameter cb requires an expression of type void(*)(void*).
1 Error(s) detected.

No idea why this happens... 

C or C++ ?

最近Linus和Dmitry在Git的开发者列表上,展开了关于用C还是用C++的对战。国内的业者也开始轰轰烈烈的讨论起来,参见刘江的这篇blog“Linux之父炮轰C++:糟糕程序员的垃圾语言”,其中还提到了孟岩和风云的文章。读来的感觉是,中国人比较中庸,不容易走极端,提倡兼用所长。

我也曾花了许多力气来学习C++这门超复杂的语言,因此对它有较深的感情,市面上几乎所有重要的C++著作,都罗列在我的书架上。但是不得不承认,C++的适用领域已经越来越窄了。如果人们不了解C++语言和编译器背后的机理,是不可能写出优质甚至正确的代码的。另外C++的可移植性差,历来为人们所诟病。像mozilla社区对C++的使用进行了各种限制来保证可移植性(参见"C++ portability guide")。反过来说,用C语言来实现较完备的OO系统,例如GObject,远不如C++方便和直观。

最近读了刘未鹏的“C++ 0x漫谈”系列,能够看到C++社区对改进语言本身所做的努力。C++的确需要一个大变革。不过即便这个新标准通过了,等到各编译器完全支持它,又不知道何年月了。希望这一天早日到来。

那么21世纪的你,还应该学习C++这门语言吗?我认为,作为一个严肃地从事软件开发的职业程序员,还是应该深入学习好C++的。C++能培养你多方面的能力和素养,如OOA/D、GP等等。我觉得,一个优秀的C++程序员,一定有能力写出好的(甚至更好的)C代码。况且现在要招到有经验的C++程序员,还挺不容易的,钱途看涨;)。

Be careful about the associated containers in Cstd of SunStudio

If your program has a lot of std::set<T> or std::map<T> object instances, while mostly have small sizes (e.g., 1..5), you'd better change them to Vector or other unassociated container types if you built your code with Cstd in SunStudio.

The genpyt utility in sunpinyin/slm module, takes over 800M virtual memory, and about 4~500M is allocated for the unused __rb_tree_node buffer. If your machine has less memory than 1GB, the paging in/out (to/from swap) will make the program (and your system) running very slowly.

E.g., for the following example,

std::set<TNode*> nodeset;
nodeset.insert (new TNode());

During the std::set<TNode*>::insert(TNode*), the underlying __rb_tree structure would allocate 32 __rb_tree_node (whose size is 20 bytes), in total 640 bytes. While you just want it to hold a 4/8 bytes pointer. If your program has a lot of such small Set objects, the "wasted" memory is quite large.

I tested g++, and SunStudio with stlport4, they do not has such problem (or feature ;)).

C++ template class inheritance and name looking up

1 #include <stdio.h>
2
3 template <int N>
4 struct Base {
5     int base[N];
6     Base() {
7         for (int i=0; i<N; ++i)
8             base[i] = i;
9     }
10 };
11
12 int base[1024];
13
14 template <int N>
15 struct Foo : public Base<N> {
16     void dump () {
17         for (int i=0; i<N; ++i)
18             printf ("%d ", base[i]);
19         printf ("\n");
20     }
21 };
22
23 int main (int argc, char **argv) {
24     Foo<10> foo;
25     foo.dump ();
26 }

The test CPP program gives different with SunStudio CC and g++.

g++:          0 0 0 0 0 0 0 0 0 0
sunstudio CC: 0 1 2 3 4 5 6 7 8 9

And, if we comment out line #12, i.e., the definition of ::base[], g++ would complain:

test.cpp: In member function `void Foo<N>::dump()':
test.cpp:18: error: `base' undeclared (first use this function)

It's a bug of g++? Actually, no, it's the correct behavior. Refer to the release note of gcc 3.4.

> In a template definition, unqualified names will no longer find
> members of a dependent base (as specified by [temp.dep]/3 in
> the C++ standard).

2 Tips of C++ Programming with const

1. Implicit type conversion and copy constructor

If you want to define your own copy constructor, you should define like that:

class Foo {
public:
  Foo (const Foo& obj) {}
};

Pay attention to the const in the parameter declaration. If your copy constructor definition loses const (Sun's C++ compiler does NOT complain), your implicit type conversion member will fail to compile, though the copy constructor is not used in type conversion.

class Foo {
public:
  Foo (Foo& obj) {}
  Foo (Bar *obj) {}
};

Foo test () { return new Bar(); }

On linux with g++, you will get the error messages as following:
  error: no matching function for call to ‘Foo::Foo(Foo)’
  ... ...
  error:   initializing temporary from result of ‘Foo::Foo(Bar*)’

On solaris with sun studio c++ compiler, you will get:
  Error: Cannot use Bar* to initialize Foo without "Foo::Foo(const Foo&)"

2. The reference of pointer and "this"

Look at this piece of code:

void test (Foo* &obj) {}
//void test (Foo* const &obj) {}

class Foo {
public:
  void bar () { test (this); }
};

On linux with g++, you will get error messages as following:
  invalid initialization of non-const reference of type ‘Foo*&’ from a temporary of type ‘Foo* const’
  in passing argument 1 of ‘void test(Foo*&)’

On solaris with sun studio c++ compiler, you will get:
  Error: Formal argument obj of type Foo*& in call to test(Foo*&) requires an lvalue.

Seems that sun studio c++ compiler treats the & in "&obj" as the addressing operator, so it requires an lvalue (left value). If we uncomment the 2nd test () (which in blue), then in function bar (), we must call test ((Foo *const) this); or just comment out the 1st test (), so that we could pass the compiling.

For non const members or external functions, the type of 'this' is Foo *const, for const members, its type is const Foo *const.