http://www.cppblog.com/lymons/archive/2010/08/01/120638.aspx
8 Mar 2004 7:00 AM
在块作用域中的静态变量的规则 (与之相对的是全局作用域的静态变量) 是, 程序第一次执行到他的声明的时候进行初始化.
察看下面的竞争条件:
int ComputeSomething() { static int cachedResult = ComputeSomethingSlowly(); return cachedResult; }
这段代码的意图是在该函数第一次被调用的时候去计算一些费用, 并且把结果缓冲起来待函数将来再被调用的时候则直接返回这个值即可.
这个基本技巧的变种,在网络上也被叫做 避免 "static initialization order fiasco". ( fiasco这个词 在这个网页上有非常棒的描述,因此我建议大家去读一读然后去理解它.)
这段代码的问题是非线程安全的. 在局部作用域中的静态变量是编译时会在编译器内部转换成下面的样子:
int ComputeSomething() { static bool cachedResult_computed = false; static int cachedResult; if (!cachedResult_computed) { cachedResult_computed = true; cachedResult = ComputeSomethingSlowly(); } return cachedResult; }
现在竞争条件就比较容易看到了.
假设两个线程在同一时刻都调用这个函数. 第一个线程在执行 cachedResult_computed = true 后, 被抢占. 第二个线程现在看到的 cachedResult_computed 是一个真值( true ),然后就略过了if分支的处理,最后该函数返回的是一个未初始化的变量.
现在你看到的东西并不是一个编译器的bug, 这个行为 C++ 标准所要求的.
你也能写一个变体来产生一个更糟糕的问题:
class Something { … }; int ComputeSomething() { static Something s; return s.ComputeIt(); }
同样的在编译器内部它会被重写 (这次, 我们使用C++伪代码):
class Something { … }; int ComputeSomething() { static bool s_constructed = false; static uninitialized Something s; if (!s_constructed) { s_constructed = true; new(&s) Something; // construct it atexit(DestructS); } return s.ComputeIt(); } // Destruct s at process termination void DestructS() { ComputeSomething::s.~Something(); }
注意这里有多重的竞争条件. 就像前面所说的, 一个线程很可能在另一个线程之前运行并且在"s"还没有被构造前就使用它.
甚至更糟糕的情况, 第一个线程很可能在s_contructed 条件判定 之后,在他被设置成"true"之前被抢占. 在这种场合下, 对象s就会被双重构造和双重析构.
这样就不是很好.
但是等等, 这并不是全部, 现在(原文是Not,我认为是Now的笔误)看看如果有两个运行期初始化局部静态变量的话会发生什么:
class Something { … }; int ComputeSomething() { static Something s(0); static Something t(1); return s.ComputeIt() + t.ComputeIt(); }
上面的代码会被编译器转化为下面的伪C++代码:
class Something { … }; int ComputeSomething() { static char constructed = 0; static uninitialized Something s; if (!(constructed & 1)) { constructed |= 1; new(&s) Something; // construct it atexit(DestructS); } static uninitialized Something t; if (!(constructed & 2)) { constructed |= 2; new(&t) Something; // construct it atexit(DestructT); } return s.ComputeIt() + t.ComputeIt(); }
为了节省空间, 编译器会把两个"x_constructed" 变量放到一个 bitfield 中. 现在这里在变量"construted"上就有多个无内部锁定的读-改-存操作.
现在考虑一下如果一个线程尝试去执行 "constructed |= 1", 而在同一时间另一个线程尝试执行 "constructed |= 2".
在x86平台上, 这条语句会被汇编成
or constructed, 1 … or constructed, 2 并没有 "lock" 前缀. 在多处理机器上, 很有可能发生两个存储都去读同一个旧值并且互相使用冲突的值进行碰撞(clobber).
在 ia64 和 alpha平台上, 这个碰撞将更加明显,因为它们么没有这样的读-改-存的单条指令; 而是被编码成三条指令:
ldl t1,0(a0) ; load addl t1,1,t1 ; modify stl t1,1,0(a0) ; store
如果这个线程在 load 和 store之间被抢占, 这个存储的值可能将不再是它曾经要写入的那个值.
因此,现在考虑下面这个有问题的执行顺序:
线程A 在测试 "constructed" 条件后发现他是零, 并且正要准备把这个值设定成1, 但是它被抢占了.
线程B 进入同样的函数, 看到 "constructed" 是零并继续去构造 "s" 和 "t", 离开时 "constructed" 等于3.
线程A 继续执行并且完成它的 读-改-存 的指令序列, 设定 "constructed" 成 1, 然后构造 "s" (第二次).
线程A 然后继续去构造 "t" (第二次) 并设定 "constructed" (最终) 成 3.
现在, 你可能会认为你能用临界区 (critical section) 来封装这个运行期初始化动作:
int ComputeSomething() { EnterCriticalSection(…); static int cachedResult = ComputeSomethingSlowly(); LeaveCriticalSection(…); return cachedResult; }
因为你现在把这个一次初始化放到了临界区里面,而使它线程安全.
但是如果从同一个线程再一次调用这个函数会怎样? ("我们跟踪了这个调用; 它确实是来自这个线程!") 如果 ComputeSomethingSlowly() 它自己间接地调用 ComputeSomething()就会发生这个状况.
结论: 当你看见一个局部静态变量在运行期初始化时, 你一定要小心.
————————————————————————————————–
英文原文 :http://blogs.msdn.com/b/oldnewthing/archive/2004/03/08/85901.aspx
————————————————————————————————–
C++ scoped static initialization is not thread-safe, on purpose!
8 Mar 2004 7:00 AM
The rule for static variables at block scope (as opposed to static variables with global scope) is that they are initialized the first time execution reaches their declaration.
Find the race condition:
int ComputeSomething() { static int cachedResult = ComputeSomethingSlowly(); return cachedResult; }
The intent of this code is to compute something expensive the first time the function is called, and then cache the result to be returned by future calls to the function.
A variation on this basic technique is is advocated by this web site to avoid the "static initialization order fiasco". (Said fiasco is well-described on that page so I encourage you to read it and understand it.)
The problem is that this code is not thread-safe. Statics with local scope are internally converted by the compiler into something like this:
int ComputeSomething() { static bool cachedResult_computed = false; static int cachedResult; if (!cachedResult_computed) { cachedResult_computed = true; cachedResult = ComputeSomethingSlowly(); } return cachedResult; }
Now the race condition is easier to see.
Suppose two threads both call this function for the first time. The first thread gets as far as setting cachedResult_computed = true, and then gets pre-empted. The second thread now sees that cachedResult_computed is true and skips over the body of the "if" branch and returns an uninitialized variable.
What you see here is not a compiler bug. This behavior is required by the C++ standard.
You can write variations on this theme to create even worse problems:
class Something { … }; int ComputeSomething() { static Something s; return s.ComputeIt(); }
This gets rewritten internally as (this time, using pseudo-C++):
class Something { … }; int ComputeSomething() { static bool s_constructed = false; static uninitialized Something s; if (!s_constructed) { s_constructed = true; new(&s) Something; // construct it atexit(DestructS); } return s.ComputeIt(); } // Destruct s at process termination void DestructS() { ComputeSomething::s.~Something(); }
Notice that there are multiple race conditions here. As before, it’s possible for one thread to run ahead of the other thread and use "s" before it has been constructed.
Even worse, it’s possible for the first thread to get pre-empted immediately after testing s_constructed but before setting it to "true". In this case, the object s gets double-constructed and double-destructed.
That can’t be good.
But wait, that’s not all. Not look at what happens if you have two runtime-initialized local statics:
class Something { … }; int ComputeSomething() { static Something s(0); static Something t(1); return s.ComputeIt() + t.ComputeIt(); }
This is converted by the compiler into the following pseudo-C++:
class Something { … }; int ComputeSomething() { static char constructed = 0; static uninitialized Something s; if (!(constructed & 1)) { constructed |= 1; new(&s) Something; // construct it atexit(DestructS); } static uninitialized Something t; if (!(constructed & 2)) { constructed |= 2; new(&t) Something; // construct it atexit(DestructT); } return s.ComputeIt() + t.ComputeIt(); }
To save space, the compiler placed the two "x_constructed" variables into a bitfield. Now there are multiple non-interlocked read-modify-store operations on the variable "constructed".
Now consider what happens if one thread attempts to execute "constructed |= 1" at the same time another thread attempts to execute "constructed |= 2".
On an x86, the statements likely assemble into
or constructed, 1 … or constructed, 2
without any "lock" prefixes. On multiprocessor machines, it is possible for the two stores both to read the old value and clobber each other with conflicting values.
On ia64 and alpha, this clobbering is much more obvious since they do not have a single read-modify-store instruction; the three steps must be explicitly coded:
ldl t1,0(a0) ; load addl t1,1,t1 ; modify stl t1,1,0(a0) ; store
If the thread gets pre-empted between the load and the store, the value stored may no longer agree with the value being overwritten.
So now consider the following insane sequence of execution:
Thread A tests "constructed" and finds it zero and prepares to set the value to 1, but it gets pre-empted.
Thread B enters the same function, sees "constructed" is zero and proceeds to construct both "s" and "t", leaving "constructed" equal to 3.
Thread A resumes execution and completes its load-modify-store sequence, setting "constructed" to 1, then constructs "s" (a second time).
Thread A then proceeds to construct "t" as well (a second time) setting "constructed" (finally) to 3.
Now, you might think you can wrap the runtime initialization in a critical section:
int ComputeSomething() { EnterCriticalSection(…); static int cachedResult = ComputeSomethingSlowly(); LeaveCriticalSection(…); return cachedResult; }
Because now you’ve placed the one-time initialization inside a critical section and made it thread-safe.
But what if the second call comes from within the same thread? ("We’ve traced the call; it’s coming from inside the thread!") This can happen if ComputeSomethingSlowly() itself calls ComputeSomething(), perhaps indirectly. Since that thread already owns the critical section, the code enter it just fine and you once again end up returning an uninitialized variable.
Conclusion: When you see runtime initialization of a local static variable, be very concerned.