最初union的设计是为了可以在同一个位置管理不同类型的量。因为有的量实际上不止一种类型,这种需求促发了union。
而节省空间,以及以不同的方式解析同一个数据,只是由这个类型的设计,引发出来的一种灵活性。就像很多东西最初设计的时候,并没有想到会产生这么多灵活的使用方式。比如红楼梦,其实曹雪芹当初不见得考虑的那么复杂,而如今的红学所挖掘出来的东西,当时他也不见得全部了然于胸。
考虑下如果没有union会怎样呢,我们需要分别定义多个变量,来存放这个量,这样就带来了空间的浪费,以及管理的复杂。
再看以不同的方式解析同一个数据,这个更像一个hacer级的应用,而不是它正常的应用。对于union的使用,有如下限制:读取的类型必须是最近一次存入的类型,否则结果取决于实现。很明显,不同的方式解析同一个数据意味着违反了这个限制,将其带入了implementation-defined 的境地。而大小端的测试本身就是利用了这种不同端选择下的不同行为,来区分端的选择。实际上这体现了一个道理,就算是implementation-defined 行为,也有可取之处,起码可以被拿来区分不同的implementation。
当然也不否认,随着时间的流逝,union慢慢发展出来很多新的惯用法。同时如果目标平台只有一个,不考虑implementation-defined,写出implementation-defined的代码也是可以的。比如
// PLL control register bit definitions:
struct PLLCR_BITS { // bits description
Uint16 DIV:4; // 3:0 Set clock ratio for the PLL
Uint16 rsvd1:12; // 15:4 reserved
};
union PLLCR_REG {
Uint16 all;
struct PLLCR_BITS bit;
};
PLLCR_REG pll_reg;
pll_reg.all = 0x000A;
pll_reg.bit.DIV = 0x0A;
union通过all提供全局访问,而bit提供了对某个位的设置。
再看下面这个例子,用union提供的方便来保存不同的包类型。
struct Ctrl1Packet{
double a;
××××
};
struct Ctrl2Packet{
float a;
××××
};
struct Ctrl3Packet{
char* s;
××××
};
struct Packet{
int type;
typedef union CtrlPacket{
Ctrl1Packet p1;
Ctrl2Packet p2;
Ctrl3Packet p3;
}CtrlPacket;
CtrlPacket pack;
};
另外需要注意,标准不再对同一数据以不同类型的访问提供规定,而是完全由实现决定:
from C90 to C99 was to remove any restriction on accessing one member of a union when the last store was to a different one. The rationale was that the behaviour would then depend on the representations of the values. Since this point is often misunderstood, it might well be worth making it clear in the Standard.
最后从c标准来看,虽然新的标准已经去掉了关于以不同类型访问同一数据的规定,但是对于union中struct相容类型仍然提供了如下一个保证。
[#5] One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
这条规则主要为union使用不同的结构体类型访问提供了一个安全保证。
union:http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_257.htm
该文章主要是基于以下两点原因,对该规定做出了一个改进建议:
1.The implementation may put padding between structure members. This rule is necessary to ensure that the common initial sequence uses the same padding in both places, so that the corresponding members occupy the same location.
2.If we consider part of the second example in 6.5.2.3#8:
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 * p1, struct t2 * p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
the rule is necessary for an implementation to realize that p1 and p2 might refer the same location.
改进建议:
To address the wider point about visibility, change the first part of 6.5.2.3#5 to read:
[#5] One special guarantee is made in order to simplify the use of unions: if several structure types share a common initial sequence (see below), then corresponding members are required to lie at the same offset from the start of the union. Therefore if a union contains two or more such structures, the common initial part may be inspected using any of them, no matter which one was used to store the value.
To address issues about "similar" types raised in point (2) above, change the second part
of #5 to read:
Two structures share a common initial sequence if corresponding members have matching types for a sequence of one or more initial members. Two types, in turn, are matching if they are:
compatible types (and, for bit-fields, the same widths)
signed and unsigned versions of the same integer type
qualified or unqualified versions of matching types, or
pointers to matching types.