最近在做一个数据格式分析和转换的项目,第一次接触底层的二进制代码存储,看的一头雾水,看到这个帖子后对于在Windows系统下数据的存储方式有了更多的了解,将原文分享一下:
原文地址为http://www.cppblog.com/aaxron/archive/2011/12/03/161347.html
C语言中float,double类型,在内存中的结构(存储方式)
从存储结构和算法上来讲,double和float是一样的,不一样的地方仅仅是float是32位的,double是64位的,所以double能存储更高的精度。
任何数据在内存中都是以二进制(0或1)顺序存储的,每一个1或0被称为1位,而在x86CPU上一个字节是8位。比如一个16位(2 字节)的short int型变量的值是1000,那么它的二进制表达就是:00000011 11101000。由于Intel CPU的架构原因,它是按字节倒 序存储的,那么就因该是这样:11101000 00000011,这就是定点数1000在内存中的结构。
目前C/C++编译器标准都遵照IEEE制定的浮点数表示法来进行float,double运算。
这种结构是一种科学计数法,用符号、指数和 尾数来表示,底数定为2——即把一个浮点数表示为尾数乘以2的指数次方再添上符号。
下面是具体的规格:
类型 符号位 阶码 尾数 长度
float 1 8 23 32
double 1 11 52 64
临时数 1 15 64 80
由于通常C编译器默认浮点数是double型的,下面以double为例: 共计64位,折合8字节。
由最高到最低位分别是第63、62、61、……、0位: 最高位63位是符号位,1表示该数为负,0正; 62-52位,一共11位是指数位; 51-0位,一共52位是尾数位。
按照IEEE浮点数表示法,下面将把double型浮点数38414.4转换为十六进制代码。
把整数部和小数部分开处理:整数部直接化十六进制:960E。小数的处理: 0.4=0.5*0+0.25*1+0.125*1+0.0625*0+…… 实际上这永远算不完!这就是著名的浮点数精度问题。所以直到加上前面的整数部分算够53位就行了(隐藏位技术:最高位的1 不写入内存)。
如果你够耐心,手工算到53位那么因该是:38414.4(10)=1001011000001110.0110101010101010101010101010101010101(2)
科学记数法为:1.001……乘以2的15次方。指数为15! 于是来看阶码,一共11位,可以表示范围是-1024 ~ 1023。因为指数可以为负,为了便于计算,规定都先加上1023,在这里, 15+1023=1038。
二进制表示为:100 00001110 符号位:正—— 0 ! 合在一起(尾数二进制最高位的1不要): 01000000 11100010 11000001 11001101 01010101 01010101 01010101 01010101 按字节倒序存储的十六进制数就是: 55 55 55 55 CD C1 E2 40。
memset用法与注意
需要的头文件 <memory.h> or <string.h>
函数原型 void *memset(void *s, int ch, unsigned n);
程序示例:
1.
unsigned char (*p)[40]=new unsigned char [100][40];
memset(p,1,100*40);
2.
float (*p)[40]=new float[100][40];
memset(p,1,100*40);
3.
float (*p)[40]=new float[100][40];
memset(p,0,100*40);
程序1可以达到想要的目标,但是2和3都达不到想要的目标。这是为什么呢?
因为memset是以字节为单位进行赋值,每个都用ASCII为1的字符去填充,转为二进制后,1就是00000001,占一个字节。一个float元素是4字节,合一起就是00000001000000010000000100000001,转换成十进制明显不是我们想要的结果。而3是没有把所有的内存全部清空。
所以用memset对非字符型数组赋初值是不可取的!清空内存却可以做到,正确的做法如4:
4。
float (*p)[40]=new float[100][40];
memset(p,0,100*40*sizeof(float));
了解更多请看参考资料:http://baike.baidu.com/view/982208.htm
Is it legal to use memset(,0,) on array of doubles?
Is it legal to use memset(,0,) on array of doubles?


7 Answers
up vote10down vote | The C99 standard Annex F says:
And, immediately after,
So, if IEC 60559 is basically IEEE 754-1985 and this specifies that 8 zero bytes mean 0.0 (as @David Heffernan said), this means that if you find | ||


up vote8down vote | If you are talking about IEEE754 then the standard defines +0.0 to double precision as 8 zero bytes. If you know that you are backed by IEEE754 floating point then this is well-defined. As for Intel, I can't think of a compiler that doesn't use IEEE754 on Intel x86/x64. | ||||||||||||
|
up vote3down vote | David Heffernan has given a good answer for part (2) of your question. For part (1): The C99 standard makes no guarantees about the representation of floating-point values in the general case. §6.2.6.1 says:
...and that subclause makes no further mention of floating point. You said:
Indeed - there a difference between "undefined behaviour", "unspecified behaviour" and "implementation-defined behaviour":
and so, as floating point representation is unspecified behaviour, it can vary in an undocumented manner from platform to platform (where "platform" here means "the combination of hardware and compiler" rather than just "hardware"). (I'm not sure how useful the guarantee that a | ||||||||
|
up vote2down vote | As Matteo Italia says, that's legal according to the standard, but I wouldn't use it. Something like is at least twice faster. | ||||||||||||||||||||
|
up vote0down vote | Well, I think the zeroing is "legal" (after all, it's zeroing a regular buffer), but I have no idea if the standard lets you assume anything about the resulting logical value. My guess would be that the C standard leaves it as undefined. | ||
up vote0down vote | Even though it is unlikely that you encounter a machine where this has problems, you may also avoid this relatively easily if you are really talking of arrays as you indicate in the question title, and if these arrays are of known length at compile time (that is not VLA), then just initializing them is probably even more convenient: should always work. If you'd have to zero such an array again, later, and your compiler is compliant to modern C (C99) you can do this with a compound literal on any modern compiler this should be as efficient as | ||||||||
|
up vote0down vote | It's "legal" to use memset. The issue is whether it produces a bit pattern where array[x] == 0.0 is true. While the basic C standard doesn't require that to be true, I'd be interested in hearing examples where it isn't! It appears memset is equivalent to 0.0 on IBM-AIX, HP-UX (PARISC), HP-UX (IA-64), Linux (IA-64, I think). |
__STDC_IEC_559__
you can be sure that it's IEC 60559 aka IEEE 754. – Matteo Italia Jan 7 '11 at 23:29