类文件结构学习笔记
第一节为《深入理解Java虚拟机》的第六章前半部分内容,第二节才是原创,目的是根据第一节的规范解析class文件内容。解析出常量池后就没继续了,其中将二进制浮点数表示成十进制花了一会儿工夫。
1 Class类文件的结构
Class文件格式:
类型 | 名称 | 数量 |
---|---|---|
u4 | magic | 1 |
u2 | minor_version | 1 |
u2 | major_version | 1 |
u2 | constant_pool_count | 1 |
cp_info | constant_pool | constant_pool_count - 1 |
u2 | access_flags | 1 |
u2 | this_class | 1 |
u2 | super_class | 1 |
u2 | interfaces_count | 1 |
u2 | interfaces | interfaces_count |
u2 | fields_count | 1 |
field_info | fields | fields_count |
u2 | methods_count | 1 |
method_info | methods | method_count |
u2 | attributes_count | 1 |
attribute_info | attributes | attributes_count |
1.1 魔数与Class文件的版本
魔数:CAFEBABE
主版本号:50对应JDK 6、51对应JDK 7
版本号:主版本号+次版本号,如:52.0
1.2 常量池
- 常量池由1开始计数,如
00 16
十进制为22,代表常量池中有21项常量,索引值范围为1-21 - 主要存放字面量(Literal)和符号引用(Symbolic References)
常量池的项目类型:
名称 | 标志 | 描述 |
---|---|---|
CONSTANT_Utf8_info | 1 | UTF-8编码的字符串 |
CONSTANT_Integer_info | 3 | 整型字面量 |
CONSTANT_Float_info | 4 | 浮点型字面量 |
CONSTANT_Long_info | 5 | 长整型字面量 |
CONSTANT_Double_info | 6 | 双精度浮点型字面量 |
CONSTANT_Class_info | 7 | 类或接口的符号引用 |
CONSTANT_String_info | 8 | 字符串类型字面量 |
CONSTANT_Fieldref_info | 9 | 字段的符号引用 |
CONSTANT_Methodref_info | 10 | 类中方法的符号引用 |
CONSTANT_InterfaceMethodref_info | 11 | 接口中方法的符号引用 |
CONSTANT_NameAndType_info | 12 | 字段或方法的部分符号引用 |
CONSTANT_MethodHandle_info | 15 | 表示方法句柄 |
CONSTANT_MethodType_info | 16 | 表示方法类型 |
CONSTANT_Dynamic_info | 17 | 表示一个动态计算常量 |
CONSTANT_InvokeDynamic_info | 18 | 表示一个动态方法调用点 |
CONSTANT_Module_info | 19 | 表示一个模块 |
CONSTANT_Package_info | 20 | 表示一个模块中开放或者导出的包 |
CONSTANT_Class_info类型常量的结构:
类型 | 名称 | 数量 |
---|---|---|
u1 | tag | 1 |
u2 | name_index | 1 |
tag是标志位,用于区分常量类型,name_index是常量池的索引值,指向常量池中一个CONSTANT_Utf8_info类型常量(全限定名)
CONSTANT_Utf8_info型常量的结构:
类型 | 名称 | 数量 |
---|---|---|
u1 | tag | 数量 |
u2 | length | 1 |
u1 | bytes | length |
- 使用UTF-8缩略编码表示,从
u0001
到u007f
使用一个字节,从u0080
到u07ff
使用两个字节,从u0800
到uffff
使用三个字节。 - 最大为65535,也就是Java方法、字段名的最大长度。
使用javap:
javap -verbose TestClass
常量池中17种数据类型的结构总表:
名称 | 项目 | 类型 | 描述 |
---|---|---|---|
CONSTANT_Utf8_info | tag | u1 | 1 |
length | u2 | ||
bytes | u1 | ||
CONSTANT_Integer_info | tag | u1 | 3 |
bytes | u4 | ||
CONSTANT_Float_info | tag | u1 | 4 |
bytes | u4 | ||
CONSTANT_Long_info | tag | u1 | 5 |
bytes | u8 | ||
CONSTANT_Double_info | tag | u1 | 6 |
bytes | u8 | ||
CONSTANT_Class_info | tag | u1 | 7 |
index | u2 | ||
CONSTANT_String_info | tag | u1 | 8 |
index | u2 | ||
CONSTANT_Fieldref_info | tag | u1 | 9 |
index | u2 | ||
index | u2 | ||
CONSTANT_Methodref_info | tag | u1 | 10 |
index | u2 | ||
index | u2 | ||
CONSTANT_InterfaceMethodref_info | tag | u1 | 11 |
index | u2 | ||
index | u2 | ||
CONSTANT_NameAndType_info | tag | u1 | 12 |
index | u2 | ||
index | u2 | ||
CONSTANT_MethodHandle_info | tag | u1 | 15 |
reference_kind | u1 | ||
reference_index | u2 | ||
CONSTANT_MethodType_info | tag | u1 | 16 |
descriptor_index | u2 | ||
CONSTANT_Dynamic_info | tag | u1 | 17 |
bootstrap_method_attr_index | u2 | ||
CONSTANT_InvokeDynamic_info | tag | u1 | 18 |
bootstrap_method_attr_index | u2 | ||
CONSTANT_Module_info | tag | u1 | 19 |
name_index | u2 | ||
CONSTANT_Package_info | tag | u1 | 20 |
name_index | u2 |
1.3 访问标志
常量池结束之后,紧接着的2个字节代表访问标志(access_flags),用于标志用于识别一些类或接口层次的访问信息
标志名称 | 标志值 | 含义 |
---|---|---|
ACC_PUBLIC | 0x0001 | 是否为public类型 |
ACC_FINAL | 0x0010 | 是否被final,只有类可设置 |
ACC_SUPER | 0x0020 | 是否允许使用invokespecial字节码指令的新语义 |
ACC_INTERFACE | 0x0200 | 标识这是一个接口 |
ACC_ABSTRACT | 0x0400 | 是否为abstract类型,对于接口或者抽象类为真,其他类型为假 |
ACC_SYNTHETIC | 0x1000 | 标识这个类并非由用户代码产生的 |
ACC_ANNOTATION | 0x2000 | 标识这是一个注解 |
ACC_ENUM | 0x4000 | 标识这是一个枚举 |
ACC_MODULE | 0x8000 | 标识这是一个模块 |
1.4 类索引、父类索引、接口索引集合
类索引(this_class)和父类索引(super_class)是u2类型的数据,接口索引集合(interfaces)是一组u2类型的数据集合(implements关键字后从左到右排列)。
1.5 字段表集合
字段表(field_info)用于描述接口或则类中声明的变量。
字段表结构:
名称 | 类型 | 数量 |
---|---|---|
access_flags | u2 | 1 |
name_index | u2 | 1 |
descriptor_index | u2 | 1 |
attributes_count | 1 | |
attributes | attribute | attributes_count |
字段访问标志:
标志名称 | 标志值 | 含义 |
---|---|---|
ACC_PUBLIC | 0x0001 | 字段是否public |
ACC_PRIVATE | 0x0002 | 字段是否private |
ACC_PROTECTED | 0x0004 | 字段是否protected |
ACC_STATIC | 0x0008 | 字段是否static |
ACC_FINAL | 0x0010 | 字段是否final |
ACC_VOLATILE | 0x0040 | 字段是否volatile |
ACC_TRANSIENT | 0x0080 | 字段是否transient |
ACC_SYNTHETIC | 0x1000 | 字段是否由编译器自动产生 |
ACC_ENUM | 0x4000 | 字段是否enum |
ACC_PUBLIC
、ACC_PRIVATE
、ACC_PROTECTED
三者只能取一个ACC_FINAL
和ACC_VOLATILE
两者只能取一个- 接口之中的字段必须要有
ACC_PUBLIC
、ACC_STATIC
、ACC_FINAL
标志
name_index和descriptor_index都是对常量池项的引用,分别代表字段的简单名称以及字段和方法的描述符
描述符标识字符含义:
标识字符 | 含义 |
---|---|
B | byte |
C | char |
D | double |
F | float |
I | int |
J | long |
S | short |
Z | boolean |
V | 特殊类型void |
L | 对象类型,如Ljava/lang/Object; |
- 数组用
[
标识,例如:java.lang.String[][]
将被记录为[[Ljava/lang/String;
int[]
将被记录为[I
2 解析Class文件
下面的程序的最终目的是读取.class
文件,转成.java
文件。
2.1 java源文件
下面的java源文件编译成class文件后将位于:/Users/z8g/Test/build/classes/HelloWorld.class
public class HelloWorld { private final int age = 19971015; private final float len = -25.125f; private final double d = -25.125; public int add(int i) { return i + 1; } }
2.2 UnsignedByte:无符号字节类
class文件的字节需要转换成无符号字节,可以使用以下公式:
byte b; short value = (short) ((short) b & 0xFF)
下面便是无符号字节类的实现:
class UnsignedByte { private short value; private byte rawValue; private UnsignedByte() { } public static UnsignedByte from(byte b) { UnsignedByte ub = new UnsignedByte(); ub.rawValue = b; ub.value = (short) ((short) b & 0xFF); return ub; } }
在此基础上,封装了一些方法:
public static UnsignedByte[] from(byte[] bytes) { UnsignedByte[] result = new UnsignedByte[bytes.length]; for (int i = 0; i < bytes.length; i++) { result[i] = from(bytes[i]); } return result; }
获取值:
public short value() { return value; } public static long value(UnsignedByte[] bytes) { if (bytes.length == 1) { return bytes[1].value(); } if (bytes.length == 2) { return (bytes[0].value() << 32) + bytes[1].value(); } throw new IllegalArgumentException(); }
转字节类型:
public static byte[] to(UnsignedByte[] b) { byte[] result = new byte[b.length]; for (int i = 0; i < result.length; i++) { result[i] = (byte) (b[i].rawValue & 0xFF); } return result; } public static byte to(UnsignedByte b) { return b.rawValue; } public static byte to(short i) { return (byte) i; }
2.3 文件字节流
String classFilePath = "/Users/z8g/Test/build/classes/HelloWorld.class"; File classFile = new File(classFilePath); FileInputStream fis = new FileInputStream(classFile);
获取到FileInputStream实例fis
后,下面将依次通过调用fis.read
方法将class文件内容读入byte数组。
2.4 读取魔数
魔数占4字节,因此声明一个4字节的byte数组,读取之后转换成无符号字节数组
。
// 读取[魔数] byte[] magic = new byte[4]; fis.read(magic); UnsignedByte[] magicUB = UnsignedByte.from(magic); System.out.printf("[magic] %02x%02x %02x%02x\n", magicUB[0].value(), magicUB[1].value(), magicUB[2].value(), magicUB[3].value());
控制台打印:
[magic] cafe babe
2.5 读取次版本号和主版本号
在此之后继续调用fis的read方法,次版本号和主版本号都占用2个字节,因此使用相同的方式读取即可。
// 读取[次版本号]和[主版本号] byte[] minor_version = new byte[2]; byte[] major_version = new byte[2]; fis.read(minor_version); fis.read(major_version); UnsignedByte[] minor_versionUB = UnsignedByte.from(minor_version); UnsignedByte[] major_versionUB = UnsignedByte.from(major_version); System.out.printf("[minor_version] %02x%02x\n", minor_versionUB[0].value(), minor_versionUB[1].value()); System.out.printf("[major_version] %02x%02x\n", major_versionUB[0].value(), major_versionUB[1].value());
控制台打印:
[minor_version] 0000 [major_version] 0034
0034
是16进制,其十进制表示为52,次版本号为0,因此其版本号为52.0
,表示该class文件由JDK 8编译。
2.6 读取常量池长度
// 读取[常量池长度] byte[] constant_pool_count = new byte[2]; fis.read(constant_pool_count); UnsignedByte[] constant_pool_countUB = UnsignedByte.from(constant_pool_count); System.out.printf("[constant_pool_count] %02x %02x\n", constant_pool_countUB[0].value(), constant_pool_countUB[1].value()); long constantPoolCount = UnsignedByte.value(constant_pool_countUB); System.out.printf("(constantPoolCount: %d)\n", constantPoolCount);
控制台打印:
[constant_pool_count] 00 24 (constantPoolCount: 36)
常量池的索引范围是[1, constant_pool_count - 1]
,因此其常量池的长度为35。
2.7 读取常量池
前面得到常量池的长度constant_pool_count
,因此遍历[1, constant_pool_count - 1]
,每次先读取tag
标志(占1字节),再根据tag标志来选择读取策略。
// 读取[常量池项] for (long i = 1; i <= constantPoolCount - 1; i++) { System.out.println("#" + i); byte[] tag = new byte[1]; fis.read(tag); UnsignedByte tagUB = UnsignedByte.from(tag[0]); short tagValue = tagUB.value(); System.out.printf("[tag] %02x ", tagUB.value()); switch (tagValue) { case 1: System.out.println("(1 CONSTANT_Utf8_info) "); read_CONSTANT_Utf8_info(fis); break; case 3: System.out.println("(3 CONSTANT_Integer_info) "); read_CONSTANT_Integer_info(fis); break; case 4: System.out.println("(4 CONSTANT_Float_info) "); read_CONSTANT_Float_info(fis); break; case 5: System.out.println("(5 CONSTANT_Long_info) "); read_CONSTANT_Long_info(fis); break; case 6: System.out.println("(6 CONSTANT_Double_info) "); read_CONSTANT_Double_info(fis); break; case 7: System.out.println("(7 CONSTANT_Class_info) "); read_CONSTANT_Class_info(fis); break; case 8: System.out.println("(8 CONSTANT_String_info) "); read_CONSTANT_String_info(fis); break; case 9: System.out.println("(9 CONSTANT_Fieldref_info) "); read_CONSTANT_Fieldref_info(fis); break; case 10: System.out.println("(10 CONSTANT_Methodref_info) "); read_CONSTANT_Methodref_info(fis); break; case 11: System.out.println("(11 CONSTANT_InterfaceMethodref_info) "); read_CONSTANT_InterfaceMethodref_info(fis); break; case 12: System.out.println("(12 CONSTANT_NameAndType_info) "); read_CONSTANT_NameAndType_info(fis); break; case 15: System.out.println("(15 CONSTANT_MethodHandle_info) "); read_CONSTANT_MethodHandle_info(fis); break; case 16: System.out.println("(16 CONSTANT_MethodType_info) "); read_CONSTANT_MethodType_info(fis); break; case 17: System.out.println("(17 CONSTANT_Dynamic_info) "); read_CONSTANT_Dynamic_info(fis); break; case 18: System.out.println("(18 CONSTANT_InvokeDynamic_info) "); read_CONSTANT_InvokeDynamic_info(fis); break; case 19: System.out.println("(19 CONSTANT_Module_info) "); read_CONSTANT_Module_info(fis); break; case 20: System.out.println("(20 CONSTANT_Package_info) "); read_CONSTANT_Package_info(fis); break; } }
读取 1 CONSTANT_Utf8_info:
当
tag == 1
时,表示该常量项是UTF-8编码的字符串
接下来的2字节是
length
,表示bytes
的长度接着继续读取
length
字节即可private static void read_CONSTANT_Utf8_info(FileInputStream fis) throws IOException { byte[] length = new byte[2]; fis.read(length); UnsignedByte[] lengthUB = UnsignedByte.from(length); long lengthValue = UnsignedByte.value(lengthUB); System.out.print("[utf8] "); for (long k = 0; k < lengthValue; k++) { byte[] bytes = new byte[1]; fis.read(bytes); UnsignedByte bytesUB = UnsignedByte.from(bytes[0]); System.out.printf("%c", bytesUB.value()); } System.out.println(); }
3 read_CONSTANT_Integer_info:
- 当
tag == 3
时,表示常量项是整型字面量
- 接下来的4字节是
bytes
,表示按照高位在前存储的int值
- Integer是32位整数,可以通过移位操作再累加的方法取出其值
private static void read_CONSTANT_Integer_info(FileInputStream fis) throws IOException { byte[] bytes = new byte[4]; fis.read(bytes); UnsignedByte[] bytesUB = UnsignedByte.from(bytes); int integerValue = (bytesUB[0].value() << (8 * 3)) + (bytesUB[1].value() << (8 * 2)) + (bytesUB[2].value() << (8 * 1)) + bytesUB[3].value(); System.out.printf("[int] %d\n", integerValue); }
5 CONSTANT_Long_info
当
tag == 5
时,表示常量项是长整型字面量
接下来的8字节是
bytes
,表示按照高位在前存储的long值
Long是64位整数,可以通过移位操作再累加的方法取出其值
private static void read_CONSTANT_Long_info(FileInputStream fis) throws IOException { byte[] bytes = new byte[8]; fis.read(bytes); UnsignedByte[] bytesUB = UnsignedByte.from(bytes); int longValue = (bytesUB[0].value() << (8 * 7)) + (bytesUB[1].value() << ((8 * 6))) + (bytesUB[2].value() << ((8 * 5))) + (bytesUB[3].value() << (8 * 4)) + (bytesUB[4].value() << (8 * 3)) + (bytesUB[5].value() << (8 * 2)) + (bytesUB[6].value() << (8 * 1)) + bytesUB[7].value(); System.out.printf("[long] %d\n", longValue); }
7 CONSTANT_Class_info
- 当
tag == 7
时,表示常量项是类或接口的符号引用
- 接下来的2字节是
index
,指向全限定名的常量项索引private static void read_CONSTANT_Class_info(FileInputStream fis) throws IOException { byte[] index = new byte[2]; fis.read(index); UnsignedByte[] indexUB = UnsignedByte.from(index); System.out.printf("[index] %d\n", UnsignedByte.value(indexUB)); }
以下是其他tag
的读取方法,实现方式与CONSTANT_Class_info
的相似:
private static void read_CONSTANT_String_info(FileInputStream fis) throws IOException { byte[] index = new byte[2]; fis.read(index); UnsignedByte[] indexUB = UnsignedByte.from(index); System.out.printf("[index] %d\n", UnsignedByte.value(indexUB)); } private static void read_CONSTANT_Fieldref_info(FileInputStream fis) throws IOException { byte[] index1 = new byte[2]; byte[] index2 = new byte[2]; fis.read(index1); fis.read(index2); UnsignedByte[] index1UB = UnsignedByte.from(index1); UnsignedByte[] index2UB = UnsignedByte.from(index1); System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB)); System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB)); } private static void read_CONSTANT_Methodref_info(FileInputStream fis) throws IOException { byte[] index1 = new byte[2]; byte[] index2 = new byte[2]; fis.read(index1); fis.read(index2); UnsignedByte[] index1UB = UnsignedByte.from(index1); UnsignedByte[] index2UB = UnsignedByte.from(index1); System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB)); System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB)); } private static void read_CONSTANT_InterfaceMethodref_info(FileInputStream fis) throws IOException { byte[] index1 = new byte[2]; byte[] index2 = new byte[2]; fis.read(index1); fis.read(index2); UnsignedByte[] index1UB = UnsignedByte.from(index1); UnsignedByte[] index2UB = UnsignedByte.from(index1); System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB)); System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB)); } private static void read_CONSTANT_NameAndType_info(FileInputStream fis) throws IOException { byte[] index1 = new byte[2]; byte[] index2 = new byte[2]; fis.read(index1); fis.read(index2); UnsignedByte[] index1UB = UnsignedByte.from(index1); UnsignedByte[] index2UB = UnsignedByte.from(index1); System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB)); System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB)); } private static void read_CONSTANT_MethodHandle_info(FileInputStream fis) throws IOException { byte[] reference_kind = new byte[1];//1至9之间,决定了方法句柄的类型 byte[] reference_index = new byte[2]; fis.read(reference_kind); fis.read(reference_index); UnsignedByte[] reference_kindUB = UnsignedByte.from(reference_kind); UnsignedByte[] reference_indexUB = UnsignedByte.from(reference_index); System.out.printf("[reference_kind] %d\n", UnsignedByte.value(reference_kindUB)); System.out.printf("[reference_index] %d\n", UnsignedByte.value(reference_indexUB)); } private static void read_CONSTANT_MethodType_info(FileInputStream fis) throws IOException { byte[] descriptor_index = new byte[2]; fis.read(descriptor_index); UnsignedByte[] descriptor_indexUB = UnsignedByte.from(descriptor_index); System.out.printf("[descriptor_index] %d\n", UnsignedByte.value(descriptor_indexUB)); } private static void read_CONSTANT_Dynamic_info(FileInputStream fis) throws IOException { byte[] bootstrap_method_attr_index = new byte[2]; fis.read(bootstrap_method_attr_index); UnsignedByte[] bootstrap_method_attr_indexUB = UnsignedByte.from(bootstrap_method_attr_index); System.out.printf("[bootstrap_method_attr_index] %d\n", UnsignedByte.value(bootstrap_method_attr_indexUB)); } private static void read_CONSTANT_InvokeDynamic_info(FileInputStream fis) throws IOException { byte[] bootstrap_method_attr_index = new byte[2]; fis.read(bootstrap_method_attr_index); UnsignedByte[] bootstrap_method_attr_indexUB = UnsignedByte.from(bootstrap_method_attr_index); System.out.printf("[bootstrap_method_attr_index] %d\n", UnsignedByte.value(bootstrap_method_attr_indexUB)); } private static void read_CONSTANT_Module_info(FileInputStream fis) throws IOException { byte[] name_index = new byte[2]; fis.read(name_index); UnsignedByte[] name_indexUB = UnsignedByte.from(name_index); System.out.printf("[name_index] %d\n", UnsignedByte.value(name_indexUB)); } private static void read_CONSTANT_Package_info(FileInputStream fis) throws IOException { byte[] name_index = new byte[2]; fis.read(name_index); UnsignedByte[] name_indexUB = UnsignedByte.from(name_index); System.out.printf("[name_index] %d\n", UnsignedByte.value(name_indexUB)); }
下面重点介绍读取4 CONSTANT_Float_info 和 6 CONSTANT_Double_info的方法:
- 当
tag == 4
时,表示浮点型字面量
- 接下来的4字节是
bytes
,表示按照高位在前存储的float值
private static void read_CONSTANT_Float_info(FileInputStream fis) throws IOException { byte[] bytes = new byte[4]; fis.read(bytes); UnsignedByte[] bytesUB = UnsignedByte.from(bytes); System.out.printf("[bytes] %s %s %s %s\n", Integer.toBinaryString(bytesUB[0].value()), Integer.toBinaryString(bytesUB[1].value()), Integer.toBinaryString(bytesUB[2].value()), Integer.toBinaryString(bytesUB[3].value())); }
控制台打印:
[tag] 04 (4 CONSTANT_Float_info) [bytes] 11000001 11001001 0 0
打印出来的bytes是float的二进制存储表示,接下来将详细介绍将浮点数二进制转换成十进制,下表是float和double的二进制存储形式:
类型 | 符号位 | 阶码 | 尾数 | 长度 |
---|---|---|---|---|
float | 1 | 8 | 23 | 32 |
double | 1 | 11 | 52 | 64 |
下面是一个将float
二进制形式转换成十进制的例子:
-25.125f的二进制形式: 11000001 11001001 00000000 00000000 按照符号位、阶码、尾数规定的长度进行分割: 1 10000011 10010010000000000000000 1表示该浮点数为负数 10000011的十进制为:128 + 2 + 1 = 131 减去127得到4,表示小数点右移的数位 剩下23位是纯二进制小数 10010010000000000000000表示0.10010010000000000000000 前面加1得到1.10010010000000000000000 小数点右移4位得到11001.0010000000000000000 变为10进制得到 (1 + 8 + 16).(1/8) = 25.125 前面的1表示负数,所以该浮点数为-25.125
前面已经将float的内容读入了长度为4的bytesUB
数组中.
- 将bytesUB[0]的值右移7位,便只剩下其第8位(从右往左数,下面也如是),即符号位:
boolean positive = (bytesUB[0].value() >> 7) == 0;
- 阶码是由
bytesUB[0].value
前7位和bytesUB[1].value
的2至8位,最后要减去127,即(Math.pow(2, 8) - 1)
。int exponent = bytesUB[0].value() & 0b00000000_01111111; // 去掉符号位 exponent <<= (8 - 7); //增加一位 exponent += (bytesUB[1].value() >> (16 - 1 - 8)); //接上其首位 exponent -= (Math.pow(2, 8) - 1); // 减去127
- 下面是计算尾数的方法:
int mantissa = bytesUB[1].value() & 0b00000000_01111111; // 忽略第一位 mantissa <<= (8 + 8);//先左移16位腾出位置 mantissa += (bytesUB[2].value() << 8) + bytesUB[3].value();//加上后2字节 mantissa += 0b1000_0000____0000_0000____0000_0000;// 首位加个1
- 尾数包含整数部分和小数部分,小数点的位置由阶码决定:
int left = mantissa >> (23 - exponent); int right = mantissa << (32 - 23 + exponent) >> (32 - 23 + exponent);
- 小数部分的计算方式与整数部分的不同,例如如二进制形式的
0.001
,其十进制表示为0.125
,是先算出8,再用1.0除以8得出:int dividend = calcDividend(right, 23 - exponent);
- 下面是calcDividend方法的实现:
// 计算小数点后的二进制的十进制表示,如.001表示8,再用1/8得到0.125 private static int calcDividend(int rightBits, int len) { int result = 0; while (rightBits != 0) { int lastBits = rightBits & 1; // 最后一位是1,则结果为1,否则为0 result += lastBits * Math.pow(2, len); rightBits >>= 1; len--; } return result; }
- 最后得到其十进制表示:
float floatValue = (left + 1.0f / dividend); floatValue = positive ? floatValue : -floatValue; System.out.printf("[float] %f\n", floatValue);
read_CONSTANT_Float_info
方法至此结束,read_CONSTANT_Double_info
方法与其类似,区别在于符号位、阶码、尾数的长度。
3 源代码及输出结果
3.1 ClassFileParser.java
import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; /** * * @author z8g */ public class ClassFileParser { public static void main(String[] args) throws FileNotFoundException, IOException { String classFilePath = "/Users/z8g/Test/build/classes/HelloWorld.class"; File classFile = new File(classFilePath); FileInputStream fis = new FileInputStream(classFile); // 读取[魔数] byte[] magic = new byte[4]; fis.read(magic); UnsignedByte[] magicUB = UnsignedByte.from(magic); System.out.printf("[magic] %02x%02x %02x%02x\n", magicUB[0].value(), magicUB[1].value(), magicUB[2].value(), magicUB[3].value()); // 读取[次版本号]和[主版本号] byte[] minor_version = new byte[2]; byte[] major_version = new byte[2]; fis.read(minor_version); fis.read(major_version); UnsignedByte[] minor_versionUB = UnsignedByte.from(minor_version); UnsignedByte[] major_versionUB = UnsignedByte.from(major_version); System.out.printf("[minor_version] %02x%02x\n", minor_versionUB[0].value(), minor_versionUB[1].value()); System.out.printf("[major_version] %02x%02x\n", major_versionUB[0].value(), major_versionUB[1].value()); // 读取[常量池长度] byte[] constant_pool_count = new byte[2]; fis.read(constant_pool_count); UnsignedByte[] constant_pool_countUB = UnsignedByte.from(constant_pool_count); System.out.printf("[constant_pool_count] %02x %02x\n", constant_pool_countUB[0].value(), constant_pool_countUB[1].value()); long constantPoolCount = UnsignedByte.value(constant_pool_countUB); System.out.printf("(constantPoolCount: %d)\n", constantPoolCount); // 读取[常量池项] for (long i = 1; i <= constantPoolCount - 1; i++) { System.out.println("#" + i); byte[] tag = new byte[1]; fis.read(tag); UnsignedByte tagUB = UnsignedByte.from(tag[0]); short tagValue = tagUB.value(); System.out.printf("[tag] %02x ", tagUB.value()); switch (tagValue) { case 1: System.out.println("(1 CONSTANT_Utf8_info) "); read_CONSTANT_Utf8_info(fis); break; case 3: System.out.println("(3 CONSTANT_Integer_info) "); read_CONSTANT_Integer_info(fis); break; case 4: System.out.println("(4 CONSTANT_Float_info) "); read_CONSTANT_Float_info(fis); break; case 5: System.out.println("(5 CONSTANT_Long_info) "); read_CONSTANT_Long_info(fis); break; case 6: System.out.println("(6 CONSTANT_Double_info) "); read_CONSTANT_Double_info(fis); break; case 7: System.out.println("(7 CONSTANT_Class_info) "); read_CONSTANT_Class_info(fis); break; case 8: System.out.println("(8 CONSTANT_String_info) "); read_CONSTANT_String_info(fis); break; case 9: System.out.println("(9 CONSTANT_Fieldref_info) "); read_CONSTANT_Fieldref_info(fis); break; case 10: System.out.println("(10 CONSTANT_Methodref_info) "); read_CONSTANT_Methodref_info(fis); break; case 11: System.out.println("(11 CONSTANT_InterfaceMethodref_info) "); read_CONSTANT_InterfaceMethodref_info(fis); break; case 12: System.out.println("(12 CONSTANT_NameAndType_info) "); read_CONSTANT_NameAndType_info(fis); break; case 15: System.out.println("(15 CONSTANT_MethodHandle_info) "); read_CONSTANT_MethodHandle_info(fis); break; case 16: System.out.println("(16 CONSTANT_MethodType_info) "); read_CONSTANT_MethodType_info(fis); break; case 17: System.out.println("(17 CONSTANT_Dynamic_info) "); read_CONSTANT_Dynamic_info(fis); break; case 18: System.out.println("(18 CONSTANT_InvokeDynamic_info) "); read_CONSTANT_InvokeDynamic_info(fis); break; case 19: System.out.println("(19 CONSTANT_Module_info) "); read_CONSTANT_Module_info(fis); break; case 20: System.out.println("(20 CONSTANT_Package_info) "); read_CONSTANT_Package_info(fis); break; } } } private static void read_CONSTANT_Utf8_info(FileInputStream fis) throws IOException { byte[] length = new byte[2]; fis.read(length); UnsignedByte[] lengthUB = UnsignedByte.from(length); long lengthValue = UnsignedByte.value(lengthUB); System.out.print("[utf8] "); for (long k = 0; k < lengthValue; k++) { byte[] bytes = new byte[1]; fis.read(bytes); UnsignedByte bytesUB = UnsignedByte.from(bytes[0]); System.out.printf("%c", bytesUB.value()); } System.out.println(); } private static void read_CONSTANT_Integer_info(FileInputStream fis) throws IOException { byte[] bytes = new byte[4]; fis.read(bytes); UnsignedByte[] bytesUB = UnsignedByte.from(bytes); int integerValue = (bytesUB[0].value() << (8 * 3)) + (bytesUB[1].value() << (8 * 2)) + (bytesUB[2].value() << (8 * 1)) + bytesUB[3].value(); System.out.printf("[int] %d\n", integerValue); } private static void read_CONSTANT_Float_info(FileInputStream fis) throws IOException { byte[] bytes = new byte[4]; fis.read(bytes); UnsignedByte[] bytesUB = UnsignedByte.from(bytes); System.out.printf("[bytes] %s %s %s %s\n", Integer.toBinaryString(bytesUB[0].value()), Integer.toBinaryString(bytesUB[1].value()), Integer.toBinaryString(bytesUB[2].value()), Integer.toBinaryString(bytesUB[3].value())); /* -25.125f的二进制形式 11000001 11001001 00000000 00000000 1 10000011 10010010000000000000000 1表示该浮点数为负数 10000011的十进制为:128 + 2 + 1 = 131 减去127得到4,表示小数点右移的数位 剩下23位是纯二进制小数 10010010000000000000000表示0.10010010000000000000000 前面加1得到1.10010010000000000000000 小数点右移4位得到11001.0010000000000000000 变为10进制得到 (1 + 8 + 16).(1/8) = 25.125 前面的1表示负数,所以该浮点数为-25.125 */ boolean positive = (bytesUB[0].value() >> 7) == 0; int exponent = bytesUB[0].value() & 0b00000000_01111111; // 去掉符号位 exponent <<= (8 - 7); //增加一位 exponent += (bytesUB[1].value() >> (16 - 1 - 8)); //接上其首位 exponent -= (Math.pow(2, 8) - 1); // 减去127 int mantissa = bytesUB[1].value() & 0b00000000_01111111; // 忽略第一位 mantissa <<= (8 + 8);//先左移16位腾出位置 mantissa += (bytesUB[2].value() << 8) + bytesUB[3].value();//加上后2字节 mantissa += 0b1000_0000____0000_0000____0000_0000;// 首位加个1 int left = mantissa >> (23 - exponent); int right = mantissa << (32 - 23 + exponent) >> (32 - 23 + exponent); int dividend = calcDividend(right, 23 - exponent); float floatValue = (left + 1.0f / dividend); floatValue = positive ? floatValue : -floatValue; // System.out.println("dividend: " + dividend); // System.out.println("mantissa: " + Integer.toBinaryString(mantissa)); // System.out.println("left: " + Integer.toBinaryString(left)); // System.out.println("right: " + Integer.toBinaryString(right)); // // System.out.println("positive: " + positive); // System.out.println("exponent: " + exponent); // System.out.println("mantissa: " + mantissa); // // System.out.printf("[bytes] %s %s %s %s\n", // (bytesUB[0].value()), // (bytesUB[1].value()), // (bytesUB[2].value()), // (bytesUB[3].value())); System.out.printf("[float] %f\n", floatValue); } // 计算小数点后的二进制的十进制表示,如.001表示8,再用1/8得到0.125 private static int calcDividend(int rightBits, int len) { int result = 0; while (rightBits != 0) { int lastBits = rightBits & 1; // 最后一位是1,则结果为1,否则为0 result += lastBits * Math.pow(2, len); rightBits >>= 1; len--; } return result; } private static void read_CONSTANT_Long_info(FileInputStream fis) throws IOException { byte[] bytes = new byte[8]; fis.read(bytes); UnsignedByte[] bytesUB = UnsignedByte.from(bytes); int longValue = (bytesUB[0].value() << (8 * 7)) + (bytesUB[1].value() << ((8 * 6))) + (bytesUB[2].value() << ((8 * 5))) + (bytesUB[3].value() << (8 * 4)) + (bytesUB[4].value() << (8 * 3)) + (bytesUB[5].value() << (8 * 2)) + (bytesUB[6].value() << (8 * 1)) + bytesUB[7].value(); System.out.printf("[long] %d\n", longValue); } // 符号位 阶码 尾数 长度 // float 1 8 23 32 // double 1 11 52 64 private static void read_CONSTANT_Double_info(FileInputStream fis) throws IOException { byte[] bytes = new byte[8]; fis.read(bytes); UnsignedByte[] bytesUB = UnsignedByte.from(bytes); System.out.printf("[bytes] %s %s %s %s %s %s %s %s\n", Integer.toBinaryString(bytesUB[0].value()), Integer.toBinaryString(bytesUB[1].value()), Integer.toBinaryString(bytesUB[2].value()), Integer.toBinaryString(bytesUB[3].value()), Integer.toBinaryString(bytesUB[4].value()), Integer.toBinaryString(bytesUB[5].value()), Integer.toBinaryString(bytesUB[6].value()), Integer.toBinaryString(bytesUB[7].value())); /* -25.125f的二进制形式 11000000 111001 100000 0 0 0 0 0 11000000 00111001 00100000 00000000 00000000 00000000 00000000 00000000 1 10000000011 1001 00100000 00000000 00000000 00000000 00000000 00000000 1表示该浮点数为负数 10000000011的十进制为:128 + 2 + 1 = 131 减去127得到4,表示小数点右移的数位 剩下23位是纯二进制小数 10010010000000000000000表示0.10010010000000000000000 前面加1得到1.10010010000000000000000 小数点右移4位得到11001.0010000000000000000 变为10进制得到 (1 + 8 + 16).(1/8) = 25.125 前面的1表示负数,所以该浮点数为-25.125 */ boolean positive = (bytesUB[0].value() >> 7) == 0; int exponent = bytesUB[0].value() & 0b00000000_01111111; // 去掉符号位 exponent <<= (11 - 7); //增加一位 exponent += (bytesUB[1].value() >> (16 - 1 - 11)); //接上第二字节的前4位 exponent -= (Math.pow(2, 11) - 1); int mantissa = bytesUB[1].value() & 0b00000000_00001111; // 忽略前4位 //!!! mantissa <<= (8 + 8); mantissa += (bytesUB[2].value() << 8) + bytesUB[3].value(); mantissa += 0b1000_0000____0000_0000____0000_0000;// 首位加个1 int left = mantissa >> (23 - exponent); int right = mantissa << (32 - 23 + exponent) >> (32 - 23 + exponent); int dividend = calcDividend(right, 23 - exponent); float floatValue = (left + 1.0f / dividend); floatValue = positive ? floatValue : -floatValue; // System.out.println("dividend: " + dividend); // System.out.println("mantissa: " + Integer.toBinaryString(mantissa)); // System.out.println("left: " + Integer.toBinaryString(left)); // System.out.println("right: " + Integer.toBinaryString(right)); // // System.out.println("positive: " + positive); // System.out.println("exponent: " + exponent); // System.out.println("mantissa: " + mantissa); // // System.out.printf("[bytes] %s %s %s %s\n", // (bytesUB[0].value()), // (bytesUB[1].value()), // (bytesUB[2].value()), // (bytesUB[3].value())); System.out.printf("[double] %f\n", .0); } private static void read_CONSTANT_Class_info(FileInputStream fis) throws IOException { byte[] index = new byte[2]; fis.read(index); UnsignedByte[] indexUB = UnsignedByte.from(index); System.out.printf("[index] %d\n", UnsignedByte.value(indexUB)); } private static void read_CONSTANT_String_info(FileInputStream fis) throws IOException { byte[] index = new byte[2]; fis.read(index); UnsignedByte[] indexUB = UnsignedByte.from(index); System.out.printf("[index] %d\n", UnsignedByte.value(indexUB)); } private static void read_CONSTANT_Fieldref_info(FileInputStream fis) throws IOException { byte[] index1 = new byte[2]; byte[] index2 = new byte[2]; fis.read(index1); fis.read(index2); UnsignedByte[] index1UB = UnsignedByte.from(index1); UnsignedByte[] index2UB = UnsignedByte.from(index1); System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB)); System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB)); } private static void read_CONSTANT_Methodref_info(FileInputStream fis) throws IOException { byte[] index1 = new byte[2]; byte[] index2 = new byte[2]; fis.read(index1); fis.read(index2); UnsignedByte[] index1UB = UnsignedByte.from(index1); UnsignedByte[] index2UB = UnsignedByte.from(index1); System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB)); System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB)); } private static void read_CONSTANT_InterfaceMethodref_info(FileInputStream fis) throws IOException { byte[] index1 = new byte[2]; byte[] index2 = new byte[2]; fis.read(index1); fis.read(index2); UnsignedByte[] index1UB = UnsignedByte.from(index1); UnsignedByte[] index2UB = UnsignedByte.from(index1); System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB)); System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB)); } private static void read_CONSTANT_NameAndType_info(FileInputStream fis) throws IOException { byte[] index1 = new byte[2]; byte[] index2 = new byte[2]; fis.read(index1); fis.read(index2); UnsignedByte[] index1UB = UnsignedByte.from(index1); UnsignedByte[] index2UB = UnsignedByte.from(index1); System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB)); System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB)); } private static void read_CONSTANT_MethodHandle_info(FileInputStream fis) throws IOException { byte[] reference_kind = new byte[1];//1至9之间,决定了方法句柄的类型 byte[] reference_index = new byte[2]; fis.read(reference_kind); fis.read(reference_index); UnsignedByte[] reference_kindUB = UnsignedByte.from(reference_kind); UnsignedByte[] reference_indexUB = UnsignedByte.from(reference_index); System.out.printf("[reference_kind] %d\n", UnsignedByte.value(reference_kindUB)); System.out.printf("[reference_index] %d\n", UnsignedByte.value(reference_indexUB)); } private static void read_CONSTANT_MethodType_info(FileInputStream fis) throws IOException { byte[] descriptor_index = new byte[2]; fis.read(descriptor_index); UnsignedByte[] descriptor_indexUB = UnsignedByte.from(descriptor_index); System.out.printf("[descriptor_index] %d\n", UnsignedByte.value(descriptor_indexUB)); } private static void read_CONSTANT_Dynamic_info(FileInputStream fis) throws IOException { byte[] bootstrap_method_attr_index = new byte[2]; fis.read(bootstrap_method_attr_index); UnsignedByte[] bootstrap_method_attr_indexUB = UnsignedByte.from(bootstrap_method_attr_index); System.out.printf("[bootstrap_method_attr_index] %d\n", UnsignedByte.value(bootstrap_method_attr_indexUB)); } private static void read_CONSTANT_InvokeDynamic_info(FileInputStream fis) throws IOException { byte[] bootstrap_method_attr_index = new byte[2]; fis.read(bootstrap_method_attr_index); UnsignedByte[] bootstrap_method_attr_indexUB = UnsignedByte.from(bootstrap_method_attr_index); System.out.printf("[bootstrap_method_attr_index] %d\n", UnsignedByte.value(bootstrap_method_attr_indexUB)); } private static void read_CONSTANT_Module_info(FileInputStream fis) throws IOException { byte[] name_index = new byte[2]; fis.read(name_index); UnsignedByte[] name_indexUB = UnsignedByte.from(name_index); System.out.printf("[name_index] %d\n", UnsignedByte.value(name_indexUB)); } private static void read_CONSTANT_Package_info(FileInputStream fis) throws IOException { byte[] name_index = new byte[2]; fis.read(name_index); UnsignedByte[] name_indexUB = UnsignedByte.from(name_index); System.out.printf("[name_index] %d\n", UnsignedByte.value(name_indexUB)); } } class UnsignedByte { private short value; private byte rawValue; private UnsignedByte() { } public static UnsignedByte from(byte b) { UnsignedByte ub = new UnsignedByte(); ub.rawValue = b; ub.value = (short) ((short) b & 0xFF); return ub; } public static UnsignedByte[] from(byte[] bytes) { UnsignedByte[] result = new UnsignedByte[bytes.length]; for (int i = 0; i < bytes.length; i++) { result[i] = from(bytes[i]); } return result; } public short value() { return value; } public static long value(UnsignedByte[] bytes) { if (bytes.length == 1) { return bytes[1].value(); } if (bytes.length == 2) { return (bytes[0].value() << 32) + bytes[1].value(); } throw new IllegalArgumentException(); } public static byte[] to(UnsignedByte[] b) { byte[] result = new byte[b.length]; for (int i = 0; i < result.length; i++) { result[i] = (byte) (b[i].rawValue & 0xFF); } return result; } public static byte to(UnsignedByte b) { return b.rawValue; } public static byte to(short i) { return (byte) i; } }
3.2 HelloWorld.java
public class HelloWorld { private final int age = 19971015; private final float len = -25.125f; private final double d = -25.125; public int add(int i) { return i + 1; } }
3.3 输出结果
[magic] cafe babe [minor_version] 0000 [major_version] 0034 [constant_pool_count] 00 24 (constantPoolCount: 36) #1 [tag] 0a (10 CONSTANT_Methodref_info) [index1] 10 [index2] 10 #2 [tag] 03 (3 CONSTANT_Integer_info) [int] 19971015 #3 [tag] 09 (9 CONSTANT_Fieldref_info) [index1] 9 [index2] 9 #4 [tag] 04 (4 CONSTANT_Float_info) [bytes] 11000001 11001001 0 0 [float] -25.125000 #5 [tag] 09 (9 CONSTANT_Fieldref_info) [index1] 9 [index2] 9 #6 [tag] 06 (6 CONSTANT_Double_info) [bytes] 11000000 111001 100000 0 0 0 0 0 [double] 0.000000 #7 [tag] 09 (9 CONSTANT_Fieldref_info) [index1] 9 [index2] 9 #8 [tag] 07 (7 CONSTANT_Class_info) [index] 34 #9 [tag] 07 (7 CONSTANT_Class_info) [index] 35 #10 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] age #11 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] I #12 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] ConstantValue #13 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] len #14 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] F #15 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] d #16 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] D #17 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] <init> #18 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] ()V #19 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] Code #20 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] LineNumberTable #21 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] LocalVariableTable #22 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] this #23 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] LHelloWorld; #24 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] add #25 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] (I)I #26 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] i #27 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] SourceFile #28 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] HelloWorld.java #29 [tag] 0c (12 CONSTANT_NameAndType_info) [index1] 18 [index2] 18 #30 [tag] 0c (12 CONSTANT_NameAndType_info) [index1] 11 [index2] 11 #31 [tag] 0c (12 CONSTANT_NameAndType_info) [index1] 14 [index2] 14 #32 [tag] 0c (12 CONSTANT_NameAndType_info) [index1] 16 [index2] 16 #33 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] HelloWorld #34 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] java/lang/Object #35 [tag] 00