类文件结构学习笔记
第一节为《深入理解Java虚拟机》的第六章前半部分内容,第二节才是原创,目的是根据第一节的规范解析class文件内容。解析出常量池后就没继续了,其中将二进制浮点数表示成十进制花了一会儿工夫。
1 Class类文件的结构
Class文件格式:
| 类型 | 名称 | 数量 |
|---|---|---|
| u4 | magic | 1 |
| u2 | minor_version | 1 |
| u2 | major_version | 1 |
| u2 | constant_pool_count | 1 |
| cp_info | constant_pool | constant_pool_count - 1 |
| u2 | access_flags | 1 |
| u2 | this_class | 1 |
| u2 | super_class | 1 |
| u2 | interfaces_count | 1 |
| u2 | interfaces | interfaces_count |
| u2 | fields_count | 1 |
| field_info | fields | fields_count |
| u2 | methods_count | 1 |
| method_info | methods | method_count |
| u2 | attributes_count | 1 |
| attribute_info | attributes | attributes_count |
1.1 魔数与Class文件的版本
魔数:CAFEBABE
主版本号:50对应JDK 6、51对应JDK 7
版本号:主版本号+次版本号,如:52.0
1.2 常量池
- 常量池由1开始计数,如
00 16十进制为22,代表常量池中有21项常量,索引值范围为1-21 - 主要存放字面量(Literal)和符号引用(Symbolic References)
常量池的项目类型:
| 名称 | 标志 | 描述 |
|---|---|---|
| CONSTANT_Utf8_info | 1 | UTF-8编码的字符串 |
| CONSTANT_Integer_info | 3 | 整型字面量 |
| CONSTANT_Float_info | 4 | 浮点型字面量 |
| CONSTANT_Long_info | 5 | 长整型字面量 |
| CONSTANT_Double_info | 6 | 双精度浮点型字面量 |
| CONSTANT_Class_info | 7 | 类或接口的符号引用 |
| CONSTANT_String_info | 8 | 字符串类型字面量 |
| CONSTANT_Fieldref_info | 9 | 字段的符号引用 |
| CONSTANT_Methodref_info | 10 | 类中方法的符号引用 |
| CONSTANT_InterfaceMethodref_info | 11 | 接口中方法的符号引用 |
| CONSTANT_NameAndType_info | 12 | 字段或方法的部分符号引用 |
| CONSTANT_MethodHandle_info | 15 | 表示方法句柄 |
| CONSTANT_MethodType_info | 16 | 表示方法类型 |
| CONSTANT_Dynamic_info | 17 | 表示一个动态计算常量 |
| CONSTANT_InvokeDynamic_info | 18 | 表示一个动态方法调用点 |
| CONSTANT_Module_info | 19 | 表示一个模块 |
| CONSTANT_Package_info | 20 | 表示一个模块中开放或者导出的包 |
CONSTANT_Class_info类型常量的结构:
| 类型 | 名称 | 数量 |
|---|---|---|
| u1 | tag | 1 |
| u2 | name_index | 1 |
tag是标志位,用于区分常量类型,name_index是常量池的索引值,指向常量池中一个CONSTANT_Utf8_info类型常量(全限定名)
CONSTANT_Utf8_info型常量的结构:
| 类型 | 名称 | 数量 |
|---|---|---|
| u1 | tag | 数量 |
| u2 | length | 1 |
| u1 | bytes | length |
- 使用UTF-8缩略编码表示,从
u0001到u007f使用一个字节,从u0080到u07ff使用两个字节,从u0800到uffff使用三个字节。 - 最大为65535,也就是Java方法、字段名的最大长度。
使用javap:
javap -verbose TestClass
常量池中17种数据类型的结构总表:
| 名称 | 项目 | 类型 | 描述 |
|---|---|---|---|
| CONSTANT_Utf8_info | tag | u1 | 1 |
| length | u2 | ||
| bytes | u1 | ||
| CONSTANT_Integer_info | tag | u1 | 3 |
| bytes | u4 | ||
| CONSTANT_Float_info | tag | u1 | 4 |
| bytes | u4 | ||
| CONSTANT_Long_info | tag | u1 | 5 |
| bytes | u8 | ||
| CONSTANT_Double_info | tag | u1 | 6 |
| bytes | u8 | ||
| CONSTANT_Class_info | tag | u1 | 7 |
| index | u2 | ||
| CONSTANT_String_info | tag | u1 | 8 |
| index | u2 | ||
| CONSTANT_Fieldref_info | tag | u1 | 9 |
| index | u2 | ||
| index | u2 | ||
| CONSTANT_Methodref_info | tag | u1 | 10 |
| index | u2 | ||
| index | u2 | ||
| CONSTANT_InterfaceMethodref_info | tag | u1 | 11 |
| index | u2 | ||
| index | u2 | ||
| CONSTANT_NameAndType_info | tag | u1 | 12 |
| index | u2 | ||
| index | u2 | ||
| CONSTANT_MethodHandle_info | tag | u1 | 15 |
| reference_kind | u1 | ||
| reference_index | u2 | ||
| CONSTANT_MethodType_info | tag | u1 | 16 |
| descriptor_index | u2 | ||
| CONSTANT_Dynamic_info | tag | u1 | 17 |
| bootstrap_method_attr_index | u2 | ||
| CONSTANT_InvokeDynamic_info | tag | u1 | 18 |
| bootstrap_method_attr_index | u2 | ||
| CONSTANT_Module_info | tag | u1 | 19 |
| name_index | u2 | ||
| CONSTANT_Package_info | tag | u1 | 20 |
| name_index | u2 |
1.3 访问标志
常量池结束之后,紧接着的2个字节代表访问标志(access_flags),用于标志用于识别一些类或接口层次的访问信息
| 标志名称 | 标志值 | 含义 |
|---|---|---|
| ACC_PUBLIC | 0x0001 | 是否为public类型 |
| ACC_FINAL | 0x0010 | 是否被final,只有类可设置 |
| ACC_SUPER | 0x0020 | 是否允许使用invokespecial字节码指令的新语义 |
| ACC_INTERFACE | 0x0200 | 标识这是一个接口 |
| ACC_ABSTRACT | 0x0400 | 是否为abstract类型,对于接口或者抽象类为真,其他类型为假 |
| ACC_SYNTHETIC | 0x1000 | 标识这个类并非由用户代码产生的 |
| ACC_ANNOTATION | 0x2000 | 标识这是一个注解 |
| ACC_ENUM | 0x4000 | 标识这是一个枚举 |
| ACC_MODULE | 0x8000 | 标识这是一个模块 |
1.4 类索引、父类索引、接口索引集合
类索引(this_class)和父类索引(super_class)是u2类型的数据,接口索引集合(interfaces)是一组u2类型的数据集合(implements关键字后从左到右排列)。
1.5 字段表集合
字段表(field_info)用于描述接口或则类中声明的变量。
字段表结构:
| 名称 | 类型 | 数量 |
|---|---|---|
| access_flags | u2 | 1 |
| name_index | u2 | 1 |
| descriptor_index | u2 | 1 |
| attributes_count | 1 | |
| attributes | attribute | attributes_count |
字段访问标志:
| 标志名称 | 标志值 | 含义 |
|---|---|---|
| ACC_PUBLIC | 0x0001 | 字段是否public |
| ACC_PRIVATE | 0x0002 | 字段是否private |
| ACC_PROTECTED | 0x0004 | 字段是否protected |
| ACC_STATIC | 0x0008 | 字段是否static |
| ACC_FINAL | 0x0010 | 字段是否final |
| ACC_VOLATILE | 0x0040 | 字段是否volatile |
| ACC_TRANSIENT | 0x0080 | 字段是否transient |
| ACC_SYNTHETIC | 0x1000 | 字段是否由编译器自动产生 |
| ACC_ENUM | 0x4000 | 字段是否enum |
ACC_PUBLIC、ACC_PRIVATE、ACC_PROTECTED三者只能取一个ACC_FINAL和ACC_VOLATILE两者只能取一个- 接口之中的字段必须要有
ACC_PUBLIC、ACC_STATIC、ACC_FINAL标志
name_index和descriptor_index都是对常量池项的引用,分别代表字段的简单名称以及字段和方法的描述符
描述符标识字符含义:
| 标识字符 | 含义 |
|---|---|
| B | byte |
| C | char |
| D | double |
| F | float |
| I | int |
| J | long |
| S | short |
| Z | boolean |
| V | 特殊类型void |
| L | 对象类型,如Ljava/lang/Object; |
- 数组用
[标识,例如:java.lang.String[][]将被记录为[[Ljava/lang/String;int[]将被记录为[I
2 解析Class文件
下面的程序的最终目的是读取.class文件,转成.java文件。
2.1 java源文件
下面的java源文件编译成class文件后将位于:/Users/z8g/Test/build/classes/HelloWorld.class
public class HelloWorld {
private final int age = 19971015;
private final float len = -25.125f;
private final double d = -25.125;
public int add(int i) {
return i + 1;
}
} 2.2 UnsignedByte:无符号字节类
class文件的字节需要转换成无符号字节,可以使用以下公式:
byte b; short value = (short) ((short) b & 0xFF)
下面便是无符号字节类的实现:
class UnsignedByte {
private short value;
private byte rawValue;
private UnsignedByte() {
}
public static UnsignedByte from(byte b) {
UnsignedByte ub = new UnsignedByte();
ub.rawValue = b;
ub.value = (short) ((short) b & 0xFF);
return ub;
}
} 在此基础上,封装了一些方法:
public static UnsignedByte[] from(byte[] bytes) {
UnsignedByte[] result = new UnsignedByte[bytes.length];
for (int i = 0; i < bytes.length; i++) {
result[i] = from(bytes[i]);
}
return result;
} 获取值:
public short value() {
return value;
}
public static long value(UnsignedByte[] bytes) {
if (bytes.length == 1) {
return bytes[1].value();
}
if (bytes.length == 2) {
return (bytes[0].value() << 32) + bytes[1].value();
}
throw new IllegalArgumentException();
} 转字节类型:
public static byte[] to(UnsignedByte[] b) {
byte[] result = new byte[b.length];
for (int i = 0; i < result.length; i++) {
result[i] = (byte) (b[i].rawValue & 0xFF);
}
return result;
}
public static byte to(UnsignedByte b) {
return b.rawValue;
}
public static byte to(short i) {
return (byte) i;
} 2.3 文件字节流
String classFilePath = "/Users/z8g/Test/build/classes/HelloWorld.class"; File classFile = new File(classFilePath); FileInputStream fis = new FileInputStream(classFile);
获取到FileInputStream实例fis后,下面将依次通过调用fis.read方法将class文件内容读入byte数组。
2.4 读取魔数
魔数占4字节,因此声明一个4字节的byte数组,读取之后转换成无符号字节数组。
// 读取[魔数]
byte[] magic = new byte[4];
fis.read(magic);
UnsignedByte[] magicUB = UnsignedByte.from(magic);
System.out.printf("[magic] %02x%02x %02x%02x\n",
magicUB[0].value(), magicUB[1].value(),
magicUB[2].value(), magicUB[3].value()); 控制台打印:
[magic] cafe babe
2.5 读取次版本号和主版本号
在此之后继续调用fis的read方法,次版本号和主版本号都占用2个字节,因此使用相同的方式读取即可。
// 读取[次版本号]和[主版本号]
byte[] minor_version = new byte[2];
byte[] major_version = new byte[2];
fis.read(minor_version);
fis.read(major_version);
UnsignedByte[] minor_versionUB = UnsignedByte.from(minor_version);
UnsignedByte[] major_versionUB = UnsignedByte.from(major_version);
System.out.printf("[minor_version] %02x%02x\n",
minor_versionUB[0].value(), minor_versionUB[1].value());
System.out.printf("[major_version] %02x%02x\n",
major_versionUB[0].value(), major_versionUB[1].value()); 控制台打印:
[minor_version] 0000 [major_version] 0034
0034是16进制,其十进制表示为52,次版本号为0,因此其版本号为52.0,表示该class文件由JDK 8编译。
2.6 读取常量池长度
// 读取[常量池长度]
byte[] constant_pool_count = new byte[2];
fis.read(constant_pool_count);
UnsignedByte[] constant_pool_countUB = UnsignedByte.from(constant_pool_count);
System.out.printf("[constant_pool_count] %02x %02x\n",
constant_pool_countUB[0].value(), constant_pool_countUB[1].value());
long constantPoolCount = UnsignedByte.value(constant_pool_countUB);
System.out.printf("(constantPoolCount: %d)\n", constantPoolCount); 控制台打印:
[constant_pool_count] 00 24 (constantPoolCount: 36)
常量池的索引范围是[1, constant_pool_count - 1],因此其常量池的长度为35。
2.7 读取常量池
前面得到常量池的长度constant_pool_count,因此遍历[1, constant_pool_count - 1],每次先读取tag标志(占1字节),再根据tag标志来选择读取策略。
// 读取[常量池项]
for (long i = 1; i <= constantPoolCount - 1; i++) {
System.out.println("#" + i);
byte[] tag = new byte[1];
fis.read(tag);
UnsignedByte tagUB = UnsignedByte.from(tag[0]);
short tagValue = tagUB.value();
System.out.printf("[tag] %02x ", tagUB.value());
switch (tagValue) {
case 1:
System.out.println("(1 CONSTANT_Utf8_info) ");
read_CONSTANT_Utf8_info(fis);
break;
case 3:
System.out.println("(3 CONSTANT_Integer_info) ");
read_CONSTANT_Integer_info(fis);
break;
case 4:
System.out.println("(4 CONSTANT_Float_info) ");
read_CONSTANT_Float_info(fis);
break;
case 5:
System.out.println("(5 CONSTANT_Long_info) ");
read_CONSTANT_Long_info(fis);
break;
case 6:
System.out.println("(6 CONSTANT_Double_info) ");
read_CONSTANT_Double_info(fis);
break;
case 7:
System.out.println("(7 CONSTANT_Class_info) ");
read_CONSTANT_Class_info(fis);
break;
case 8:
System.out.println("(8 CONSTANT_String_info) ");
read_CONSTANT_String_info(fis);
break;
case 9:
System.out.println("(9 CONSTANT_Fieldref_info) ");
read_CONSTANT_Fieldref_info(fis);
break;
case 10:
System.out.println("(10 CONSTANT_Methodref_info) ");
read_CONSTANT_Methodref_info(fis);
break;
case 11:
System.out.println("(11 CONSTANT_InterfaceMethodref_info) ");
read_CONSTANT_InterfaceMethodref_info(fis);
break;
case 12:
System.out.println("(12 CONSTANT_NameAndType_info) ");
read_CONSTANT_NameAndType_info(fis);
break;
case 15:
System.out.println("(15 CONSTANT_MethodHandle_info) ");
read_CONSTANT_MethodHandle_info(fis);
break;
case 16:
System.out.println("(16 CONSTANT_MethodType_info) ");
read_CONSTANT_MethodType_info(fis);
break;
case 17:
System.out.println("(17 CONSTANT_Dynamic_info) ");
read_CONSTANT_Dynamic_info(fis);
break;
case 18:
System.out.println("(18 CONSTANT_InvokeDynamic_info) ");
read_CONSTANT_InvokeDynamic_info(fis);
break;
case 19:
System.out.println("(19 CONSTANT_Module_info) ");
read_CONSTANT_Module_info(fis);
break;
case 20:
System.out.println("(20 CONSTANT_Package_info) ");
read_CONSTANT_Package_info(fis);
break;
}
} 读取 1 CONSTANT_Utf8_info:
当
tag == 1时,表示该常量项是UTF-8编码的字符串接下来的2字节是
length,表示bytes的长度接着继续读取
length字节即可private static void read_CONSTANT_Utf8_info(FileInputStream fis) throws IOException { byte[] length = new byte[2]; fis.read(length); UnsignedByte[] lengthUB = UnsignedByte.from(length); long lengthValue = UnsignedByte.value(lengthUB); System.out.print("[utf8] "); for (long k = 0; k < lengthValue; k++) { byte[] bytes = new byte[1]; fis.read(bytes); UnsignedByte bytesUB = UnsignedByte.from(bytes[0]); System.out.printf("%c", bytesUB.value()); } System.out.println(); }
3 read_CONSTANT_Integer_info:
- 当
tag == 3时,表示常量项是整型字面量 - 接下来的4字节是
bytes,表示按照高位在前存储的int值 - Integer是32位整数,可以通过移位操作再累加的方法取出其值
private static void read_CONSTANT_Integer_info(FileInputStream fis) throws IOException {
byte[] bytes = new byte[4];
fis.read(bytes);
UnsignedByte[] bytesUB = UnsignedByte.from(bytes);
int integerValue = (bytesUB[0].value() << (8 * 3))
+ (bytesUB[1].value() << (8 * 2))
+ (bytesUB[2].value() << (8 * 1))
+ bytesUB[3].value();
System.out.printf("[int] %d\n", integerValue);
} 5 CONSTANT_Long_info
当
tag == 5时,表示常量项是长整型字面量接下来的8字节是
bytes,表示按照高位在前存储的long值Long是64位整数,可以通过移位操作再累加的方法取出其值
private static void read_CONSTANT_Long_info(FileInputStream fis) throws IOException { byte[] bytes = new byte[8]; fis.read(bytes); UnsignedByte[] bytesUB = UnsignedByte.from(bytes); int longValue = (bytesUB[0].value() << (8 * 7)) + (bytesUB[1].value() << ((8 * 6))) + (bytesUB[2].value() << ((8 * 5))) + (bytesUB[3].value() << (8 * 4)) + (bytesUB[4].value() << (8 * 3)) + (bytesUB[5].value() << (8 * 2)) + (bytesUB[6].value() << (8 * 1)) + bytesUB[7].value(); System.out.printf("[long] %d\n", longValue); }
7 CONSTANT_Class_info
- 当
tag == 7时,表示常量项是类或接口的符号引用 - 接下来的2字节是
index,指向全限定名的常量项索引private static void read_CONSTANT_Class_info(FileInputStream fis) throws IOException { byte[] index = new byte[2]; fis.read(index); UnsignedByte[] indexUB = UnsignedByte.from(index); System.out.printf("[index] %d\n", UnsignedByte.value(indexUB)); }
以下是其他tag的读取方法,实现方式与CONSTANT_Class_info的相似:
private static void read_CONSTANT_String_info(FileInputStream fis) throws IOException {
byte[] index = new byte[2];
fis.read(index);
UnsignedByte[] indexUB = UnsignedByte.from(index);
System.out.printf("[index] %d\n", UnsignedByte.value(indexUB));
}
private static void read_CONSTANT_Fieldref_info(FileInputStream fis) throws IOException {
byte[] index1 = new byte[2];
byte[] index2 = new byte[2];
fis.read(index1);
fis.read(index2);
UnsignedByte[] index1UB = UnsignedByte.from(index1);
UnsignedByte[] index2UB = UnsignedByte.from(index1);
System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB));
System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB));
}
private static void read_CONSTANT_Methodref_info(FileInputStream fis) throws IOException {
byte[] index1 = new byte[2];
byte[] index2 = new byte[2];
fis.read(index1);
fis.read(index2);
UnsignedByte[] index1UB = UnsignedByte.from(index1);
UnsignedByte[] index2UB = UnsignedByte.from(index1);
System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB));
System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB));
}
private static void read_CONSTANT_InterfaceMethodref_info(FileInputStream fis) throws IOException {
byte[] index1 = new byte[2];
byte[] index2 = new byte[2];
fis.read(index1);
fis.read(index2);
UnsignedByte[] index1UB = UnsignedByte.from(index1);
UnsignedByte[] index2UB = UnsignedByte.from(index1);
System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB));
System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB));
}
private static void read_CONSTANT_NameAndType_info(FileInputStream fis) throws IOException {
byte[] index1 = new byte[2];
byte[] index2 = new byte[2];
fis.read(index1);
fis.read(index2);
UnsignedByte[] index1UB = UnsignedByte.from(index1);
UnsignedByte[] index2UB = UnsignedByte.from(index1);
System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB));
System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB));
}
private static void read_CONSTANT_MethodHandle_info(FileInputStream fis) throws IOException {
byte[] reference_kind = new byte[1];//1至9之间,决定了方法句柄的类型
byte[] reference_index = new byte[2];
fis.read(reference_kind);
fis.read(reference_index);
UnsignedByte[] reference_kindUB = UnsignedByte.from(reference_kind);
UnsignedByte[] reference_indexUB = UnsignedByte.from(reference_index);
System.out.printf("[reference_kind] %d\n", UnsignedByte.value(reference_kindUB));
System.out.printf("[reference_index] %d\n", UnsignedByte.value(reference_indexUB));
}
private static void read_CONSTANT_MethodType_info(FileInputStream fis) throws IOException {
byte[] descriptor_index = new byte[2];
fis.read(descriptor_index);
UnsignedByte[] descriptor_indexUB = UnsignedByte.from(descriptor_index);
System.out.printf("[descriptor_index] %d\n", UnsignedByte.value(descriptor_indexUB));
}
private static void read_CONSTANT_Dynamic_info(FileInputStream fis) throws IOException {
byte[] bootstrap_method_attr_index = new byte[2];
fis.read(bootstrap_method_attr_index);
UnsignedByte[] bootstrap_method_attr_indexUB = UnsignedByte.from(bootstrap_method_attr_index);
System.out.printf("[bootstrap_method_attr_index] %d\n", UnsignedByte.value(bootstrap_method_attr_indexUB));
}
private static void read_CONSTANT_InvokeDynamic_info(FileInputStream fis) throws IOException {
byte[] bootstrap_method_attr_index = new byte[2];
fis.read(bootstrap_method_attr_index);
UnsignedByte[] bootstrap_method_attr_indexUB = UnsignedByte.from(bootstrap_method_attr_index);
System.out.printf("[bootstrap_method_attr_index] %d\n", UnsignedByte.value(bootstrap_method_attr_indexUB));
}
private static void read_CONSTANT_Module_info(FileInputStream fis) throws IOException {
byte[] name_index = new byte[2];
fis.read(name_index);
UnsignedByte[] name_indexUB = UnsignedByte.from(name_index);
System.out.printf("[name_index] %d\n", UnsignedByte.value(name_indexUB));
}
private static void read_CONSTANT_Package_info(FileInputStream fis) throws IOException {
byte[] name_index = new byte[2];
fis.read(name_index);
UnsignedByte[] name_indexUB = UnsignedByte.from(name_index);
System.out.printf("[name_index] %d\n", UnsignedByte.value(name_indexUB));
} 下面重点介绍读取4 CONSTANT_Float_info 和 6 CONSTANT_Double_info的方法:
- 当
tag == 4时,表示浮点型字面量 - 接下来的4字节是
bytes,表示按照高位在前存储的float值
private static void read_CONSTANT_Float_info(FileInputStream fis) throws IOException {
byte[] bytes = new byte[4];
fis.read(bytes);
UnsignedByte[] bytesUB = UnsignedByte.from(bytes);
System.out.printf("[bytes] %s %s %s %s\n",
Integer.toBinaryString(bytesUB[0].value()),
Integer.toBinaryString(bytesUB[1].value()),
Integer.toBinaryString(bytesUB[2].value()),
Integer.toBinaryString(bytesUB[3].value()));
} 控制台打印:
[tag] 04 (4 CONSTANT_Float_info) [bytes] 11000001 11001001 0 0
打印出来的bytes是float的二进制存储表示,接下来将详细介绍将浮点数二进制转换成十进制,下表是float和double的二进制存储形式:
| 类型 | 符号位 | 阶码 | 尾数 | 长度 |
|---|---|---|---|---|
| float | 1 | 8 | 23 | 32 |
| double | 1 | 11 | 52 | 64 |
下面是一个将float二进制形式转换成十进制的例子:
-25.125f的二进制形式: 11000001 11001001 00000000 00000000 按照符号位、阶码、尾数规定的长度进行分割: 1 10000011 10010010000000000000000 1表示该浮点数为负数 10000011的十进制为:128 + 2 + 1 = 131 减去127得到4,表示小数点右移的数位 剩下23位是纯二进制小数 10010010000000000000000表示0.10010010000000000000000 前面加1得到1.10010010000000000000000 小数点右移4位得到11001.0010000000000000000 变为10进制得到 (1 + 8 + 16).(1/8) = 25.125 前面的1表示负数,所以该浮点数为-25.125
前面已经将float的内容读入了长度为4的bytesUB数组中.
- 将bytesUB[0]的值右移7位,便只剩下其第8位(从右往左数,下面也如是),即符号位:
boolean positive = (bytesUB[0].value() >> 7) == 0;
- 阶码是由
bytesUB[0].value前7位和bytesUB[1].value的2至8位,最后要减去127,即(Math.pow(2, 8) - 1)。int exponent = bytesUB[0].value() & 0b00000000_01111111; // 去掉符号位 exponent <<= (8 - 7); //增加一位 exponent += (bytesUB[1].value() >> (16 - 1 - 8)); //接上其首位 exponent -= (Math.pow(2, 8) - 1); // 减去127
- 下面是计算尾数的方法:
int mantissa = bytesUB[1].value() & 0b00000000_01111111; // 忽略第一位 mantissa <<= (8 + 8);//先左移16位腾出位置 mantissa += (bytesUB[2].value() << 8) + bytesUB[3].value();//加上后2字节 mantissa += 0b1000_0000____0000_0000____0000_0000;// 首位加个1
- 尾数包含整数部分和小数部分,小数点的位置由阶码决定:
int left = mantissa >> (23 - exponent); int right = mantissa << (32 - 23 + exponent) >> (32 - 23 + exponent);
- 小数部分的计算方式与整数部分的不同,例如如二进制形式的
0.001,其十进制表示为0.125,是先算出8,再用1.0除以8得出:int dividend = calcDividend(right, 23 - exponent);
- 下面是calcDividend方法的实现:
// 计算小数点后的二进制的十进制表示,如.001表示8,再用1/8得到0.125 private static int calcDividend(int rightBits, int len) { int result = 0; while (rightBits != 0) { int lastBits = rightBits & 1; // 最后一位是1,则结果为1,否则为0 result += lastBits * Math.pow(2, len); rightBits >>= 1; len--; } return result; } - 最后得到其十进制表示:
float floatValue = (left + 1.0f / dividend); floatValue = positive ? floatValue : -floatValue; System.out.printf("[float] %f\n", floatValue);
read_CONSTANT_Float_info方法至此结束,read_CONSTANT_Double_info方法与其类似,区别在于符号位、阶码、尾数的长度。
3 源代码及输出结果
3.1 ClassFileParser.java
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
/**
*
* @author z8g
*/
public class ClassFileParser {
public static void main(String[] args) throws FileNotFoundException, IOException {
String classFilePath = "/Users/z8g/Test/build/classes/HelloWorld.class";
File classFile = new File(classFilePath);
FileInputStream fis = new FileInputStream(classFile);
// 读取[魔数]
byte[] magic = new byte[4];
fis.read(magic);
UnsignedByte[] magicUB = UnsignedByte.from(magic);
System.out.printf("[magic] %02x%02x %02x%02x\n",
magicUB[0].value(), magicUB[1].value(),
magicUB[2].value(), magicUB[3].value());
// 读取[次版本号]和[主版本号]
byte[] minor_version = new byte[2];
byte[] major_version = new byte[2];
fis.read(minor_version);
fis.read(major_version);
UnsignedByte[] minor_versionUB = UnsignedByte.from(minor_version);
UnsignedByte[] major_versionUB = UnsignedByte.from(major_version);
System.out.printf("[minor_version] %02x%02x\n",
minor_versionUB[0].value(), minor_versionUB[1].value());
System.out.printf("[major_version] %02x%02x\n",
major_versionUB[0].value(), major_versionUB[1].value());
// 读取[常量池长度]
byte[] constant_pool_count = new byte[2];
fis.read(constant_pool_count);
UnsignedByte[] constant_pool_countUB = UnsignedByte.from(constant_pool_count);
System.out.printf("[constant_pool_count] %02x %02x\n",
constant_pool_countUB[0].value(), constant_pool_countUB[1].value());
long constantPoolCount = UnsignedByte.value(constant_pool_countUB);
System.out.printf("(constantPoolCount: %d)\n", constantPoolCount);
// 读取[常量池项]
for (long i = 1; i <= constantPoolCount - 1; i++) {
System.out.println("#" + i);
byte[] tag = new byte[1];
fis.read(tag);
UnsignedByte tagUB = UnsignedByte.from(tag[0]);
short tagValue = tagUB.value();
System.out.printf("[tag] %02x ", tagUB.value());
switch (tagValue) {
case 1:
System.out.println("(1 CONSTANT_Utf8_info) ");
read_CONSTANT_Utf8_info(fis);
break;
case 3:
System.out.println("(3 CONSTANT_Integer_info) ");
read_CONSTANT_Integer_info(fis);
break;
case 4:
System.out.println("(4 CONSTANT_Float_info) ");
read_CONSTANT_Float_info(fis);
break;
case 5:
System.out.println("(5 CONSTANT_Long_info) ");
read_CONSTANT_Long_info(fis);
break;
case 6:
System.out.println("(6 CONSTANT_Double_info) ");
read_CONSTANT_Double_info(fis);
break;
case 7:
System.out.println("(7 CONSTANT_Class_info) ");
read_CONSTANT_Class_info(fis);
break;
case 8:
System.out.println("(8 CONSTANT_String_info) ");
read_CONSTANT_String_info(fis);
break;
case 9:
System.out.println("(9 CONSTANT_Fieldref_info) ");
read_CONSTANT_Fieldref_info(fis);
break;
case 10:
System.out.println("(10 CONSTANT_Methodref_info) ");
read_CONSTANT_Methodref_info(fis);
break;
case 11:
System.out.println("(11 CONSTANT_InterfaceMethodref_info) ");
read_CONSTANT_InterfaceMethodref_info(fis);
break;
case 12:
System.out.println("(12 CONSTANT_NameAndType_info) ");
read_CONSTANT_NameAndType_info(fis);
break;
case 15:
System.out.println("(15 CONSTANT_MethodHandle_info) ");
read_CONSTANT_MethodHandle_info(fis);
break;
case 16:
System.out.println("(16 CONSTANT_MethodType_info) ");
read_CONSTANT_MethodType_info(fis);
break;
case 17:
System.out.println("(17 CONSTANT_Dynamic_info) ");
read_CONSTANT_Dynamic_info(fis);
break;
case 18:
System.out.println("(18 CONSTANT_InvokeDynamic_info) ");
read_CONSTANT_InvokeDynamic_info(fis);
break;
case 19:
System.out.println("(19 CONSTANT_Module_info) ");
read_CONSTANT_Module_info(fis);
break;
case 20:
System.out.println("(20 CONSTANT_Package_info) ");
read_CONSTANT_Package_info(fis);
break;
}
}
}
private static void read_CONSTANT_Utf8_info(FileInputStream fis) throws IOException {
byte[] length = new byte[2];
fis.read(length);
UnsignedByte[] lengthUB = UnsignedByte.from(length);
long lengthValue = UnsignedByte.value(lengthUB);
System.out.print("[utf8] ");
for (long k = 0; k < lengthValue; k++) {
byte[] bytes = new byte[1];
fis.read(bytes);
UnsignedByte bytesUB = UnsignedByte.from(bytes[0]);
System.out.printf("%c", bytesUB.value());
}
System.out.println();
}
private static void read_CONSTANT_Integer_info(FileInputStream fis) throws IOException {
byte[] bytes = new byte[4];
fis.read(bytes);
UnsignedByte[] bytesUB = UnsignedByte.from(bytes);
int integerValue = (bytesUB[0].value() << (8 * 3))
+ (bytesUB[1].value() << (8 * 2))
+ (bytesUB[2].value() << (8 * 1))
+ bytesUB[3].value();
System.out.printf("[int] %d\n", integerValue);
}
private static void read_CONSTANT_Float_info(FileInputStream fis) throws IOException {
byte[] bytes = new byte[4];
fis.read(bytes);
UnsignedByte[] bytesUB = UnsignedByte.from(bytes);
System.out.printf("[bytes] %s %s %s %s\n",
Integer.toBinaryString(bytesUB[0].value()),
Integer.toBinaryString(bytesUB[1].value()),
Integer.toBinaryString(bytesUB[2].value()),
Integer.toBinaryString(bytesUB[3].value()));
/*
-25.125f的二进制形式
11000001 11001001 00000000 00000000
1 10000011 10010010000000000000000
1表示该浮点数为负数
10000011的十进制为:128 + 2 + 1 = 131
减去127得到4,表示小数点右移的数位
剩下23位是纯二进制小数
10010010000000000000000表示0.10010010000000000000000
前面加1得到1.10010010000000000000000
小数点右移4位得到11001.0010000000000000000
变为10进制得到 (1 + 8 + 16).(1/8) = 25.125
前面的1表示负数,所以该浮点数为-25.125
*/
boolean positive = (bytesUB[0].value() >> 7) == 0;
int exponent = bytesUB[0].value() & 0b00000000_01111111; // 去掉符号位
exponent <<= (8 - 7); //增加一位
exponent += (bytesUB[1].value() >> (16 - 1 - 8)); //接上其首位
exponent -= (Math.pow(2, 8) - 1); // 减去127
int mantissa = bytesUB[1].value() & 0b00000000_01111111; // 忽略第一位
mantissa <<= (8 + 8);//先左移16位腾出位置
mantissa += (bytesUB[2].value() << 8) + bytesUB[3].value();//加上后2字节
mantissa += 0b1000_0000____0000_0000____0000_0000;// 首位加个1
int left = mantissa >> (23 - exponent);
int right = mantissa << (32 - 23 + exponent) >> (32 - 23 + exponent);
int dividend = calcDividend(right, 23 - exponent);
float floatValue = (left + 1.0f / dividend);
floatValue = positive ? floatValue : -floatValue;
// System.out.println("dividend: " + dividend);
// System.out.println("mantissa: " + Integer.toBinaryString(mantissa));
// System.out.println("left: " + Integer.toBinaryString(left));
// System.out.println("right: " + Integer.toBinaryString(right));
//
// System.out.println("positive: " + positive);
// System.out.println("exponent: " + exponent);
// System.out.println("mantissa: " + mantissa);
//
// System.out.printf("[bytes] %s %s %s %s\n",
// (bytesUB[0].value()),
// (bytesUB[1].value()),
// (bytesUB[2].value()),
// (bytesUB[3].value()));
System.out.printf("[float] %f\n", floatValue);
}
// 计算小数点后的二进制的十进制表示,如.001表示8,再用1/8得到0.125
private static int calcDividend(int rightBits, int len) {
int result = 0;
while (rightBits != 0) {
int lastBits = rightBits & 1; // 最后一位是1,则结果为1,否则为0
result += lastBits * Math.pow(2, len);
rightBits >>= 1;
len--;
}
return result;
}
private static void read_CONSTANT_Long_info(FileInputStream fis) throws IOException {
byte[] bytes = new byte[8];
fis.read(bytes);
UnsignedByte[] bytesUB = UnsignedByte.from(bytes);
int longValue = (bytesUB[0].value() << (8 * 7))
+ (bytesUB[1].value() << ((8 * 6)))
+ (bytesUB[2].value() << ((8 * 5)))
+ (bytesUB[3].value() << (8 * 4))
+ (bytesUB[4].value() << (8 * 3))
+ (bytesUB[5].value() << (8 * 2))
+ (bytesUB[6].value() << (8 * 1))
+ bytesUB[7].value();
System.out.printf("[long] %d\n", longValue);
}
// 符号位 阶码 尾数 长度
// float 1 8 23 32
// double 1 11 52 64
private static void read_CONSTANT_Double_info(FileInputStream fis) throws IOException {
byte[] bytes = new byte[8];
fis.read(bytes);
UnsignedByte[] bytesUB = UnsignedByte.from(bytes);
System.out.printf("[bytes] %s %s %s %s %s %s %s %s\n",
Integer.toBinaryString(bytesUB[0].value()),
Integer.toBinaryString(bytesUB[1].value()),
Integer.toBinaryString(bytesUB[2].value()),
Integer.toBinaryString(bytesUB[3].value()),
Integer.toBinaryString(bytesUB[4].value()),
Integer.toBinaryString(bytesUB[5].value()),
Integer.toBinaryString(bytesUB[6].value()),
Integer.toBinaryString(bytesUB[7].value()));
/*
-25.125f的二进制形式
11000000 111001 100000 0 0 0 0 0
11000000 00111001 00100000 00000000 00000000 00000000 00000000 00000000
1 10000000011 1001 00100000 00000000 00000000 00000000 00000000 00000000
1表示该浮点数为负数
10000000011的十进制为:128 + 2 + 1 = 131
减去127得到4,表示小数点右移的数位
剩下23位是纯二进制小数
10010010000000000000000表示0.10010010000000000000000
前面加1得到1.10010010000000000000000
小数点右移4位得到11001.0010000000000000000
变为10进制得到 (1 + 8 + 16).(1/8) = 25.125
前面的1表示负数,所以该浮点数为-25.125
*/
boolean positive = (bytesUB[0].value() >> 7) == 0;
int exponent = bytesUB[0].value() & 0b00000000_01111111; // 去掉符号位
exponent <<= (11 - 7); //增加一位
exponent += (bytesUB[1].value() >> (16 - 1 - 11)); //接上第二字节的前4位
exponent -= (Math.pow(2, 11) - 1);
int mantissa = bytesUB[1].value() & 0b00000000_00001111; // 忽略前4位
//!!!
mantissa <<= (8 + 8);
mantissa += (bytesUB[2].value() << 8) + bytesUB[3].value();
mantissa += 0b1000_0000____0000_0000____0000_0000;// 首位加个1
int left = mantissa >> (23 - exponent);
int right = mantissa << (32 - 23 + exponent) >> (32 - 23 + exponent);
int dividend = calcDividend(right, 23 - exponent);
float floatValue = (left + 1.0f / dividend);
floatValue = positive ? floatValue : -floatValue;
// System.out.println("dividend: " + dividend);
// System.out.println("mantissa: " + Integer.toBinaryString(mantissa));
// System.out.println("left: " + Integer.toBinaryString(left));
// System.out.println("right: " + Integer.toBinaryString(right));
//
// System.out.println("positive: " + positive);
// System.out.println("exponent: " + exponent);
// System.out.println("mantissa: " + mantissa);
//
// System.out.printf("[bytes] %s %s %s %s\n",
// (bytesUB[0].value()),
// (bytesUB[1].value()),
// (bytesUB[2].value()),
// (bytesUB[3].value()));
System.out.printf("[double] %f\n", .0);
}
private static void read_CONSTANT_Class_info(FileInputStream fis) throws IOException {
byte[] index = new byte[2];
fis.read(index);
UnsignedByte[] indexUB = UnsignedByte.from(index);
System.out.printf("[index] %d\n", UnsignedByte.value(indexUB));
}
private static void read_CONSTANT_String_info(FileInputStream fis) throws IOException {
byte[] index = new byte[2];
fis.read(index);
UnsignedByte[] indexUB = UnsignedByte.from(index);
System.out.printf("[index] %d\n", UnsignedByte.value(indexUB));
}
private static void read_CONSTANT_Fieldref_info(FileInputStream fis) throws IOException {
byte[] index1 = new byte[2];
byte[] index2 = new byte[2];
fis.read(index1);
fis.read(index2);
UnsignedByte[] index1UB = UnsignedByte.from(index1);
UnsignedByte[] index2UB = UnsignedByte.from(index1);
System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB));
System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB));
}
private static void read_CONSTANT_Methodref_info(FileInputStream fis) throws IOException {
byte[] index1 = new byte[2];
byte[] index2 = new byte[2];
fis.read(index1);
fis.read(index2);
UnsignedByte[] index1UB = UnsignedByte.from(index1);
UnsignedByte[] index2UB = UnsignedByte.from(index1);
System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB));
System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB));
}
private static void read_CONSTANT_InterfaceMethodref_info(FileInputStream fis) throws IOException {
byte[] index1 = new byte[2];
byte[] index2 = new byte[2];
fis.read(index1);
fis.read(index2);
UnsignedByte[] index1UB = UnsignedByte.from(index1);
UnsignedByte[] index2UB = UnsignedByte.from(index1);
System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB));
System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB));
}
private static void read_CONSTANT_NameAndType_info(FileInputStream fis) throws IOException {
byte[] index1 = new byte[2];
byte[] index2 = new byte[2];
fis.read(index1);
fis.read(index2);
UnsignedByte[] index1UB = UnsignedByte.from(index1);
UnsignedByte[] index2UB = UnsignedByte.from(index1);
System.out.printf("[index1] %d\n", UnsignedByte.value(index1UB));
System.out.printf("[index2] %d\n", UnsignedByte.value(index2UB));
}
private static void read_CONSTANT_MethodHandle_info(FileInputStream fis) throws IOException {
byte[] reference_kind = new byte[1];//1至9之间,决定了方法句柄的类型
byte[] reference_index = new byte[2];
fis.read(reference_kind);
fis.read(reference_index);
UnsignedByte[] reference_kindUB = UnsignedByte.from(reference_kind);
UnsignedByte[] reference_indexUB = UnsignedByte.from(reference_index);
System.out.printf("[reference_kind] %d\n", UnsignedByte.value(reference_kindUB));
System.out.printf("[reference_index] %d\n", UnsignedByte.value(reference_indexUB));
}
private static void read_CONSTANT_MethodType_info(FileInputStream fis) throws IOException {
byte[] descriptor_index = new byte[2];
fis.read(descriptor_index);
UnsignedByte[] descriptor_indexUB = UnsignedByte.from(descriptor_index);
System.out.printf("[descriptor_index] %d\n", UnsignedByte.value(descriptor_indexUB));
}
private static void read_CONSTANT_Dynamic_info(FileInputStream fis) throws IOException {
byte[] bootstrap_method_attr_index = new byte[2];
fis.read(bootstrap_method_attr_index);
UnsignedByte[] bootstrap_method_attr_indexUB = UnsignedByte.from(bootstrap_method_attr_index);
System.out.printf("[bootstrap_method_attr_index] %d\n", UnsignedByte.value(bootstrap_method_attr_indexUB));
}
private static void read_CONSTANT_InvokeDynamic_info(FileInputStream fis) throws IOException {
byte[] bootstrap_method_attr_index = new byte[2];
fis.read(bootstrap_method_attr_index);
UnsignedByte[] bootstrap_method_attr_indexUB = UnsignedByte.from(bootstrap_method_attr_index);
System.out.printf("[bootstrap_method_attr_index] %d\n", UnsignedByte.value(bootstrap_method_attr_indexUB));
}
private static void read_CONSTANT_Module_info(FileInputStream fis) throws IOException {
byte[] name_index = new byte[2];
fis.read(name_index);
UnsignedByte[] name_indexUB = UnsignedByte.from(name_index);
System.out.printf("[name_index] %d\n", UnsignedByte.value(name_indexUB));
}
private static void read_CONSTANT_Package_info(FileInputStream fis) throws IOException {
byte[] name_index = new byte[2];
fis.read(name_index);
UnsignedByte[] name_indexUB = UnsignedByte.from(name_index);
System.out.printf("[name_index] %d\n", UnsignedByte.value(name_indexUB));
}
}
class UnsignedByte {
private short value;
private byte rawValue;
private UnsignedByte() {
}
public static UnsignedByte from(byte b) {
UnsignedByte ub = new UnsignedByte();
ub.rawValue = b;
ub.value = (short) ((short) b & 0xFF);
return ub;
}
public static UnsignedByte[] from(byte[] bytes) {
UnsignedByte[] result = new UnsignedByte[bytes.length];
for (int i = 0; i < bytes.length; i++) {
result[i] = from(bytes[i]);
}
return result;
}
public short value() {
return value;
}
public static long value(UnsignedByte[] bytes) {
if (bytes.length == 1) {
return bytes[1].value();
}
if (bytes.length == 2) {
return (bytes[0].value() << 32) + bytes[1].value();
}
throw new IllegalArgumentException();
}
public static byte[] to(UnsignedByte[] b) {
byte[] result = new byte[b.length];
for (int i = 0; i < result.length; i++) {
result[i] = (byte) (b[i].rawValue & 0xFF);
}
return result;
}
public static byte to(UnsignedByte b) {
return b.rawValue;
}
public static byte to(short i) {
return (byte) i;
}
}
3.2 HelloWorld.java
public class HelloWorld {
private final int age = 19971015;
private final float len = -25.125f;
private final double d = -25.125;
public int add(int i) {
return i + 1;
}
}
3.3 输出结果
[magic] cafe babe [minor_version] 0000 [major_version] 0034 [constant_pool_count] 00 24 (constantPoolCount: 36) #1 [tag] 0a (10 CONSTANT_Methodref_info) [index1] 10 [index2] 10 #2 [tag] 03 (3 CONSTANT_Integer_info) [int] 19971015 #3 [tag] 09 (9 CONSTANT_Fieldref_info) [index1] 9 [index2] 9 #4 [tag] 04 (4 CONSTANT_Float_info) [bytes] 11000001 11001001 0 0 [float] -25.125000 #5 [tag] 09 (9 CONSTANT_Fieldref_info) [index1] 9 [index2] 9 #6 [tag] 06 (6 CONSTANT_Double_info) [bytes] 11000000 111001 100000 0 0 0 0 0 [double] 0.000000 #7 [tag] 09 (9 CONSTANT_Fieldref_info) [index1] 9 [index2] 9 #8 [tag] 07 (7 CONSTANT_Class_info) [index] 34 #9 [tag] 07 (7 CONSTANT_Class_info) [index] 35 #10 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] age #11 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] I #12 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] ConstantValue #13 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] len #14 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] F #15 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] d #16 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] D #17 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] <init> #18 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] ()V #19 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] Code #20 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] LineNumberTable #21 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] LocalVariableTable #22 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] this #23 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] LHelloWorld; #24 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] add #25 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] (I)I #26 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] i #27 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] SourceFile #28 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] HelloWorld.java #29 [tag] 0c (12 CONSTANT_NameAndType_info) [index1] 18 [index2] 18 #30 [tag] 0c (12 CONSTANT_NameAndType_info) [index1] 11 [index2] 11 #31 [tag] 0c (12 CONSTANT_NameAndType_info) [index1] 14 [index2] 14 #32 [tag] 0c (12 CONSTANT_NameAndType_info) [index1] 16 [index2] 16 #33 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] HelloWorld #34 [tag] 01 (1 CONSTANT_Utf8_info) [utf8] java/lang/Object #35 [tag] 00
查看9道真题和解析
CVTE公司福利 672人发布