This document describes changes to the Java Virtual Machine Specification to support strict field initialization in the JVM, a preview feature introduced by JEP draft 8350458.
Revision history:
Jan 2026: added a discussion of initialization states to 2.9.1 and 2.9.2; revised some terminology; factored out some existing issues with
StackMapTableinitialization state inference into bug JDK-8375481; identified exception types thrown by static field checks; aligned with latest JVMS textApril 2025: initial proposed changes, extracted from earlier drafts of JVMS changes for Value Classes & Objects
Chapter 2: The Structure of the Java Virtual Machine
2.9 Special Methods
2.9.1 Instance Initialization Methods
A class has zero or more
instance initialization methods, used to initialize
instances of the class. each Each
typically corresponding corresponds to a
constructor written in the Java programming language.
If a class does not have an instance initialization method, its instances cannot be initialized or used.
A method is an instance initialization method if all of the following are true:
It is defined in a class (not an interface).
It has the special name
<init>.It is
void(4.3.3).
In a class, any
non-void method named <init> is not an
instance initialization method. In an interface, any method named
<init> is not an instance initialization method. Such
methods cannot be invoked by any Java Virtual Machine instruction (4.4.2, 4.9.2) and are rejected by format
checking (4.6, 4.8).
The declaration and use of an
instance initialization method is constrained by the Java Virtual
Machine. For the declaration, the method's access_flags
item and code array are constrained (4.6, 4.9.2). For a use, an instance
initialization method may be invoked only by the invokespecial
instruction on an uninitialized class instance (4.10.1.9).
Because the name
<init>is not a valid identifier in the Java programming language, it cannot be used directly in a program written in the Java programming language.
Class instances are created by the new instruction (6.5.new), and then must proceed through a sequence of initialization states to be made available for general use. These states are:
Uninitialized: The object has been created by
new, but has not yet attempted initialization.Early larval: The object is currently being initialized, and limited operations are available.
Late larval: The object is currently being initialized, but can be used without restriction.
Initialized: The object has successfully completed initialization.
Erroneous: The object failed initialization and may not be used.
A early larval state is
entered when an instance initialization method is invoked on an object
created by new. Most operations, including method
invocations, are not allowed on an object in an early larval
state, and the object may not be shared with other code. However, its
instance fields may be set with the putfield instruction (6.5.putfield).
Strictly-initialized instance fields (4.5)
must be set while the object is in this state.
While in the early larval
state, each instance initialization method delegates to another instance
initialization method with invokespecial (6.5.invokespecial).
The recursion bottoms out at the invocation of an
Object.<init> method. If any of these instance
initialization method invocations fails with an exception, the object
transitions to the erroneous state and can never become
initialized.
Once execution reaches
Object.<init>, the instance transitions to the
late larval state, and its fields and methods are available for
normal use. In this state, the object may be freely shared with other
code. However, that code must handle the object carefully, because it is
not yet fully initialized. Starting with
Object.<init>, each instance initialization method in
the call stack completes its initialization work and returns. If every
instance initialization method invocation returns successfully, the
object transitions to the initialized state and is available
for general use.
2.9.2 Class Initialization Methods
A class or interface has at most one class or interface initialization method and is initialized by the Java Virtual Machine invoking that method (5.5).
If a class or interface does not have a class or interface initialization method, initialization of the class or interface proceeds without executing any bytecode in the class or interface.
A method is a class or interface initialization method if all of the following are true:
It has the special name
<clinit>.It is
void(4.3.3).In a
classfile whose version number is 51.0 or above, the method has itsACC_STATICflag set and takes no arguments (4.6).The requirement for
ACC_STATICwas introduced in Java SE 7, and for taking no arguments in Java SE 9. In a class file whose version number is 50.0 or below, a method named<clinit>that isvoidis considered the class or interface initialization method regardless of the setting of itsACC_STATICflag or whether it takes arguments.
Other methods named
<clinit> in a class file are not class
or interface initialization methods. They are never invoked by the Java
Virtual Machine itself, cannot be invoked by any Java Virtual Machine
instruction (4.9.1), and are
rejected by format checking (4.6, 4.8).
Because the name
<clinit>is not a valid identifier in the Java programming language, it cannot be used directly in a program written in the Java programming language.
After a class has been loaded, it proceeds through a sequence of initialization states to be made available for general use. These states are:
Uninitialized: The class is loaded but has not yet attempted initialization.
Larval: The class is currently being initialized.
Initialized: The class has successfully completed initialization, and can be used without restriction.
Erroneous: The class failed initialization and may not be used.
Class initialization methods are executed while the class is in a larval state. In this state, the fields and methods of the class may be freely accessed in the current thread, while attempts to access the class from other threads are blocked. Strictly-initialized static fields (4.5) must be set while the class is in a larval state.
If the class initialization method invocation fails with an exception, the class transitions to the erroneous state and can never become initialized. If the class initialziation method returns successfully, the class transitions to the initialized state and is available for general use.
Chapter 4: The class File Format
4.5 Fields
Each field is described by a
field_info structure.
No two fields in one
class file may have the same name and descriptor (4.3.2).
The structure has the following format:
field_info {
u2 access_flags;
u2 name_index;
u2 descriptor_index;
u2 attributes_count;
attribute_info attributes[attributes_count];
}
The items of the
field_info structure are as follows:
- access_flags
-
The value of the
access_flagsitem is a mask of flags used to denote access permission to and properties of this field. The interpretation of each flag, when set, is specified in Table 4.5-A.Table 4.5-A. Field access and property flags
Flag Name Value Interpretation ACC_PUBLIC0x0001 Declared public; may be accessed from outside its package.ACC_PRIVATE0x0002 Declared private; accessible only within the defining class and other classes belonging to the same nest (5.4.4).ACC_PROTECTED0x0004 Declared protected; may be accessed within subclasses.ACC_STATIC0x0008 Declared static.ACC_FINAL0x0010 Declared final; never directly assigned to after object construction (JLS §17.5).ACC_VOLATILE0x0040 Declared volatile; cannot be cached.ACC_TRANSIENT0x0080 Declared transient; not written or read by a persistent object manager.ACC_STRICT_INIT0x0800 A strictly-initialized field; must be assigned before it can be read. ACC_SYNTHETIC0x1000 Declared synthetic; not present in the source code. ACC_ENUM0x4000 Declared as an element of an enumclass.Fields of classes may set any of the flags in Table 4.5-A. However, each field of a class may have at most one of its
ACC_PUBLIC,ACC_PRIVATE, andACC_PROTECTEDflags set (JLS §8.3.1), and must not have both itsACC_FINALandACC_VOLATILEflags set (JLS §8.3.1.4).Fields of interfaces must have their
ACC_PUBLIC,ACC_STATIC, andACC_FINALflags set; they may have theirACC_STRICT_INITorACC_SYNTHETICflag set, and must not have any of the other flags in Table 4.5-A set (JLS §9.3).If the class file version number is not
71.65535, the 0x0800 bit is ignored, and theACC_STRICT_INITflag is considered not to be set.The
ACC_SYNTHETICflag indicates that this field was generated by a compiler and does not appear in source code.The
ACC_ENUMflag indicates that this field is used to hold an element of an enum class (JLS §8.9).All bits of the
access_flagsitem not assigned in Table 4.5-A are reserved for future use. They should be set to zero in generatedclassfiles and should be ignored by Java Virtual Machine implementations. - name_index
-
The value of the
name_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_infostructure (4.4.7) which represents a valid unqualified name denoting a field (4.2.2). - descriptor_index
-
The value of the
descriptor_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_infostructure (4.4.7) which represents a valid field descriptor (4.3.2). - attributes_count
-
The value of the
attributes_countitem indicates the number of additional attributes of this field. - attributes[]
-
Each value of the
attributestable must be anattribute_infostructure (4.7).A field can have any number of optional attributes associated with it.
The attributes defined by this specification as appearing in the
attributestable of afield_infostructure are listed in Table 4.7-C.The rules concerning attributes defined to appear in the
attributestable of afield_infostructure are given in 4.7.The rules concerning non-predefined attributes in the
attributestable of afield_infostructure are given in 4.7.1.
4.7 Attributes
4.7.4 The StackMapTable Attribute
The StackMapTable
attribute is a variable-length attribute in the attributes
table of a Code attribute (4.7.3). A
StackMapTable attribute is used during the process of
verification by type checking (4.10.1).
There may be at most one
StackMapTable attribute in the attributes
table of a Code attribute.
In a class file whose
version number is 50.0 or above, if a method's Code
attribute does not have a StackMapTable attribute, it has
an implicit stack map attribute (4.10.1). This implicit stack map
attribute is equivalent to a StackMapTable attribute with
number_of_entries equal to zero.
The StackMapTable
attribute has the following format:
StackMapTable_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 number_of_entries;
stack_map_frame entries[number_of_entries];
}
The items of the
StackMapTable_attribute structure are as follows:
- attribute_name_index
-
The value of the
attribute_name_indexitem must be a valid index into theconstant_pooltable. Theconstant_poolentry at that index must be aCONSTANT_Utf8_infostructure (4.4.7) representing the string "StackMapTable". - attribute_length
-
The value of the
attribute_lengthitem indicates the length of the attribute, excluding the initial six bytes. - number_of_entries
-
The value of the
number_of_entriesitem gives the number ofstack_map_frameentries in theentriestable. - entries[]
-
Each entry in the
entriestable describes one stack map frame of the method. The order of the stack map frames in theentriestable is significant.
A stack map frame
specifies (either explicitly or implicitly) the bytecode offset at which
it applies, and the verification types of local variables and operand
stack entries for that offset. Verification by type checking
requires that a stack map frame be associated with every bytecode offset
of the code array that is used as a branch target or as the
start of an exception handler (4.10.1.6).
A stack map frame is
restricted if it specifies a local variable or operand stack
entry of type uninitializedThis. Restricted stack map
frames are used to describe the type state of instance initialization
methods operating on a new class instance in the early larval
initialization state (2.9.1). These frames
always specify, either explicitly or implicitly, a list of unset
strictly-initialized fields that must be set before instance
initialization can proceed.
Each stack map frame described in
the entries table relies on the previous frame for some of
its semantics. The first stack map frame of a method is implicit, and
computed from the method descriptor by the type checker (4.10.1.5). The stack_map_frame
structure at entries[0] therefore describes the second
stack map frame of the method.
The bytecode offset at
which a stack map frame applies is calculated by taking the value
offset_delta specified in the frame (either explicitly or
implicitly), and adding offset_delta + 1 to the bytecode
offset of the previous frame, unless the previous frame is the initial
frame of the method. In that case, the bytecode offset at which the
stack map frame applies is the value offset_delta specified
in the frame.
By using an offset delta rather than storing the actual bytecode offset, we ensure, by definition, that stack map frames are in the correctly sorted order. Furthermore, by consistently using the formula
offset_delta + 1for all explicit frames (as opposed to the implicit first frame), we guarantee the absence of duplicates.
We say that an instruction in the
bytecode has a corresponding stack map frame if the instruction
starts at offset i in the code array of a
Code attribute, and the Code attribute has a
StackMapTable attribute whose entries
array table contains a stack map frame that
applies at bytecode offset i.
A verification type
specifies the type of either one or two locations, where a
location is either a single local variable or a single operand
stack entry. A verification type is represented by a discriminated
union, verification_type_info, that consists of a one-byte
tag, indicating which item of the union is in use, followed by zero or
more bytes, giving more information about the tag.
union verification_type_info {
Top_variable_info;
Integer_variable_info;
Float_variable_info;
Long_variable_info;
Double_variable_info;
Null_variable_info;
UninitializedThis_variable_info;
Object_variable_info;
Uninitialized_variable_info;
}
A verification type that specifies
one location in the local variable array or in the operand stack is
represented by the following items of the
verification_type_info union:
The
Top_variable_infoitem indicates that the local variable has the verification typetop.Top_variable_info { u1 tag = ITEM_Top; /* 0 */ }The
Integer_variable_infoitem indicates that the location has the verification typeint.Integer_variable_info { u1 tag = ITEM_Integer; /* 1 */ }The
Float_variable_infoitem indicates that the location has the verification typefloat.Float_variable_info { u1 tag = ITEM_Float; /* 2 */ }The
Null_variable_infotype indicates that the location has the verification typenull.Null_variable_info { u1 tag = ITEM_Null; /* 5 */ }The
UninitializedThis_variable_infoitem indicates that the location has the verification typeuninitializedThis.UninitializedThis_variable_info { u1 tag = ITEM_UninitializedThis; /* 6 */ }The
Object_variable_infoitem indicates that the location has the verification type which is the class represented by theCONSTANT_Class_infostructure (4.4.1) found in theconstant_pooltable at the index given bycpool_index.Object_variable_info { u1 tag = ITEM_Object; /* 7 */ u2 cpool_index; }The
Uninitialized_variable_infoitem indicates that the location has the verification typeuninitialized(Offset). TheOffsetitem indicates the offset, in thecodearray of theCodeattribute that contains thisStackMapTableattribute, of the new instruction (6.5.new) that created the object being stored in the location.Uninitialized_variable_info { u1 tag = ITEM_Uninitialized; /* 8 */ u2 offset; }
A verification type that specifies
two locations in the local variable array or in the operand stack is
represented by the following items of the
verification_type_info union:
The
Long_variable_infoitem indicates that the first of two locations has the verification typelong.Long_variable_info { u1 tag = ITEM_Long; /* 4 */ }The
Double_variable_infoitem indicates that the first of two locations has the verification typedouble.Double_variable_info { u1 tag = ITEM_Double; /* 3 */ }The
Long_variable_infoandDouble_variable_infoitems indicate the verification type of the second of two locations as follows:If the first of the two locations is a local variable, then:
It must not be the local variable with the highest index.
The next higher numbered local variable has the verification type
top.
If the first of the two locations is an operand stack entry, then:
It must not be the topmost location of the operand stack.
The next location closer to the top of the operand stack has the verification type
top.
A stack map frame is represented by
a discriminated union, stack_map_frame, which consists of a
one-byte tag, indicating which item of the union is in use, followed by
zero or more bytes, giving more information about the tag. An
early_larval_frame is used to explicitly list the unset
fields of a restricted stack map frame. All other stack map frames have
the type base_frame.
union stack_map_frame {
base_frame;
early_larval_frame;
}
union stack_map_frame {
union base_frame {
same_frame;
same_locals_1_stack_item_frame;
same_locals_1_stack_item_frame_extended;
chop_frame;
same_frame_extended;
append_frame;
full_frame;
}
The tag indicates the frame type of the stack map frame:
The frame type
same_frameis represented by tags in the range [0-63]. This frame type indicates that the frame has exactly the same local variables as the previous frame and that the operand stack is empty. Theoffset_deltavalue for the frame is the value of the tag item,frame_type.same_frame { u1 frame_type = SAME; /* 0-63 */ }The frame type
same_locals_1_stack_item_frameis represented by tags in the range [64, 127]. This frame type indicates that the frame has exactly the same local variables as the previous frame and that the operand stack has one entry. Theoffset_deltavalue for the frame is given by the formulaframe_type - 64. The verification type of the one stack entry appears after the frame type.same_locals_1_stack_item_frame { u1 frame_type = SAME_LOCALS_1_STACK_ITEM; /* 64-127 */ verification_type_info stack[1]; }Tags in the range [128-246] are reserved for future use.The frame type
same_locals_1_stack_item_frame_extendedis represented by the tag 247. This frame type indicates that the frame has exactly the same local variables as the previous frame and that the operand stack has one entry. Theoffset_deltavalue for the frame is given explicitly, unlike in the frame typesame_locals_1_stack_item_frame. The verification type of the one stack entry appears afteroffset_delta.same_locals_1_stack_item_frame_extended { u1 frame_type = SAME_LOCALS_1_STACK_ITEM_EXTENDED; /* 247 */ u2 offset_delta; verification_type_info stack[1]; }The frame type
chop_frameis represented by tags in the range [248-250]. This frame type indicates that the frame has the same local variables as the previous frame except that the last k local variables are absent, and that the operand stack is empty. The value of k is given by the formula251 - frame_type. Theoffset_deltavalue for the frame is given explicitly.chop_frame { u1 frame_type = CHOP; /* 248-250 */ u2 offset_delta; }Assume the verification types of local variables in the previous frame are given by
locals, an array structured as in thefull_frameframe type. Iflocals[M-1]in the previous frame represented local variable X andlocals[M]represented local variable Y, then the effect of removing one local variable is thatlocals[M-1]in the new frame represents local variable X andlocals[M]is undefined.It is an error if k is larger than the number of local variables in
localsfor the previous frame, that is, if the number of local variables in the new frame would be less than zero.The frame type
same_frame_extendedis represented by the tag 251. This frame type indicates that the frame has exactly the same local variables as the previous frame and that the operand stack is empty. Theoffset_deltavalue for the frame is given explicitly, unlike in the frame typesame_frame.same_frame_extended { u1 frame_type = SAME_FRAME_EXTENDED; /* 251 */ u2 offset_delta; }The frame type
append_frameis represented by tags in the range [252-254]. This frame type indicates that the frame has the same locals as the previous frame except that k additional locals are defined, and that the operand stack is empty. The value of k is given by the formulaframe_type - 251. Theoffset_deltavalue for the frame is given explicitly.append_frame { u1 frame_type = APPEND; /* 252-254 */ u2 offset_delta; verification_type_info locals[frame_type - 251]; }The 0th entry in
localsrepresents the verification type of the first additional local variable. Iflocals[M]represents local variableN, then:locals[M+1]represents local variableN+1iflocals[M]is one ofTop_variable_info,Integer_variable_info,Float_variable_info,Null_variable_info,UninitializedThis_variable_info,Object_variable_info, orUninitialized_variable_info; andlocals[M+1]represents local variableN+2iflocals[M]is eitherLong_variable_infoorDouble_variable_info.
It is an error if, for any index i,
locals[i]represents a local variable whose index is greater than the maximum number of local variables for the method.The frame type
full_frameis represented by the tag 255. Theoffset_deltavalue for the frame is given explicitly.full_frame { u1 frame_type = FULL_FRAME; /* 255 */ u2 offset_delta; u2 number_of_locals; verification_type_info locals[number_of_locals]; u2 number_of_stack_items; verification_type_info stack[number_of_stack_items]; }The 0th entry in
localsrepresents the verification type of local variable- If
locals[M]represents local variableN, then:
locals[M+1]represents local variableN+1iflocals[M]is one ofTop_variable_info,Integer_variable_info,Float_variable_info,Null_variable_info,UninitializedThis_variable_info,Object_variable_info, orUninitialized_variable_info; andlocals[M+1]represents local variableN+2iflocals[M]is eitherLong_variable_infoorDouble_variable_info.
It is an error if, for any index i,
locals[i]represents a local variable whose index is greater than the maximum number of local variables for the method.The 0th entry in
stackrepresents the verification type of the bottom of the operand stack, and subsequent entries instackrepresent the verification types of stack entries closer to the top of the operand stack. We refer to the bottom of the operand stack as stack entry 0, and to subsequent entries of the operand stack as stack entry 1, 2, etc. Ifstack[M]represents stack entryN, then:stack[M+1]represents stack entryN+1ifstack[M]is one ofTop_variable_info,Integer_variable_info,Float_variable_info,Null_variable_info,UninitializedThis_variable_info,Object_variable_info, orUninitialized_variable_info; andstack[M+1]represents stack entryN+2ifstack[M]is eitherLong_variable_infoorDouble_variable_info.
It is an error if, for any index i,
stack[i]represents a stack entry whose index is greater than the maximum operand stack size for the method.- If
The frame type
early_larval_frameis represented by the tag 246. This frame type wraps a restrictedbase_frameand specifies that, for thatbase_frame, the listed strict-initialized fields should be treated as unset. The list may be empty to indicate that no strictly-initialized fields are unset.The
offset_deltaof anearly_larval_frameis given by thebase_framethat it wraps.early_larval_frame { u1 frame_type = EARLY_LARVAL; /* 246 */ u2 number_of_unset_fields; u2 unset_fields[number_of_unset_fields]; base_frame base; }Each entry in the
unset_fieldsarray must be a valid index into theconstant_pooltable. The constant_pool entry at that index must be aCONSTANT_NameAndType_info, and the given name and descriptor must match that of a field declared by the current class with itsACC_STRICT_INITflag set and itsACC_STATICflag not set.It is an error if the wrapped
base_framedoes not have a local variable or operand stack entry of typeuninitializedThis.When a restricted frame is not wrapped in an
early_larval_frame, the unset fields of that frame are inferred: if the previous frame was restricted, the unset fields are the same as the unset fields of the previous frame; otherwise, the list of unset fields is empty.The implicit first stack map frame of an instance initialization method is restricted, with the list of unset fields consisting of every stricitly-initialized instance field declared by the class (4.10.1.5).
See also JDK-8375481, which ensures that the operand stack types are considered when determining whether a frame is restricted.
Tags in the range [128-245] are reserved for future use.
4.9 Constraints on Java Virtual Machine Code
4.9.2 Structural Constraints
The structural constraints on the
code array specify constraints on relationships between
Java Virtual Machine instructions. The structural constraints are as
follows:
Each instruction must only be executed with the appropriate type and number of arguments in the operand stack and local variable array, regardless of the execution path that leads to its invocation.
As noted in 2.3.4 and 2.11.1, the Java Virtual Machine implicitly converts values of types
boolean,byte,short, andcharto typeint, allowing instructions expecting values of typeintto operate on them.If an instruction can be executed along several different execution paths, the operand stack must have the same depth (2.6.2) prior to the execution of the instruction, regardless of the path taken.
At no point during execution can the operand stack grow to a depth greater than that implied by the
max_stackitem.At no point during execution can more values be popped from the operand stack than it contains.
At no point during execution can the order of the local variable pair holding a value of type
longordoublebe reversed or the pair split up. At no point can the local variables of such a pair be operated on individually.No local variable (or local variable pair, in the case of a value of type
longordouble) can be accessed before it is assigned a value.Each invokespecial instruction must name one of the following:
an instance initialization method (2.9.1)
a method in the current class or interface
a method in a superclass of the current class
a method in a direct superinterface of the current class or interface
a method in
Object
If an invokespecial instruction names an instance initialization method, then the target reference on the operand stack must be an uninitialized class instance. An instance initialization method must never be invoked on an initialized class instance. In addition:
If the target reference on the operand stack is an uninitialized class instance for the current class, then invokespecial must name an instance initialization method from the current class or its direct superclass.
If the target reference on the operand stack is a class instance created by an earlier new instruction, then invokespecial must name an instance initialization method from the class of that class instance.
In either case, if the instruction is covered by an exception handler, then any copy of the target reference stored in a local variable must be treated as unusable by the exception handler code.
If an invokespecial instruction names a method which is not an instance initialization method, then the target reference on the operand stack must be a class instance whose type is assignment compatible with the current class (JLS §5.2).
The general rule for invokespecial is that the class or interface named by invokespecial must be be "above" the caller class or interface, while the receiver object targeted by invokespecial must be "at" or "below" the caller class or interface. The latter clause is especially important: a class or interface can only perform invokespecial on its own objects. See 4.10.1.9.invokespecial for an explanation of how the latter clause is implemented in Prolog.
Each instance initialization method, except for the instance initialization method derived from the constructor of class
Object, must call either another instance initialization method ofthisor an instance initialization method of its direct superclasssuperbefore its instance members are accessed, and before the calling instance initialization method returns.However, instance fields of
thisthat are declared in the current class may be assigned by putfield before calling any instance initialization method.If an instance field of
thisis declared in the current class with itsACC_STRICT_INITflag set, then that field must be assigned by a putfield instruction before invoking an instance initialization method of the direct superclass. Further, if the field is bothACC_STRICT_INITandACC_FINAL, then it may not be assigned after invoking another instance initialization method.When any instance method is invoked or when any instance variable is accessed, the class instance that contains the instance method or instance variable must already be initialized.
There must never be an uninitialized class instance on the operand stack or in a local variable when a jsr or jsr_w instruction is executed.
The type of every class instance that is the target of a method invocation instruction (that is, the type of the target reference on the operand stack) must be assignment compatible with the class or interface type specified in the instruction.
The types of the arguments to each method invocation must be method invocation compatible with the method descriptor (JLS §5.3, 4.3.3).
Each return instruction must match its method's return type:
If the method returns a
boolean,byte,char,short, orint, only the ireturn instruction may be used.If the method returns a
float,long, ordouble, only an freturn, lreturn, or dreturn instruction, respectively, may be used.If the method returns a
referencetype, only an areturn instruction may be used, and the type of the returned value must be assignment compatible with the return descriptor of the method (4.3.3).All instance initialization methods, class or interface initialization methods, and methods declared to return
voidmust use only the return instruction.
The type of every class instance accessed by a getfield instruction or modified by a putfield instruction (that is, the type of the target reference on the operand stack) must be assignment compatible with the class type specified in the instruction.
The type of every value stored by a putfield or putstatic instruction must be compatible with the descriptor of the field (4.3.2) of the class instance or class being stored into:
If the descriptor type is
boolean,byte,char,short, orint, then the value must be anint.If the descriptor type is
float,long, ordouble, then the value must be afloat,long, ordouble, respectively.If the descriptor type is a
referencetype, then the value must be of a type that is assignment compatible with the descriptor type.
The type of every value stored into an array by an aastore instruction must be a
referencetype.The component type of the array being stored into by the aastore instruction must also be a
referencetype.Each athrow instruction must throw only values that are instances of class
Throwableor of subclasses ofThrowable.Each class mentioned in a
catch_typeitem of theexception_tablearray of the method'sCode_attributestructure must beThrowableor a subclass ofThrowable.If getfield or putfield is used to access a
protectedfield declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must be assignment compatible with the current class.If invokevirtual or invokespecial is used to access a
protectedmethod declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must be assignment compatible with the current class.Execution never falls off the bottom of the
codearray.No return address (a value of type
returnAddress) may be loaded from a local variable.The instruction following each jsr or jsr_w instruction may be returned to only by a single ret instruction.
No jsr or jsr_w instruction that is returned to may be used to recursively call a subroutine if that subroutine is already present in the subroutine call chain. (Subroutines can be nested when using
try-finallyconstructs from within afinallyclause.)Each instance of type
returnAddresscan be returned to at most once.If a ret instruction returns to a point in the subroutine call chain above the ret instruction corresponding to a given instance of type
returnAddress, then that instance can never be used as a return address.
4.10 Verification of class Files
4.10.1 Verification by Type Checking
4.10.1.1 Accessors for Java Virtual Machine Artifacts
We stipulate the existence of 17
Prolog predicates ("accessors") that have certain expected behavior but
whose formal definitions are not given in this specification. The Prolog
term Class, as used here, represents a binary class or
interface that has been successfully loaded (5.3); the Prolog term
Method represents a method of a Class; and the
Prolog term Loader represents a class loader of a
Class. This specification does not mandate a precise
structure for these entities.
- classClassName(Class, ClassName)
-
Extracts the name,
ClassName, of the classClass. - classIsInterface(Class)
-
True iff the class,
Class, is an interface. - classIsNotFinal(Class)
-
True iff the class,
Class, is not afinalclass. - classSuperClassName(Class, SuperClassName)
-
Extracts the name,
SuperClassName, of the superclass of classClass. - classInterfaceNames(Class, InterfaceNames)
-
Extracts a list,
InterfaceNames, of the names of the direct superinterfaces of the classClass. - classMethods(Class, Methods)
-
Extracts a list,
Methods, of the methods declared in the classClass. - classAttributes(Class, Attributes)
-
Extracts a list,
Attributes, of the attributes of the classClass.Each attribute is represented as a functor application of the form
attribute(AttributeName, AttributeContents), whereAttributeNameis the name of the attribute. The format of the attribute's contents is unspecified. - classDeclaresMember(Class, MemberName, MemberDescriptor, AccessFlags)
-
Asserts that a class,
Class, declares a field or method with nameMemberName,anddescriptorMemberDescriptor, and access flagsAccessFlags. This assertion does not consider members declared in the superclasses or superinterfaces ofClass. - classDefiningLoader(Class, Loader)
-
Extracts the defining class loader,
Loader, of the classClass. - isBootstrapLoader(Loader)
-
True iff the class loader
Loaderis the bootstrap class loader. - loadedClass(Name, InitiatingLoader, ClassDefinition)
-
True iff there exists a class named
Namewhose representation (in accordance with this specification) when loaded by the class loaderInitiatingLoaderisClassDefinition. - methodName(Method, Name)
-
Extracts the name,
Name, of the methodMethod. - methodAccessFlags(Method, AccessFlags)
-
Extracts the access flags,
AccessFlags, of the methodMethod. - methodDescriptor(Method, Descriptor)
-
Extracts the descriptor,
Descriptor, of the methodMethod. - methodAttributes(Method, Attributes)
-
Extracts a list,
Attributes, of the attributes of the methodMethod.Each attribute is represented as a functor application of the form
attribute(AttributeName, AttributeContents), whereAttributeNameis the name of the attribute. The format of the attribute's contents is unspecified. - isInit(Method)
-
True iff
Method(regardless of class) is<init>. - isNotInit(Method)
-
True iff
Method(regardless of class) is not<init>. - isNotFinal(Method, Class)
-
True iff
Methodin classClassis notfinal. - isStatic(Method, Class)
-
True iff
Methodin classClassisstatic. - isNotStatic(Method, Class)
-
True iff
Methodin classClassis notstatic. - isPrivate(Method, Class)
-
True iff
Methodin classClassisprivate. - isNotPrivate(Method, Class)
-
True iff
Methodin classClassis notprivate. - isProtected(MemberClass, MemberName, MemberDescriptor)
-
True iff there is a member named
MemberNamewith descriptorMemberDescriptorin the classMemberClassand it isprotected. - isNotProtected(MemberClass, MemberName, MemberDescriptor)
-
True iff there is a member named
MemberNamewith descriptorMemberDescriptorin the classMemberClassand it is notprotected. - parseFieldDescriptor(Descriptor, Type)
-
Converts a field descriptor,
Descriptor, into the corresponding verification typeType(4.10.1.2).The verification type derived from descriptor types
byte,short,boolean, andchar, when not used as array component types, isint. - parseMethodDescriptor(Descriptor, ArgTypeList, ReturnType)
-
Converts a method descriptor,
Descriptor, into a list of verification types,ArgTypeList, corresponding to the method argument types, and a verification type,ReturnType, corresponding to the return type.The verification type derived from descriptor types
byte,short,boolean, andchar, when not used as array component types, isint. A void return is represented with the special symbolvoid. - parseCodeAttribute(Class, Method, FrameSize, MaxStack, ParsedCode, Handlers, StackMap)
-
Extracts the instruction stream,
ParsedCode, of the methodMethodinClass, as well as the maximum operand stack size,MaxStack, the maximal number of local variables,FrameSize, the exception handlers,Handlers, and the stack mapStackMap.The representation of the instruction stream and stack map attribute must be as specified in 4.10.1.3 and 4.10.1.4.
- samePackageName(Class1, Class2)
-
True iff the package names of
Class1andClass2are the same. - differentPackageName(Class1, Class2)
-
True iff the package names of
Class1andClass2are different.
The above accessors are used to
define loadedSuperclasses, which produces a list of a
class's superclasses by recurring on each class's direct superclass
until Object, which has no superclass, is reached.
loadedSuperclasses(Class, [ Superclass | Rest ]) :-
classSuperClassName(Class, SuperclassName),
classDefiningLoader(Class, L),
loadedClass(SuperclassName, L, Superclass),
loadedSuperclasses(Superclass, Rest).
loadedSuperclasses(Class, []) :-
\+ classSuperClassName(Class, _).
When type checking a method's body, it is convenient to access information about the method. For this purpose, we define an environment, a six-tuple consisting of:
a class
a method
the declared return type of the method (or
void)the instructions in a method
the maximal size of the operand stack
a list of exception handlers
We specify accessors to extract information from the environment.
allInstructions(Environment, Instructions) :-
Environment = environment(_Class, _Method, _ReturnType,
Instructions, _, _).
exceptionHandlers(Environment, Handlers) :-
Environment = environment(_Class, _Method, _ReturnType,
_Instructions, _, Handlers).
maxOperandStackLength(Environment, MaxStack) :-
Environment = environment(_Class, _Method, _ReturnType,
_Instructions, MaxStack, _Handlers).
currentClassLoader(Environment, Loader) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, Loader).
thisClass(Environment, Class) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _).
thisType(Environment, class(ClassName, L)) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
thisMethodReturnType(Environment, ReturnType) :-
Environment = environment(_Class, _Method, ReturnType,
_Instructions, _, _).
offsetStackFrame(Environment, Offset, StackFrame) :-
allInstructions(Environment, Instructions),
member(stackMap(Offset, StackFrame), Instructions).
Finally, we specify a general predicate used throughout the type rules:
notMember(_, []).
notMember(X, [A | More]) :- X \= A, notMember(X, More).
The principle guiding the determination as to which accessors are stipulated and which are fully specified is that we do not want to over-specify the representation of the
classfile. Providing specific accessors to theClassorMethodterm would force us to completely specify the format for a Prolog term representing theclassfile.
4.10.1.4 Stack Map Frames and Type Transitions
Stack map frames are represented in Prolog as a list of terms of the form:
stackMap(Offset, TypeState)
where:
Offsetis an integer indicating the bytecode offset at which the stack map frame applies (4.7.4).The order of bytecode offsets in this list must be the same as in the
classfile.TypeStateis the expected incoming type state for the instruction atOffset.
A type state is a mapping from locations in the operand stack and local variables of a method to verification types. It has the form:
frame(Locals, OperandStack, Flags)
frame(Locals, OperandStack, InitState)
where:
Localsis a list of verification types, such that the i'th element of the list (with 0-based indexing) represents the type of local variable i.Types of size 2 (
longanddouble) are represented by two local variables (2.6.1), with the first local variable being the type itself and the second local variable beingtop(4.10.1.7).OperandStackis a list of verification types, such that the first element of the list represents the type of the top of the operand stack, and the types of stack entries below the top follow in the list in the appropriate order.Types of size 2 (
longanddouble) are represented by two stack entries, with the first entry beingtopand the second entry being the type itself.For example, a stack with a
doublevalue, anintvalue, and alongvalue is represented in a type state as a stack with five entries:topanddoubleentries for thedoublevalue, anintentry for theintvalue, andtopandlongentries for thelongvalue. Accordingly,OperandStackis the list[top, double, int, top, long].Flagsis a list which may either be empty or have the single elementflagThisUninit.If any local variable in
Localshas the typeuninitializedThis, thenFlagshas the single elementflagThisUninit, otherwiseFlagsis an empty list.flagThisUninitis used in constructors to mark type states where initialization ofthishas not yet been completed. In such type states, it is illegal to return from the method.InitStateexpresses, if relevant, the instance initialization state (2.9.1) of a class instance being initialized. It may be any of the following:restricted(UnsetFields), whereUnsetFieldsis a list of functor applications of the formunsetField(Name, Descriptor). This expresses an early larval state in which the given strictly-initialized instance fields must be set before initialization can proceed.erroneous, expressing an erroneous state in which initialization has failed and an exception must be propagated to the creator of the new instance.unrestricted, if neither of the above apply. This includes instance initialization method code that runs in the late larval initialization state, and the code of all other methods.
Subtyping of verification types
is extended pointwise to type states. The local variable array of a
method has a fixed length by construction (see
methodInitialStackFrame in 4.10.1.5), but the operand stack grows and
shrinks, so we require an explicit check on the length of the operand
stacks whose assignability is desired for subtyping. Subtyping
only allows a transition between different initialization states if both
are restricted states, and the latter does not remove any
unset fields.
frameIsAssignable(frame(Locals1, StackMap1, Flags1),
frame(Locals2, StackMap2, Flags2)) :-
frameIsAssignable(frame(Locals1, StackMap1, InitState1),
frame(Locals2, StackMap2, InitState2)) :-
length(StackMap1, StackMapLength),
length(StackMap2, StackMapLength),
maplist(isAssignable, Locals1, Locals2),
maplist(isAssignable, StackMap1, StackMap2),
subset(Flags1, Flags2).
initStateTransition(InitState1, InitState2).
initStateTransition(InitState, InitState).
initStateTransition(restricted(UnsetFields1),
restricted(UnsetFields2)) :-
subset(UnsetFields1, UnsetFields2).
Most of the type rules for individual instructions (4.10.1.9) depend on the notion of a valid type transition. A type transition is valid if one can pop a list of expected types off the incoming type state's operand stack and replace them with an expected result type, resulting in a new type state where the length of the operand stack does not exceed its declared maximum size.
validTypeTransition(Environment, ExpectedTypesOnStack, ResultType,
frame(Locals, InputOperandStack, Flags),
frame(Locals, NextOperandStack, Flags)) :-
frame(Locals, InputOperandStack, InitState),
frame(Locals, NextOperandStack, InitState)) :-
popMatchingList(InputOperandStack, ExpectedTypesOnStack,
InterimOperandStack),
pushOperandStack(InterimOperandStack, ResultType, NextOperandStack),
operandStackHasLegalLength(Environment, NextOperandStack).
Pop a list of types off the stack.
popMatchingList(OperandStack, [], OperandStack).
popMatchingList(OperandStack, [P | Rest], NewOperandStack) :-
popMatchingType(OperandStack, P, TempOperandStack, _ActualType),
popMatchingList(TempOperandStack, Rest, NewOperandStack).
Pop an individual type off the
stack. The exact behavior depends on the stack contents. If the logical
top of the stack is some subtype of the specified type,
Type, then pop it. If a type occupies two stack entries,
then the logical top of the stack is really the type just below the top,
and the top of the stack is the unusable type top.
popMatchingType([ActualType | OperandStack],
Type, OperandStack, ActualType) :-
sizeOf(Type, 1),
isAssignable(ActualType, Type).
popMatchingType([top, ActualType | OperandStack],
Type, OperandStack, ActualType) :-
sizeOf(Type, 2),
isAssignable(ActualType, Type).
sizeOf(X, 2) :- isAssignable(X, twoWord).
sizeOf(X, 1) :- isAssignable(X, oneWord).
sizeOf(top, 1).
Push a logical type onto the
stack. The exact behavior varies with the size of the type. If the
pushed type is of size 1, we just push it onto the stack. If the pushed
type is of size 2, we push it, and then push top.
pushOperandStack(OperandStack, 'void', OperandStack).
pushOperandStack(OperandStack, Type, [Type | OperandStack]) :-
sizeOf(Type, 1).
pushOperandStack(OperandStack, Type, [top, Type | OperandStack]) :-
sizeOf(Type, 2).
The length of the operand stack must not exceed the declared maximum size.
operandStackHasLegalLength(Environment, OperandStack) :-
length(OperandStack, Length),
maxOperandStackLength(Environment, MaxStack),
Length =< MaxStack.
The dup instructions pop expected types off the incoming type state's operand stack and replace them with predefined result types, resulting in a new type state. However, these instructions are not defined in terms of type transitions because there is no need to match types by means of the subtyping relation. Instead, the dup instructions manipulate the operand stack entirely in terms of the category of types on the stack (2.11.1).
Category 1 types occupy a single
stack entry. Popping a logical type of category 1, Type,
off the stack is possible if the top of the stack is Type
and Type is not top (otherwise it could denote
the upper half of a category 2 type). The result is the incoming stack,
with the top entry popped off.
popCategory1([Type | Rest], Type, Rest) :-
Type \= top,
sizeOf(Type, 1).
Category 2 types occupy two
stack entries. Popping a logical type of category 2, Type,
off the stack is possible if the top of the stack is type
top, and the entry directly below it is Type.
The result is the incoming stack, with the top two entries popped
off.
popCategory2([top, Type | Rest], Type, Rest) :-
sizeOf(Type, 2).
The dup instructions push a list of types onto the stack in essentially the same way as when a type is pushed for a valid type transition.
canSafelyPush(Environment, InputOperandStack, Type, OutputOperandStack) :-
pushOperandStack(InputOperandStack, Type, OutputOperandStack),
operandStackHasLegalLength(Environment, OutputOperandStack).
canSafelyPushList(Environment, InputOperandStack, Types,
OutputOperandStack) :-
canPushList(InputOperandStack, Types, OutputOperandStack),
operandStackHasLegalLength(Environment, OutputOperandStack).
canPushList(InputOperandStack, [], InputOperandStack).
canPushList(InputOperandStack, [Type | Rest], OutputOperandStack) :-
pushOperandStack(InputOperandStack, Type, InterimOperandStack),
canPushList(InterimOperandStack, Rest, OutputOperandStack).
Many of the type rules for individual instructions use the following clause to easily pop a list of types off the stack.
canPop(frame(Locals, OperandStack, Flags), Types,
frame(Locals, PoppedOperandStack, Flags)) :-
canPop(frame(Locals, OperandStack, InitState), Types,
frame(Locals, PoppedOperandStack, InitState)) :-
popMatchingList(OperandStack, Types, PoppedOperandStack).
Finally, certain array instructions (4.10.1.9.aaload, 4.10.1.9.arraylength, 4.10.1.9.baload, 4.10.1.9.bastore) peek at types on the operand stack in order to check they are array types. The following clause accesses the i'th element of the operand stack from a type state.
nth1OperandStackIs(*i*, frame(_Locals, OperandStack, _Flags), Element) :-
nth1OperandStackIs(*i*, frame(_, OperandStack, _), Element) :-
nth1(*i*, OperandStack, Element).
4.10.1.5 Type Checking Methods
A method with a Code attribute is type safe if it is
possible to merge the code and the stack map frames into a single stream
such that each stack map frame precedes the instruction it corresponds
to, and the merged stream is type correct. The method's exception
handlers, if any, must also be legal.
methodIsTypeSafe(Class, Method) :-
parseCodeAttribute(Class, Method, FrameSize, MaxStack,
ParsedCode, Handlers, StackMap),
mergeStackMapAndCode(StackMap, ParsedCode, MergedCode),
methodInitialStackFrame(Class, Method, FrameSize, StackFrame, ReturnType),
Environment = environment(Class, Method, ReturnType, MergedCode,
MaxStack, Handlers),
handlersAreLegal(Environment),
mergedCodeIsTypeSafe(Environment, MergedCode, StackFrame).
Let us consider exception handlers first.
An exception handler is represented by a functor application of the form:
handler(Start, End, Target, ClassName)
whose arguments are, respectively, the start and end of the range of instructions covered by the handler, the first instruction of the handler code, and the name of the exception class that this handler is designed to handle.
An exception handler is legal if its start
(Start) is less than its end (End), there
exists an instruction whose offset is equal to Start, there
exists an instruction whose offset equals End, and the
handler's exception class is assignable to the class
Throwable. The exception class of a handler is
Throwable if the handler's class entry is 0, otherwise it
is the class named in the handler.
handlersAreLegal(Environment) :-
exceptionHandlers(Environment, Handlers),
checklist(handlerIsLegal(Environment), Handlers).
handlerIsLegal(Environment, Handler) :-
Handler = handler(Start, End, Target, _),
Start < End,
allInstructions(Environment, Instructions),
member(instruction(Start, _), Instructions),
offsetStackFrame(Environment, Target, _),
instructionsIncludeEnd(Instructions, End),
currentClassLoader(Environment, CurrentLoader),
handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
isBootstrapLoader(BL),
isAssignable(ExceptionClass, class('java/lang/Throwable', BL)).
instructionsIncludeEnd(Instructions, End) :-
member(instruction(End, _), Instructions).
instructionsIncludeEnd(Instructions, End) :-
member(endOfCode(End), Instructions).
handlerExceptionClass(handler(_, _, _, 0),
class('java/lang/Throwable', BL), _) :-
isBootstrapLoader(BL).
handlerExceptionClass(handler(_, _, _, Name),
class(Name, L), L) :-
Name \= 0.
Let us now turn to the stream of instructions and stack map frames.
Merging instructions and stack map frames into a single stream involves four cases:
Merging an empty
StackMapand a list of instructions yields the original list of instructions.mergeStackMapAndCode([], CodeList, CodeList).Given a list of stack map frames beginning with the type state for the instruction at
Offset, and a list of instructions beginning atOffset, the merged list is the head of the stack map frame list, followed by the head of the instruction list, followed by the merge of the tails of the two lists.mergeStackMapAndCode([stackMap(Offset, Map) | RestMap], [instruction(Offset, Parse) | RestCode], [stackMap(Offset, Map), instruction(Offset, Parse) | RestMerge]) :- mergeStackMapAndCode(RestMap, RestCode, RestMerge).Otherwise, given a list of stack map frames beginning with the type state for the instruction at
OffsetM, and a list of instructions beginning atOffsetP, then, ifOffsetP < OffsetM, the merged list consists of the head of the instruction list, followed by the merge of the stack map frame list and the tail of the instruction list.mergeStackMapAndCode([stackMap(OffsetM, Map) | RestMap], [instruction(OffsetP, Parse) | RestCode], [instruction(OffsetP, Parse) | RestMerge]) :- OffsetP < OffsetM, mergeStackMapAndCode([stackMap(OffsetM, Map) | RestMap], RestCode, RestMerge).Otherwise, the merge of the two lists is undefined. Since the instruction list has monotonically increasing offsets, the merge of the two lists is not defined unless every stack map frame offset has a corresponding instruction offset and the stack map frames are in monotonically increasing order.
To determine if the merged stream for a method is type correct, we first infer the method's initial type state.
The initial type state of a method consists of an empty operand stack
and local variable types derived from the type of this and
the arguments, as well as the appropriate flag, depending on
whether instance initialization state, if this is
an <init> method.
methodInitialStackFrame(Class, Method, FrameSize, frame(Locals, [], Flags),
ReturnType):-
methodInitialStackFrame(Class, Method, FrameSize,
frame(Locals, [], InitState), ReturnType):-
methodDescriptor(Method, Descriptor),
parseMethodDescriptor(Descriptor, RawArgs, ReturnType),
expandTypeList(RawArgs, Args),
methodInitialThisType(Class, Method, ThisList),
flags(ThisList, Flags),
methodInitialInitState(Class, Method, InitState),
append(ThisList, Args, ThisArgs),
expandToLength(ThisArgs, FrameSize, top, Locals).
Given a list of types, the following clause produces a list where
every type of size 2 has been substituted by two entries: one for
itself, and one top entry. The result then corresponds to
the representation of the list as 32-bit words in the Java Virtual
Machine.
expandTypeList([], []).
expandTypeList([Item | List], [Item | Result]) :-
sizeOf(Item, 1),
expandTypeList(List, Result).
expandTypeList([Item | List], [Item, top | Result]) :-
sizeOf(Item, 2),
expandTypeList(List, Result).
flags([uninitializedThis], [flagThisUninit]).
flags(X, []) :- X \= [uninitializedThis].
expandToLength(List, Size, _Filler, List) :-
length(List, Size).
expandToLength(List, Size, Filler, Result) :-
length(List, ListLength),
ListLength < Size,
Delta is Size - ListLength,
length(Extra, Delta),
checklist(=(Filler), Extra),
append(List, Extra, Result).
For the initial type state of an instance method, we compute the type
of this and put it in a list. The type of this
in an <init> method of a class with a superclass
(that is, of any class except Object) is
uninitializedThis; the type of this in any
other instance method, including an <init> method of
Object, is class(N, L) where N is
the name of the class containing the method and L is its
defining class loader.
For the initial type state of a static method, this is
irrelevant, so the list is empty.
methodInitialThisType(_Class, Method, []) :-
methodAccessFlags(Method, AccessFlags),
member(static, AccessFlags),
methodName(Method, MethodName),
MethodName \= '`<init>`'.
methodInitialThisType(Class, Method, [This]) :-
methodAccessFlags(Method, AccessFlags),
notMember(static, AccessFlags),
instanceMethodInitialThisType(Class, Method, This).
instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
isSubclassInstanceInit(Class, Method).
instanceMethodInitialThisType(Class, Method, class(ClassName, L)) :-
\+ isSubclassInstanceInit(Class, Method),
classClassName(Class, ClassName),
classDefiningLoader(Class, L).
isSubclassInstanceInit(Class, Method) :-
methodName(Method, '`<init>`'),
classSuperClassName(Class, _).
The starting initialization state (4.10.1.4) of an <init>
method (other than Object.<init>) is
restricted, with each strictly-initialized instance field
of the class listed as unset. The starting initialization state of any
other method is unrestricted.
methodInitialInitState(Class, Method, restricted(UnsetFields)) :-
isSubclassInstanceInit(Class, Method),
findall(unsetField(FieldName, FieldDescriptor),
fieldRequiresAssignment(Class, FieldName, FieldDescriptor),
UnsetFields).
fieldRequiresAssignment(Class, FieldName, FieldDescriptor) :-
classDeclaresMember(Class, FieldName, FieldDescriptor, FieldFlags),
member(strict_init, FieldFlags),
notMember(static, FieldFlags).
methodInitialInitState(Class, Method, unrestricted) :-
\+ isSubclassInstanceInit(Class, Method).
4.10.1.6 Type Checking Code Streams
Given an incoming type state, the mergedCodeIsTypeSafe
predicate checks that a merged stream of instructions and stack map
frames is type correct:
If we have a stack map frame and an incoming type state, the type state must be assignable to the one in the stack map frame. We may then proceed to type check the rest of the stream with the type state given in the stack map frame.
mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode], frame(Locals, OperandStack, Flags)) :- frameIsAssignable(frame(Locals, OperandStack, Flags), MapFrame),mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode], frame(Locals, OperandStack, InitState)) :- frameIsAssignable(frame(Locals, OperandStack, InitState), MapFrame),mergedCodeIsTypeSafe(Environment, MoreCode, MapFrame).A merged code stream is type safe relative to an incoming type state
Tif it begins with an instructionIthat is type safe relative toT, andIsatisfies its exception handlers (see below), and the tail of the stream is type safe given the type state following that execution ofI.NextStackFrameindicates what falls through to the following instruction. For an unconditional branch instruction, it will have the special valueafterGoto.ExceptionStackFrameindicates what is passed to exception handlers.mergedCodeIsTypeSafe(Environment, [instruction(Offset, Parse) | MoreCode], frame(Locals, OperandStack, Flags)) :- instructionIsTypeSafe(Parse, Environment, Offset, frame(Locals, OperandStack, Flags), NextStackFrame, ExceptionStackFrame),mergedCodeIsTypeSafe(Environment, [instruction(Offset, Parse) | MoreCode], frame(Locals, OperandStack, InitState)) :- instructionIsTypeSafe(Parse, Environment, Offset, frame(Locals, OperandStack, InitState), NextStackFrame, ExceptionStackFrame),instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame), mergedCodeIsTypeSafe(Environment, MoreCode, NextStackFrame).After an unconditional branch (indicated by an incoming type state of
afterGoto), if we have a stack map frame giving the type state for the following instructions, we can proceed and type check them using the type state provided by the stack map frame.mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode], afterGoto) :- mergedCodeIsTypeSafe(Environment, MoreCode, MapFrame).It is illegal to have code after an unconditional branch without a stack map frame being provided for it.
mergedCodeIsTypeSafe(_Environment, [instruction(_, _) | _MoreCode], afterGoto) :- write_ln('No stack frame after unconditional branch'), fail.If we have an unconditional branch at the end of the code, stop.
mergedCodeIsTypeSafe(_Environment, [endOfCode(Offset)], afterGoto).
Branching to a target is type safe if the target has an associated
stack frame, Frame, and the current stack frame,
StackFrame, is assignable to Frame.
targetIsTypeSafe(Environment, StackFrame, Target) :-
offsetStackFrame(Environment, Target, Frame),
frameIsAssignable(StackFrame, Frame).
An instruction satisfies its exception handlers if it satisfies every exception handler that is applicable to the instruction.
instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame) :-
exceptionHandlers(Environment, Handlers),
sublist(isApplicableHandler(Offset), Handlers, ApplicableHandlers),
checklist(instructionSatisfiesHandler(Environment, ExceptionStackFrame),
ApplicableHandlers).
An exception handler is applicable to an instruction if the offset of the instruction is greater or equal to the start of the handler's range and less than the end of the handler's range.
isApplicableHandler(Offset, handler(Start, End, _Target, _ClassName)) :-
Offset >= Start,
Offset < End.
An instruction satisfies an exception handler if the
instructions's outgoing type state is ExcStackFrame, and
the handler's target (the initial instruction of the handler code) is
type safe assuming an incoming type state T. The type state
T is derived from ExcStackFrame by replacing
the operand stack with a stack whose sole element is the handler's
exception class.
instructionSatisfiesHandler(Environment, ExcStackFrame, Handler) :-
Handler = handler(_, _, Target, _),
currentClassLoader(Environment, CurrentLoader),
handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
/* The stack consists of just the exception. */
ExcStackFrame = frame(Locals, _, Flags),
TrueExcStackFrame = frame(Locals, [ ExceptionClass ], Flags),
ExcStackFrame = frame(Locals, _, InitState),
TrueExcStackFrame = frame(Locals, [ ExceptionClass ], InitState),
operandStackHasLegalLength(Environment, TrueExcStackFrame),
targetIsTypeSafe(Environment, TrueExcStackFrame, Target).
4.10.1.7 Type Checking Load and Store Instructions
All load instructions are variations on a common pattern, varying the type of the value that the instruction loads.
Loading a value of type
Type from local variable Index is type safe,
if the type of that local variable is ActualType,
ActualType is assignable to Type, and pushing
ActualType onto the incoming operand stack is a valid type
transition (4.10.1.4) that yields a new
type state NextStackFrame. After execution of the load
instruction, the type state will be NextStackFrame.
loadIsTypeSafe(Environment, Index, Type, StackFrame, NextStackFrame) :-
StackFrame = frame(Locals, _OperandStack, _Flags),
StackFrame = frame(Locals, _OperandStack, _InitState),
nth0(Index, Locals, ActualType),
isAssignable(ActualType, Type),
validTypeTransition(Environment, [], ActualType, StackFrame,
NextStackFrame).
All store instructions are variations on a common pattern, varying the type of the value that the instruction stores.
In general, a store instruction
is type safe if the local variable it references is of a type that is a
supertype of Type, and the top of the operand stack is of a
subtype of Type, where Type is the type the
instruction is designed to store.
More precisely, the store is
type safe if one can pop a type ActualType that "matches"
Type (that is, is a subtype of Type) off the
operand stack (4.10.1.4), and then legally
assign that type the local variable LIndex.
storeIsTypeSafe(_Environment, Index, Type,
frame(Locals, OperandStack, Flags),
frame(NextLocals, NextOperandStack, Flags)) :-
storeIsTypeSafe(_Environment, Index, Type,
frame(Locals, OperandStack, InitState),
frame(NextLocals, NextOperandStack, InitState)) :-
popMatchingType(OperandStack, Type, NextOperandStack, ActualType),
modifyLocalVariable(Index, ActualType, Locals, NextLocals).
Given local variables Locals, modifying
Index to have type Type results in the local
variable list NewLocals. The modifications are somewhat
involved, because some values (and their corresponding types) occupy two
local variables. Hence, modifying LN may require modifying
LN+1 (because the type will occupy both the N
and N+1 slots) or LN-1 (because local
N used to be the upper half of the two word value/type
starting at local N-1, and so local N-1 must
be invalidated), or both. This is described further below. We start at
L0 and count up.
modifyLocalVariable(Index, Type, Locals, NewLocals) :-
modifyLocalVariable(0, Index, Type, Locals, NewLocals).
Given LocalsRest, the suffix of the local variable list
starting at index I, modifying local variable
Index to have type Type results in the local
variable list suffix NextLocalsRest.
If I < Index-1, just copy the input to the output and
recurse forward. If I = Index-1, the type of local
I may change. This can occur if LI has a type
of size 2. Once we set LI+1 to the new type (and the
corresponding value), the type/value of LI will be
invalidated, as its upper half will be trashed. Then we recurse
forward.
modifyLocalVariable(I, Index, Type,
[Locals1 | LocalsRest],
[Locals1 | NextLocalsRest] ) :-
I < Index - 1,
I1 is I + 1,
modifyLocalVariable(I1, Index, Type, LocalsRest, NextLocalsRest).
modifyLocalVariable(I, Index, Type,
[Locals1 | LocalsRest],
[NextLocals1 | NextLocalsRest] ) :-
I =:= Index - 1,
modifyPreIndexVariable(Locals1, NextLocals1),
modifyLocalVariable(Index, Index, Type, LocalsRest, NextLocalsRest).
When we find the variable, and it only occupies one word, we change
it to Type and we're done. When we find the variable, and
it occupies two words, we change its type to Type and the
next word to top.
modifyLocalVariable(Index, Index, Type,
[_ | LocalsRest], [Type | LocalsRest]) :-
sizeOf(Type, 1).
modifyLocalVariable(Index, Index, Type,
[_, _ | LocalsRest], [Type, top | LocalsRest]) :-
sizeOf(Type, 2).
We refer to a local whose index immediately precedes a local whose
type will be modified as a pre-index variable. The future type
of a pre-index variable of type InputType is
Result. If the type, Type, of the pre-index
local is of size 1, it doesn't change. If the type of the pre-index
local, Type, is 2, we need to mark the lower half of its
two word value as unusable, by setting its type to top.
modifyPreIndexVariable(Type, Type) :- sizeOf(Type, 1).
modifyPreIndexVariable(Type, top) :- sizeOf(Type, 2).
4.10.1.8 Type Checking for protected
Members
All instructions that access
members must contend with the rules concerning protected
members. This section describes the protected check that
corresponds to JLS §6.6.2.1.
The protected check
applies only to protected members of superclasses of the
current class. protected members in other classes will be
caught by the access checking done at resolution (5.4.4). There are four cases:
If the referenced type is not a class type with the same name as a superclass, it cannot be a superclass, and so it can safely be ignored.
passesProtectedCheck(Environment, class(MemberClassName, _), MemberName, MemberDescriptor, StackFrame) :- thisClass(Environment, CurrentClass), \+ hasSuperclassWithName(CurrentClass, MemberClassName). passesProtectedCheck(Environment, arrayOf(_), MemberName, MemberDescriptor, StackFrame). hasSuperclassWithName(Class, SuperclassName) :- loadedSuperclasses(Class, Supers), member(Super, Supers), classClassName(Super, SuperclassName).If the
MemberClassNameis the same as the name of a superclass, the class being resolved may indeed be a superclass. In this case, if no superclass namedMemberClassNamein a different run-time package has aprotectedmember namedMemberNamewith descriptorMemberDescriptor, theprotectedcheck does not apply.This is because the actual class being resolved will either be one of these superclasses, in which case we know that it is either in the same run-time package, and the access is legal; or the member in question is not
protectedand the check does not apply; or it will be a subclass, in which case the check would succeed anyway; or it will be some other class in the same run-time package, in which case the access is legal and the check need not take place; or the verifier need not flag this as a problem, since it will be caught anyway because resolution will per force fail.passesProtectedCheck(Environment, class(MemberClassName, _), MemberName, MemberDescriptor, StackFrame) :- thisClass(Environment, CurrentClass), hasSuperclassWithName(CurrentClass, MemberClassName), loadedSuperclasses(CurrentClass, Supers), classesInOtherPkgWithProtectedMember( CurrentClass, MemberName, MemberDescriptor, MemberClassName, Supers, []).If there does exist a
protectedsuperclass member in a different run-time package, then loadMemberClassName; if the member in question is notprotected, the check does not apply. (Using a superclass member that is notprotectedis trivially correct.)passesProtectedCheck(Environment, class(MemberClassName, Loader), MemberName, MemberDescriptor, StackFrame) :- thisClass(Environment, CurrentClass), hasSuperclassWithName(CurrentClass, MemberClassName), loadedSuperclasses(CurrentClass, Supers), classesInOtherPkgWithProtectedMember( CurrentClass, MemberName, MemberDescriptor, MemberClassName, Supers, List), List \= [], loadedClass(MemberClassName, Loader, ReferencedClass), isNotProtected(ReferencedClass, MemberName, MemberDescriptor).Otherwise, use of a member of an object of type
Targetrequires thatTargetbe assignable to the type of the current class.passesProtectedCheck(Environment, class(MemberClassName, Loader), MemberName, MemberDescriptor,frame(_Locals, [Target | _Rest], _Flags)) :-frame(_Locals, [Target | _Rest], _InitState)) :-thisClass(Environment, CurrentClass), hasSuperclassWithName(CurrentClass, MemberClassName), loadedSuperclasses(CurrentClass, Supers), classesInOtherPkgWithProtectedMember( CurrentClass, MemberName, MemberDescriptor, MemberClassName, Supers, List), List \= [], loadedClass(MemberClassName, Loader, ReferencedClass), isProtected(ReferencedClass, MemberName, MemberDescriptor), thisType(Environment, ThisType), isAssignable(Target, ThisType).
The predicate
classesInOtherPkgWithProtectedMember(Class, MemberName, MemberDescriptor, MemberClassName, Supers, List)
is true if List is the set of classes in
Supers with name MemberClassName that are in a
different run-time package than Class which declare a
protected member named MemberName with
descriptor MemberDescriptor.
classesInOtherPkgWithProtectedMember(_, _, _, _, [], []).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[Super | Tail], [Super | T]) :-
classClassName(Super, MemberClassName),
\+ sameRuntimePackage(Class, Super),
isProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[Super | Tail], T) :-
classClassName(Super, MemberClassName),
\+ sameRuntimePackage(Class, Super),
isNotProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[Super | Tail], T) :-
classClassName(Super, MemberClassName),
sameRuntimePackage(Class, Super),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[Super | Tail], T) :-
classClassName(Super, SuperClassName),
SuperClassName \= MemberClassName,
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
sameRuntimePackage(Class1, Class2) :-
classDefiningLoader(Class1, L),
classDefiningLoader(Class2, L),
samePackageName(Class1, Class2).
4.10.1.9 Type Checking Instructions
In general, the type rule for an
instruction is given relative to an environment Environment
that defines the class and method in which the instruction occurs (4.10.1.1), and the offset Offset
within the method at which the instruction occurs. The rule states that
if the incoming type state StackFrame fulfills certain
requirements, then:
The instruction is type safe.
It is provable that the type state after the instruction completes normally has a particular form given by
NextStackFrame, and that the type state after the instruction completes abruptly is given byExceptionStackFrame.The type state after an instruction completes abruptly is the same as the incoming type state, except that the operand stack is empty.
exceptionStackFrame(StackFrame, ExceptionStackFrame) :-StackFrame = frame(Locals, _OperandStack, Flags), ExceptionStackFrame = frame(Locals, [], Flags).StackFrame = frame(Locals, _OperandStack, InitState), ExceptionStackFrame = frame(Locals, [], InitState).
Many instructions have type
rules that are completely isomorphic to the rules for other
instructions. If an instruction b1 is isomorphic to another
instruction b2, then the type rule for b1 is
the same as the type rule for b2.
instructionIsTypeSafe(Instruction, Environment, Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
instructionHasEquivalentTypeRule(Instruction, IsomorphicInstruction),
instructionIsTypeSafe(IsomorphicInstruction, Environment, Offset,
StackFrame, NextStackFrame,
ExceptionStackFrame).
The English language description of each rule is intended to be readable, intuitive, and concise. As such, the description avoids repeating all the contextual assumptions given above. In particular:
The description does not explicitly mention the environment.
When the description speaks of the operand stack or local variables in the following, it is referring to the operand stack and local variable components of a type state: either the incoming type state or the outgoing one.
The type state after the instruction completes abruptly is almost always identical to the incoming type state. The description only discusses the type state after the instruction completes abruptly when that is not the case.
The description speaks of popping and pushing types onto the operand stack, and does not explicitly discuss issues of stack underflow or overflow. The description assumes these operations can be completed successfully, but the Prolog clauses for operand stack manipulation ensure that the necessary checks are made.
The description discusses only the manipulation of logical types. In practice, some types take more than one word. The description abstracts from these representation details, but the Prolog clauses that manipulate data do not.
Any ambiguities can be resolved by referring to the formal Prolog clauses.
dup
A dup instruction
is type safe iff one can validly replace a category 1 type,
Type, with the types Type, Type,
yielding the outgoing type state.
instructionIsTypeSafe(dup, Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
StackFrame = frame(Locals, InputOperandStack, Flags),
StackFrame = frame(Locals, InputOperandStack, InitState),
popCategory1(InputOperandStack, Type, _),
canSafelyPush(Environment, InputOperandStack, Type, OutputOperandStack),
NextStackFrame = frame(Locals, OutputOperandStack, Flags),
NextStackFrame = frame(Locals, OutputOperandStack, InitState),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
iinc
An iinc
instruction with first operand Index is type safe iff
LIndex has type int. The iinc
instruction does not change the type state.
instructionIsTypeSafe(iinc(Index, _Value), _Environment, _Offset,
StackFrame, StackFrame, ExceptionStackFrame) :-
StackFrame = frame(Locals, _OperandStack, _Flags),
StackFrame = frame(Locals, _OperandStack, _InitState),
nth0(Index, Locals, int),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
invokeinterface
An invokeinterface instruction is type safe iff all of the following are true:
Its first operand,
CP, refers to a constant pool entry denoting an interface method with descriptorDescriptorthat is a member of a typeMethodClassType.MethodNameis not<init>.MethodNameis not<clinit>.Its second operand,
Count, is a valid count operand (see below).One can validly replace types matching
MethodClassTypeand the argument types given inDescriptoron the incoming operand stack with the return type given inDescriptor, yielding the outgoing type state.
instructionIsTypeSafe(invokeinterface(CP, Count, 0), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = imethod(MethodClassType, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
reverse([MethodClassType | OperandArgList], StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
validTypeTransition(Environment, [], ReturnType,
TempFrame, NextStackFrame),
countIsValid(Count, StackFrame, TempFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The
Count operand of an invokeinterface instruction is
valid if it equals the size of the arguments to the instruction. This is
equal to the difference between the size of InputFrame and
OutputFrame.
countIsValid(Count, InputFrame, OutputFrame) :-
InputFrame = frame(_Locals1, OperandStack1, _Flags1),
OutputFrame = frame(_Locals2, OperandStack2, _Flags2),
InputFrame = frame(_, OperandStack1, _),
OutputFrame = frame(_, OperandStack2, _),
length(OperandStack1, Length1),
length(OperandStack2, Length2),
Count =:= Length1 - Length2.
invokespecial
An invokespecial instruction is type safe iff all of the following are true:
Its first operand,
CP, refers to a constant pool entry denoting a method namedMethodNamewith descriptorDescriptorthat is a member of a typeMethodClassType.Either:
MethodNameis not<init>.MethodNameis not<clinit>.MethodClassTypeis the current class, a superclass of the current class, or a direct superinterface of the current class.One can validly replace types matching the current class and the argument types given in
Descriptoron the incoming operand stack with the return type given inDescriptor, yielding the outgoing type state.One can validly replace types matching the class
MethodClassNameand the argument types given inDescriptoron the incoming operand stack with the return type given inDescriptor.
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = method(MethodClassType, MethodName, Descriptor) ;
CP = imethod(MethodClassType, MethodName, Descriptor)),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
validSpecialMethodClassType(Environment, MethodClassType),
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
thisType(Environment, ThisType),
reverse([ThisType | OperandArgList], StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
validSpecialMethodClassType(Environment, class(MethodClassName, _)) :-
thisClass(Environment, ThisClass),
classClassName(ThisClass, MethodClassName).
validSpecialMethodClassType(Environment, class(MethodClassName, L)) :-
thisClass(Environment, ThisClass),
loadedSuperclasses(ThisClass, Supers),
member(Super, Supers),
classClassName(Super, MethodClassName),
loadedClass(MethodClassName, L, Super).
validSpecialMethodClassType(Environment, class(MethodClassName, _)) :-
thisClass(Environment, ThisClass),
classInterfaceNames(ThisClass, InterfaceNames),
member(MethodClassName, InterfaceNames).
The
validSpecialMethodClassTypeclause enforces the structural constraint that invokespecial, for other than an instance initialization method, must name a method in the current class/interface or a superclass/superinterface.
The
validTypeTransitionclause enforces the structural constraint that invokespecial, for other than an instance initialization method, targets a receiver object of the current class or deeper. To see why, consider thatStackArgListsimulates the list of types on the operand stack expected by the method, starting with the current class (the class performing invokespecial). The actual types on the operand stack are inStackFrame. The effect ofvalidTypeTransitionis to pop the first type from the operand stack inStackFrameand check it is a subtype of the first term ofStackArgList, namely the current class. Thus, the actual receiver type is compatible with the current class.
A sharp-eyed reader might notice that enforcing this structural constraint supercedes the structural constraint pertaining to invokespecial of a
protectedmethod. Thus, the Prolog code above makes no reference topassesProtectedCheck(4.10.1.8), whereas the Prolog code for invokespecial of an instance initialization method usespassesProtectedCheckto ensure the actual receiver type is compatible with the current class when certainprotectedinstance initialization methods are named.
Or:
MethodName is
<init>.Descriptorspecifies avoidreturn type.One can validly pop types matching the argument types given in
DescriptoranduninitializedThisoff the incoming operand stack, yieldingOperandStack.MethodClassTypeis the current class or the direct superclass of the current class.The instance initialization state is
restricted. IfMethodClassTypeis the direct superclass of the current class, then the initialization state does not contain any unset fields.The outgoing type state is derived from the incoming type state by first replacing the incoming operand stack with
OperandStackand then replacing all instances ofuninitializedThiswith the type of the current class. The outgoing initialization state isunrestricted.The exceptional type state is derived from the incoming type state by replacing all instances of
uninitializedThiswith the typetop. The exceptional instance initialization state iserroneous.
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = method(MethodClassType, '`<init>`', Descriptor) ;
CP = imethod(MethodClassType, '`<init>`', Descriptor)),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse([uninitializedThis | OperandArgList], StackArgList),
canPop(StackFrame, StackArgList, frame(Locals, OperandStack, Flags)),
validThisInitClassType(Environment, MethodClassType),
canPop(StackFrame, StackArgList, frame(Locals, OperandStack, InitState)),
validThisInitCall(Environment, MethodClassType, InitState),
thisType(Environment, This),
substitute(uninitializedThis, This, OperandStack, NextOperandStack),
substitute(uninitializedThis, This, Locals, NextLocals),
substitute(uninitializedThis, top, Locals, ExceptionLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, []),
ExceptionStackFrame = frame(ExceptionLocals, [], Flags).
NextStackFrame = frame(NextLocals, NextOperandStack, unrestricted),
ExceptionStackFrame = frame(ExceptionLocals, [], erroneous).
validThisInitClassType(Environment, class(MethodClassName, _)) :-
validThisInitCall(Environment, class(MethodClassName, _), restricted(_)) :-
thisClass(Environment, ThisClass),
classClassName(ThisClass, MethodClassName).
validThisInitClassType(Environment, class(MethodClassName, _)) :-
validThisInitCall(Environment, class(MethodClassName, _), restricted([])) :-
thisClass(Environment, ThisClass),
classSuperClassName(ThisClass, MethodClassName).
substitute(_Old, _New, [], []).
substitute(Old, New, [Old | FromRest], [New | ToRest]) :-
substitute(Old, New, FromRest, ToRest).
substitute(Old, New, [From1 | FromRest], [From1 | ToRest]) :-
From1 \= Old,
substitute(Old, New, FromRest, ToRest).
Or:
MethodName is
<init>.Descriptorspecifies avoidreturn type.One can validly pop types matching the argument types given in
Descriptorand an uninitialized type,uninitialized(Address), off the incoming operand stack, yieldingOperandStack.The instruction at
Addresscreated a newMethodClassType.The outgoing type state is derived from the incoming type state by first replacing the incoming operand stack with
OperandStackand then replacing all instances ofuninitialized(Address)withMethodClassType.The exceptional type state is derived from the incoming type state by replacing all instances of
uninitialized(Address)with the typetop.If the method is
protected, the usage conforms to the special rules governing access toprotectedmembers (4.10.1.8).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = method(MethodClassType, '`<init>`', Descriptor) ;
CP = imethod(MethodClassType, '`<init>`', Descriptor)),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse([uninitialized(Address) | OperandArgList], StackArgList),
canPop(StackFrame, StackArgList, frame(Locals, OperandStack, Flags)),
canPop(StackFrame, StackArgList, frame(Locals, OperandStack, InitState)),
allInstructions(Environment, Instructions),
member(instruction(Address, new(MethodClassType)), Instructions),
substitute(uninitialized(Address), MethodClassType,
OperandStack, NextOperandStack),
substitute(uninitialized(Address), MethodClassType, Locals, NextLocals),
substitute(uninitialized(Address), top, Locals, ExceptionLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, Flags),
ExceptionStackFrame = frame(ExceptionLocals, [], Flags),
NextStackFrame = frame(NextLocals, NextOperandStack, InitState),
ExceptionStackFrame = frame(ExceptionLocals, [], InitState),
passesProtectedCheck(Environment, MethodClassType, '`<init>`',
Descriptor, NextStackFrame).
The rule for invokespecial of an
<init>method is the sole motivation for passing back a distinct exception stack frame. The concern is that when initializing an object, the invokespecial invocation could fail, leaving the object in a partially initialized, permanently unusable state. To prevent repeated initialization attempts after an object fails to initialize the first time, an exception handler must consider any references to the object stored in local variables to have typetoprather thanuninitializedThisoruninitialized(Offset).
In the special case of initializing the current object (that is, when invoking
<init>for typeuninitializedThis), the original frame typically holds an uninitialized object in local variable 0 and has flagflagThisUninit. Normal termination of invokespecial initializes the uninitialized object and turns off theflagThisUninitflag. But if the invokespecial invocation throws an exception, the exception frame contains the broken object (with typetop) and theflagThisUninitflag (the old flag). There is no way to perform a return instruction given that type state, so the handler would have to throw another exception (or loop forever). In fact, it's not even possible to express a handler with that type state, because there is no way for a stack frame, as expressed by theStackMapTableattribute (4.7.4), to carryflagThisUninitwithout any accompanying use of typeuninitializedThis.
If not for
thesethis specialconstraintsconstraint on object initialization, the local variable typesand flagsof the exception stack frame would always be the same as the local variable typesand flagsof the input stack frame.
new
A new instruction
with operand CP at offset Offset is type safe
iff CP refers to a constant pool entry denoting a class or
interface type, the type uninitialized(Offset) does not
appear in the incoming operand stack, and one can validly push
uninitialized(Offset) onto the incoming operand stack and
replace uninitialized(Offset) with top in the
incoming local variables yielding the outgoing type state.
instructionIsTypeSafe(new(CP), Environment, Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
StackFrame = frame(Locals, OperandStack, Flags),
StackFrame = frame(Locals, OperandStack, InitState),
CP = class(_, _),
NewItem = uninitialized(Offset),
notMember(NewItem, OperandStack),
substitute(NewItem, top, Locals, NewLocals),
validTypeTransition(Environment, [], NewItem,
frame(NewLocals, OperandStack, Flags),
frame(NewLocals, OperandStack, InitState),
NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The
substitutepredicate is defined in the rule for invokespecial (4.10.1.9.invokespecial).
pop, pop2
A pop instruction is type safe iff one can validly pop a category 1 type off the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(pop, _Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
StackFrame = frame(Locals, [Type | Rest], Flags),
StackFrame = frame(Locals, [Type | Rest], InitState),
popCategory1([Type | Rest], Type, Rest),
NextStackFrame = frame(Locals, Rest, Flags),
NextStackFrame = frame(Locals, Rest, InitState),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
A pop2 instruction is type safe iff it is a type safe form of the pop2 instruction.
instructionIsTypeSafe(pop2, _Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
StackFrame = frame(Locals, InputOperandStack, Flags),
StackFrame = frame(Locals, InputOperandStack, InitState),
pop2SomeFormIsTypeSafe(InputOperandStack, OutputOperandStack),
NextStackFrame = frame(Locals, OutputOperandStack, Flags),
NextStackFrame = frame(Locals, OutputOperandStack, InitState),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
A pop2 instruction is a type safe form of the pop2 instruction iff it is a type safe form 1 pop2 instruction or a type safe form 2 pop2 instruction.
pop2SomeFormIsTypeSafe(InputOperandStack, OutputOperandStack) :-
pop2Form1IsTypeSafe(InputOperandStack, OutputOperandStack).
pop2SomeFormIsTypeSafe(InputOperandStack, OutputOperandStack) :-
pop2Form2IsTypeSafe(InputOperandStack, OutputOperandStack).
A pop2 instruction is a type safe form 1 pop2 instruction iff one can validly pop two types of size 1 off the incoming operand stack yielding the outgoing type state.
pop2Form1IsTypeSafe([Type1, Type2 | Rest], Rest) :-
popCategory1([Type1 | Rest], Type1, Rest),
popCategory1([Type2 | Rest], Type2, Rest).
A pop2 instruction is a type safe form 2 pop2 instruction iff one can validly pop a type of size 2 off the incoming operand stack yielding the outgoing type state.
pop2Form2IsTypeSafe([top, Type | Rest], Rest) :-
popCategory2([top, Type | Rest], Type, Rest).
putfield
A putfield
instruction with operand CP is type safe iff all of the
following are true:
Its first operand,
CP, refers to a constant pool entry denoting a field whose declared type isFieldTypethat is a member of a typeFieldClassType.Either:
The referenced field, if it is declared by the current class, does not have its
ACC_FINALandACC_STRICT_INITflags set.One can validly pop types matching
FieldTypeandFieldClassTypeoff the incoming operand stack yielding the outgoing type state.protectedfields are subject to additional checks (4.10.1.8).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassType, FieldName, FieldDescriptor),
\+ currentClassStrictFinalField(Environment, CP),
parseFieldDescriptor(FieldDescriptor, FieldType),
canPop(StackFrame, [FieldType], PoppedFrame),
passesProtectedCheck(Environment, FieldClassType, FieldName,
FieldDescriptor, PoppedFrame),
canPop(StackFrame, [FieldType, FieldClassType], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
currentClassStrictFinalField(Environment, CP) :-
CP = field(FieldClassName, FieldName, FieldDescriptor),
thisClass(Environment, CurrentClass),
classClassName(CurrentClass, FieldClassName),
classDeclaresMember(CurrentClass, FieldName, FieldDescriptor, FieldFlags),
member(strict_init, FieldFlags),
member(final, FieldFlags).
The \+ operator is commonly used in Prolog to indicate
negation. The new clause prevents this rule from applying when the given
field is declared ACC_FINAL and
ACC_STRICT_INIT in the current class. (And note that any
attempts to write to an ACC_FINAL field declared in another
class will fail with a linkage error.)
Or:
- If the instruction
occurs in an instance initialization method of the class named by
FieldClassTypeand assigns to a field declared by the class, then one can validly pop types matchingFieldTypeanduninitializedThisoff the incoming operand stack yielding the outgoing type state. This allows instance fields ofthisthat are declared in the current class to be assigned prior to complete initialization ofthis.
The referenced field is declared by the current class.
One can validly pop types matching
FieldTypeanduninitializedThisoff the incoming operand stack yielding the outgoing stack types.The incoming initialization state is
restricted. The referenced field is removed, if present, from the list of unset fields in the outgoing initialization state.
- If the instruction
occurs in an instance initialization method of the class named by
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassType, FieldName, FieldDescriptor),
thisType(Environment, FieldClassType),
Environment = environment(CurrentClass, CurrentMethod, _, _, _, _),
methodName(CurrentMethod, '`<init>`'),
classDeclaresMember(CurrentClass, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
canPop(StackFrame, [FieldType, uninitializedThis], NextStackFrame),
canPop(StackFrame, [FieldType, uninitializedThis], TempFrame),
TempFrame = frame(NextLocals, NextStack, restricted(Unset)),
exclude(=(unsetField(FieldName, FieldDescriptor)), Unset, NextUnset),
NextStackFrame = frame(NextLocals, NextStack, restricted(NextUnset)),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
return
A return
instruction is type safe if the enclosing method declares a
void return type, and either: the
incoming instance initialization state is
unrestricted.
The enclosing method is not an
<init>method, orthishas already been completely initialized at the point where the instruction occurs.
instructionIsTypeSafe(return, Environment, _Offset, StackFrame,
afterGoto, ExceptionStackFrame) :-
thisMethodReturnType(Environment, void),
StackFrame = frame(_Locals, _OperandStack, Flags),
notMember(flagThisUninit, Flags),
StackFrame = frame(_Locals, _OperandStack, unrestricted),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
swap
A swap instruction
is type safe iff one can validly replace two category 1 types,
Type1 and Type2, on the incoming operand stack
with the types Type2 and Type1 yielding the
outgoing type state.
instructionIsTypeSafe(swap, _Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
StackFrame = frame(_Locals, [Type1, Type2 | Rest], _Flags),
StackFrame = frame(_, [Type1, Type2 | Rest], _),
popCategory1([Type1 | Rest], Type1, Rest),
popCategory1([Type2 | Rest], Type2, Rest),
NextStackFrame = frame(_Locals, [Type2, Type1 | Rest], _Flags),
NextStackFrame = frame(_, [Type2, Type1 | Rest], _),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Chapter 5: Loading, Linking, and Initializing
5.5 Initialization
Initialization of a class or
interface involves assigning any ConstantValue attribute
values to its static fields and executing any declared
class or interface initialization method (2.9.2).
A class or interface C may be initialized only as a result of:
The execution of any one of the Java Virtual Machine instructions new, getstatic, putstatic, or invokestatic that references C (6.5.new, 6.5.getstatic, 6.5.putstatic, 6.5.invokestatic).
Upon execution of a new instruction, the class to be initialized is the class referenced by the instruction.
Upon execution of a getstatic, putstatic, or invokestatic instruction, the class or interface to be initialized is the class or interface that declares the resolved field or method.
The first invocation of a
java.lang.invoke.MethodHandleinstance which was the result of method handle resolution (5.4.3.5) for a method handle of kind 2 (REF_getStatic), 4 (REF_putStatic), 6 (REF_invokeStatic), or 8 (REF_newInvokeSpecial).This implies that the class of a bootstrap method is initialized when the bootstrap method is invoked for an invokedynamic instruction (6.5.invokedynamic), as part of the continuing resolution of the call site specifier.
Invocation of certain reflective methods in the class library (2.12), for example, in class
Classor in packagejava.lang.reflect.If C is a class, the initialization of one of its subclasses.
If C is an interface that declares a non-
abstract, non-staticmethod, the initialization of a class that implements C directly or indirectly.Its designation as the initial class or interface at Java Virtual Machine startup (5.2).
Prior to initialization, a class or interface must be linked, that is, verified, prepared, and optionally resolved.
Because the Java Virtual Machine is multithreaded, initialization of a class or interface requires careful synchronization, since some other thread may be trying to initialize the same class or interface at the same time. There is also the possibility that initialization of a class or interface may be requested recursively as part of the initialization of that class or interface. The implementation of the Java Virtual Machine is responsible for taking care of synchronization and recursive initialization by using the following procedure. It assumes that the class or interface has already been verified and prepared, and that the class or interface contains state that indicates one of four situations:
Uninitialized: This class or interface is verified and prepared but not initialized.
Larval: This class or interface is being initialized by some particular thread. A larval initialization state tracks whenever a strictly-initialized field of the class is set or read.
Initialized: This class or interface is fully initialized and ready for use.
Erroneous: This class or interface
is in an erroneous state, perhaps because initialization was attempted and failedattempted but failed to complete initialization.
The precise form of the initialization state is left to the discretion of the JVM implementation.
For each class or interface
C, there is a unique initialization lock LC. The
mapping from C to LC is also left to the discretion of
the Java Virtual Machine implementation. For example, LC could
be the Class object for C, or the monitor
associated with that Class object. The procedure for
initializing C is then as follows:
Synchronize on the initialization lock, LC, for C. This involves waiting until the current thread can acquire LC.
If the initialization state of C indicates that initialization is in progress for C by some other thread, then release LC and block the current thread until informed that the in-progress initialization has completed, at which time repeat this procedure.
Thread interrupt status is unaffected by execution of the initialization procedure.
If the initialization state of C
indicates that initialization is in progress for C byis larval in the current thread, then this must be a recursive request for initialization. Release LC and complete normally.If the initialization state of C indicates that C has already been initialized, then no further action is required. Release LC and complete normally.
If the initialization state of C is in an erroneous state, then initialization is not possible. Release LC and throw a
NoClassDefFoundError.Otherwise,
record the fact that initialization of C is in progress bytransition C to a larval state in the current thread, and release LC.Then, initialize each
staticfield of C with the constant value in itsConstantValueattribute (4.7.2), in the order the fields appear in theClassFilestructure, and if the field is strictly-initialized, record in the initialization state that it is set.Next, if C is a class rather than an interface, then let SC be its superclass and let SI1, ..., SIn be all superinterfaces of C (whether direct or indirect) that declare at least one non-
abstract, non-staticmethod. The order of superinterfaces is given by a recursive enumeration over the superinterface hierarchy of each interface directly implemented by C. For each interface I directly implemented by C (in the order of theinterfacesarray of C), the enumeration recurs on I's superinterfaces (in the order of theinterfacesarray of I) before returning I.For each S in the list [ SC, SI1, ..., SIn ], if S has not yet been initialized, then recursively perform this entire procedure for S. If necessary, verify and prepare S first.
If the initialization of S completes abruptly because of a thrown exception, then acquire LC,
label C as erroneoustransition C to the erroneous state, notify all waiting threads, release LC, and complete abruptly, throwing the same exception that resulted from initializing S.Next, determine whether assertions are enabled for C by querying its defining loader.
Next, if C declares a class or interface initialization method, execute that method.
If the execution of the class or interface initialization method completes normally, or if C declares no class or interface initialization method, then acquire LC, label C as fully initialized, notify all waiting threads, release LC, and complete this procedure normally.
Otherwise,If the execution of the class or interface initialization methodmust have completedcompletes abruptly by throwing some exception E., then initialization has failed. If the class of E isErroror one of its subclasses, the failure is for reason E. If E has some other class,If the class of E is notthen create a new instance of the classErroror one of its subclasses,ExceptionInInitializerErrorwith E as the argument, and use this objectin place of E in the following stepas the reason for failure. If a new instance ofExceptionInInitializerErrorcannot be created because anOutOfMemoryErroroccurs, then use anOutOfMemoryErrorobjectin place of E in the following stepas the reason for failure. In any case, acquire LC, transition C to the erroneous state, notify all waiting threads, release LC, and complete this procedure abruptly with the reason described above.Acquire LC, label C as erroneous, notify all waiting threads, release LC, and complete this procedure abruptly with reason E or its replacement as determined in the previous step.
Otherwise, execution of the class or interface initialization method, if any, has completed normally. Check whether each strictly-initialized
staticfield declared by C has been set. If any such field has been left unset, acquire LC, transition C to the erroneous state, notify all waiting threads, release LC, and complete this procedure abruptly with anIllegalStateExceptionas the reason.The precise exception class to be thrown may change. Some sort of
Errormay be appropriate.If all strictly-initialized
staticfields declared by C have been set, then acquire C, transition C to the initialized state, notify all waiting threads, release LC, and complete this procedure normally.
A Java Virtual Machine implementation may optimize this procedure by eliding the lock acquisition in step 1 (and release in step 4/5) when it can determine that the initialization of the class has already completed, provided that, in terms of the Java memory model, all happens-before orderings (JLS §17.4.5) that would exist if the lock were acquired, still exist when the optimization is performed.
Chapter 6: The Java Virtual Machine Instruction Set
6.5 Instructions
getstatic
- Operation
-
Get
staticfield from class - Format
-
getstatic
indexbyte1
indexbyte2 - Forms
-
getstatic = 178 (0xb2)
- Operand Stack
-
..., →
..., value
- Description
-
The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1
<<8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a field (5.1), which gives the name and descriptor of the field as well as a symbolic reference to the class or interface in which the field is to be found. The referenced field is resolved (5.4.3.2).On successful resolution of the field, the class or interface that declared the resolved field is initialized if that class or interface has not already been initialized (5.5). If that class or interface is in a larval initialization state, and if the resolved field is strictly-initialized, then the initialization state is updated to reflect that the field has been read.
TheFinally, the value of the class or interface field is fetched and pushed onto the operand stack. - Linking Exceptions
-
During resolution of the symbolic reference to the class or interface field, any of the exceptions pertaining to field resolution (5.4.3.2) can be thrown.
Otherwise, if the resolved field is not a
static(class) field or an interface field, getstatic throws anIncompatibleClassChangeError. - Run-time Exception
-
Otherwise, if execution of this getstatic instruction causes initialization of the
referencedresolved field's declaring class or interface, getstatic may throw anErroras detailed in 5.5. -
Otherwise, if the resolved field is strictly-initialized, and the resolved field's declaring class or interface is in a larval initialization state, and that state indicates that the field is unset, getstatic throws an
IllegalStateException.
The precise exception class to be thrown may change.
It is the intent that these rules apply equally to all reflective attempts to read a static field. TBD whether something more needs to be said somewhere in order to enforce that intention.
putstatic
- Operation
-
Set static field in class
- Format
-
putstatic
indexbyte1
indexbyte2 - Forms
-
putstatic = 179 (0xb3)
- Operand Stack
-
..., value →
...
- Description
-
The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1
<<8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a field (5.1), which gives the name and descriptor of the field as well as a symbolic reference to the class or interface in which the field is to be found. The referenced field is resolved (5.4.3.2).On successful resolution of the field, the class or interface that declared the resolved field is initialized if that class or interface has not already been initialized (5.5). If that class or interface is in a larval initialization state, and if the resolved field is strictly-initialized, then the initialization state is updated to reflect that the field has been set.
The type of a value stored by a putstatic instruction must be compatible with the descriptor of the referenced field (4.3.2). If the field descriptor type is
boolean,byte,char,short, orint, then the value must be anint. If the field descriptor type isfloat,long, ordouble, then the value must be afloat,long, ordouble, respectively. If the field descriptor type is a class type or an array type, then the value must be a value of the field descriptor type. If the field isfinal, it must be declared in the current class or interface, and the instruction must occur in the class or interface initialization method of the current class or interface (2.9.2).The value is popped from the operand stack.
If the value is of type
intand the field descriptor type is one ofbyte,char,short, orboolean, then theintvalue is converted to the field descriptor type as follows. If the field descriptor type isbyte,char, orshort, then theintvalue is truncated to a value of the field descriptor type, value'. If the field descriptor type isboolean, then theintvalue is narrowed by taking the bitwise AND of value and 1, resulting in value'. The referenced field in the class or interface is set to value'.Otherwise, the referenced field in the class or interface is set to value.
- Linking Exceptions
-
During resolution of the symbolic reference to the class or interface field, any of the exceptions pertaining to field resolution (5.4.3.2) can be thrown.
Otherwise, if the resolved field is not a
static(class) field or an interface field, putstatic throws anIncompatibleClassChangeError.Otherwise, if the resolved field is
final, it must be declared in the current class or interface, and the instruction must occur in the class or interface initialization method of the current class or interface. Otherwise, anIllegalAccessErroris thrown. - Run-time Exception
-
Otherwise, if execution of this putstatic instruction causes initialization of the referenced class or interface, putstatic may throw an
Erroras detailed in 5.5. -
Otherwise, if the resolved field is strictly-initialized and
final, and the resolved field's declaring class or interface is in a larval initialization state, and that state indicates that the field has been read, putstatic throws anIllegalStateException.
The precise exception class to be thrown may change.
It is the intent that these rules apply equally to all reflective attempts to write a static field. TBD whether something more needs to be said somewhere in order to enforce that intention.
- Notes
-
A putstatic instruction may be used only to set the value of an interface field on the initialization of that field. Interface fields may be assigned to only once, on execution of an interface variable initialization expression when the interface is initialized (5.5, JLS §9.3.1).