Strict Field Initialization in the JVM

Changes to the Java® Virtual Machine Specification • Version 27-jep401ea3+1-1

This document describes changes to the Java Virtual Machine Specification to support strict field initialization in the JVM, a preview feature introduced by JEP draft 8350458.

Revision history:

Chapter 2: The Structure of the Java Virtual Machine

2.9 Special Methods

2.9.1 Instance Initialization Methods

A class has zero or more instance initialization methods, used to initialize instances of the class. each Each typically corresponding corresponds to a constructor written in the Java programming language.

If a class does not have an instance initialization method, its instances cannot be initialized or used.

A method is an instance initialization method if all of the following are true:

In a class, any non-void method named <init> is not an instance initialization method. In an interface, any method named <init> is not an instance initialization method. Such methods cannot be invoked by any Java Virtual Machine instruction (4.4.2, 4.9.2) and are rejected by format checking (4.6, 4.8).

The declaration and use of an instance initialization method is constrained by the Java Virtual Machine. For the declaration, the method's access_flags item and code array are constrained (4.6, 4.9.2). For a use, an instance initialization method may be invoked only by the invokespecial instruction on an uninitialized class instance (4.10.1.9).

Because the name <init> is not a valid identifier in the Java programming language, it cannot be used directly in a program written in the Java programming language.

Class instances are created by the new instruction (6.5.new), and then must proceed through a sequence of initialization states to be made available for general use. These states are:

A early larval state is entered when an instance initialization method is invoked on an object created by new. Most operations, including method invocations, are not allowed on an object in an early larval state, and the object may not be shared with other code. However, its instance fields may be set with the putfield instruction (6.5.putfield). Strictly-initialized instance fields (4.5) must be set while the object is in this state.

While in the early larval state, each instance initialization method delegates to another instance initialization method with invokespecial (6.5.invokespecial). The recursion bottoms out at the invocation of an Object.<init> method. If any of these instance initialization method invocations fails with an exception, the object transitions to the erroneous state and can never become initialized.

Once execution reaches Object.<init>, the instance transitions to the late larval state, and its fields and methods are available for normal use. In this state, the object may be freely shared with other code. However, that code must handle the object carefully, because it is not yet fully initialized. Starting with Object.<init>, each instance initialization method in the call stack completes its initialization work and returns. If every instance initialization method invocation returns successfully, the object transitions to the initialized state and is available for general use.

2.9.2 Class Initialization Methods

A class or interface has at most one class or interface initialization method and is initialized by the Java Virtual Machine invoking that method (5.5).

If a class or interface does not have a class or interface initialization method, initialization of the class or interface proceeds without executing any bytecode in the class or interface.

A method is a class or interface initialization method if all of the following are true:

Other methods named <clinit> in a class file are not class or interface initialization methods. They are never invoked by the Java Virtual Machine itself, cannot be invoked by any Java Virtual Machine instruction (4.9.1), and are rejected by format checking (4.6, 4.8).

Because the name <clinit> is not a valid identifier in the Java programming language, it cannot be used directly in a program written in the Java programming language.

After a class has been loaded, it proceeds through a sequence of initialization states to be made available for general use. These states are:

Class initialization methods are executed while the class is in a larval state. In this state, the fields and methods of the class may be freely accessed in the current thread, while attempts to access the class from other threads are blocked. Strictly-initialized static fields (4.5) must be set while the class is in a larval state.

If the class initialization method invocation fails with an exception, the class transitions to the erroneous state and can never become initialized. If the class initialziation method returns successfully, the class transitions to the initialized state and is available for general use.

Chapter 4: The class File Format

4.5 Fields

Each field is described by a field_info structure.

No two fields in one class file may have the same name and descriptor (4.3.2).

The structure has the following format:

field_info {
    u2             access_flags;
    u2             name_index;
    u2             descriptor_index;
    u2             attributes_count;
    attribute_info attributes[attributes_count];
}

The items of the field_info structure are as follows:

access_flags

The value of the access_flags item is a mask of flags used to denote access permission to and properties of this field. The interpretation of each flag, when set, is specified in Table 4.5-A.

Table 4.5-A. Field access and property flags

Flag Name Value Interpretation
ACC_PUBLIC 0x0001 Declared public; may be accessed from outside its package.
ACC_PRIVATE 0x0002 Declared private; accessible only within the defining class and other classes belonging to the same nest (5.4.4).
ACC_PROTECTED 0x0004 Declared protected; may be accessed within subclasses.
ACC_STATIC 0x0008 Declared static.
ACC_FINAL 0x0010 Declared final; never directly assigned to after object construction (JLS §17.5).
ACC_VOLATILE 0x0040 Declared volatile; cannot be cached.
ACC_TRANSIENT 0x0080 Declared transient; not written or read by a persistent object manager.
ACC_STRICT_INIT 0x0800 A strictly-initialized field; must be assigned before it can be read.
ACC_SYNTHETIC 0x1000 Declared synthetic; not present in the source code.
ACC_ENUM 0x4000 Declared as an element of an enum class.

Fields of classes may set any of the flags in Table 4.5-A. However, each field of a class may have at most one of its ACC_PUBLIC, ACC_PRIVATE, and ACC_PROTECTED flags set (JLS §8.3.1), and must not have both its ACC_FINAL and ACC_VOLATILE flags set (JLS §8.3.1.4).

Fields of interfaces must have their ACC_PUBLIC, ACC_STATIC, and ACC_FINAL flags set; they may have their ACC_STRICT_INIT or ACC_SYNTHETIC flag set, and must not have any of the other flags in Table 4.5-A set (JLS §9.3).

If the class file version number is not 71.65535, the 0x0800 bit is ignored, and the ACC_STRICT_INIT flag is considered not to be set.

The ACC_SYNTHETIC flag indicates that this field was generated by a compiler and does not appear in source code.

The ACC_ENUM flag indicates that this field is used to hold an element of an enum class (JLS §8.9).

All bits of the access_flags item not assigned in Table 4.5-A are reserved for future use. They should be set to zero in generated class files and should be ignored by Java Virtual Machine implementations.

name_index

The value of the name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info structure (4.4.7) which represents a valid unqualified name denoting a field (4.2.2).

descriptor_index

The value of the descriptor_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info structure (4.4.7) which represents a valid field descriptor (4.3.2).

attributes_count

The value of the attributes_count item indicates the number of additional attributes of this field.

attributes[]

Each value of the attributes table must be an attribute_info structure (4.7).

A field can have any number of optional attributes associated with it.

The attributes defined by this specification as appearing in the attributes table of a field_info structure are listed in Table 4.7-C.

The rules concerning attributes defined to appear in the attributes table of a field_info structure are given in 4.7.

The rules concerning non-predefined attributes in the attributes table of a field_info structure are given in 4.7.1.

4.7 Attributes

4.7.4 The StackMapTable Attribute

The StackMapTable attribute is a variable-length attribute in the attributes table of a Code attribute (4.7.3). A StackMapTable attribute is used during the process of verification by type checking (4.10.1).

There may be at most one StackMapTable attribute in the attributes table of a Code attribute.

In a class file whose version number is 50.0 or above, if a method's Code attribute does not have a StackMapTable attribute, it has an implicit stack map attribute (4.10.1). This implicit stack map attribute is equivalent to a StackMapTable attribute with number_of_entries equal to zero.

The StackMapTable attribute has the following format:

StackMapTable_attribute {
    u2              attribute_name_index;
    u4              attribute_length;
    u2              number_of_entries;
    stack_map_frame entries[number_of_entries];
}

The items of the StackMapTable_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info structure (4.4.7) representing the string "StackMapTable".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

number_of_entries

The value of the number_of_entries item gives the number of stack_map_frame entries in the entries table.

entries[]

Each entry in the entries table describes one stack map frame of the method. The order of the stack map frames in the entries table is significant.

A stack map frame specifies (either explicitly or implicitly) the bytecode offset at which it applies, and the verification types of local variables and operand stack entries for that offset. Verification by type checking requires that a stack map frame be associated with every bytecode offset of the code array that is used as a branch target or as the start of an exception handler (4.10.1.6).

A stack map frame is restricted if it specifies a local variable or operand stack entry of type uninitializedThis. Restricted stack map frames are used to describe the type state of instance initialization methods operating on a new class instance in the early larval initialization state (2.9.1). These frames always specify, either explicitly or implicitly, a list of unset strictly-initialized fields that must be set before instance initialization can proceed.

Each stack map frame described in the entries table relies on the previous frame for some of its semantics. The first stack map frame of a method is implicit, and computed from the method descriptor by the type checker (4.10.1.5). The stack_map_frame structure at entries[0] therefore describes the second stack map frame of the method.

The bytecode offset at which a stack map frame applies is calculated by taking the value offset_delta specified in the frame (either explicitly or implicitly), and adding offset_delta + 1 to the bytecode offset of the previous frame, unless the previous frame is the initial frame of the method. In that case, the bytecode offset at which the stack map frame applies is the value offset_delta specified in the frame.

By using an offset delta rather than storing the actual bytecode offset, we ensure, by definition, that stack map frames are in the correctly sorted order. Furthermore, by consistently using the formula offset_delta + 1 for all explicit frames (as opposed to the implicit first frame), we guarantee the absence of duplicates.

We say that an instruction in the bytecode has a corresponding stack map frame if the instruction starts at offset i in the code array of a Code attribute, and the Code attribute has a StackMapTable attribute whose entries array table contains a stack map frame that applies at bytecode offset i.

A verification type specifies the type of either one or two locations, where a location is either a single local variable or a single operand stack entry. A verification type is represented by a discriminated union, verification_type_info, that consists of a one-byte tag, indicating which item of the union is in use, followed by zero or more bytes, giving more information about the tag.

union verification_type_info {
    Top_variable_info;
    Integer_variable_info;
    Float_variable_info;
    Long_variable_info;
    Double_variable_info;
    Null_variable_info;
    UninitializedThis_variable_info;
    Object_variable_info;
    Uninitialized_variable_info;
}

A verification type that specifies one location in the local variable array or in the operand stack is represented by the following items of the verification_type_info union:

A verification type that specifies two locations in the local variable array or in the operand stack is represented by the following items of the verification_type_info union:

A stack map frame is represented by a discriminated union, stack_map_frame, which consists of a one-byte tag, indicating which item of the union is in use, followed by zero or more bytes, giving more information about the tag. An early_larval_frame is used to explicitly list the unset fields of a restricted stack map frame. All other stack map frames have the type base_frame.

union stack_map_frame {
    base_frame;
    early_larval_frame;
}
union stack_map_frame {
union base_frame {
    same_frame;
    same_locals_1_stack_item_frame;
    same_locals_1_stack_item_frame_extended;
    chop_frame;
    same_frame_extended;
    append_frame;
    full_frame;
}

The tag indicates the frame type of the stack map frame:

4.9 Constraints on Java Virtual Machine Code

4.9.2 Structural Constraints

The structural constraints on the code array specify constraints on relationships between Java Virtual Machine instructions. The structural constraints are as follows:

4.10 Verification of class Files

4.10.1 Verification by Type Checking

4.10.1.1 Accessors for Java Virtual Machine Artifacts

We stipulate the existence of 17 Prolog predicates ("accessors") that have certain expected behavior but whose formal definitions are not given in this specification. The Prolog term Class, as used here, represents a binary class or interface that has been successfully loaded (5.3); the Prolog term Method represents a method of a Class; and the Prolog term Loader represents a class loader of a Class. This specification does not mandate a precise structure for these entities.

classClassName(Class, ClassName)

Extracts the name, ClassName, of the class Class.

classIsInterface(Class)

True iff the class, Class, is an interface.

classIsNotFinal(Class)

True iff the class, Class, is not a final class.

classSuperClassName(Class, SuperClassName)

Extracts the name, SuperClassName, of the superclass of class Class.

classInterfaceNames(Class, InterfaceNames)

Extracts a list, InterfaceNames, of the names of the direct superinterfaces of the class Class.

classMethods(Class, Methods)

Extracts a list, Methods, of the methods declared in the class Class.

classAttributes(Class, Attributes)

Extracts a list, Attributes, of the attributes of the class Class.

Each attribute is represented as a functor application of the form attribute(AttributeName, AttributeContents), where AttributeName is the name of the attribute. The format of the attribute's contents is unspecified.

classDeclaresMember(Class, MemberName, MemberDescriptor, AccessFlags)

Asserts that a class, Class, declares a field or method with name MemberName, and descriptor MemberDescriptor, and access flags AccessFlags. This assertion does not consider members declared in the superclasses or superinterfaces of Class.

classDefiningLoader(Class, Loader)

Extracts the defining class loader, Loader, of the class Class.

isBootstrapLoader(Loader)

True iff the class loader Loader is the bootstrap class loader.

loadedClass(Name, InitiatingLoader, ClassDefinition)

True iff there exists a class named Name whose representation (in accordance with this specification) when loaded by the class loader InitiatingLoader is ClassDefinition.

methodName(Method, Name)

Extracts the name, Name, of the method Method.

methodAccessFlags(Method, AccessFlags)

Extracts the access flags, AccessFlags, of the method Method.

methodDescriptor(Method, Descriptor)

Extracts the descriptor, Descriptor, of the method Method.

methodAttributes(Method, Attributes)

Extracts a list, Attributes, of the attributes of the method Method.

Each attribute is represented as a functor application of the form attribute(AttributeName, AttributeContents), where AttributeName is the name of the attribute. The format of the attribute's contents is unspecified.

isInit(Method)

True iff Method (regardless of class) is <init>.

isNotInit(Method)

True iff Method (regardless of class) is not <init>.

isNotFinal(Method, Class)

True iff Method in class Class is not final.

isStatic(Method, Class)

True iff Method in class Class is static.

isNotStatic(Method, Class)

True iff Method in class Class is not static.

isPrivate(Method, Class)

True iff Method in class Class is private.

isNotPrivate(Method, Class)

True iff Method in class Class is not private.

isProtected(MemberClass, MemberName, MemberDescriptor)

True iff there is a member named MemberName with descriptor MemberDescriptor in the class MemberClass and it is protected.

isNotProtected(MemberClass, MemberName, MemberDescriptor)

True iff there is a member named MemberName with descriptor MemberDescriptor in the class MemberClass and it is not protected.

parseFieldDescriptor(Descriptor, Type)

Converts a field descriptor, Descriptor, into the corresponding verification type Type (4.10.1.2).

The verification type derived from descriptor types byte, short, boolean, and char, when not used as array component types, is int.

parseMethodDescriptor(Descriptor, ArgTypeList, ReturnType)

Converts a method descriptor, Descriptor, into a list of verification types, ArgTypeList, corresponding to the method argument types, and a verification type, ReturnType, corresponding to the return type.

The verification type derived from descriptor types byte, short, boolean, and char, when not used as array component types, is int. A void return is represented with the special symbol void.

parseCodeAttribute(Class, Method, FrameSize, MaxStack, ParsedCode, Handlers, StackMap)

Extracts the instruction stream, ParsedCode, of the method Method in Class, as well as the maximum operand stack size, MaxStack, the maximal number of local variables, FrameSize, the exception handlers, Handlers, and the stack map StackMap.

The representation of the instruction stream and stack map attribute must be as specified in 4.10.1.3 and 4.10.1.4.

samePackageName(Class1, Class2)

True iff the package names of Class1 and Class2 are the same.

differentPackageName(Class1, Class2)

True iff the package names of Class1 and Class2 are different.

The above accessors are used to define loadedSuperclasses, which produces a list of a class's superclasses by recurring on each class's direct superclass until Object, which has no superclass, is reached.

loadedSuperclasses(Class, [ Superclass | Rest ]) :-
    classSuperClassName(Class, SuperclassName),
    classDefiningLoader(Class, L),
    loadedClass(SuperclassName, L, Superclass),
    loadedSuperclasses(Superclass, Rest).

loadedSuperclasses(Class, []) :-
    \+ classSuperClassName(Class, _).

When type checking a method's body, it is convenient to access information about the method. For this purpose, we define an environment, a six-tuple consisting of:

We specify accessors to extract information from the environment.

allInstructions(Environment, Instructions) :-
    Environment = environment(_Class, _Method, _ReturnType,
                              Instructions, _, _).

exceptionHandlers(Environment, Handlers) :-
    Environment = environment(_Class, _Method, _ReturnType,
                              _Instructions, _, Handlers).

maxOperandStackLength(Environment, MaxStack) :-
    Environment = environment(_Class, _Method, _ReturnType,
                              _Instructions, MaxStack, _Handlers).

currentClassLoader(Environment, Loader) :-
    Environment = environment(Class, _Method, _ReturnType,
                              _Instructions, _, _),
    classDefiningLoader(Class, Loader).

thisClass(Environment, Class) :-
    Environment = environment(Class, _Method, _ReturnType,
                              _Instructions, _, _).

thisType(Environment, class(ClassName, L)) :-
    Environment = environment(Class, _Method, _ReturnType,
                              _Instructions, _, _),
    classDefiningLoader(Class, L),
    classClassName(Class, ClassName).

thisMethodReturnType(Environment, ReturnType) :-
    Environment = environment(_Class, _Method, ReturnType,
                              _Instructions, _, _).

offsetStackFrame(Environment, Offset, StackFrame) :-
    allInstructions(Environment, Instructions),
    member(stackMap(Offset, StackFrame), Instructions).

Finally, we specify a general predicate used throughout the type rules:

notMember(_, []).
notMember(X, [A | More]) :- X \= A, notMember(X, More).

The principle guiding the determination as to which accessors are stipulated and which are fully specified is that we do not want to over-specify the representation of the class file. Providing specific accessors to the Class or Method term would force us to completely specify the format for a Prolog term representing the class file.

4.10.1.4 Stack Map Frames and Type Transitions

Stack map frames are represented in Prolog as a list of terms of the form:

stackMap(Offset, TypeState)

where:

A type state is a mapping from locations in the operand stack and local variables of a method to verification types. It has the form:

frame(Locals, OperandStack, Flags)
frame(Locals, OperandStack, InitState)

where:

Subtyping of verification types is extended pointwise to type states. The local variable array of a method has a fixed length by construction (see methodInitialStackFrame in 4.10.1.5), but the operand stack grows and shrinks, so we require an explicit check on the length of the operand stacks whose assignability is desired for subtyping. Subtyping only allows a transition between different initialization states if both are restricted states, and the latter does not remove any unset fields.

frameIsAssignable(frame(Locals1, StackMap1, Flags1),
                  frame(Locals2, StackMap2, Flags2)) :-
frameIsAssignable(frame(Locals1, StackMap1, InitState1),
                  frame(Locals2, StackMap2, InitState2)) :-
    length(StackMap1, StackMapLength),
    length(StackMap2, StackMapLength),
    maplist(isAssignable, Locals1, Locals2),
    maplist(isAssignable, StackMap1, StackMap2),
    subset(Flags1, Flags2).
    initStateTransition(InitState1, InitState2).

initStateTransition(InitState, InitState).

initStateTransition(restricted(UnsetFields1),
                    restricted(UnsetFields2)) :-
    subset(UnsetFields1, UnsetFields2).

Most of the type rules for individual instructions (4.10.1.9) depend on the notion of a valid type transition. A type transition is valid if one can pop a list of expected types off the incoming type state's operand stack and replace them with an expected result type, resulting in a new type state where the length of the operand stack does not exceed its declared maximum size.

validTypeTransition(Environment, ExpectedTypesOnStack, ResultType,
                    frame(Locals, InputOperandStack, Flags),
                    frame(Locals, NextOperandStack, Flags)) :-
                    frame(Locals, InputOperandStack, InitState),
                    frame(Locals, NextOperandStack, InitState)) :-
    popMatchingList(InputOperandStack, ExpectedTypesOnStack,
                    InterimOperandStack),
    pushOperandStack(InterimOperandStack, ResultType, NextOperandStack),
    operandStackHasLegalLength(Environment, NextOperandStack).

Pop a list of types off the stack.

popMatchingList(OperandStack, [], OperandStack).
popMatchingList(OperandStack, [P | Rest], NewOperandStack) :-
    popMatchingType(OperandStack, P, TempOperandStack, _ActualType),
    popMatchingList(TempOperandStack, Rest, NewOperandStack).

Pop an individual type off the stack. The exact behavior depends on the stack contents. If the logical top of the stack is some subtype of the specified type, Type, then pop it. If a type occupies two stack entries, then the logical top of the stack is really the type just below the top, and the top of the stack is the unusable type top.

popMatchingType([ActualType | OperandStack],
                Type, OperandStack, ActualType) :-
    sizeOf(Type, 1),
    isAssignable(ActualType, Type).

popMatchingType([top, ActualType | OperandStack],
                Type, OperandStack, ActualType) :-
    sizeOf(Type, 2),
    isAssignable(ActualType, Type).

sizeOf(X, 2) :- isAssignable(X, twoWord).
sizeOf(X, 1) :- isAssignable(X, oneWord).
sizeOf(top, 1).

Push a logical type onto the stack. The exact behavior varies with the size of the type. If the pushed type is of size 1, we just push it onto the stack. If the pushed type is of size 2, we push it, and then push top.

pushOperandStack(OperandStack, 'void', OperandStack).
pushOperandStack(OperandStack, Type, [Type | OperandStack]) :-
    sizeOf(Type, 1).
pushOperandStack(OperandStack, Type, [top, Type | OperandStack]) :-
    sizeOf(Type, 2).

The length of the operand stack must not exceed the declared maximum size.

operandStackHasLegalLength(Environment, OperandStack) :-
    length(OperandStack, Length),
    maxOperandStackLength(Environment, MaxStack),
    Length =< MaxStack.

The dup instructions pop expected types off the incoming type state's operand stack and replace them with predefined result types, resulting in a new type state. However, these instructions are not defined in terms of type transitions because there is no need to match types by means of the subtyping relation. Instead, the dup instructions manipulate the operand stack entirely in terms of the category of types on the stack (2.11.1).

Category 1 types occupy a single stack entry. Popping a logical type of category 1, Type, off the stack is possible if the top of the stack is Type and Type is not top (otherwise it could denote the upper half of a category 2 type). The result is the incoming stack, with the top entry popped off.

popCategory1([Type | Rest], Type, Rest) :-
    Type \= top,
    sizeOf(Type, 1).

Category 2 types occupy two stack entries. Popping a logical type of category 2, Type, off the stack is possible if the top of the stack is type top, and the entry directly below it is Type. The result is the incoming stack, with the top two entries popped off.

popCategory2([top, Type | Rest], Type, Rest) :-
    sizeOf(Type, 2).

The dup instructions push a list of types onto the stack in essentially the same way as when a type is pushed for a valid type transition.

canSafelyPush(Environment, InputOperandStack, Type, OutputOperandStack) :-
    pushOperandStack(InputOperandStack, Type, OutputOperandStack),
    operandStackHasLegalLength(Environment, OutputOperandStack).

canSafelyPushList(Environment, InputOperandStack, Types,
                  OutputOperandStack) :-
    canPushList(InputOperandStack, Types, OutputOperandStack),
    operandStackHasLegalLength(Environment, OutputOperandStack).

canPushList(InputOperandStack, [], InputOperandStack).
canPushList(InputOperandStack, [Type | Rest], OutputOperandStack) :-
    pushOperandStack(InputOperandStack, Type, InterimOperandStack),
    canPushList(InterimOperandStack, Rest, OutputOperandStack).

Many of the type rules for individual instructions use the following clause to easily pop a list of types off the stack.

canPop(frame(Locals, OperandStack, Flags), Types,
       frame(Locals, PoppedOperandStack, Flags)) :-
canPop(frame(Locals, OperandStack, InitState), Types,
       frame(Locals, PoppedOperandStack, InitState)) :-
    popMatchingList(OperandStack, Types, PoppedOperandStack).

Finally, certain array instructions (4.10.1.9.aaload, 4.10.1.9.arraylength, 4.10.1.9.baload, 4.10.1.9.bastore) peek at types on the operand stack in order to check they are array types. The following clause accesses the i'th element of the operand stack from a type state.

nth1OperandStackIs(*i*, frame(_Locals, OperandStack, _Flags), Element) :-
nth1OperandStackIs(*i*, frame(_, OperandStack, _), Element) :-
    nth1(*i*, OperandStack, Element).
4.10.1.5 Type Checking Methods

A method with a Code attribute is type safe if it is possible to merge the code and the stack map frames into a single stream such that each stack map frame precedes the instruction it corresponds to, and the merged stream is type correct. The method's exception handlers, if any, must also be legal.

methodIsTypeSafe(Class, Method) :-
    parseCodeAttribute(Class, Method, FrameSize, MaxStack,
                       ParsedCode, Handlers, StackMap),
    mergeStackMapAndCode(StackMap, ParsedCode, MergedCode),
    methodInitialStackFrame(Class, Method, FrameSize, StackFrame, ReturnType),
    Environment = environment(Class, Method, ReturnType, MergedCode,
                              MaxStack, Handlers),
    handlersAreLegal(Environment),
    mergedCodeIsTypeSafe(Environment, MergedCode, StackFrame).

Let us consider exception handlers first.

An exception handler is represented by a functor application of the form:

handler(Start, End, Target, ClassName)

whose arguments are, respectively, the start and end of the range of instructions covered by the handler, the first instruction of the handler code, and the name of the exception class that this handler is designed to handle.

An exception handler is legal if its start (Start) is less than its end (End), there exists an instruction whose offset is equal to Start, there exists an instruction whose offset equals End, and the handler's exception class is assignable to the class Throwable. The exception class of a handler is Throwable if the handler's class entry is 0, otherwise it is the class named in the handler.

handlersAreLegal(Environment) :-
    exceptionHandlers(Environment, Handlers),
    checklist(handlerIsLegal(Environment), Handlers).

handlerIsLegal(Environment, Handler) :-
    Handler = handler(Start, End, Target, _),
    Start < End,
    allInstructions(Environment, Instructions),
    member(instruction(Start, _), Instructions),
    offsetStackFrame(Environment, Target, _),
    instructionsIncludeEnd(Instructions, End),
    currentClassLoader(Environment, CurrentLoader),
    handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
    isBootstrapLoader(BL),
    isAssignable(ExceptionClass, class('java/lang/Throwable', BL)).

instructionsIncludeEnd(Instructions, End) :-
    member(instruction(End, _), Instructions).
instructionsIncludeEnd(Instructions, End) :-
    member(endOfCode(End), Instructions).

handlerExceptionClass(handler(_, _, _, 0),
                      class('java/lang/Throwable', BL), _) :-
    isBootstrapLoader(BL).

handlerExceptionClass(handler(_, _, _, Name),
                      class(Name, L), L) :-
    Name \= 0.

Let us now turn to the stream of instructions and stack map frames.

Merging instructions and stack map frames into a single stream involves four cases:

To determine if the merged stream for a method is type correct, we first infer the method's initial type state.

The initial type state of a method consists of an empty operand stack and local variable types derived from the type of this and the arguments, as well as the appropriate flag, depending on whether instance initialization state, if this is an <init> method.

methodInitialStackFrame(Class, Method, FrameSize, frame(Locals, [], Flags),
                        ReturnType):-
methodInitialStackFrame(Class, Method, FrameSize,
                        frame(Locals, [], InitState), ReturnType):-
    methodDescriptor(Method, Descriptor),
    parseMethodDescriptor(Descriptor, RawArgs, ReturnType),
    expandTypeList(RawArgs, Args),
    methodInitialThisType(Class, Method, ThisList),
    flags(ThisList, Flags),
    methodInitialInitState(Class, Method, InitState),
    append(ThisList, Args, ThisArgs),
    expandToLength(ThisArgs, FrameSize, top, Locals).

Given a list of types, the following clause produces a list where every type of size 2 has been substituted by two entries: one for itself, and one top entry. The result then corresponds to the representation of the list as 32-bit words in the Java Virtual Machine.

expandTypeList([], []).
expandTypeList([Item | List], [Item | Result]) :-
    sizeOf(Item, 1),
    expandTypeList(List, Result).
expandTypeList([Item | List], [Item, top | Result]) :-
    sizeOf(Item, 2),
    expandTypeList(List, Result).
flags([uninitializedThis], [flagThisUninit]).
flags(X, []) :- X \= [uninitializedThis].
expandToLength(List, Size, _Filler, List) :-
    length(List, Size).
expandToLength(List, Size, Filler, Result) :-
    length(List, ListLength),
    ListLength < Size,
    Delta is Size - ListLength,
    length(Extra, Delta),
    checklist(=(Filler), Extra),
    append(List, Extra, Result).

For the initial type state of an instance method, we compute the type of this and put it in a list. The type of this in an <init> method of a class with a superclass (that is, of any class except Object) is uninitializedThis; the type of this in any other instance method, including an <init> method of Object, is class(N, L) where N is the name of the class containing the method and L is its defining class loader.

For the initial type state of a static method, this is irrelevant, so the list is empty.

methodInitialThisType(_Class, Method, []) :-
    methodAccessFlags(Method, AccessFlags),
    member(static, AccessFlags),
    methodName(Method, MethodName),
    MethodName \= '`<init>`'.

methodInitialThisType(Class, Method, [This]) :-
    methodAccessFlags(Method, AccessFlags),
    notMember(static, AccessFlags),
    instanceMethodInitialThisType(Class, Method, This).

instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
    isSubclassInstanceInit(Class, Method).

instanceMethodInitialThisType(Class, Method, class(ClassName, L)) :-
    \+ isSubclassInstanceInit(Class, Method),
    classClassName(Class, ClassName),
    classDefiningLoader(Class, L).

isSubclassInstanceInit(Class, Method) :-
    methodName(Method, '`<init>`'),
    classSuperClassName(Class, _).

The starting initialization state (4.10.1.4) of an <init> method (other than Object.<init>) is restricted, with each strictly-initialized instance field of the class listed as unset. The starting initialization state of any other method is unrestricted.

methodInitialInitState(Class, Method, restricted(UnsetFields)) :-
    isSubclassInstanceInit(Class, Method),
    findall(unsetField(FieldName, FieldDescriptor),
            fieldRequiresAssignment(Class, FieldName, FieldDescriptor),
            UnsetFields).

fieldRequiresAssignment(Class, FieldName, FieldDescriptor) :-
    classDeclaresMember(Class, FieldName, FieldDescriptor, FieldFlags),
    member(strict_init, FieldFlags),
    notMember(static, FieldFlags).

methodInitialInitState(Class, Method, unrestricted) :-
    \+ isSubclassInstanceInit(Class, Method).
4.10.1.6 Type Checking Code Streams

Given an incoming type state, the mergedCodeIsTypeSafe predicate checks that a merged stream of instructions and stack map frames is type correct:

Branching to a target is type safe if the target has an associated stack frame, Frame, and the current stack frame, StackFrame, is assignable to Frame.

targetIsTypeSafe(Environment, StackFrame, Target) :-
    offsetStackFrame(Environment, Target, Frame),
    frameIsAssignable(StackFrame, Frame).

An instruction satisfies its exception handlers if it satisfies every exception handler that is applicable to the instruction.

instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame) :-
    exceptionHandlers(Environment, Handlers),
    sublist(isApplicableHandler(Offset), Handlers, ApplicableHandlers),
    checklist(instructionSatisfiesHandler(Environment, ExceptionStackFrame),
              ApplicableHandlers).

An exception handler is applicable to an instruction if the offset of the instruction is greater or equal to the start of the handler's range and less than the end of the handler's range.

isApplicableHandler(Offset, handler(Start, End, _Target, _ClassName)) :-
    Offset >= Start,
    Offset < End.

An instruction satisfies an exception handler if the instructions's outgoing type state is ExcStackFrame, and the handler's target (the initial instruction of the handler code) is type safe assuming an incoming type state T. The type state T is derived from ExcStackFrame by replacing the operand stack with a stack whose sole element is the handler's exception class.

instructionSatisfiesHandler(Environment, ExcStackFrame, Handler) :-
    Handler = handler(_, _, Target, _),
    currentClassLoader(Environment, CurrentLoader),
    handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
    /* The stack consists of just the exception. */
    ExcStackFrame = frame(Locals, _, Flags),
    TrueExcStackFrame = frame(Locals, [ ExceptionClass ], Flags),
    ExcStackFrame = frame(Locals, _, InitState),
    TrueExcStackFrame = frame(Locals, [ ExceptionClass ], InitState),
    operandStackHasLegalLength(Environment, TrueExcStackFrame),
    targetIsTypeSafe(Environment, TrueExcStackFrame, Target).
4.10.1.7 Type Checking Load and Store Instructions

All load instructions are variations on a common pattern, varying the type of the value that the instruction loads.

Loading a value of type Type from local variable Index is type safe, if the type of that local variable is ActualType, ActualType is assignable to Type, and pushing ActualType onto the incoming operand stack is a valid type transition (4.10.1.4) that yields a new type state NextStackFrame. After execution of the load instruction, the type state will be NextStackFrame.

loadIsTypeSafe(Environment, Index, Type, StackFrame, NextStackFrame) :-
    StackFrame = frame(Locals, _OperandStack, _Flags),
    StackFrame = frame(Locals, _OperandStack, _InitState),
    nth0(Index, Locals, ActualType),
    isAssignable(ActualType, Type),
    validTypeTransition(Environment, [], ActualType, StackFrame,
                        NextStackFrame).

All store instructions are variations on a common pattern, varying the type of the value that the instruction stores.

In general, a store instruction is type safe if the local variable it references is of a type that is a supertype of Type, and the top of the operand stack is of a subtype of Type, where Type is the type the instruction is designed to store.

More precisely, the store is type safe if one can pop a type ActualType that "matches" Type (that is, is a subtype of Type) off the operand stack (4.10.1.4), and then legally assign that type the local variable LIndex.

storeIsTypeSafe(_Environment, Index, Type,
                frame(Locals, OperandStack, Flags),
                frame(NextLocals, NextOperandStack, Flags)) :-
storeIsTypeSafe(_Environment, Index, Type,
                frame(Locals, OperandStack, InitState),
                frame(NextLocals, NextOperandStack, InitState)) :-
    popMatchingType(OperandStack, Type, NextOperandStack, ActualType),
    modifyLocalVariable(Index, ActualType, Locals, NextLocals).

Given local variables Locals, modifying Index to have type Type results in the local variable list NewLocals. The modifications are somewhat involved, because some values (and their corresponding types) occupy two local variables. Hence, modifying LN may require modifying LN+1 (because the type will occupy both the N and N+1 slots) or LN-1 (because local N used to be the upper half of the two word value/type starting at local N-1, and so local N-1 must be invalidated), or both. This is described further below. We start at L0 and count up.

modifyLocalVariable(Index, Type, Locals, NewLocals) :-
    modifyLocalVariable(0, Index, Type, Locals, NewLocals).

Given LocalsRest, the suffix of the local variable list starting at index I, modifying local variable Index to have type Type results in the local variable list suffix NextLocalsRest.

If I < Index-1, just copy the input to the output and recurse forward. If I = Index-1, the type of local I may change. This can occur if LI has a type of size 2. Once we set LI+1 to the new type (and the corresponding value), the type/value of LI will be invalidated, as its upper half will be trashed. Then we recurse forward.

modifyLocalVariable(I, Index, Type,
                    [Locals1 | LocalsRest],
                    [Locals1 | NextLocalsRest] ) :-
    I < Index - 1,
    I1 is I + 1,
    modifyLocalVariable(I1, Index, Type, LocalsRest, NextLocalsRest).

modifyLocalVariable(I, Index, Type,
                    [Locals1 | LocalsRest],
                    [NextLocals1 | NextLocalsRest] ) :-
    I =:= Index - 1,
    modifyPreIndexVariable(Locals1, NextLocals1),
    modifyLocalVariable(Index, Index, Type, LocalsRest, NextLocalsRest).

When we find the variable, and it only occupies one word, we change it to Type and we're done. When we find the variable, and it occupies two words, we change its type to Type and the next word to top.

modifyLocalVariable(Index, Index, Type,
                    [_ | LocalsRest], [Type | LocalsRest]) :-
    sizeOf(Type, 1).

modifyLocalVariable(Index, Index, Type,
                    [_, _ | LocalsRest], [Type, top | LocalsRest]) :-
    sizeOf(Type, 2).

We refer to a local whose index immediately precedes a local whose type will be modified as a pre-index variable. The future type of a pre-index variable of type InputType is Result. If the type, Type, of the pre-index local is of size 1, it doesn't change. If the type of the pre-index local, Type, is 2, we need to mark the lower half of its two word value as unusable, by setting its type to top.

modifyPreIndexVariable(Type, Type) :- sizeOf(Type, 1).
modifyPreIndexVariable(Type, top) :- sizeOf(Type, 2).
4.10.1.8 Type Checking for protected Members

All instructions that access members must contend with the rules concerning protected members. This section describes the protected check that corresponds to JLS §6.6.2.1.

The protected check applies only to protected members of superclasses of the current class. protected members in other classes will be caught by the access checking done at resolution (5.4.4). There are four cases:

The predicate classesInOtherPkgWithProtectedMember(Class, MemberName, MemberDescriptor, MemberClassName, Supers, List) is true if List is the set of classes in Supers with name MemberClassName that are in a different run-time package than Class which declare a protected member named MemberName with descriptor MemberDescriptor.

classesInOtherPkgWithProtectedMember(_, _, _, _, [], []).

classesInOtherPkgWithProtectedMember(Class, MemberName,
                                     MemberDescriptor, MemberClassName,
                                     [Super | Tail], [Super | T]) :-
    classClassName(Super, MemberClassName),
    \+ sameRuntimePackage(Class, Super),
    isProtected(Super, MemberName, MemberDescriptor),
    classesInOtherPkgWithProtectedMember(
      Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).

classesInOtherPkgWithProtectedMember(Class, MemberName,
                                     MemberDescriptor, MemberClassName,
                                     [Super | Tail], T) :-
    classClassName(Super, MemberClassName),
    \+ sameRuntimePackage(Class, Super),
    isNotProtected(Super, MemberName, MemberDescriptor),
    classesInOtherPkgWithProtectedMember(
      Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).

classesInOtherPkgWithProtectedMember(Class, MemberName,
                                     MemberDescriptor, MemberClassName,
                                     [Super | Tail], T) :-
    classClassName(Super, MemberClassName),
    sameRuntimePackage(Class, Super),
    classesInOtherPkgWithProtectedMember(
      Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).

classesInOtherPkgWithProtectedMember(Class, MemberName,
                                     MemberDescriptor, MemberClassName,
                                     [Super | Tail], T) :-
    classClassName(Super, SuperClassName),
    SuperClassName \= MemberClassName,
    classesInOtherPkgWithProtectedMember(
      Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).

sameRuntimePackage(Class1, Class2) :-
    classDefiningLoader(Class1, L),
    classDefiningLoader(Class2, L),
    samePackageName(Class1, Class2).
4.10.1.9 Type Checking Instructions

In general, the type rule for an instruction is given relative to an environment Environment that defines the class and method in which the instruction occurs (4.10.1.1), and the offset Offset within the method at which the instruction occurs. The rule states that if the incoming type state StackFrame fulfills certain requirements, then:

Many instructions have type rules that are completely isomorphic to the rules for other instructions. If an instruction b1 is isomorphic to another instruction b2, then the type rule for b1 is the same as the type rule for b2.

instructionIsTypeSafe(Instruction, Environment, Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    instructionHasEquivalentTypeRule(Instruction, IsomorphicInstruction),
    instructionIsTypeSafe(IsomorphicInstruction, Environment, Offset,
                          StackFrame, NextStackFrame,
                          ExceptionStackFrame).

The English language description of each rule is intended to be readable, intuitive, and concise. As such, the description avoids repeating all the contextual assumptions given above. In particular:

Any ambiguities can be resolved by referring to the formal Prolog clauses.

dup

A dup instruction is type safe iff one can validly replace a category 1 type, Type, with the types Type, Type, yielding the outgoing type state.

instructionIsTypeSafe(dup, Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    StackFrame = frame(Locals, InputOperandStack, Flags),
    StackFrame = frame(Locals, InputOperandStack, InitState),
    popCategory1(InputOperandStack, Type, _),
    canSafelyPush(Environment, InputOperandStack, Type, OutputOperandStack),
    NextStackFrame = frame(Locals, OutputOperandStack, Flags),
    NextStackFrame = frame(Locals, OutputOperandStack, InitState),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
iinc

An iinc instruction with first operand Index is type safe iff LIndex has type int. The iinc instruction does not change the type state.

instructionIsTypeSafe(iinc(Index, _Value), _Environment, _Offset,
                      StackFrame, StackFrame, ExceptionStackFrame) :-
    StackFrame = frame(Locals, _OperandStack, _Flags),
    StackFrame = frame(Locals, _OperandStack, _InitState),
    nth0(Index, Locals, int),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
invokeinterface

An invokeinterface instruction is type safe iff all of the following are true:

instructionIsTypeSafe(invokeinterface(CP, Count, 0), Environment, _Offset,
                      StackFrame, NextStackFrame, ExceptionStackFrame) :-
    CP = imethod(MethodClassType, MethodName, Descriptor),
    MethodName \= '`<init>`',
    MethodName \= '`<clinit>`',
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    reverse([MethodClassType | OperandArgList], StackArgList),
    canPop(StackFrame, StackArgList, TempFrame),
    validTypeTransition(Environment, [], ReturnType,
                        TempFrame, NextStackFrame),
    countIsValid(Count, StackFrame, TempFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The Count operand of an invokeinterface instruction is valid if it equals the size of the arguments to the instruction. This is equal to the difference between the size of InputFrame and OutputFrame.

countIsValid(Count, InputFrame, OutputFrame) :-
    InputFrame = frame(_Locals1, OperandStack1, _Flags1),
    OutputFrame = frame(_Locals2, OperandStack2, _Flags2),
    InputFrame = frame(_, OperandStack1, _),
    OutputFrame = frame(_, OperandStack2, _),
    length(OperandStack1, Length1),
    length(OperandStack2, Length2),
    Count =:= Length1 - Length2.
invokespecial

An invokespecial instruction is type safe iff all of the following are true:

instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = method(MethodClassType, MethodName, Descriptor) ;
     CP = imethod(MethodClassType, MethodName, Descriptor)),
    MethodName \= '`<init>`',
    MethodName \= '`<clinit>`',
    validSpecialMethodClassType(Environment, MethodClassType),
    parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
    thisType(Environment, ThisType),
    reverse([ThisType | OperandArgList], StackArgList),
    validTypeTransition(Environment, StackArgList, ReturnType,
                        StackFrame, NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

validSpecialMethodClassType(Environment, class(MethodClassName, _)) :-
    thisClass(Environment, ThisClass),
    classClassName(ThisClass, MethodClassName).

validSpecialMethodClassType(Environment, class(MethodClassName, L)) :-
    thisClass(Environment, ThisClass),
    loadedSuperclasses(ThisClass, Supers),
    member(Super, Supers),
    classClassName(Super, MethodClassName),
    loadedClass(MethodClassName, L, Super).

validSpecialMethodClassType(Environment, class(MethodClassName, _)) :-
    thisClass(Environment, ThisClass),
    classInterfaceNames(ThisClass, InterfaceNames),
    member(MethodClassName, InterfaceNames).

The validSpecialMethodClassType clause enforces the structural constraint that invokespecial, for other than an instance initialization method, must name a method in the current class/interface or a superclass/superinterface.

The validTypeTransition clause enforces the structural constraint that invokespecial, for other than an instance initialization method, targets a receiver object of the current class or deeper. To see why, consider that StackArgList simulates the list of types on the operand stack expected by the method, starting with the current class (the class performing invokespecial). The actual types on the operand stack are in StackFrame. The effect of validTypeTransition is to pop the first type from the operand stack in StackFrame and check it is a subtype of the first term of StackArgList, namely the current class. Thus, the actual receiver type is compatible with the current class.

A sharp-eyed reader might notice that enforcing this structural constraint supercedes the structural constraint pertaining to invokespecial of a protected method. Thus, the Prolog code above makes no reference to passesProtectedCheck (4.10.1.8), whereas the Prolog code for invokespecial of an instance initialization method uses passesProtectedCheck to ensure the actual receiver type is compatible with the current class when certain protected instance initialization methods are named.

instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = method(MethodClassType, '`<init>`', Descriptor) ;
     CP = imethod(MethodClassType, '`<init>`', Descriptor)),
    parseMethodDescriptor(Descriptor, OperandArgList, void),
    reverse([uninitializedThis | OperandArgList], StackArgList),
    canPop(StackFrame, StackArgList, frame(Locals, OperandStack, Flags)),
    validThisInitClassType(Environment, MethodClassType),
    canPop(StackFrame, StackArgList, frame(Locals, OperandStack, InitState)),
    validThisInitCall(Environment, MethodClassType, InitState),
    thisType(Environment, This),
    substitute(uninitializedThis, This, OperandStack, NextOperandStack),
    substitute(uninitializedThis, This, Locals, NextLocals),
    substitute(uninitializedThis, top, Locals, ExceptionLocals),
    NextStackFrame = frame(NextLocals, NextOperandStack, []),
    ExceptionStackFrame = frame(ExceptionLocals, [], Flags).
    NextStackFrame = frame(NextLocals, NextOperandStack, unrestricted),
    ExceptionStackFrame = frame(ExceptionLocals, [], erroneous).
validThisInitClassType(Environment, class(MethodClassName, _)) :-
validThisInitCall(Environment, class(MethodClassName, _), restricted(_)) :-
    thisClass(Environment, ThisClass),
    classClassName(ThisClass, MethodClassName).
validThisInitClassType(Environment, class(MethodClassName, _)) :-
validThisInitCall(Environment, class(MethodClassName, _), restricted([])) :-
    thisClass(Environment, ThisClass),
    classSuperClassName(ThisClass, MethodClassName).
substitute(_Old, _New, [], []).
substitute(Old, New, [Old | FromRest], [New | ToRest]) :-
    substitute(Old, New, FromRest, ToRest).
substitute(Old, New, [From1 | FromRest], [From1 | ToRest]) :-
    From1 \= Old,
    substitute(Old, New, FromRest, ToRest).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    (CP = method(MethodClassType, '`<init>`', Descriptor) ;
     CP = imethod(MethodClassType, '`<init>`', Descriptor)),
    parseMethodDescriptor(Descriptor, OperandArgList, void),
    reverse([uninitialized(Address) | OperandArgList], StackArgList),
    canPop(StackFrame, StackArgList, frame(Locals, OperandStack, Flags)),
    canPop(StackFrame, StackArgList, frame(Locals, OperandStack, InitState)),
    allInstructions(Environment, Instructions),
    member(instruction(Address, new(MethodClassType)), Instructions),
    substitute(uninitialized(Address), MethodClassType,
               OperandStack, NextOperandStack),
    substitute(uninitialized(Address), MethodClassType, Locals, NextLocals),
    substitute(uninitialized(Address), top, Locals, ExceptionLocals),
    NextStackFrame = frame(NextLocals, NextOperandStack, Flags),
    ExceptionStackFrame = frame(ExceptionLocals, [], Flags),
    NextStackFrame = frame(NextLocals, NextOperandStack, InitState),
    ExceptionStackFrame = frame(ExceptionLocals, [], InitState),
    passesProtectedCheck(Environment, MethodClassType, '`<init>`',
                         Descriptor, NextStackFrame).

The rule for invokespecial of an <init> method is the sole motivation for passing back a distinct exception stack frame. The concern is that when initializing an object, the invokespecial invocation could fail, leaving the object in a partially initialized, permanently unusable state. To prevent repeated initialization attempts after an object fails to initialize the first time, an exception handler must consider any references to the object stored in local variables to have type top rather than uninitializedThis or uninitialized(Offset).

In the special case of initializing the current object (that is, when invoking <init> for type uninitializedThis), the original frame typically holds an uninitialized object in local variable 0 and has flag flagThisUninit. Normal termination of invokespecial initializes the uninitialized object and turns off the flagThisUninit flag. But if the invokespecial invocation throws an exception, the exception frame contains the broken object (with type top) and the flagThisUninit flag (the old flag). There is no way to perform a return instruction given that type state, so the handler would have to throw another exception (or loop forever). In fact, it's not even possible to express a handler with that type state, because there is no way for a stack frame, as expressed by the StackMapTable attribute (4.7.4), to carry flagThisUninit without any accompanying use of type uninitializedThis.

If not for these this special constraints constraint on object initialization, the local variable types and flags of the exception stack frame would always be the same as the local variable types and flags of the input stack frame.

new

A new instruction with operand CP at offset Offset is type safe iff CP refers to a constant pool entry denoting a class or interface type, the type uninitialized(Offset) does not appear in the incoming operand stack, and one can validly push uninitialized(Offset) onto the incoming operand stack and replace uninitialized(Offset) with top in the incoming local variables yielding the outgoing type state.

instructionIsTypeSafe(new(CP), Environment, Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    StackFrame = frame(Locals, OperandStack, Flags),
    StackFrame = frame(Locals, OperandStack, InitState),
    CP = class(_, _),
    NewItem = uninitialized(Offset),
    notMember(NewItem, OperandStack),
    substitute(NewItem, top, Locals, NewLocals),
    validTypeTransition(Environment, [], NewItem,
                        frame(NewLocals, OperandStack, Flags),
                        frame(NewLocals, OperandStack, InitState),
                        NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

The substitute predicate is defined in the rule for invokespecial (4.10.1.9.invokespecial).

pop, pop2

A pop instruction is type safe iff one can validly pop a category 1 type off the incoming operand stack yielding the outgoing type state.

instructionIsTypeSafe(pop, _Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    StackFrame = frame(Locals, [Type | Rest], Flags),
    StackFrame = frame(Locals, [Type | Rest], InitState),
    popCategory1([Type | Rest], Type, Rest),
    NextStackFrame = frame(Locals, Rest, Flags),
    NextStackFrame = frame(Locals, Rest, InitState),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

A pop2 instruction is type safe iff it is a type safe form of the pop2 instruction.

instructionIsTypeSafe(pop2, _Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    StackFrame = frame(Locals, InputOperandStack, Flags),
    StackFrame = frame(Locals, InputOperandStack, InitState),
    pop2SomeFormIsTypeSafe(InputOperandStack, OutputOperandStack),
    NextStackFrame = frame(Locals, OutputOperandStack, Flags),
    NextStackFrame = frame(Locals, OutputOperandStack, InitState),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

A pop2 instruction is a type safe form of the pop2 instruction iff it is a type safe form 1 pop2 instruction or a type safe form 2 pop2 instruction.

pop2SomeFormIsTypeSafe(InputOperandStack, OutputOperandStack) :-
    pop2Form1IsTypeSafe(InputOperandStack, OutputOperandStack).

pop2SomeFormIsTypeSafe(InputOperandStack, OutputOperandStack) :-
    pop2Form2IsTypeSafe(InputOperandStack, OutputOperandStack).

A pop2 instruction is a type safe form 1 pop2 instruction iff one can validly pop two types of size 1 off the incoming operand stack yielding the outgoing type state.

pop2Form1IsTypeSafe([Type1, Type2 | Rest], Rest) :-
    popCategory1([Type1 | Rest], Type1, Rest),
    popCategory1([Type2 | Rest], Type2, Rest).

A pop2 instruction is a type safe form 2 pop2 instruction iff one can validly pop a type of size 2 off the incoming operand stack yielding the outgoing type state.

pop2Form2IsTypeSafe([top, Type | Rest], Rest) :-
    popCategory2([top, Type | Rest], Type, Rest).
putfield

A putfield instruction with operand CP is type safe iff all of the following are true:

instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(FieldClassType, FieldName, FieldDescriptor),
    \+ currentClassStrictFinalField(Environment, CP),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    canPop(StackFrame, [FieldType], PoppedFrame),
    passesProtectedCheck(Environment, FieldClassType, FieldName,
                         FieldDescriptor, PoppedFrame),
    canPop(StackFrame, [FieldType, FieldClassType], NextStackFrame),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
currentClassStrictFinalField(Environment, CP) :-
    CP = field(FieldClassName, FieldName, FieldDescriptor),
    thisClass(Environment, CurrentClass),
    classClassName(CurrentClass, FieldClassName),
    classDeclaresMember(CurrentClass, FieldName, FieldDescriptor, FieldFlags),
    member(strict_init, FieldFlags),
    member(final, FieldFlags).

The \+ operator is commonly used in Prolog to indicate negation. The new clause prevents this rule from applying when the given field is declared ACC_FINAL and ACC_STRICT_INIT in the current class. (And note that any attempts to write to an ACC_FINAL field declared in another class will fail with a linkage error.)

instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    CP = field(FieldClassType, FieldName, FieldDescriptor),
    thisType(Environment, FieldClassType),
    Environment = environment(CurrentClass, CurrentMethod, _, _, _, _),
    methodName(CurrentMethod, '`<init>`'),
    classDeclaresMember(CurrentClass, FieldName, FieldDescriptor),
    parseFieldDescriptor(FieldDescriptor, FieldType),
    canPop(StackFrame, [FieldType, uninitializedThis], NextStackFrame),
    canPop(StackFrame, [FieldType, uninitializedThis], TempFrame),
    TempFrame = frame(NextLocals, NextStack, restricted(Unset)),
    exclude(=(unsetField(FieldName, FieldDescriptor)), Unset, NextUnset),
    NextStackFrame = frame(NextLocals, NextStack, restricted(NextUnset)),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
return

A return instruction is type safe if the enclosing method declares a void return type, and either: the incoming instance initialization state is unrestricted.

instructionIsTypeSafe(return, Environment, _Offset, StackFrame,
                      afterGoto, ExceptionStackFrame) :-
    thisMethodReturnType(Environment, void),
    StackFrame = frame(_Locals, _OperandStack, Flags),
    notMember(flagThisUninit, Flags),
    StackFrame = frame(_Locals, _OperandStack, unrestricted),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).
swap

A swap instruction is type safe iff one can validly replace two category 1 types, Type1 and Type2, on the incoming operand stack with the types Type2 and Type1 yielding the outgoing type state.

instructionIsTypeSafe(swap, _Environment, _Offset, StackFrame,
                      NextStackFrame, ExceptionStackFrame) :-
    StackFrame = frame(_Locals, [Type1, Type2 | Rest], _Flags),
    StackFrame = frame(_, [Type1, Type2 | Rest], _),
    popCategory1([Type1 | Rest], Type1, Rest),
    popCategory1([Type2 | Rest], Type2, Rest),
    NextStackFrame = frame(_Locals, [Type2, Type1 | Rest], _Flags),
    NextStackFrame = frame(_, [Type2, Type1 | Rest], _),
    exceptionStackFrame(StackFrame, ExceptionStackFrame).

Chapter 5: Loading, Linking, and Initializing

5.5 Initialization

Initialization of a class or interface involves assigning any ConstantValue attribute values to its static fields and executing any declared class or interface initialization method (2.9.2).

A class or interface C may be initialized only as a result of:

Prior to initialization, a class or interface must be linked, that is, verified, prepared, and optionally resolved.

Because the Java Virtual Machine is multithreaded, initialization of a class or interface requires careful synchronization, since some other thread may be trying to initialize the same class or interface at the same time. There is also the possibility that initialization of a class or interface may be requested recursively as part of the initialization of that class or interface. The implementation of the Java Virtual Machine is responsible for taking care of synchronization and recursive initialization by using the following procedure. It assumes that the class or interface has already been verified and prepared, and that the class or interface contains state that indicates one of four situations:

The precise form of the initialization state is left to the discretion of the JVM implementation.

For each class or interface C, there is a unique initialization lock LC. The mapping from C to LC is also left to the discretion of the Java Virtual Machine implementation. For example, LC could be the Class object for C, or the monitor associated with that Class object. The procedure for initializing C is then as follows:

  1. Synchronize on the initialization lock, LC, for C. This involves waiting until the current thread can acquire LC.

  2. If the initialization state of C indicates that initialization is in progress for C by some other thread, then release LC and block the current thread until informed that the in-progress initialization has completed, at which time repeat this procedure.

    Thread interrupt status is unaffected by execution of the initialization procedure.

  3. If the initialization state of C indicates that initialization is in progress for C by is larval in the current thread, then this must be a recursive request for initialization. Release LC and complete normally.

  4. If the initialization state of C indicates that C has already been initialized, then no further action is required. Release LC and complete normally.

  5. If the initialization state of C is in an erroneous state, then initialization is not possible. Release LC and throw a NoClassDefFoundError.

  6. Otherwise, record the fact that initialization of C is in progress by transition C to a larval state in the current thread, and release LC.

    Then, initialize each static field of C with the constant value in its ConstantValue attribute (4.7.2), in the order the fields appear in the ClassFile structure, and if the field is strictly-initialized, record in the initialization state that it is set.

  7. Next, if C is a class rather than an interface, then let SC be its superclass and let SI1, ..., SIn be all superinterfaces of C (whether direct or indirect) that declare at least one non-abstract, non-static method. The order of superinterfaces is given by a recursive enumeration over the superinterface hierarchy of each interface directly implemented by C. For each interface I directly implemented by C (in the order of the interfaces array of C), the enumeration recurs on I's superinterfaces (in the order of the interfaces array of I) before returning I.

    For each S in the list [ SC, SI1, ..., SIn ], if S has not yet been initialized, then recursively perform this entire procedure for S. If necessary, verify and prepare S first.

    If the initialization of S completes abruptly because of a thrown exception, then acquire LC, label C as erroneous transition C to the erroneous state, notify all waiting threads, release LC, and complete abruptly, throwing the same exception that resulted from initializing S.

  8. Next, determine whether assertions are enabled for C by querying its defining loader.

  9. Next, if C declares a class or interface initialization method, execute that method.

  10. If the execution of the class or interface initialization method completes normally, or if C declares no class or interface initialization method, then acquire LC, label C as fully initialized, notify all waiting threads, release LC, and complete this procedure normally.

  11. Otherwise, If the execution of the class or interface initialization method must have completed completes abruptly by throwing some exception E., then initialization has failed. If the class of E is Error or one of its subclasses, the failure is for reason E. If E has some other class, If the class of E is not Error or one of its subclasses, then create a new instance of the class ExceptionInInitializerError with E as the argument, and use this object in place of E in the following step as the reason for failure. If a new instance of ExceptionInInitializerError cannot be created because an OutOfMemoryError occurs, then use an OutOfMemoryError object in place of E in the following step as the reason for failure. In any case, acquire LC, transition C to the erroneous state, notify all waiting threads, release LC, and complete this procedure abruptly with the reason described above.

  12. Acquire LC, label C as erroneous, notify all waiting threads, release LC, and complete this procedure abruptly with reason E or its replacement as determined in the previous step.

  13. Otherwise, execution of the class or interface initialization method, if any, has completed normally. Check whether each strictly-initialized static field declared by C has been set. If any such field has been left unset, acquire LC, transition C to the erroneous state, notify all waiting threads, release LC, and complete this procedure abruptly with an IllegalStateException as the reason.

    The precise exception class to be thrown may change. Some sort of Error may be appropriate.

  14. If all strictly-initialized static fields declared by C have been set, then acquire C, transition C to the initialized state, notify all waiting threads, release LC, and complete this procedure normally.

A Java Virtual Machine implementation may optimize this procedure by eliding the lock acquisition in step 1 (and release in step 4/5) when it can determine that the initialization of the class has already completed, provided that, in terms of the Java memory model, all happens-before orderings (JLS §17.4.5) that would exist if the lock were acquired, still exist when the optimization is performed.

Chapter 6: The Java Virtual Machine Instruction Set

6.5 Instructions

getstatic

Operation

Get static field from class

Format

getstatic
indexbyte1
indexbyte2

Forms

getstatic = 178 (0xb2)

Operand Stack

...,

..., value

Description

The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1 << 8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a field (5.1), which gives the name and descriptor of the field as well as a symbolic reference to the class or interface in which the field is to be found. The referenced field is resolved (5.4.3.2).

On successful resolution of the field, the class or interface that declared the resolved field is initialized if that class or interface has not already been initialized (5.5). If that class or interface is in a larval initialization state, and if the resolved field is strictly-initialized, then the initialization state is updated to reflect that the field has been read.

The Finally, the value of the class or interface field is fetched and pushed onto the operand stack.

Linking Exceptions

During resolution of the symbolic reference to the class or interface field, any of the exceptions pertaining to field resolution (5.4.3.2) can be thrown.

Otherwise, if the resolved field is not a static (class) field or an interface field, getstatic throws an IncompatibleClassChangeError.

Run-time Exception

Otherwise, if execution of this getstatic instruction causes initialization of the referenced resolved field's declaring class or interface, getstatic may throw an Error as detailed in 5.5.

Otherwise, if the resolved field is strictly-initialized, and the resolved field's declaring class or interface is in a larval initialization state, and that state indicates that the field is unset, getstatic throws an IllegalStateException.

The precise exception class to be thrown may change.

It is the intent that these rules apply equally to all reflective attempts to read a static field. TBD whether something more needs to be said somewhere in order to enforce that intention.

putstatic

Operation

Set static field in class

Format

putstatic
indexbyte1
indexbyte2

Forms

putstatic = 179 (0xb3)

Operand Stack

..., value

...

Description

The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1 << 8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a field (5.1), which gives the name and descriptor of the field as well as a symbolic reference to the class or interface in which the field is to be found. The referenced field is resolved (5.4.3.2).

On successful resolution of the field, the class or interface that declared the resolved field is initialized if that class or interface has not already been initialized (5.5). If that class or interface is in a larval initialization state, and if the resolved field is strictly-initialized, then the initialization state is updated to reflect that the field has been set.

The type of a value stored by a putstatic instruction must be compatible with the descriptor of the referenced field (4.3.2). If the field descriptor type is boolean, byte, char, short, or int, then the value must be an int. If the field descriptor type is float, long, or double, then the value must be a float, long, or double, respectively. If the field descriptor type is a class type or an array type, then the value must be a value of the field descriptor type. If the field is final, it must be declared in the current class or interface, and the instruction must occur in the class or interface initialization method of the current class or interface (2.9.2).

The value is popped from the operand stack.

If the value is of type int and the field descriptor type is one of byte, char, short, or boolean, then the int value is converted to the field descriptor type as follows. If the field descriptor type is byte, char, or short, then the int value is truncated to a value of the field descriptor type, value'. If the field descriptor type is boolean, then the int value is narrowed by taking the bitwise AND of value and 1, resulting in value'. The referenced field in the class or interface is set to value'.

Otherwise, the referenced field in the class or interface is set to value.

Linking Exceptions

During resolution of the symbolic reference to the class or interface field, any of the exceptions pertaining to field resolution (5.4.3.2) can be thrown.

Otherwise, if the resolved field is not a static (class) field or an interface field, putstatic throws an IncompatibleClassChangeError.

Otherwise, if the resolved field is final, it must be declared in the current class or interface, and the instruction must occur in the class or interface initialization method of the current class or interface. Otherwise, an IllegalAccessError is thrown.

Run-time Exception

Otherwise, if execution of this putstatic instruction causes initialization of the referenced class or interface, putstatic may throw an Error as detailed in 5.5.

Otherwise, if the resolved field is strictly-initialized and final, and the resolved field's declaring class or interface is in a larval initialization state, and that state indicates that the field has been read, putstatic throws an IllegalStateException.

The precise exception class to be thrown may change.

It is the intent that these rules apply equally to all reflective attempts to write a static field. TBD whether something more needs to be said somewhere in order to enforce that intention.

Notes

A putstatic instruction may be used only to set the value of an interface field on the initialization of that field. Interface fields may be assigned to only once, on execution of an interface variable initialization expression when the interface is initialized (5.5, JLS §9.3.1).