Class Float16
java.lang.Object
java.lang.Number
jdk.incubator.vector.Float16
- All Implemented Interfaces:
Serializable
,Comparable<Float16>
The
Float16
is a class holding 16-bit data
in IEEE 754 binary16 format.
Binary16 Format:
S EEEEE MMMMMMMMMM
Sign - 1 bit
Exponent - 5 bits
Significand - 10 bits (does not include the implicit bit
inferred from the exponent, see PRECISION
)
Unless otherwise specified, the methods in this class use a rounding policy (JLS 15.4) of round to nearest.
This is a value-based class; programmers should treat instances that are equal as interchangeable and should not use instances for synchronization, or unpredictable behavior may occur. For example, in a future release, synchronization may fail.
Floating-point Equality, Equivalence, and Comparison
The classjava.lang.Double
has a discussion of equality,
equivalence, and comparison of floating-point values that is
equally applicable to Float16
values.
Decimal ↔ Binary Conversion Issues
The discussion of binary to decimal conversion issues injava.lang.Double
is also
applicable to Float16
values.- API Note:
- The methods in this class generally have analogous methods in
either
Float
/Double
orMath
/StrictMath
. Unless otherwise specified, the handling of special floating-point values such as NaN values, infinities, and signed zeros of methods in this class is wholly analogous to the handling of equivalent cases by methods inFloat
,Double
,Math
, etc. - Since:
- 24
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
The number of bytes used to represent aFloat16
value, 2.static final int
Maximum exponent a finiteFloat16
variable may have, 15.static final Float16
A constant holding the largest positive finite value of typeFloat16
, (2-2-10)·215, numerically equal to 65504.0.static final int
Minimum exponent a normalizedFloat16
variable may have, -14.static final Float16
A constant holding the smallest positive normal value of typeFloat16
, 2-14.static final Float16
A constant holding the smallest positive nonzero value of typeFloat16
, 2-24.static final Float16
A constant holding a Not-a-Number (NaN) value of typeFloat16
.static final Float16
A constant holding the negative infinity of typeFloat16
.static final Float16
A constant holding the positive infinity of typeFloat16
.static final int
The number of bits in the significand of aFloat16
value, 11.static final int
The number of bits used to represent aFloat16
value, 16. -
Method Summary
Modifier and TypeMethodDescriptionstatic Float16
Returns the absolute value of the argument.static Float16
Adds twoFloat16
values together as per the+
operator semantics using the round to nearest rounding policy.byte
Returns the value of thisFloat16
as abyte
after a narrowing primitive conversion.static int
Compares the two specifiedFloat16
values.int
Compares twoFloat16
objects numerically.static Float16
Returns the first floating-point argument with the sign of the second floating-point argument.static Float16
Divides twoFloat16
values as per the/
operator semantics using the round to nearest rounding policy.double
Returns the value of thisFloat16
as adouble
after a widening primitive conversion.boolean
Compares this object against the specified object.static short
Returns a representation of the specified floating-point value according to the IEEE 754 floating-point binary16 bit layout.static short
Returns a representation of the specified floating-point value according to the IEEE 754 floating-point binary16 bit layout.float
Returns the value of thisFloat16
as afloat
after a widening primitive conversion.static Float16
Returns the fused multiply add of the three arguments; that is, returns the exact product of the first two arguments summed with the third argument and then rounded once to the nearestFloat16
.static int
getExponent
(Float16 f16) Returns the unbiased exponent used in the representation of aFloat16
.int
hashCode()
Returns a hash code for thisFloat16
object.static int
Returns a hash code for aFloat16
value; compatible withFloat16.hashCode()
.int
intValue()
Returns the value of thisFloat16
as anint
after a narrowing primitive conversion.static boolean
Returnstrue
if the argument is a finite floating-point value; returnsfalse
otherwise (for NaN and infinity arguments).static boolean
isInfinite
(Float16 f16) Returnstrue
if the specified number is infinitely large in magnitude,false
otherwise.static boolean
Returnstrue
if the specified number is a Not-a-Number (NaN) value,false
otherwise.long
Returns value of thisFloat16
as along
after a narrowing primitive conversion.static Float16
Returns the larger of twoFloat16
values.static Float16
Returns the smaller of twoFloat16
values.static Float16
Multiplies twoFloat16
values as per the*
operator semantics using the round to nearest rounding policy.static Float16
Returns the negation of the argument.static Float16
Returns the floating-point value adjacent tov
in the direction of negative infinity.static Float16
Returns the floating-point value adjacent tov
in the direction of positive infinity.static Float16
Returnsv
× 2scaleFactor
rounded as if performed by a single correctly rounded floating-point multiply.static Float16
shortBitsToFloat16
(short bits) Returns theFloat16
value corresponding to a given bit representation.short
Returns the value of thisFloat16
as ashort
after a narrowing primitive conversion.static Float16
Returns the signum function of the argument; zero if the argument is zero, 1.0 if the argument is greater than zero, -1.0 if the argument is less than zero.static Float16
Returns the square root of the operand.static Float16
Subtracts twoFloat16
values as per the-
operator semantics using the round to nearest rounding policy.static String
toHexString
(Float16 f16) Returns a hexadecimal string representation of theFloat16
argument.toString()
Returns a string representation of thisFloat16
.static String
Returns a string representation of theFloat16
argument.static Float16
Returns the size of an ulp of the argument.static Float16
valueOf
(double d) Returns aFloat16
value rounded from thedouble
argument using the round to nearest rounding policy.static Float16
valueOf
(float f) Returns aFloat16
value rounded from thefloat
argument using the round to nearest rounding policy.static Float16
valueOf
(int value) Returns the value of anint
converted toFloat16
.static Float16
valueOf
(long value) Returns the value of along
converted toFloat16
.static Float16
Returns aFloat16
holding the floating-point value represented by the argument string.static Float16
Returns aFloat16
value rounded from theBigDecimal
argument using the round to nearest rounding policy.
-
Field Details
-
POSITIVE_INFINITY
A constant holding the positive infinity of typeFloat16
.- See Also:
-
NEGATIVE_INFINITY
A constant holding the negative infinity of typeFloat16
.- See Also:
-
NaN
A constant holding a Not-a-Number (NaN) value of typeFloat16
.- See Also:
-
MAX_VALUE
A constant holding the largest positive finite value of typeFloat16
, (2-2-10)·215, numerically equal to 65504.0.- See Also:
-
MIN_NORMAL
A constant holding the smallest positive normal value of typeFloat16
, 2-14.- See Also:
-
MIN_VALUE
A constant holding the smallest positive nonzero value of typeFloat16
, 2-24.- See Also:
-
SIZE
public static final int SIZEThe number of bits used to represent aFloat16
value, 16.- See Also:
-
PRECISION
public static final int PRECISIONThe number of bits in the significand of aFloat16
value, 11. This corresponds to parameter N in section 4.2.3 of The Java Language Specification.- See Also:
-
MAX_EXPONENT
public static final int MAX_EXPONENTMaximum exponent a finiteFloat16
variable may have, 15. It is equal to the value returned byFloat16.getExponent(Float16.MAX_VALUE)
.- See Also:
-
MIN_EXPONENT
public static final int MIN_EXPONENTMinimum exponent a normalizedFloat16
variable may have, -14. It is equal to the value returned byFloat16.getExponent(Float16.MIN_NORMAL)
.- See Also:
-
BYTES
public static final int BYTESThe number of bytes used to represent aFloat16
value, 2.- See Also:
-
-
Method Details
-
toString
Returns a string representation of theFloat16
argument. The behavior of this method is analogous toFloat.toString(float)
in the handling of special values (signed zeros, infinities, and NaN) and the generation of a decimal string that will convert back to the argument value.- Parameters:
f16
- theFloat16
to be converted.- Returns:
- a string representation of the argument.
- See Also:
-
toHexString
Returns a hexadecimal string representation of theFloat16
argument. The behavior of this class is analogous toFloat.toHexString(float)
except that an exponent value of"p-14"
is used for subnormalFloat16
values.- API Note:
- This method corresponds to the convertToHexCharacter operation defined in IEEE 754.
- Parameters:
f16
- theFloat16
to be converted.- Returns:
- a hex string representation of the argument.
- See Also:
-
valueOf
Returns the value of anint
converted toFloat16
.- API Note:
- This method corresponds to the convertFromInt operation defined in IEEE 754.
- Parameters:
value
- anint
value.- Returns:
- the value of an
int
converted toFloat16
-
valueOf
Returns the value of along
converted toFloat16
.- API Note:
- This method corresponds to the convertFromInt operation defined in IEEE 754.
- Parameters:
value
- along
value.- Returns:
- the value of a
long
converted toFloat16
-
valueOf
Returns aFloat16
value rounded from thefloat
argument using the round to nearest rounding policy.- API Note:
- This method corresponds to the convertFormat operation defined in IEEE 754.
- Parameters:
f
- afloat
- Returns:
- a
Float16
value rounded from thefloat
argument using the round to nearest rounding policy
-
valueOf
Returns aFloat16
value rounded from thedouble
argument using the round to nearest rounding policy.- API Note:
- This method corresponds to the convertFormat operation defined in IEEE 754.
- Parameters:
d
- adouble
- Returns:
- a
Float16
value rounded from thedouble
argument using the round to nearest rounding policy
-
valueOf
Returns aFloat16
holding the floating-point value represented by the argument string. The grammar of strings accepted by this method is the same as that accepted byDouble.valueOf(String)
. The rounding policy is also analogous to the one used by that method, a valid input is regarded as an exact numerical value that is rounded once to the nearest representableFloat16
value.- API Note:
- This method corresponds to the convertFromDecimalCharacter and convertFromHexCharacter operations defined in IEEE 754.
- Parameters:
s
- the string to be parsed.- Returns:
- the
Float16
value represented by the string argument. - Throws:
NullPointerException
- if the string is nullNumberFormatException
- if the string does not contain a parsableFloat16
.- See Also:
-
valueOf
Returns aFloat16
value rounded from theBigDecimal
argument using the round to nearest rounding policy.- Parameters:
v
- aBigDecimal
- Returns:
- a
Float16
value rounded from theBigDecimal
argument using the round to nearest rounding policy
-
isNaN
Returnstrue
if the specified number is a Not-a-Number (NaN) value,false
otherwise.- API Note:
- This method corresponds to the isNaN operation defined in IEEE 754.
- Parameters:
f16
- the value to be tested.- Returns:
true
if the argument is NaN;false
otherwise.- See Also:
-
isInfinite
Returnstrue
if the specified number is infinitely large in magnitude,false
otherwise.- API Note:
- This method corresponds to the isInfinite operation defined in IEEE 754.
- Parameters:
f16
- the value to be tested.- Returns:
true
if the argument is positive infinity or negative infinity;false
otherwise.- See Also:
-
isFinite
Returnstrue
if the argument is a finite floating-point value; returnsfalse
otherwise (for NaN and infinity arguments).- API Note:
- This method corresponds to the isFinite operation defined in IEEE 754.
- Parameters:
f16
- theFloat16
value to be tested- Returns:
true
if the argument is a finite floating-point value,false
otherwise.- See Also:
-
byteValue
public byte byteValue()Returns the value of thisFloat16
as abyte
after a narrowing primitive conversion.- Overrides:
byteValue
in classNumber
- Returns:
- the value of this
Float16
as abyte
after a narrowing primitive conversion - See Java Language Specification:
-
5.1.3 Narrowing Primitive Conversion
-
toString
-
shortValue
public short shortValue()Returns the value of thisFloat16
as ashort
after a narrowing primitive conversion.- Overrides:
shortValue
in classNumber
- Returns:
- the value of this
Float16
as ashort
after a narrowing primitive conversion - See Java Language Specification:
-
5.1.3 Narrowing Primitive Conversion
-
intValue
public int intValue()Returns the value of thisFloat16
as anint
after a narrowing primitive conversion.- Specified by:
intValue
in classNumber
- API Note:
- This method corresponds to the convertToIntegerTowardZero operation defined in IEEE 754.
- Returns:
- the value of this
Float16
as anint
after a narrowing primitive conversion - See Java Language Specification:
-
5.1.3 Narrowing Primitive Conversion
-
longValue
public long longValue()Returns value of thisFloat16
as along
after a narrowing primitive conversion.- Specified by:
longValue
in classNumber
- API Note:
- This method corresponds to the convertToIntegerTowardZero operation defined in IEEE 754.
- Returns:
- value of this
Float16
as along
after a narrowing primitive conversion - See Java Language Specification:
-
5.1.3 Narrowing Primitive Conversion
-
floatValue
public float floatValue()Returns the value of thisFloat16
as afloat
after a widening primitive conversion.- Specified by:
floatValue
in classNumber
- API Note:
- This method corresponds to the convertFormat operation defined in IEEE 754.
- Returns:
- the value of this
Float16
as afloat
after a widening primitive conversion - See Java Language Specification:
-
5.1.2 Widening Primitive Conversion
-
doubleValue
public double doubleValue()Returns the value of thisFloat16
as adouble
after a widening primitive conversion.- Specified by:
doubleValue
in classNumber
- API Note:
- This method corresponds to the convertFormat operation defined in IEEE 754.
- Returns:
- the value of this
Float16
as adouble
after a widening primitive conversion - See Java Language Specification:
-
5.1.2 Widening Primitive Conversion
-
hashCode
public int hashCode()Returns a hash code for thisFloat16
object. The general contract ofObject#hashCode()
is satisfied. All NaN values have the same hash code. Additionally, all distinct numerical values have unique hash codes; in particular, negative zero and positive zero have different hash codes from each other. -
hashCode
Returns a hash code for aFloat16
value; compatible withFloat16.hashCode()
.- Parameters:
value
- the value to hash- Returns:
- a hash code value for a
Float16
value.
-
equals
Compares this object against the specified object. The result istrue
if and only if the argument is notnull
and is aFloat16
object that represents aFloat16
that has the same value as thedouble
represented by this object.- Overrides:
equals
in classObject
- Parameters:
obj
- the reference object with which to compare.- Returns:
true
if this object is the same as the obj argument;false
otherwise.- See Java Language Specification:
-
15.21.1 Numerical Equality Operators == and !=
- See Also:
-
float16ToRawShortBits
Returns a representation of the specified floating-point value according to the IEEE 754 floating-point binary16 bit layout.- Parameters:
f16
- aFloat16
floating-point number.- Returns:
- the bits that represent the floating-point number.
- See Also:
-
float16ToShortBits
Returns a representation of the specified floating-point value according to the IEEE 754 floating-point binary16 bit layout. All NaN values return the same bit pattern asNaN
.- Parameters:
f16
- aFloat16
floating-point number.- Returns:
- the bits that represent the floating-point number.
- See Also:
-
shortBitsToFloat16
Returns theFloat16
value corresponding to a given bit representation.- Parameters:
bits
- anyshort
integer.- Returns:
- the
Float16
floating-point value with the same bit pattern. - See Also:
-
compareTo
Compares twoFloat16
objects numerically. This method imposes a total order onFloat16
objects with two differences compared to the incomplete order defined by the Java language numerical comparison operators (<, <=, ==, >=, >
) onfloat
anddouble
values.- A NaN is unordered with respect to other
values and unequal to itself under the comparison
operators. This method chooses to define
Float16.NaN
to be equal to itself and greater than all otherFloat16
values (includingFloat16.POSITIVE_INFINITY
). - Positive zero and negative zero compare equal numerically, but are distinct and distinguishable values. This method chooses to define positive zero to be greater than negative zero.
- Specified by:
compareTo
in interfaceComparable<Float16>
- Parameters:
anotherFloat16
- theFloat16
to be compared.- Returns:
- the value
0
ifanotherFloat16
is numerically equal to thisFloat16
; a value less than0
if thisFloat16
is numerically less thananotherFloat16
; and a value greater than0
if thisFloat16
is numerically greater thananotherFloat16
. - See Java Language Specification:
-
15.20.1 Numerical Comparison Operators
<
,<=
,>
, and>=
- See Also:
- A NaN is unordered with respect to other
values and unequal to itself under the comparison
operators. This method chooses to define
-
compare
Compares the two specifiedFloat16
values.- Parameters:
f1
- the firstFloat16
to comparef2
- the secondFloat16
to compare- Returns:
- the value
0
iff1
is numerically equal tof2
; a value less than0
iff1
is numerically less thanf2
; and a value greater than0
iff1
is numerically greater thanf2
. - See Also:
-
max
Returns the larger of twoFloat16
values. The handling of signed zeros, NaNs, infinities, and other special cases by this method is analogous to the handling of those cases by the Math#max(double, double) method.- API Note:
- This method corresponds to the maximum operation defined in IEEE 754.
- Parameters:
a
- the first operandb
- the second operand- Returns:
- the greater of
a
andb
- See Also:
-
min
Returns the smaller of twoFloat16
values. The handling of signed zeros, NaNs, infinities, and other special cases by this method is analogous to the handling of those cases by the Math#min(double, double) method.- API Note:
- This method corresponds to the minimum operation defined in IEEE 754.
- Parameters:
a
- the first operandb
- the second operand- Returns:
- the smaller of
a
andb
- See Also:
-
add
Adds twoFloat16
values together as per the+
operator semantics using the round to nearest rounding policy. The handling of signed zeros, NaNs, infinities, and other special cases by this method is the same as for the handling of those cases by the built-in+
operator for floating-point addition (JLS 15.18.2).- API Note:
- This method corresponds to the addition operation defined in IEEE 754.
- Parameters:
addend
- the first operandaugend
- the second operand- Returns:
- the sum of the operands
- See Java Language Specification:
-
15.4 Floating-point Expressions
15.18.2 Additive Operators (+ and -) for Numeric Types
-
subtract
Subtracts twoFloat16
values as per the-
operator semantics using the round to nearest rounding policy. The handling of signed zeros, NaNs, infinities, and other special cases by this method is the same as for the handling of those cases by the built-in-
operator for floating-point subtraction (JLS 15.18.2).- API Note:
- This method corresponds to the subtraction operation defined in IEEE 754.
- Parameters:
minuend
- the first operandsubtrahend
- the second operand- Returns:
- the difference of the operands
- See Java Language Specification:
-
15.4 Floating-point Expressions
15.18.2 Additive Operators (+ and -) for Numeric Types
-
multiply
Multiplies twoFloat16
values as per the*
operator semantics using the round to nearest rounding policy. The handling of signed zeros, NaNs, and infinities, other special cases by this method is the same as for the handling of those cases by the built-in*
operator for floating-point multiplication (JLS 15.17.1).- API Note:
- This method corresponds to the multiplication operation defined in IEEE 754.
- Parameters:
multiplier
- the first operandmultiplicand
- the second operand- Returns:
- the product of the operands
- See Java Language Specification:
-
15.4 Floating-point Expressions
15.17.1 Multiplication Operator *
-
divide
Divides twoFloat16
values as per the/
operator semantics using the round to nearest rounding policy. The handling of signed zeros, NaNs, and infinities, other special cases by this method is the same as for the handling of those cases by the built-in/
operator for floating-point division (JLS 15.17.2).- API Note:
- This method corresponds to the division operation defined in IEEE 754.
- Parameters:
dividend
- the first operanddivisor
- the second operand- Returns:
- the quotient of the operands
- See Java Language Specification:
-
15.4 Floating-point Expressions
15.17.2 Division Operator /
-
sqrt
Returns the square root of the operand. The square root is computed using the round to nearest rounding policy. The handling of zeros, NaN, infinities, and negative arguments by this method is analogous to the handling of those cases byMath.sqrt(double)
.- API Note:
- This method corresponds to the squareRoot operation defined in IEEE 754.
- Parameters:
radicand
- the argument to have its square root taken- Returns:
- the square root of the operand
- See Also:
-
fma
Returns the fused multiply add of the three arguments; that is, returns the exact product of the first two arguments summed with the third argument and then rounded once to the nearestFloat16
. The handling of zeros, NaN, infinities, and other special cases by this method is analogous to the handling of those cases byMath.fma(float, float, float)
.- API Note:
- This method corresponds to the fusedMultiplyAdd operation defined in IEEE 754.
- Parameters:
a
- a valueb
- a valuec
- a value- Returns:
- (a × b + c)
computed, as if with unlimited range and precision, and rounded
once to the nearest
Float16
value - See Also:
-
negate
Returns the negation of the argument. Special cases:- If the argument is zero, the result is a zero with the opposite sign as the argument.
- If the argument is infinite, the result is an infinity with the opposite sign as the argument.
- If the argument is a NaN, the result is a NaN.
- API Note:
- This method corresponds to the negate operation defined in IEEE 754.
- Parameters:
f16
- the value to be negated- Returns:
- the negation of the argument
- See Java Language Specification:
-
15.15.4 Unary Minus Operator
-
-
abs
Returns the absolute value of the argument. The handling of zeros, NaN, and infinities by this method is analogous to the handling of those cases byMath.abs(float)
.- Parameters:
f16
- the argument whose absolute value is to be determined- Returns:
- the absolute value of the argument
- See Also:
-
getExponent
Returns the unbiased exponent used in the representation of aFloat16
.- If the argument is NaN or infinite, then the result is
MAX_EXPONENT
+ 1. - If the argument is zero or subnormal, then the result is
MIN_EXPONENT
- 1.
- API Note:
- This method is analogous to the logB operation defined in IEEE 754, but returns a different value on subnormal arguments.
- Parameters:
f16
- aFloat16
value- Returns:
- the unbiased exponent of the argument
- See Also:
- If the argument is NaN or infinite, then the result is
-
ulp
Returns the size of an ulp of the argument. An ulp, unit in the last place, of aFloat16
value is the positive distance between this floating-point value and theFloat16
value next larger in magnitude. Note that for non-NaN x,ulp(-x) == ulp(x)
.Special Cases:
- If the argument is NaN, then the result is NaN.
- If the argument is positive or negative infinity, then the result is positive infinity.
- If the argument is positive or negative zero, then the result is
Float16.MIN_VALUE
. - If the argument is ±
Float16.MAX_VALUE
, then the result is equal to 25, 32.0.
- Parameters:
f16
- the floating-point value whose ulp is to be returned- Returns:
- the size of an ulp of the argument
- See Also:
-
nextUp
Returns the floating-point value adjacent tov
in the direction of positive infinity.Special Cases:
- If the argument is NaN, the result is NaN.
- If the argument is positive infinity, the result is positive infinity.
- If the argument is zero, the result is
MIN_VALUE
- API Note:
- This method corresponds to the nextUp operation defined in IEEE 754.
- Parameters:
v
- starting floating-point value- Returns:
- The adjacent floating-point value closer to positive infinity.
- See Also:
-
nextDown
Returns the floating-point value adjacent tov
in the direction of negative infinity.Special Cases:
- If the argument is NaN, the result is NaN.
- If the argument is negative infinity, the result is negative infinity.
- If the argument is zero, the result is
-
MIN_VALUE
- API Note:
- This method corresponds to the nextDown operation defined in IEEE 754.
- Parameters:
v
- starting floating-point value- Returns:
- The adjacent floating-point value closer to negative infinity.
- See Also:
-
scalb
Returnsv
× 2scaleFactor
rounded as if performed by a single correctly rounded floating-point multiply. If the exponent of the result is betweenMIN_EXPONENT
andMAX_EXPONENT
, the answer is calculated exactly. If the exponent of the result would be larger thanFloat16.MAX_EXPONENT
, an infinity is returned. Note that if the result is subnormal, precision may be lost; that is, whenscalb(x, n)
is subnormal,scalb(scalb(x, n), -n)
may not equal x. When the result is non-NaN, the result has the same sign asv
.Special cases:
- If the first argument is NaN, NaN is returned.
- If the first argument is infinite, then an infinity of the same sign is returned.
- If the first argument is zero, then a zero of the same sign is returned.
- API Note:
- This method corresponds to the scaleB operation defined in IEEE 754.
- Parameters:
v
- number to be scaled by a power of two.scaleFactor
- power of 2 used to scalev
- Returns:
v
× 2scaleFactor
- See Also:
-
copySign
Returns the first floating-point argument with the sign of the second floating-point argument. This method does not require NaNsign
arguments to be treated as positive values; implementations are permitted to treat some NaN arguments as positive and other NaN arguments as negative to allow greater performance.- API Note:
- This method corresponds to the copySign operation defined in IEEE 754.
- Parameters:
magnitude
- the parameter providing the magnitude of the resultsign
- the parameter providing the sign of the result- Returns:
- a value with the magnitude of
magnitude
and the sign ofsign
. - See Also:
-
signum
Returns the signum function of the argument; zero if the argument is zero, 1.0 if the argument is greater than zero, -1.0 if the argument is less than zero.Special Cases:
- If the argument is NaN, then the result is NaN.
- If the argument is positive zero or negative zero, then the result is the same as the argument.
- Parameters:
f
- the floating-point value whose signum is to be returned- Returns:
- the signum function of the argument
- See Also:
-