NumPy - Data Types
NumPy supports a much greater variety of numerical types than Python does. Below is the list of most commonly used scalar data types defined in NumPy.
Data Type | Description |
---|---|
bool_ | Boolean (True or False) stored as a byte. |
short | 16-bit signed integer; Same as C short. |
intc | 32-bit signed integer; Same as C int. |
int_ | 64-bit signed integer; Same as Python int and C long. |
int8 | Byte (-128 to 127). |
int16 | Integer (-32768 to 32767). |
int32 | Integer (-2147483648 to 2147483647). |
int64 | Integer (-9223372036854775808 to 9223372036854775807). |
uint8 | Unsigned integer (0 to 255). |
uint16 | Unsigned integer (0 to 65535). |
uint32 | Unsigned integer (0 to 4294967295). |
uint64 | Unsigned integer (0 to 18446744073709551615). |
intp | Integer used for indexing, typically the same as ssize_t. |
float_ | Same as float64. |
float16 | 16-bit-precision floating-point number type: sign bit, 5 bits exponent, 10 bits mantissa. |
float32 | 32-bit-precision floating-point number type: sign bit, 8 bits exponent, 23 bits mantissa. |
float64 | 64-bit precision floating-point number type: sign bit, 11 bits exponent, 52 bits mantissa. |
double | 64-bit precision floating-point number type: sign bit, 11 bits exponent, 52 bits mantissa. Same as Python float and C double. |
complex_ | Same as complex128. |
complex64 | Complex number, represented by two 32-bit floats (real and imaginary components). |
complex128 | Complex number, represented by two 64-bit floats (real and imaginary components). |
NumPy numerical types are instances of dtype (data-type) objects, each having unique characteristics. Once NumPy is imported using:
import numpy as np
the dtypes are available as np.bool_, np.float32, etc.
Data Type Objects (dtype)
A data type object describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data:
- Type of the data (integer, float, Python object, etc.)
- Size of the data
- Byte order of the data (little-endian or big-endian)
- If the data type is structured data type, the names of fields, data type of each field and part of the memory block taken by each field.
- If data type is a subarray, its shape and data type
The byte order is decided by prefixing < or > to data type. '<' means that encoding is little-endian (least significant is stored in smallest address). '>' means that encoding is big-endian (most significant byte is stored in smallest address).
A dtype object is constructed using the following syntax:
numpy.dtype(object, align, copy)
Parameters
object |
Required. Specify the object to be converted to data type object. |
align |
Optional. If true, adds padding to the field to make it similar to C-struct. |
copy |
Optional. Makes a new copy of data-type object. If False, the result may just be a reference to a built-in data-type object. |
Example:
The example below shows how to create structured data-type.
import numpy as np # using array-scalar type dt1 = np.dtype(np.int32) print("dt1:", dt1) dt2 = np.dtype(np.float32) print("dt2:", dt2) #int8, int16, int32, int64 can be replaced #by equivalent string 'i1', 'i2', 'i4', 'i8' dt3 = np.dtype('i4') print("dt3:", dt3) #similarly, float8, float16, float32, float4 can be #replaced with string 'f1', 'f2', 'f4', 'f8' dt4 = np.dtype('f4') print("dt4:", dt4)
The output of the above code will be:
dt1: int32 dt2: float32 dt3: int32 dt4: float32
Example:
The example below shows how to create structured data-type using endian notation.
import numpy as np dt1 = np.dtype('>i4') print("dt1:", dt1) dt2 = np.dtype('<i4') print("dt2:", dt2)
The output of the above code will be:
dt1: >i4 dt2: int32
Example:
In the example below, structured data-type vertex is created with integer fields 'x' and 'y'. After that it is applied on a ndarray object Arr.
import numpy as np vertex = np.dtype([('x','>i4'), ('y', '>i4')]) Arr = np.array([(10, 20), (10, -20), (-10, 20), (-10, -20)], dtype = vertex) #printing data type print(vertex) #printing (x, y) co-ordinates of Arr print("\nArr contains:") print(Arr)
The output of the above code will be:
[('x', '>i4'), ('y', '>i4')] Arr contains: [( 10, 20) ( 10, -20) (-10, 20) (-10, -20)]
Example:
Consider one more example, structured data-type student is created with string field 'name', int field 'age' and float field 'marks'. After that it is applied on a ndarray object Arr.
import numpy as np student = np.dtype([('name','S30'), ('age', 'i4'), ('marks', 'f4')]) Arr = np.array([('John', 25, 63.5), ('Marry', 24, 75), ('Ramesh', 24, 81), ('Kim', 23, 67.5)], dtype = student) #printing Arr print("Arr contains:") print(Arr)
The output of the above code will be:
Arr contains: [(b'John', 25, 63.5) (b'Marry', 24, 75. ) (b'Ramesh', 24, 81. ) (b'Kim', 23, 67.5)]
Each built-in data type has a character code that uniquely identifies it. The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised. The supported kinds are:
- '?' − boolean
- 'b' − (signed) byte
- 'i' − (signed) integer
- 'u' − unsigned integer
- 'f' − floating-point
- 'c' − complex-floating point
- 'm' − timedelta
- 'M' − datetime
- 'O' − (Python) objects
- 'S', 'a' − zero-terminated bytes
- 'U' − Unicode string
- 'V' − raw data (void)