hipdf.core.dtypes.ListDtype#

25 min read time

Applies to Linux

class hipdf.core.dtypes.ListDtype(element_type: Any)#

Bases: _BaseDtype

Type to represent list data.

Parameters#

element_typeobject

A dtype with which represents the element types in the list.

Attributes#

element_type leaf_type

Methods#

from_arrow to_arrow

Examples#

>>> import cudf
>>> list_dtype = cudf.ListDtype("int32")
>>> list_dtype
ListDtype(int32)

A nested list dtype can be created by:

>>> nested_list_dtype = cudf.ListDtype(list_dtype)
>>> nested_list_dtype
ListDtype(ListDtype(int32))
__init__(element_type: Any) None#

Methods

__init__(element_type)

construct_array_type()

Return the array type associated with this dtype.

construct_from_string(string)

Construct this type from a string.

deserialize(header, frames)

Generate an object from a serialized representation.

device_deserialize(header, frames)

Perform device-side deserialization tasks.

device_serialize()

Serialize data and metadata associated with device memory.

empty(shape)

Construct an ExtensionArray of this dtype with the given shape.

from_arrow(typ)

Creates a ListDtype from pyarrow.ListType.

host_deserialize(header, frames)

Perform device-side deserialization tasks.

host_serialize()

Serialize data and metadata associated with host memory.

is_dtype(dtype)

Check if we match 'dtype'.

serialize()

Generate an equivalent serializable representation of an object.

to_arrow()

Convert to a pyarrow.ListType

Attributes

element_type

Returns the element type of the ListDtype.

index_class

The Index subclass to return from Index.__new__ when this dtype is encountered.

itemsize

kind

A character code (one of 'biufcmMOSUV'), default 'O'

leaf_type

Returns the type of the leaf values.

na_value

Default NA value to use for this type.

name

names

Ordered list of field names, or None if there are no fields.

type

The scalar type for the array, e.g. int.

name: str = 'list'#
__init__(element_type: Any) None#
property element_type: Dtype#

Returns the element type of the ListDtype.

Returns#

Dtype

Examples#

>>> import cudf
>>> deep_nested_type = cudf.ListDtype(cudf.ListDtype(cudf.ListDtype("float32")))
>>> deep_nested_type
ListDtype(ListDtype(ListDtype(float32)))
>>> deep_nested_type.element_type
ListDtype(ListDtype(float32))
>>> deep_nested_type.element_type.element_type
ListDtype(float32)
>>> deep_nested_type.element_type.element_type.element_type
'float32'
property leaf_type#

Returns the type of the leaf values.

Examples#

>>> import cudf
>>> deep_nested_type = cudf.ListDtype(cudf.ListDtype(cudf.ListDtype("float32")))
>>> deep_nested_type
ListDtype(ListDtype(ListDtype(float32)))
>>> deep_nested_type.leaf_type
'float32'
property type#

The scalar type for the array, e.g. int

It’s expected ExtensionArray[item] returns an instance of ExtensionDtype.type for scalar item, assuming that value is valid (not NA). NA values do not need to be instances of type.

classmethod from_arrow(typ)#

Creates a ListDtype from pyarrow.ListType.

Parameters#

typpyarrow.ListType

A pyarrow.ListType that has to be converted to ListDtype.

Returns#

obj : ListDtype

Examples#

>>> import cudf
>>> import pyarrow as pa
>>> arrow_type = pa.infer_type([[1]])
>>> arrow_type
ListType(list<item: int64>)
>>> list_dtype = cudf.ListDtype.from_arrow(arrow_type)
>>> list_dtype
ListDtype(int64)
to_arrow()#

Convert to a pyarrow.ListType

Examples#

>>> import cudf
>>> list_dtype = cudf.ListDtype(cudf.ListDtype("float32"))
>>> list_dtype
ListDtype(ListDtype(float32))
>>> list_dtype.to_arrow()
ListType(list<item: list<item: float>>)
property itemsize#
classmethod construct_array_type() type_t[ExtensionArray]#

Return the array type associated with this dtype.

Returns#

type

classmethod construct_from_string(string: str) Self#

Construct this type from a string.

This is useful mainly for data types that accept parameters. For example, a period dtype accepts a frequency parameter that can be set as period[h] (where H means hourly frequency).

By default, in the abstract class, just the name of the type is expected. But subclasses can overwrite this method to accept parameters.

Parameters#

stringstr

The name of the type, for example category.

Returns#

ExtensionDtype

Instance of the dtype.

Raises#

TypeError

If a class cannot be constructed from this ‘string’.

Examples#

For extension dtypes with arguments the following may be an adequate implementation.

>>> import re
>>> @classmethod
... def construct_from_string(cls, string):
...     pattern = re.compile(r"^my_type\[(?P<arg_name>.+)\]$")
...     match = pattern.match(string)
...     if match:
...         return cls(**match.groupdict())
...     else:
...         raise TypeError(
...             f"Cannot construct a '{cls.__name__}' from '{string}'"
...         )
empty(shape: Shape) ExtensionArray#

Construct an ExtensionArray of this dtype with the given shape.

Analogous to numpy.empty.

Parameters#

shape : int or tuple[int]

Returns#

ExtensionArray

index_class#

The Index subclass to return from Index.__new__ when this dtype is encountered.

classmethod is_dtype(dtype: object) bool#

Check if we match ‘dtype’.

Parameters#

dtypeobject

The object to check.

Returns#

bool

Notes#

The default implementation is True if

  1. cls.construct_from_string(dtype) is an instance of cls.

  2. dtype is an object and is an instance of cls

  3. dtype has a dtype attribute, and any of the above conditions is true for dtype.dtype.

property kind: str#

A character code (one of ‘biufcmMOSUV’), default ‘O’

This should match the NumPy dtype used when the array is converted to an ndarray, which is probably ‘O’ for object if the extension type cannot be represented as a built-in NumPy type.

See Also#

numpy.dtype.kind

property na_value: object#

Default NA value to use for this type.

This is used in e.g. ExtensionArray.take. This should be the user-facing “boxed” version of the NA value, not the physical NA value for storage. e.g. for JSONArray, this is an empty dictionary.

property names: list[str] | None#

Ordered list of field names, or None if there are no fields.

This is for compatibility with NumPy arrays, and may be removed in the future.