hipdf.Index#
47 min read time
- class hipdf.Index(data=None, dtype=None, copy=False, name=<no_default>, tupleize_cols=True, nan_as_null=True, **kwargs)#
Bases:
BaseIndex
The basic object storing row labels for all cuDF objects.
Parameters#
- dataarray-like (1-dimensional)/ DataFrame
If it is a DataFrame, it will return a MultiIndex
- dtypeNumPy dtype (default: object)
If dtype is None, we find the dtype that best fits the data.
- copybool
Make a copy of input data.
- nameobject
Name to be stored in the index.
- tupleize_colsbool (default: True)
When True, attempt to create a MultiIndex if possible. tupleize_cols == False is not yet supported.
- nan_as_nullbool, Default True
If
None
/True
, convertsnp.nan
values tonull
values. IfFalse
, leavesnp.nan
values as is.
Returns#
- Index
cudf Index
Warnings#
This class should not be subclassed. It is designed as a factory for different subclasses of BaseIndex depending on the provided input. If you absolutely must, and if you’re intimately familiar with the internals of cuDF, subclass BaseIndex instead.
Examples#
>>> import cudf >>> cudf.Index([1, 2, 3], dtype="uint64", name="a") UInt64Index([1, 2, 3], dtype='uint64', name='a')
- __init__()#
Methods
__init__
()any
()Return whether any elements is True in Index.
append
(other)Append a collection of Index objects together.
argsort
(*args, **kwargs)Return the integer indices that would sort the index.
astype
(dtype[, copy])Create an Index with values cast to dtypes.
copy
([deep])deserialize
(header, frames)Generate an object from a serialized representation.
device_deserialize
(header, frames)Perform device-side deserialization tasks.
Serialize data and metadata associated with device memory.
difference
(other[, sort])Return a new Index with elements from the index that are not in other.
drop_duplicates
([keep, nulls_are_equal])Drop duplicate rows in index.
dropna
([how])Drop null rows from Index.
duplicated
([keep])Indicate duplicate index values.
equals
(other)Determine if two Index objects contain the same elements.
factorize
([sort, na_sentinel, use_na_sentinel])fillna
(value[, downcast])Fill null values with the specified value.
find_label_range
(loc)Translate a label-based slice to an index-based slice
from_arrow
(obj)from_pandas
(index[, nan_as_null])Convert from a Pandas Index.
get_level_values
(level)Return an Index of values for requested level.
get_loc
(key[, method, tolerance])get_slice_bound
(label, side[, kind])Calculate slice bound that corresponds to given label.
host_deserialize
(header, frames)Perform device-side deserialization tasks.
Serialize data and metadata associated with host memory.
intersection
(other[, sort])Form the intersection of two Index objects.
Check if the Index only consists of booleans.
Check if the Index holds categorical data.
Check if the Index is a floating type.
Check if the Index only consists of integers.
Check if the Index holds Interval objects.
Check if the Index only consists of numeric data.
Check if the Index is of the object dtype.
isin
(values)Return a boolean array where the index values are in values.
isna
()Detect missing values.
join
(other[, how, level, return_indexers, sort])Compute join_index and indexers to conform data structures to the new index.
max
()The maximum value of the index.
memory_usage
([deep])Return the memory usage of an object.
min
()The minimum value of the index.
notna
()Detect existing (non-missing) values.
rename
(name[, inplace])Alter Index name.
repeat
(repeats[, axis])Repeat elements of a Index.
searchsorted
(value[, side, ascending, ...])Find index where elements should be inserted to maintain order
Generate an equivalent serializable representation of an object.
set_names
(names[, level, inplace])Set Index or MultiIndex name.
shift
([periods, freq])Not yet implemented
sort_values
([return_indexer, ascending, ...])Return a sorted copy of the index, and optionally return the indices that sorted the index itself.
take
(indices[, axis, allow_fill, fill_value])Return a new index containing the rows specified by indices
to_arrow
()Convert to a suitable Arrow object.
to_cupy
()Convert to a cupy array.
Converts a cuDF object into a DLPack tensor.
to_frame
([index, name])Create a DataFrame with a column containing this Index
to_list
()to_numpy
()Convert to a numpy array.
to_pandas
([nullable])Convert to a Pandas Index.
to_series
([index, name])Create a Series with both index and values equal to the index keys.
tolist
()union
(other[, sort])Form the union of two Index objects.
unique
()Return unique values in the index.
where
(cond[, other, inplace])Replace values where the condition is False.
Attributes
Return True if there are any NaNs or nulls.
Return boolean if values in the object are monotonic_increasing.
Return boolean if values in the object are monotonically decreasing.
Return boolean if values in the object are monotonically increasing.
Return if the index has unique values.
Returns the name of the Index.
Returns a tuple containing the name of the Index.
Number of dimensions of the underlying data, by definition 1.
Number of levels.
Get a tuple representing the dimensionality of the data.
Not yet implemented.
- classmethod from_arrow(obj)#
- property is_monotonic_increasing#
Return boolean if values in the object are monotonically increasing.
Returns#
bool
- __getitem__(key)#
- any()#
Return whether any elements is True in Index.
- append(other)#
Append a collection of Index objects together.
Parameters#
other : Index or list/tuple of indices
Returns#
appended : Index
Examples#
>>> import cudf >>> idx = cudf.Index([1, 2, 10, 100]) >>> idx Int64Index([1, 2, 10, 100], dtype='int64') >>> other = cudf.Index([200, 400, 50]) >>> other Int64Index([200, 400, 50], dtype='int64') >>> idx.append(other) Int64Index([1, 2, 10, 100, 200, 400, 50], dtype='int64')
append accepts list of Index objects
>>> idx.append([other, other]) Int64Index([1, 2, 10, 100, 200, 400, 50, 200, 400, 50], dtype='int64')
- argsort(*args, **kwargs)#
Return the integer indices that would sort the index.
Parameters vary by subclass.
- astype(dtype, copy: bool = True)#
Create an Index with values cast to dtypes.
The class of a new Index is determined by dtype. When conversion is impossible, a ValueError exception is raised.
Parameters#
- dtype
numpy.dtype
Use a
numpy.dtype
to cast entire Index object to.- copybool, default False
By default, astype always returns a newly allocated object. If copy is set to False and internal requirements on dtype are satisfied, the original data is used to create a new Index or the original Index is returned.
Returns#
- Index
Index with values cast to specified dtype.
Examples#
>>> import cudf >>> index = cudf.Index([1, 2, 3]) >>> index Int64Index([1, 2, 3], dtype='int64') >>> index.astype('float64') Float64Index([1.0, 2.0, 3.0], dtype='float64')
- dtype
- difference(other, sort=None)#
Return a new Index with elements from the index that are not in other.
This is the set difference of two Index objects.
Parameters#
other : Index or array-like sort : False or None, default None
Whether to sort the resulting index. By default, the values are attempted to be sorted, but any TypeError from incomparable elements is caught by cudf.
None : Attempt to sort the result, but catch any TypeErrors from comparing incomparable elements.
False : Do not sort the result.
Returns#
difference : Index
Examples#
>>> import cudf >>> idx1 = cudf.Index([2, 1, 3, 4]) >>> idx1 Int64Index([2, 1, 3, 4], dtype='int64') >>> idx2 = cudf.Index([3, 4, 5, 6]) >>> idx2 Int64Index([3, 4, 5, 6], dtype='int64') >>> idx1.difference(idx2) Int64Index([1, 2], dtype='int64') >>> idx1.difference(idx2, sort=False) Int64Index([2, 1], dtype='int64')
- drop_duplicates(keep='first', nulls_are_equal=True)#
Drop duplicate rows in index.
- keep{“first”, “last”, False}, default “first”
‘first’ : Drop duplicates except for the first occurrence.
‘last’ : Drop duplicates except for the last occurrence.
False
: Drop all duplicates.
- nulls_are_equal: bool, default True
Null elements are considered equal to other null elements.
- dropna(how='any')#
Drop null rows from Index.
- how{“any”, “all”}, default “any”
Specifies how to decide whether to drop a row. “any” (default) drops rows containing at least one null value. “all” drops only rows containing all null values.
- property dtype#
- duplicated(keep='first')#
Indicate duplicate index values.
Duplicated values are indicated as
True
values in the resulting array. Either all duplicates, all except the first, or all except the last occurrence of duplicates can be indicated.Parameters#
- keep{‘first’, ‘last’, False}, default ‘first’
The value or values in a set of duplicates to mark as missing.
'first'
: Mark duplicates asTrue
except for the first occurrence.'last'
: Mark duplicates asTrue
except for the last occurrence.False
: Mark all duplicates asTrue
.
Returns#
cupy.ndarray[bool]
See Also#
Series.duplicated : Equivalent method on cudf.Series. DataFrame.duplicated : Equivalent method on cudf.DataFrame. Index.drop_duplicates : Remove duplicate values from Index.
Examples#
By default, for each set of duplicated values, the first occurrence is set to False and all others to True:
>>> import cudf >>> idx = cudf.Index(['lama', 'cow', 'lama', 'beetle', 'lama']) >>> idx.duplicated() array([False, False, True, False, True])
which is equivalent to
>>> idx.duplicated(keep='first') array([False, False, True, False, True])
By using ‘last’, the last occurrence of each set of duplicated values is set to False and all others to True:
>>> idx.duplicated(keep='last') array([ True, False, True, False, False])
By setting keep to
False
, all duplicates are True:>>> idx.duplicated(keep=False) array([ True, False, True, False, True])
- property empty#
- equals(other)#
Determine if two Index objects contain the same elements.
Returns#
- out: bool
True if “other” is an Index and it has the same elements as calling index; False otherwise.
- factorize(sort=False, na_sentinel=None, use_na_sentinel=None)#
- fillna(value, downcast=None)#
Fill null values with the specified value.
Parameters#
- valuescalar
Scalar value to use to fill nulls. This value cannot be a list-likes.
- downcastdict, default is None
This Parameter is currently NON-FUNCTIONAL.
Returns#
filled : Index
Examples#
>>> import cudf >>> index = cudf.Index([1, 2, None, 4]) >>> index Int64Index([1, 2, <NA>, 4], dtype='int64') >>> index.fillna(3) Int64Index([1, 2, 3, 4], dtype='int64')
- find_label_range(loc: slice) slice #
Translate a label-based slice to an index-based slice
Parameters#
- loc
slice to search for.
Notes#
As with all label-based searches, the slice is right-closed.
Returns#
New slice translated into integer indices of the index (right-open).
- classmethod from_pandas(index, nan_as_null=<no_default>)#
Convert from a Pandas Index.
Parameters#
- indexPandas Index object
A Pandas Index object which has to be converted to cuDF Index.
- nan_as_nullbool, Default None
If
None
/True
, convertsnp.nan
values tonull
values. IfFalse
, leavesnp.nan
values as is.
Raises#
TypeError for invalid input type.
Examples#
>>> import cudf >>> import pandas as pd >>> import numpy as np >>> data = [10, 20, 30, np.nan] >>> pdi = pd.Index(data) >>> cudf.Index.from_pandas(pdi) Float64Index([10.0, 20.0, 30.0, <NA>], dtype='float64') >>> cudf.Index.from_pandas(pdi, nan_as_null=False) Float64Index([10.0, 20.0, 30.0, nan], dtype='float64')
- get_level_values(level)#
Return an Index of values for requested level.
This is primarily useful to get an individual level of values from a MultiIndex, but is provided on Index as well for compatibility.
Parameters#
- levelint or str
It is either the integer position or the name of the level.
Returns#
- Index
Calling object, as there is only one level in the Index.
See Also#
- cudf.MultiIndex.get_level_valuesGet values for
a level of a MultiIndex.
Notes#
For Index, level should be 0, since there are no multiple levels.
Examples#
>>> import cudf >>> idx = cudf.Index(["a", "b", "c"]) >>> idx.get_level_values(0) StringIndex(['a' 'b' 'c'], dtype='object')
- get_loc(key, method=None, tolerance=None)#
- get_slice_bound(label, side: str, kind=None) int #
Calculate slice bound that corresponds to given label. Returns leftmost (one-past-the-rightmost if
side=='right'
) position of given label.Parameters#
label : object side : {‘left’, ‘right’} kind : {‘ix’, ‘loc’, ‘getitem’}
Returns#
- int
Index of label.
- property has_duplicates#
- property hasnans#
Return True if there are any NaNs or nulls.
Returns#
- outbool
If Series has at least one NaN or null value, return True, if not return False.
Examples#
>>> import cudf >>> import numpy as np >>> index = cudf.Index([1, 2, np.nan, 3, 4], nan_as_null=False) >>> index Float64Index([1.0, 2.0, nan, 3.0, 4.0], dtype='float64') >>> index.hasnans True
hasnans returns True for the presence of any NA values:
>>> index = cudf.Index([1, 2, None, 3, 4]) >>> index Int64Index([1, 2, <NA>, 3, 4], dtype='int64') >>> index.hasnans True
- intersection(other, sort=False)#
Form the intersection of two Index objects.
This returns a new Index with elements common to the index and other.
Parameters#
other : Index or array-like sort : False or None, default False
Whether to sort the resulting index.
False : do not sort the result.
None : sort the result, except when self and other are equal or when the values cannot be compared.
Returns#
intersection : Index
Examples#
>>> import cudf >>> import pandas as pd >>> idx1 = cudf.Index([1, 2, 3, 4]) >>> idx2 = cudf.Index([3, 4, 5, 6]) >>> idx1.intersection(idx2) Int64Index([3, 4], dtype='int64')
MultiIndex case
>>> idx1 = cudf.MultiIndex.from_pandas( ... pd.MultiIndex.from_arrays( ... [[1, 1, 3, 4], ["Red", "Blue", "Red", "Blue"]] ... ) ... ) >>> idx2 = cudf.MultiIndex.from_pandas( ... pd.MultiIndex.from_arrays( ... [[1, 1, 2, 2], ["Red", "Blue", "Red", "Blue"]] ... ) ... ) >>> idx1 MultiIndex([(1, 'Red'), (1, 'Blue'), (3, 'Red'), (4, 'Blue')], ) >>> idx2 MultiIndex([(1, 'Red'), (1, 'Blue'), (2, 'Red'), (2, 'Blue')], ) >>> idx1.intersection(idx2) MultiIndex([(1, 'Red'), (1, 'Blue')], ) >>> idx1.intersection(idx2, sort=False) MultiIndex([(1, 'Red'), (1, 'Blue')], )
- is_boolean()#
Check if the Index only consists of booleans.
Deprecated since version 23.04: Use cudf.api.types.is_bool_dtype instead.
Returns#
- bool
Whether or not the Index only consists of booleans.
See Also#
is_integer : Check if the Index only consists of integers. is_floating : Check if the Index is a floating type. is_numeric : Check if the Index only consists of numeric data. is_object : Check if the Index is of the object dtype. is_categorical : Check if the Index holds categorical data. is_interval : Check if the Index holds Interval objects.
Examples#
>>> import cudf >>> idx = cudf.Index([True, False, True]) >>> idx.is_boolean() True >>> idx = cudf.Index(["True", "False", "True"]) >>> idx.is_boolean() False >>> idx = cudf.Index([1, 2, 3]) >>> idx.is_boolean() False
- is_categorical()#
Check if the Index holds categorical data.
Deprecated since version 23.04: Use cudf.api.types.is_categorical_dtype instead.
Returns#
- bool
True if the Index is categorical.
See Also#
CategoricalIndex : Index for categorical data. is_boolean : Check if the Index only consists of booleans. is_integer : Check if the Index only consists of integers. is_floating : Check if the Index is a floating type. is_numeric : Check if the Index only consists of numeric data. is_object : Check if the Index is of the object dtype. is_interval : Check if the Index holds Interval objects.
Examples#
>>> import cudf >>> idx = cudf.Index(["Watermelon", "Orange", "Apple", ... "Watermelon"]).astype("category") >>> idx.is_categorical() True >>> idx = cudf.Index([1, 3, 5, 7]) >>> idx.is_categorical() False >>> s = cudf.Series(["Peter", "Victor", "Elisabeth", "Mar"]) >>> s 0 Peter 1 Victor 2 Elisabeth 3 Mar dtype: object >>> s.index.is_categorical() False
- is_floating()#
Check if the Index is a floating type.
The Index may consist of only floats, NaNs, or a mix of floats, integers, or NaNs.
Deprecated since version 23.04: Use cudf.api.types.is_float_dtype instead.
Returns#
- bool
Whether or not the Index only consists of only consists of floats, NaNs, or a mix of floats, integers, or NaNs.
See Also#
is_boolean : Check if the Index only consists of booleans. is_integer : Check if the Index only consists of integers. is_numeric : Check if the Index only consists of numeric data. is_object : Check if the Index is of the object dtype. is_categorical : Check if the Index holds categorical data. is_interval : Check if the Index holds Interval objects.
Examples#
>>> import cudf >>> idx = cudf.Index([1.0, 2.0, 3.0, 4.0]) >>> idx.is_floating() True >>> idx = cudf.Index([1.0, 2.0, np.nan, 4.0]) >>> idx.is_floating() True >>> idx = cudf.Index([1, 2, 3, 4, np.nan], nan_as_null=False) >>> idx.is_floating() True >>> idx = cudf.Index([1, 2, 3, 4]) >>> idx.is_floating() False
- is_integer()#
Check if the Index only consists of integers.
Deprecated since version 23.04: Use cudf.api.types.is_integer_dtype instead.
Returns#
- bool
Whether or not the Index only consists of integers.
See Also#
is_boolean : Check if the Index only consists of booleans. is_floating : Check if the Index is a floating type. is_numeric : Check if the Index only consists of numeric data. is_object : Check if the Index is of the object dtype. is_categorical : Check if the Index holds categorical data. is_interval : Check if the Index holds Interval objects.
Examples#
>>> import cudf >>> idx = cudf.Index([1, 2, 3, 4]) >>> idx.is_integer() True >>> idx = cudf.Index([1.0, 2.0, 3.0, 4.0]) >>> idx.is_integer() False >>> idx = cudf.Index(["Apple", "Mango", "Watermelon"]) >>> idx.is_integer() False
- is_interval()#
Check if the Index holds Interval objects.
Deprecated since version 23.04: Use cudf.api.types.is_interval_dtype instead.
Returns#
- bool
Whether or not the Index holds Interval objects.
See Also#
IntervalIndex : Index for Interval objects. is_boolean : Check if the Index only consists of booleans. is_integer : Check if the Index only consists of integers. is_floating : Check if the Index is a floating type. is_numeric : Check if the Index only consists of numeric data. is_object : Check if the Index is of the object dtype. is_categorical : Check if the Index holds categorical data.
Examples#
>>> import cudf >>> import pandas as pd >>> idx = cudf.from_pandas( ... pd.Index([pd.Interval(left=0, right=5), ... pd.Interval(left=5, right=10)]) ... ) >>> idx.is_interval() True >>> idx = cudf.Index([1, 3, 5, 7]) >>> idx.is_interval() False
- property is_monotonic#
Return boolean if values in the object are monotonic_increasing.
This property is an alias for
is_monotonic_increasing
.Returns#
bool
- property is_monotonic_decreasing#
Return boolean if values in the object are monotonically decreasing.
Returns#
bool
- is_numeric()#
Check if the Index only consists of numeric data.
Deprecated since version 23.04: Use cudf.api.types.is_any_real_numeric_dtype instead.
Returns#
- bool
Whether or not the Index only consists of numeric data.
See Also#
is_boolean : Check if the Index only consists of booleans. is_integer : Check if the Index only consists of integers. is_floating : Check if the Index is a floating type. is_object : Check if the Index is of the object dtype. is_categorical : Check if the Index holds categorical data. is_interval : Check if the Index holds Interval objects.
Examples#
>>> import cudf >>> idx = cudf.Index([1.0, 2.0, 3.0, 4.0]) >>> idx.is_numeric() True >>> idx = cudf.Index([1, 2, 3, 4.0]) >>> idx.is_numeric() True >>> idx = cudf.Index([1, 2, 3, 4]) >>> idx.is_numeric() True >>> idx = cudf.Index([1, 2, 3, 4.0, np.nan]) >>> idx.is_numeric() True >>> idx = cudf.Index(["Apple", "cold"]) >>> idx.is_numeric() False
- is_object()#
Check if the Index is of the object dtype.
Deprecated since version 23.04: Use cudf.api.types.is_object_dtype instead.
Returns#
- bool
Whether or not the Index is of the object dtype.
See Also#
is_boolean : Check if the Index only consists of booleans. is_integer : Check if the Index only consists of integers. is_floating : Check if the Index is a floating type. is_numeric : Check if the Index only consists of numeric data. is_categorical : Check if the Index holds categorical data. is_interval : Check if the Index holds Interval objects.
Examples#
>>> import cudf >>> idx = cudf.Index(["Apple", "Mango", "Watermelon"]) >>> idx.is_object() True >>> idx = cudf.Index(["Watermelon", "Orange", "Apple", ... "Watermelon"]).astype("category") >>> idx.is_object() False >>> idx = cudf.Index([1.0, 2.0, 3.0, 4.0]) >>> idx.is_object() False
- property is_unique#
Return if the index has unique values.
- isin(values)#
Return a boolean array where the index values are in values.
Compute boolean array of whether each index value is found in the passed set of values. The length of the returned boolean array matches the length of the index.
Parameters#
- valuesset, list-like, Index
Sought values.
Returns#
- is_containedcupy array
CuPy array of boolean values.
Examples#
>>> idx = cudf.Index([1,2,3]) >>> idx Int64Index([1, 2, 3], dtype='int64')
Check whether each index value in a list of values.
>>> idx.isin([1, 4]) array([ True, False, False])
- isna()#
Detect missing values.
Return a boolean same-sized object indicating if the values are NA. NA values, such as
None
, numpy.NAN or cudf.NA, get mapped toTrue
values. Everything else get mapped toFalse
values.Returns#
- numpy.ndarray[bool]
A boolean array to indicate which entries are NA.
- join(other, how='left', level=None, return_indexers=False, sort=False)#
Compute join_index and indexers to conform data structures to the new index.
Parameters#
other : Index. how : {‘left’, ‘right’, ‘inner’, ‘outer’} return_indexers : bool, default False sort : bool, default False
Sort the join keys lexicographically in the result Index. If False, the order of the join keys depends on the join type (how keyword).
Returns: index
Examples#
>>> import cudf >>> lhs = cudf.DataFrame({ ... "a": [2, 3, 1], ... "b": [3, 4, 2], ... }).set_index(['a', 'b']).index >>> lhs MultiIndex([(2, 3), (3, 4), (1, 2)], names=['a', 'b']) >>> rhs = cudf.DataFrame({"a": [1, 4, 3]}).set_index('a').index >>> rhs Int64Index([1, 4, 3], dtype='int64', name='a') >>> lhs.join(rhs, how='inner') MultiIndex([(3, 4), (1, 2)], names=['a', 'b'])
- max()#
The maximum value of the index.
- memory_usage(deep=False)#
Return the memory usage of an object.
Parameters#
- deepbool
The deep parameter is ignored and is only included for pandas compatibility.
Returns#
The total bytes used.
- min()#
The minimum value of the index.
- property name#
Returns the name of the Index.
- property names#
Returns a tuple containing the name of the Index.
- property ndim#
Number of dimensions of the underlying data, by definition 1.
- property nlevels#
Number of levels.
- notna()#
Detect existing (non-missing) values.
Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to
True
. NA values, such as None or numpy.NAN, get mapped toFalse
values.Returns#
- numpy.ndarray[bool]
A boolean array to indicate which entries are not NA.
- rename(name, inplace=False)#
Alter Index name.
Defaults to returning new index.
Parameters#
- namelabel
Name(s) to set.
Returns#
Index
Examples#
>>> import cudf >>> index = cudf.Index([1, 2, 3], name='one') >>> index Int64Index([1, 2, 3], dtype='int64', name='one') >>> index.name 'one' >>> renamed_index = index.rename('two') >>> renamed_index Int64Index([1, 2, 3], dtype='int64', name='two') >>> renamed_index.name 'two'
- repeat(repeats, axis=None)#
Repeat elements of a Index.
Returns a new Index where each element of the current Index is repeated consecutively a given number of times.
Parameters#
- repeatsint, or array of ints
The number of repetitions for each element. This should be a non-negative integer. Repeating 0 times will return an empty object.
Returns#
- Index
A newly created object of same type as caller with repeated elements.
Examples#
>>> index = cudf.Index([10, 22, 33, 55]) >>> index Int64Index([10, 22, 33, 55], dtype='int64') >>> index.repeat(5) Int64Index([10, 10, 10, 10, 10, 22, 22, 22, 22, 22, 33, 33, 33, 33, 33, 55, 55, 55, 55, 55], dtype='int64')
- searchsorted(value, side: str = 'left', ascending: bool = True, na_position: str = 'last')#
Find index where elements should be inserted to maintain order
Parameters#
- value :
Value to be hypothetically inserted into Self
- sidestr {‘left’, ‘right’} optional, default ‘left’
If ‘left’, the index of the first suitable location found is given If ‘right’, return the last such index
- ascendingbool optional, default True
Index is in ascending order (otherwise descending)
- na_positionstr {‘last’, ‘first’} optional, default ‘last’
Position of null values in sorted order
Returns#
Insertion point.
Notes#
As a precondition the index must be sorted in the same order as requested by the ascending flag.
- set_names(names, level=None, inplace=False)#
Set Index or MultiIndex name. Able to set new names partially and by level.
Parameters#
- nameslabel or list of label
Name(s) to set.
- levelint, label or list of int or label, optional
If the index is a MultiIndex, level(s) to set (None for all levels). Otherwise level must be None.
- inplacebool, default False
Modifies the object directly, instead of creating a new Index or MultiIndex.
Returns#
- Index
The same type as the caller or None if inplace is True.
See Also#
cudf.Index.rename : Able to set new names without level.
Examples#
>>> import cudf >>> idx = cudf.Index([1, 2, 3, 4]) >>> idx Int64Index([1, 2, 3, 4], dtype='int64') >>> idx.set_names('quarter') Int64Index([1, 2, 3, 4], dtype='int64', name='quarter') >>> idx = cudf.MultiIndex.from_product([['python', 'cobra'], ... [2018, 2019]]) >>> idx MultiIndex([('python', 2018), ('python', 2019), ( 'cobra', 2018), ( 'cobra', 2019)], ) >>> idx.names FrozenList([None, None]) >>> idx.set_names(['kind', 'year'], inplace=True) >>> idx.names FrozenList(['kind', 'year']) >>> idx.set_names('species', level=0, inplace=True) >>> idx.names FrozenList(['species', 'year'])
- property shape#
Get a tuple representing the dimensionality of the data.
- shift(periods=1, freq=None)#
Not yet implemented
- property size#
- sort_values(return_indexer=False, ascending=True, na_position='last', key=None)#
Return a sorted copy of the index, and optionally return the indices that sorted the index itself.
Parameters#
- return_indexerbool, default False
Should the indices that would sort the index be returned.
- ascendingbool, default True
Should the index values be sorted in an ascending order.
- na_position{‘first’ or ‘last’}, default ‘last’
Argument ‘first’ puts NaNs at the beginning, ‘last’ puts NaNs at the end.
- keyNone, optional
This parameter is NON-FUNCTIONAL.
Returns#
- sorted_indexIndex
Sorted copy of the index.
- indexercupy.ndarray, optional
The indices that the index itself was sorted by.
See Also#
cudf.Series.min : Sort values of a Series. cudf.DataFrame.sort_values : Sort values in a DataFrame.
Examples#
>>> import cudf >>> idx = cudf.Index([10, 100, 1, 1000]) >>> idx Int64Index([10, 100, 1, 1000], dtype='int64')
Sort values in ascending order (default behavior).
>>> idx.sort_values() Int64Index([1, 10, 100, 1000], dtype='int64')
Sort values in descending order, and also get the indices idx was sorted by.
>>> idx.sort_values(ascending=False, return_indexer=True) (Int64Index([1000, 100, 10, 1], dtype='int64'), array([3, 1, 0, 2], dtype=int32))
Sorting values in a MultiIndex:
>>> midx = cudf.MultiIndex( ... levels=[[1, 3, 4, -10], [1, 11, 5]], ... codes=[[0, 0, 1, 2, 3], [0, 2, 1, 1, 0]], ... names=["x", "y"], ... ) >>> midx MultiIndex([( 1, 1), ( 1, 5), ( 3, 11), ( 4, 11), (-10, 1)], names=['x', 'y']) >>> midx.sort_values() MultiIndex([(-10, 1), ( 1, 1), ( 1, 5), ( 3, 11), ( 4, 11)], names=['x', 'y']) >>> midx.sort_values(ascending=False) MultiIndex([( 4, 11), ( 3, 11), ( 1, 5), ( 1, 1), (-10, 1)], names=['x', 'y'])
- property str#
Not yet implemented.
- take(indices, axis=0, allow_fill=True, fill_value=None)#
Return a new index containing the rows specified by indices
Parameters#
- indicesarray-like
Array of ints indicating which positions to take.
- axisint
The axis over which to select values, always 0.
allow_fill : Unsupported fill_value : Unsupported
Returns#
- outIndex
New object with desired subset of rows.
Examples#
>>> idx = cudf.Index(['a', 'b', 'c', 'd', 'e']) >>> idx.take([2, 0, 4, 3]) StringIndex(['c' 'a' 'e' 'd'], dtype='object')
- to_arrow()#
Convert to a suitable Arrow object.
- to_cupy()#
Convert to a cupy array.
- to_dlpack()#
Converts a cuDF object into a DLPack tensor.
DLPack is an open-source memory tensor structure: dmlc/dlpack.
This function takes a cuDF object and converts it to a PyCapsule object which contains a pointer to a DLPack tensor. This function deep copies the data into the DLPack tensor from the cuDF object.
Parameters#
cudf_obj : DataFrame, Series, Index, or Column
Returns#
- pycapsule_objPyCapsule
Output DLPack tensor pointer which is encapsulated in a PyCapsule object.
- to_frame(index=True, name=<no_default>)#
Create a DataFrame with a column containing this Index
Parameters#
- indexboolean, default True
Set the index of the returned DataFrame as the original Index
- nameobject, defaults to index.name
The passed name should substitute for the index name (if it has one).
Returns#
- DataFrame
DataFrame containing the original Index data.
See Also#
Index.to_series : Convert an Index to a Series. Series.to_frame : Convert Series to DataFrame.
Examples#
>>> import cudf >>> idx = cudf.Index(['Ant', 'Bear', 'Cow'], name='animal') >>> idx.to_frame() animal animal Ant Ant Bear Bear Cow Cow
By default, the original Index is reused. To enforce a new Index:
>>> idx.to_frame(index=False) animal 0 Ant 1 Bear 2 Cow
To override the name of the resulting column, specify name:
>>> idx.to_frame(index=False, name='zoo') zoo 0 Ant 1 Bear 2 Cow
- to_list()#
- to_numpy()#
Convert to a numpy array.
- to_pandas(nullable=False)#
Convert to a Pandas Index.
Parameters#
- nullablebool, Default False
If
nullable
isTrue
, the resulting index will have a corresponding nullable Pandas dtype. If there is no corresponding nullable Pandas dtype present, the resulting dtype will be a regular pandas dtype. Ifnullable
isFalse
, the resulting index will either convert null values tonp.nan
orNone
depending on the dtype.
Examples#
>>> import cudf >>> idx = cudf.Index([-3, 10, 15, 20]) >>> idx Int64Index([-3, 10, 15, 20], dtype='int64') >>> idx.to_pandas() Int64Index([-3, 10, 15, 20], dtype='int64') >>> type(idx.to_pandas()) <class 'pandas.core.indexes.numeric.Int64Index'> >>> type(idx) <class 'cudf.core.index.Int64Index'>
- to_series(index=None, name=None)#
Create a Series with both index and values equal to the index keys. Useful with map for returning an indexer based on an index.
Parameters#
- indexIndex, optional
Index of resulting Series. If None, defaults to original index.
- namestr, optional
Name of resulting Series. If None, defaults to name of original index.
Returns#
- Series
The dtype will be based on the type of the Index values.
- tolist()#
- union(other, sort=None)#
Form the union of two Index objects.
Parameters#
other : Index or array-like sort : bool or None, default None
Whether to sort the resulting Index.
None : Sort the result, except when
self and other are equal.
self or other has length 0.
False : do not sort the result.
Returns#
union : Index
Examples#
Union of an Index >>> import cudf >>> import pandas as pd >>> idx1 = cudf.Index([1, 2, 3, 4]) >>> idx2 = cudf.Index([3, 4, 5, 6]) >>> idx1.union(idx2) Int64Index([1, 2, 3, 4, 5, 6], dtype=’int64’)
MultiIndex case
>>> idx1 = cudf.MultiIndex.from_pandas( ... pd.MultiIndex.from_arrays( ... [[1, 1, 2, 2], ["Red", "Blue", "Red", "Blue"]] ... ) ... ) >>> idx1 MultiIndex([(1, 'Red'), (1, 'Blue'), (2, 'Red'), (2, 'Blue')], ) >>> idx2 = cudf.MultiIndex.from_pandas( ... pd.MultiIndex.from_arrays( ... [[3, 3, 2, 2], ["Red", "Green", "Red", "Green"]] ... ) ... ) >>> idx2 MultiIndex([(3, 'Red'), (3, 'Green'), (2, 'Red'), (2, 'Green')], ) >>> idx1.union(idx2) MultiIndex([(1, 'Blue'), (1, 'Red'), (2, 'Blue'), (2, 'Green'), (2, 'Red'), (3, 'Green'), (3, 'Red')], ) >>> idx1.union(idx2, sort=False) MultiIndex([(1, 'Red'), (1, 'Blue'), (2, 'Red'), (2, 'Blue'), (3, 'Red'), (3, 'Green'), (2, 'Green')], )
- property values#
- where(cond, other=None, inplace=False)#
Replace values where the condition is False.
The replacement is taken from other.
Parameters#
- condbool array-like with the same length as self
Condition to select the values on.
- otherscalar, or array-like, default None
Replacement if the condition is False.
Returns#
- cudf.Index
A copy of self with values replaced from other where the condition is False.