hipdf.DataFrame.drop

hipdf.DataFrame.drop#

22 min read time

Applies to Linux

DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace: bool = False, errors: Literal['ignore', 'raise'] = 'raise') → Self | None#

Drop specified labels from rows or columns.

Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names. When using a multi-index, labels on different levels can be removed by specifying the level.

Parameters#

labelssingle label or list-like: Index or column labels to drop.
axis{0 or ‘index’, 1 or ‘columns’}, default 0: Whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).
indexsingle label or list-like: Alternative to specifying axis (labels, axis=0 is equivalent to index=labels).
columnssingle label or list-like: Alternative to specifying axis (labels, axis=1 is equivalent to columns=labels).
levelint or level name, optional: For MultiIndex, level from which the labels will be removed.
inplacebool, default False: If False, return a copy. Otherwise, do operation inplace and return None.
errors{‘ignore’, ‘raise’}, default ‘raise’: If ‘ignore’, suppress error and only existing labels are dropped.

Returns#

DataFrame or Series: DataFrame or Series without the removed index or column labels.

Raises#

KeyError: If any of the labels is not found in the selected axis.

Examples#

Series

>>> s = cudf.Series([1,2,3], index=['x', 'y', 'z'])
>>> s
x    1
y    2
z    3
dtype: int64

Drop labels x and z

>>> s.drop(labels=['x', 'z'])
y    2
dtype: int64

Drop a label from the second level in MultiIndex Series.

>>> midx = cudf.MultiIndex.from_product([[0, 1, 2], ['x', 'y']])
>>> s = cudf.Series(range(6), index=midx)
>>> s
0  x    0
   y    1
1  x    2
   y    3
2  x    4
   y    5
dtype: int64
>>> s.drop(labels='y', level=1)
0  x    0
1  x    2
2  x    4
Name: 2, dtype: int64

DataFrame

>>> import cudf
>>> df = cudf.DataFrame({"A": [1, 2, 3, 4],
...                      "B": [5, 6, 7, 8],
...                      "C": [10, 11, 12, 13],
...                      "D": [20, 30, 40, 50]})
>>> df
   A  B   C   D
0  1  5  10  20
1  2  6  11  30
2  3  7  12  40
3  4  8  13  50

Drop columns

>>> df.drop(['B', 'C'], axis=1)
   A   D
0  1  20
1  2  30
2  3  40
3  4  50
>>> df.drop(columns=['B', 'C'])
   A   D
0  1  20
1  2  30
2  3  40
3  4  50

Drop a row by index

>>> df.drop([0, 1])
   A  B   C   D
2  3  7  12  40
3  4  8  13  50

Drop columns and/or rows of MultiIndex DataFrame

>>> midx = cudf.MultiIndex(levels=[['lama', 'cow', 'falcon'],
...                              ['speed', 'weight', 'length']],
...                      codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2],
...                             [0, 1, 2, 0, 1, 2, 0, 1, 2]])
>>> df = cudf.DataFrame(index=midx, columns=['big', 'small'],
...                   data=[[45, 30], [200, 100], [1.5, 1], [30, 20],
...                         [250, 150], [1.5, 0.8], [320, 250],
...                         [1, 0.8], [0.3, 0.2]])
>>> df
                 big  small
lama   speed    45.0   30.0
       weight  200.0  100.0
       length    1.5    1.0
cow    speed    30.0   20.0
       weight  250.0  150.0
       length    1.5    0.8
falcon speed   320.0  250.0
       weight    1.0    0.8
       length    0.3    0.2
>>> df.drop(index='cow', columns='small')
                 big
lama   speed    45.0
       weight  200.0
       length    1.5
falcon speed   320.0
       weight    1.0
       length    0.3
>>> df.drop(index='length', level=1)
                 big  small
lama   speed    45.0   30.0
       weight  200.0  100.0
cow    speed    30.0   20.0
       weight  250.0  150.0
falcon speed   320.0  250.0
       weight    1.0    0.8