hipdf.DataFrame.drop#
23 min read time
- DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')#
Drop specified labels from rows or columns.
Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names. When using a multi-index, labels on different levels can be removed by specifying the level.
Parameters#
- labelssingle label or list-like
Index or column labels to drop.
- axis{0 or ‘index’, 1 or ‘columns’}, default 0
Whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).
- indexsingle label or list-like
Alternative to specifying axis (
labels, axis=0
is equivalent toindex=labels
).- columnssingle label or list-like
Alternative to specifying axis (
labels, axis=1
is equivalent tocolumns=labels
).- levelint or level name, optional
For MultiIndex, level from which the labels will be removed.
- inplacebool, default False
If False, return a copy. Otherwise, do operation inplace and return None.
- errors{‘ignore’, ‘raise’}, default ‘raise’
If ‘ignore’, suppress error and only existing labels are dropped.
Returns#
- DataFrame or Series
DataFrame or Series without the removed index or column labels.
Raises#
- KeyError
If any of the labels is not found in the selected axis.
See Also#
DataFrame.loc : Label-location based indexer for selection by label. DataFrame.dropna : Return DataFrame with labels on given axis omitted
where (all or any) data are missing.
- DataFrame.drop_duplicatesReturn DataFrame with duplicate rows
removed, optionally only considering certain columns.
- Series.reindex
Return only specified index labels of Series
- Series.dropna
Return series without null values
- Series.drop_duplicates
Return series with duplicate values removed
Examples#
Series
>>> s = cudf.Series([1,2,3], index=['x', 'y', 'z']) >>> s x 1 y 2 z 3 dtype: int64
Drop labels x and z
>>> s.drop(labels=['x', 'z']) y 2 dtype: int64
Drop a label from the second level in MultiIndex Series.
>>> midx = cudf.MultiIndex.from_product([[0, 1, 2], ['x', 'y']]) >>> s = cudf.Series(range(6), index=midx) >>> s 0 x 0 y 1 1 x 2 y 3 2 x 4 y 5 dtype: int64 >>> s.drop(labels='y', level=1) 0 x 0 1 x 2 2 x 4 Name: 2, dtype: int64
DataFrame
>>> import cudf >>> df = cudf.DataFrame({"A": [1, 2, 3, 4], ... "B": [5, 6, 7, 8], ... "C": [10, 11, 12, 13], ... "D": [20, 30, 40, 50]}) >>> df A B C D 0 1 5 10 20 1 2 6 11 30 2 3 7 12 40 3 4 8 13 50
Drop columns
>>> df.drop(['B', 'C'], axis=1) A D 0 1 20 1 2 30 2 3 40 3 4 50 >>> df.drop(columns=['B', 'C']) A D 0 1 20 1 2 30 2 3 40 3 4 50
Drop a row by index
>>> df.drop([0, 1]) A B C D 2 3 7 12 40 3 4 8 13 50
Drop columns and/or rows of MultiIndex DataFrame
>>> midx = cudf.MultiIndex(levels=[['lama', 'cow', 'falcon'], ... ['speed', 'weight', 'length']], ... codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2], ... [0, 1, 2, 0, 1, 2, 0, 1, 2]]) >>> df = cudf.DataFrame(index=midx, columns=['big', 'small'], ... data=[[45, 30], [200, 100], [1.5, 1], [30, 20], ... [250, 150], [1.5, 0.8], [320, 250], ... [1, 0.8], [0.3, 0.2]]) >>> df big small lama speed 45.0 30.0 weight 200.0 100.0 length 1.5 1.0 cow speed 30.0 20.0 weight 250.0 150.0 length 1.5 0.8 falcon speed 320.0 250.0 weight 1.0 0.8 length 0.3 0.2 >>> df.drop(index='cow', columns='small') big lama speed 45.0 weight 200.0 length 1.5 falcon speed 320.0 weight 1.0 length 0.3 >>> df.drop(index='length', level=1) big small lama speed 45.0 30.0 weight 200.0 100.0 cow speed 30.0 20.0 weight 250.0 150.0 falcon speed 320.0 250.0 weight 1.0 0.8