hipdf.DataFrame.reindex

hipdf.DataFrame.reindex#

22 min read time

Applies to Linux

DataFrame.reindex(labels=None, index=None, columns=None, axis=None, method=None, copy=True, level=None, fill_value=<NA>, limit=None, tolerance=None)#

Conform DataFrame to new index. Places NA/NaN in locations having no value in the previous index. A new object is produced unless the new index is equivalent to the current one and copy=False.

Parameters#

labelsIndex, Series-convertible, optional, default None

New labels / index to conform the axis specified by axis to.

indexIndex, Series-convertible, optional, default None

The index labels specifying the index to conform to.

columnsarray-like, optional, default None

The column labels specifying the columns to conform to.

axisAxis to target.

Can be either the axis name (index, columns) or number (0, 1).

method : Not supported copy : boolean, default True

Return a new object, even if the passed indexes are the same.

level : Not supported fill_value : Value to use for missing values.

Defaults to NA, but can be any “compatible” value.

limit : Not supported tolerance : Not supported

Returns#

DataFrame with changed index.

Examples#

DataFrame.reindex supports two calling conventions * (index=index_labels, columns=column_labels, ...) * (labels, axis={'index', 'columns'}, ...) We _highly_ recommend using keyword arguments to clarify your intent.

Create a dataframe with some fictional data.

>>> index = ['Firefox', 'Chrome', 'Safari', 'IE10', 'Konqueror']
>>> df = cudf.DataFrame({'http_status': [200, 200, 404, 404, 301],
...                    'response_time': [0.04, 0.02, 0.07, 0.08, 1.0]},
...                      index=index)
>>> df
        http_status  response_time
Firefox            200           0.04
Chrome             200           0.02
Safari             404           0.07
IE10               404           0.08
Konqueror          301           1.00
>>> new_index = ['Safari', 'Iceweasel', 'Comodo Dragon', 'IE10',
...              'Chrome']
>>> df.reindex(new_index)
            http_status response_time
Safari                404          0.07
Iceweasel            <NA>          <NA>
Comodo Dragon        <NA>          <NA>
IE10                  404          0.08
Chrome                200          0.02

We can fill in the missing values by passing a value to the keyword fill_value.

>>> df.reindex(new_index, fill_value=0)
            http_status  response_time
Safari                 404           0.07
Iceweasel                0           0.00
Comodo Dragon            0           0.00
IE10                   404           0.08
Chrome                 200           0.02

We can also reindex the columns.

>>> df.reindex(columns=['http_status', 'user_agent'])
        http_status user_agent
Firefox            200       <NA>
Chrome             200       <NA>
Safari             404       <NA>
IE10               404       <NA>
Konqueror          301       <NA>

Or we can use “axis-style” keyword arguments

>>> df.reindex(columns=['http_status', 'user_agent'])
        http_status user_agent
Firefox            200       <NA>
Chrome             200       <NA>
Safari             404       <NA>
IE10               404       <NA>
Konqueror          301       <NA>