datascience.tables.Table.group

Table.group(column_or_label, collect=None)[source]

Group rows by unique values in a column; count or aggregate others.

Args:

column_or_label: values to group (column label or index, or array)

collect: a function applied to values in other columns for each group

Returns:

A Table with each row corresponding to a unique value in column_or_label, where the first column contains the unique values from column_or_label, and the second contains counts for each of the unique values. If collect is provided, a Table is returned with all original columns, each containing values calculated by first grouping rows according to column_or_label, then applying collect to each set of grouped values in the other columns.

Note:

The grouped column will appear first in the result table. If collect does not accept arguments with one of the column types, that column will be empty in the resulting table.

>>> marbles = Table().with_columns(
...    "Color", make_array("Red", "Green", "Blue", "Red", "Green", "Green"),
...    "Shape", make_array("Round", "Rectangular", "Rectangular", "Round", "Rectangular", "Round"),
...    "Amount", make_array(4, 6, 12, 7, 9, 2),
...    "Price", make_array(1.30, 1.30, 2.00, 1.75, 1.40, 1.00))
>>> marbles
Color | Shape       | Amount | Price
Red   | Round       | 4      | 1.3
Green | Rectangular | 6      | 1.3
Blue  | Rectangular | 12     | 2
Red   | Round       | 7      | 1.75
Green | Rectangular | 9      | 1.4
Green | Round       | 2      | 1
>>> marbles.group("Color") # just gives counts
Color | count
Blue  | 1
Green | 3
Red   | 2
>>> marbles.group("Color", max) # takes the max of each grouping, in each column
Color | Shape max   | Amount max | Price max
Blue  | Rectangular | 12         | 2
Green | Round       | 9          | 1.4
Red   | Round       | 7          | 1.75
>>> marbles.group("Shape", sum) # sum doesn't make sense for strings
Shape       | Color sum | Amount sum | Price sum
Rectangular |           | 27         | 4.7
Round       |           | 13         | 4.05