datascience.tables.Table.pivot_bin¶

Table.pivot_bin(pivot_columns, value_column, bins=None, **vargs)[source]¶

Form a table with columns formed by the unique tuples in pivot_columns containing counts per bin of the values associated with each tuple in the value_column.

By default, bins are chosen to contain all values in the value_column. The following named arguments from numpy.histogram can be applied to specialize bin widths:

Args:

bins (int or sequence of scalars): If bins is an int,: it defines the number of equal-width bins in the given range (10, by default). If bins is a sequence, it defines the bin edges, including the rightmost edge, allowing for non-uniform bin widths.
range ((float, float)): The lower and upper range of: the bins. If not provided, range contains all values in the table. Values outside the range are ignored.
normed (bool): If False, the result will contain the number of: samples in each bin. If True, the result is normalized such that the integral over the range is 1.

Returns:

New pivot table with unique rows of specified pivot_columns, populated with 0s and 1s with respect to values from value_column distributed into specified bins and range.

Examples:

>>> t = Table.from_records([
...   {
...    'column1':'data1',
...    'column2':86,
...    'column3':'b',
...    'column4':5,
...   },
...   {
...    'column1':'data2',
...    'column2':51,
...    'column3':'c',
...    'column4':3,
...   },
...   {
...    'column1':'data3',
...    'column2':32,
...    'column3':'a',
...    'column4':6,
...   }
... ])

>>> t
column1 | column2 | column3 | column4
data1   | 86      | b       | 5
data2   | 51      | c       | 3
data3   | 32      | a       | 6

>>> t.pivot_bin(pivot_columns='column1',value_column='column2')
bin  | data1 | data2 | data3
 | 0     | 0     | 1
4 | 0     | 0     | 0
8 | 0     | 0     | 0
2 | 0     | 1     | 0
6 | 0     | 0     | 0
 | 0     | 0     | 0
4 | 0     | 0     | 0
8 | 0     | 0     | 0
2 | 0     | 0     | 0
6 | 1     | 0     | 0
... (1 rows omitted)

>>> t.pivot_bin(pivot_columns=['column1','column2'],value_column='column4')
bin  | data1-86 | data2-51 | data3-32
  | 0        | 1        | 0
3  | 0        | 0        | 0
6  | 0        | 0        | 0
9  | 0        | 0        | 0
2  | 0        | 0        | 0
5  | 0        | 0        | 0
8  | 1        | 0        | 0
1  | 0        | 0        | 0
4  | 0        | 0        | 0
7  | 0        | 0        | 1
... (1 rows omitted)

>>> t.pivot_bin(pivot_columns='column1',value_column='column2',bins=[20,45,100])
bin  | data1 | data2 | data3
20   | 0     | 0     | 1
45   | 1     | 1     | 0
100  | 0     | 0     | 0

>>> t.pivot_bin(pivot_columns='column1',value_column='column2',bins=5,range=[30,60])
bin  | data1 | data2 | data3
 | 0     | 0     | 1
 | 0     | 0     | 0
 | 0     | 0     | 0
 | 0     | 1     | 0
 | 0     | 0     | 0
 | 0     | 0     | 0

datascience.tables.Table.pivot_bin¶

Table of Contents

Previous topic

Next topic

This Page