datascience.tables.Table.pivot_bin¶
- Table.pivot_bin(pivot_columns, value_column, bins=None, **vargs)[source]¶
Form a table with columns formed by the unique tuples in pivot_columns containing counts per bin of the values associated with each tuple in the value_column.
By default, bins are chosen to contain all values in the value_column. The following named arguments from numpy.histogram can be applied to specialize bin widths:
- Args:
bins
(int or sequence of scalars): If bins is an int,it defines the number of equal-width bins in the given range (10, by default). If bins is a sequence, it defines the bin edges, including the rightmost edge, allowing for non-uniform bin widths.
range
((float, float)): The lower and upper range ofthe bins. If not provided, range contains all values in the table. Values outside the range are ignored.
normed
(bool): If False, the result will contain the number ofsamples in each bin. If True, the result is normalized such that the integral over the range is 1.
- Returns:
New pivot table with unique rows of specified
pivot_columns
, populated with 0s and 1s with respect to values fromvalue_column
distributed into specifiedbins
andrange
.
Examples:
>>> t = Table.from_records([ ... { ... 'column1':'data1', ... 'column2':86, ... 'column3':'b', ... 'column4':5, ... }, ... { ... 'column1':'data2', ... 'column2':51, ... 'column3':'c', ... 'column4':3, ... }, ... { ... 'column1':'data3', ... 'column2':32, ... 'column3':'a', ... 'column4':6, ... } ... ])
>>> t column1 | column2 | column3 | column4 data1 | 86 | b | 5 data2 | 51 | c | 3 data3 | 32 | a | 6
>>> t.pivot_bin(pivot_columns='column1',value_column='column2') bin | data1 | data2 | data3 32 | 0 | 0 | 1 37.4 | 0 | 0 | 0 42.8 | 0 | 0 | 0 48.2 | 0 | 1 | 0 53.6 | 0 | 0 | 0 59 | 0 | 0 | 0 64.4 | 0 | 0 | 0 69.8 | 0 | 0 | 0 75.2 | 0 | 0 | 0 80.6 | 1 | 0 | 0 ... (1 rows omitted)
>>> t.pivot_bin(pivot_columns=['column1','column2'],value_column='column4') bin | data1-86 | data2-51 | data3-32 3 | 0 | 1 | 0 3.3 | 0 | 0 | 0 3.6 | 0 | 0 | 0 3.9 | 0 | 0 | 0 4.2 | 0 | 0 | 0 4.5 | 0 | 0 | 0 4.8 | 1 | 0 | 0 5.1 | 0 | 0 | 0 5.4 | 0 | 0 | 0 5.7 | 0 | 0 | 1 ... (1 rows omitted)
>>> t.pivot_bin(pivot_columns='column1',value_column='column2',bins=[20,45,100]) bin | data1 | data2 | data3 20 | 0 | 0 | 1 45 | 1 | 1 | 0 100 | 0 | 0 | 0
>>> t.pivot_bin(pivot_columns='column1',value_column='column2',bins=5,range=[30,60]) bin | data1 | data2 | data3 30 | 0 | 0 | 1 36 | 0 | 0 | 0 42 | 0 | 0 | 0 48 | 0 | 1 | 0 54 | 0 | 0 | 0 60 | 0 | 0 | 0