datascience.tables.Table.sample_from_distribution

Table.sample_from_distribution(distribution, k, proportions=False)[source]

Return a new table with the same number of rows and a new column. The values in the distribution column are define a multinomial. They are replaced by sample counts/proportions in the output.

>>> sizes = Table(['size', 'count']).with_rows([
...     ['small', 50],
...     ['medium', 100],
...     ['big', 50],
... ])
>>> sizes.sample_from_distribution('count', 1000) 
size   | count | count sample
small  | 50    | 239
medium | 100   | 496
big    | 50    | 265
>>> sizes.sample_from_distribution('count', 1000, True) 
size   | count | count sample
small  | 50    | 0.24
medium | 100   | 0.51
big    | 50    | 0.25