datascience.tables.Table.split

Table.split(k)[source]

Return a tuple of two tables where the first table contains k rows randomly sampled and the second contains the remaining rows.

Args:
k (int): The number of rows randomly sampled into the first

table. k must be between 1 and num_rows - 1.

Raises:

ValueError: k is not between 1 and num_rows - 1.

Returns:

A tuple containing two instances of Table.

>>> jobs = Table().with_columns(
...     'job',  make_array('a', 'b', 'c', 'd'),
...     'wage', make_array(10, 20, 15, 8))
>>> jobs
job  | wage
a    | 10
b    | 20
c    | 15
d    | 8
>>> sample, rest = jobs.split(3)
>>> sample 
job  | wage
c    | 15
a    | 10
b    | 20
>>> rest 
job  | wage
d    | 8