datascience.tables.Table.scatter¶
- Table.scatter(column_for_x, select=None, overlay=True, fit_line=False, group=None, labels=None, sizes=None, width=None, height=None, s=20, **vargs)[source]¶
Creates scatterplots, optionally adding a line of best fit. Redirects to
Table#iscatter
if interactive plots are enabled withTable#interactive_plots
- args:
column_for_x
(str
): the column to use for the x-axis valuesand label of the scatter plots.
- kwargs:
overlay
(bool
): if true, creates a chart with one colorper data column; if false, each plot will be displayed separately.
fit_line
(bool
): draw a line of best fit for each set of points.vargs
: additional arguments that get passed into plt.scatter.see http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.scatter for additional arguments that can be passed into vargs. these include: marker and norm, to name a couple.
group
: a column of categories to be used for coloring dots pereach category grouping.
labels
: a column of text labels to annotate dots.sizes
: a column of values to set the relative areas of dots.s
: size of dots. if sizes is also provided, then dots will bein the range 0 to 2 * s.
colors
: (deprecated) A synonym forgroup
. Retainedtemporarily for backwards compatibility. This argument will be removed in future releases.
show
(bool
): whether to show the figure if using interactive plots; if false,the figure is returned instead
- Raises:
ValueError – Every column,
column_for_x
orselect
, must be numerical- Returns:
Scatter plot of values of
column_for_x
plotted against values for all other columns in self. Each plot uses the values in column_for_x for horizontal positions. One plot is produced for all other columns in self as y (or for the columns designated by select).
>>> table = Table().with_columns( ... 'x', make_array(9, 3, 3, 1), ... 'y', make_array(1, 2, 2, 10), ... 'z', make_array(3, 4, 5, 6)) >>> table x | y | z 9 | 1 | 3 3 | 2 | 4 3 | 2 | 5 1 | 10 | 6 >>> table.scatter('x') <scatterplot of values in y and z on x>
>>> table.scatter('x', overlay=False) <scatterplot of values in y on x> <scatterplot of values in z on x>
>>> table.scatter('x', fit_line=True) <scatterplot of values in y and z on x with lines of best fit>