Creating Custom Plots

To create a custom plot, click the Add Graph button in the upper-right corner and select one of the available plot types. After selecting a plot, configure the available settings for that plot type and click Save. The custom plot appears on the Visualization page once it has been created.

The following example creates a custom histogram plot for the CreditCard-Train dataset:

Create Custom Plot

The following is a complete list of available graph types.

Bar chart

This plot presents categorical data with rectangular bars that are proportional to the values they represent. The type of marker used to represent bars determines the bar chart type. The most common marker is the bar marker, which ranges from a lower value (usually zero) to an upper value. Also available are the Cleveland dot plot (replaces the bar with a dot located at the upper value) and the area chart (covers the bars with a solid area marker). Bars are always plotted against the categories of a categorical variable. They may represent counts (if no y variable is specified) or the average value of the y variable per category (if the y variable is specified).

When creating a bar chart, specify the following options:

  • x variable name: Specify the name of the x variable

  • y variable name: Specify the name of the y variable

  • Transpose: Specify whether to switch the X-axis and Y-axis

  • Sort: Specify whether to sort bars alphabetically by x values

  • Mark: Specify a marker type. Select point to create a Cleveland dot plot

Boxplot

This plot presents the fractiles of a distribution. The center of the box represents the median, the edges of a box represent the lower and upper quartiles, and the ends of the “whiskers” represent that range of values. When outliers occur, the adjacent whisker is shortened to the next lower or upper value. For variables having only a few values, the boxes can be compressed.

When creating a boxplot, specify the following options:

  • Variable name: Specify the variable that you want the box to represent

  • Transpose: Specify whether to switch the X-axis and Y-axis

Dotplot

This plot represents individual data values with dots. When more than one value falls within a small neighborhood, the dots are stacked.

When creating a dotplot, specify the following options:

  • Variable name: Specify the name of the variable on which dots are calculated

  • Mark: Specify a marker type

Grouped Boxplot

This plot is a boxplot where categories are organized into groups and subgroups.

When creating a grouped boxplot, specify the following options:

  • Variable name: Specify the variable that you want the box to represent

  • Group variable name: Specify the name of the grouping variable

  • Transpose: Specify whether to switch the X-axis and Y-axis

Heatmap

See data heatmap. When creating a heatmap, specify the following options:

  • Variable names: Specify one or more variables to use. If none are specified, all the variables in the dataset are used

  • Permute: Specify whether to reorder variables using singular value decomposition (SVD)

  • Transpose: Specify whether to switch the X-axis and Y-axis

  • Matrix type: Specify a matrix type. Choose from rectangular and symmetric

Histogram

This plot is a graphical display of data that uses bars of differing height. Each bar groups numbers into ranges by its width, and taller bars show that more data falls within a specific range. This plot is often used to display the shape and spread of a continuous variable.

When creating a histogram, specify the following options:

  • Variable name: Specify the variable name

  • Transformation: Specify whether to use a transformation. Choose from log and square root

  • Number of bars: Specify the number of bars to use

  • Mark: Specify a marker type. Use area to create a density polygon

Linear Regression

This plot predicts a set of values on a variable y from values on a variable x by fitting a linear function (\(ax + b\)) so that for any value on the x variable, this function yields the most probable value on the y variable. The effectiveness of this prediction in a sample of values is represented by the discrepancies between the y values and their corresponding predicted values.

When creating a linear regression plot, specify the following options:

  • x variable name: Specify the name of the x variable

  • y variable name: Specify the name of the y variable

  • Mark: Specify a marker type. Choose from point and square

LOESS Regression

This plot predicts a set of values on a variable y from values on a variable x by fitting a locally linear function (\(ax + b\)) that determines the most probable y variable values based on the available x variable values. The effectiveness of this prediction in a sample of values is represented by the discrepancies between the y values and their corresponding predicted values.

When creating a LOESS regression plot, specify the following options:

  • x variable name: Specify the name of the x variable

  • y variable name: Specify the name of the y variable

  • Mark: Specify a marker type. Choose from point and square

  • Bandwidth: Specify the interval that represents the proportion of cases during the smoothing window. This is set to 0.5 by default

Parallel Coordinates Plot

This plot is used for comparing multiple variables. Each variable has its own vertical axis in the plot, and each profile connects the values on the axes for a single observation. If the data contains clusters, these profiles are color-coded by their cluster number.

When creating a parallel coordinates plot, specify the following options:

  • Variable names: Specify one or more variables to use. If none are specified, all the variables in the dataset are used

  • Permute: Specify whether to reorder variables using singular value decomposition (SVD)

  • Transpose: Specify whether to switch the X-axis and Y-axis

  • Cluster: Specify whether to include k-Means cluster variables. Unique colors are assigned for each cluster ID

Probability Plot

This plot evaluates the skewness of a distribution by plotting two cumulative distribution functions against each other.

When creating a probability plot, specify the following options:

  • x variable name: Specify the name of the x variable

  • Distribution: Specify a distribution type. Choose from normal and uniform

  • Mark: Specify a marker type. Choose from point and square

  • Transpose: Specify whether to switch the X-axis and Y-axis

Quantile Plot

This plot compares two probability distributions by plotting their quantiles against each other.

When creating a quantile plot, specify the following options:

  • x variable name: Specify the name of the x variable

  • y variable name: Specify the name of the y variable

  • Distribution: Specify a distribution type. Choose from normal and uniform

  • Mark: Specify a marker type. Choose from point and square

  • Transpose: Specify whether to switch the X-axis and Y-axis

Scatterplot

This plot represents the values of two variables (y and x) in a frame that contains one point for each row of the input sample data. They are useful for analyzing the joint distribution of two variables.

When creating a scatterplot, specify the following options:

  • x variable name: Specify the name of the x variable

  • y variable name: Specify the name of the y variable

  • Mark: Specify a marker type. Choose from point and square