Back to the table of contents Previous Next waffles_plotA command-line tool for plotting and visualizing datasets. Here's the usage information: Full Usage Information [Square brackets] are used to indicate required arguments. <Angled brackets> are used to indicate optional arguments. waffles_plot [command] Visualize data, plot functions, make charts, etc. bar [dataset] <options> Make a bar chart using one row of data from the specified dataset. Prints the chart as an SVG file to stdout. [dataset] The filename of a dataset for the bar chart. The dataset should contain only continuous attributes. It only needs to have one row since the other rows are ignored. <options> -range [min] [max] Specify the min and max values to show on the chart. (The default is to compute the range automatically.) -row [n] Specify which row in the dataset to use. (The default is 0.) -pad [d] Specify how much range to show beyond the min and max values when the range is determined automatically. This value is ignored if the range is specified explicitly. -thickness [t] Specify the thickness of the bars. (Note that the whole chart will be stretched to fit the width, so adjusting the width may also affect bar thickness.) -spacing [s] Specify how much space to place between the bars. -textsize [d] Specify the size of the font to use for the text labels. -noserifs Use a font with no serifs. (Generally, this makes the chart look a little cleaner.) -marks [n] Specify the maximum number of horizontal lines to use to mark positions on the vertical axis. (Set to 0 if you do not want any markings.) -size [width] [height] Specify the size of the chart. (The default is 960 540.) -labels [l0] [l1] [l2] [etc] Specify label strings to use instead of the attribute names. The number of labels specified should match the number of columns in the data. equation <options> [equations] Plot an equation (or multiple equations) in 2D. Output is printed to stdout as an SVG file. <options> -size [width] [height] Specify the size of the chart. (The default is 960 540.) -margin [size] Specify the size of the margin for the axis labels. (The default is 100.) -horizmarks [n] Specify the maximum number of vertical lines to draw to mark position along the horizontal axis. -vertmarks [n] Specify the maximum number of horizontal lines to draw to mark position along the vertical axis. -range [xmin] [ymin] [xmax] [ymax] Set the range. (The default is: -10 -5 10 5.) -nohmarks Do not draw any vertical lines to mark position on the horizontal axis. -novmarks Do not draw any horizontal lines to mark position on the vertical axis. -nogrid Do not draw any horizontal or vertical grid lines. -noserifs Use a font with no serifs. (This generally makes charts look a little cleaner.) -aspect Adjust the range to preserve the aspect ratio. In other words, make sure that both axes visually have the same scale. -thickness [size] Specify the thickness of the lines. [equations] A set of equations separated by semicolons. Since '^' is a special character for many shells, it's usually a good idea to put your equations inside quotation marks. Here are some examples: "f1(x)=3*x+2" "f1(x)=(g(x)+1)/g(x); g(x)=sqrt(x)+pi" "h(bob)=bob^2;f1(x)=3+bar(x,5)*h(x)-(x/foo);bar(a,b)=a*b-b;foo=3.2" Only functions that begin with 'f' followed by a number will be plotted, starting with 'f1', and it will stop when the next number in ascending order is not defined. You may define any number of helper functions or constants with any name you like. Built in constants include: e, and pi. Built in functions include: +, -, *, /, %, ^, abs, acos, acosh, asin, asinh, atan, atanh, ceil, cos, cosh, erf, floor, gamma, lgamma, log, max, min, normal, sin, sinh, sqrt, tan, and tanh. These generally have the same meaning as in C, except '^' means exponent, "gamma" is the gamma function, "normal" is the standard normal pdf, and max and min can support any number (>=1) of parameters. (Some of these functions may not not be available on Windows, but most of them are.) You can override any built in constants or functions with your own variables or functions, so you don't need to worry too much about name collisions. Variables must begin with an alphabet character or an underscore. Multiplication is never implicit, so you must use a '*' character to multiply. Whitespace is ignored. histogram [dataset] <options> Make a histogram. Print the plot to stdout in SVG format. [dataset] The filename of a dataset for the histogram. <options> -size [width] [height] Specify the size of the chart. (The default is 1024 1024.) -attr [index] Specify which attribute is charted. (The default is 0.) -range [xmin] [xmax] [ymax] Specify the range of the histogram plot. (Note that ymin is always 0.) printdecisiontree [model-file] <dataset> <data_opts> Print a textual representation of a decision tree to stdout. [model-file] The filename of a trained decision tree model. (You can make one with the command "waffles_learn train [dataset] decisiontree > [filename]".) <dataset> An optional filename of the arff file that was used to train the decision tree. The data in this file is ignored, but the meta-data will be used to make the printed model richer. <data_opts> -labels [attr_list] Specify which attributes to use as labels. (If not specified, the default is to use the last attribute for the label.) [attr_list] is a comma-separated list of zero-indexed columns. A hypen may be used to specify a range of columns. A '*' preceding a value means to index from the right instead of the left. For example, "0,2-5" refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all but the last column. -ignore [attr_list] Specify attributes to ignore. [attr_list] is a comma-separated list of zero-indexed columns. A hypen may be used to specify a range of columns. A '*' preceding a value means to index from the right instead of the left. For example, "0,2-5" refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all but the last column. printrandomforest [model-file] <dataset> <data_opts> Print a textual representation of random forest to stdout. [model-file] The filename of a trained random forest model. (You can make one with the command "waffles_learn train [dataset] randomforest [trees] > [filename]".) <dataset> An optional filename of the arff file that was used to train the random forest. The data in this file is ignored, but the meta-data will be used to make the printed model richer. <data_opts> -labels [attr_list] Specify which attributes to use as labels. (If not specified, the default is to use the last attribute for the label.) [attr_list] is a comma-separated list of zero-indexed columns. A hypen may be used to specify a range of columns. A '*' preceding a value means to index from the right instead of the left. For example, "0,2-5" refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all but the last column. -ignore [attr_list] Specify attributes to ignore. [attr_list] is a comma-separated list of zero-indexed columns. A hypen may be used to specify a range of columns. A '*' preceding a value means to index from the right instead of the left. For example, "0,2-5" refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all but the last column. scatter [dataset] <globalopts> <color-x-y> Makes a scatter plot or line graph. Print the resulting SVG file to stdout. [dataset] The filename of a dataset to be plotted. The first attribute specifies the values on the horizontal axis. All other attributes specify the values on the vertical axis for a certain color. <globalopts> -size [width] [height] Specify the size of the chart. (The default is 960 540.) -margin [size] Specify the size of the margin for the axis labels. (The default is 100.) -horizmarks [n] Specify the maximum number of vertical lines to draw to mark position along the horizontal axis. -vertmarks [n] Specify the maximum number of horizontal lines to draw to mark position along the vertical axis. -pad [n] Specify the ratio of extra space to include in the range of the chart beyond the most-extreme points. (This value is only used if the range is auto-determined. It is ignored if the range is specified explicitly.) -range [xmin] [ymin] [xmax] [ymax] Set the range for the chart. (The default is to determine the range automatically.) -logx Show the horizontal axis on a logarithmic scale -logy Show the vertical axis on a logarithmic scale -nohmarks Do not draw any vertical lines to mark position on the horizontal axis. -novmarks Do not draw any horizontal lines to mark position on the vertical axis. -nogrid Do not draw any horizontal or vertical grid lines. -noserifs Use a font with no serifs. (This generally makes charts look a little cleaner.) -hlabel [string] Specify a label for the horizontal axis. (The default is to determine it from the data.) -vlabel [string] Specify a label for the vertical axis. (The default is to determine it from the data.) -aspect Adjust the range to preserve the aspect ratio. In other words, make sure that both axes visually have the same scale. -horizattr [n] Make a grid of charts, instead of just a single chart, and specify the attribute that differs along the horizontal axis of the grid of charts. An equal number of samples must exist for every value in this attribute. -vertattr [n] Make a grid of charts, instead of just a single chart, and specify the attribute that differs along the vertical axis of the grid of charts. An equal number of samples must exist for every value in this attribute. <color-x-y> [color] [attr-x] [attr-y] <options> [color] Specify the color to use for this pair of attributes. row Use a spectrum color according to the row-index in the data (starting with red, ending with purple) #800000 Red. red The same as #800000. pink The same as #ffc0c0. peach The same as #ffc080. orange The same as #ff8000. brown The same as #a06000. yellow The same as #d0d000. green The same as #008000. cyan The same as #008080. blue The same as #000080. purple The same as #8000ff. magenta The same as #800080. black The same as #000000. gray The same as #808080. 0 Use the value in attribute 0 to determine the color. 1 Use the value in attribute 1 to determine the color. 2 Use the value in attribute 2 to determine the color. 3 Use the value in attribute 3 to determine the color. (And so forth.) [attr-x] The zero-based index of the attribute to use to specify position on the horizontal axis. (Alternatively, the special value "row" may be used to use the row-index instead of an attribute for the horizontal axis.) [attr-y] The zero-based index of the attribute to use to specify position on the vertical axis. (Alternatively, the special value "row" may be used to use the row-index instead of an attribute for the vertical axis.) <options> -radius Specify the radius (in window units) to use for each point. -thickness Specify the thickness (in window units) of the lines to use to connect the points. (Use 0 if you want a scatter plot with no connecting lines.) percentsame [dataset1] [dataset2] Given two data files, counts the number of identical values in the same place in each dataset. Prints as a percent for each column. The data files must have the same number and type of attributes as well as the same number of rows. semanticmap [model-file] [dataset] <options> Write a svg file representing a semantic map for the given self-organizing map processing the given dataset. For each node n, a semantic map plots, at n's location in the map, one attribute (usually a class label) of the entry of the input data to which n responds most strongly. [model-file] The self-organizing map output from "waffles_transform som". [dataset] Data for the semantic map in .arff format. Any attributes over the number needed for input to the self-organizing map are ignored in determining som node responses. <options> -out [filename] Write the svg file to filename. The default is "semantic_map.svg". -labels [column] Use the attribute column for labeling. Column is a zero-based index into the attributes. The default is to use the last column. -variance Label the nodes with the variance of the label column values for their winning dataset entries. If the label column is a variable being predicted, then its variance its related to the predictive power of that node. Higher variance means lower predictive power. stats [dataset] <options> Prints some basic stats about the dataset to stdout. [dataset] The filename of a dataset. <options> -all Print stats for all attributes, even if there are a lot of them. usage Print usage information. Previous Next Back to the table of contents |