Click or drag to resize
What Is ML Studio?
[This topic is pre-release documentation and is subject to change in future releases. Blank topics are included as placeholders.]

The ML Studio interactive workspace

To develop a predictive analysis model, you typically use data from one or more sources, transform and analyze that data through various data manipulation and statistical functions, and generate a set of results. Developing a model like this is an iterative process - as you modify the various functions and their parameters, your results converge until you are satisfied that you have a trained, effective model.

ML Studio gives you an interactive, visual workspace to easily build, test, and iterate on a predictive analysis model. You drag-and-drop datasets and analysis modules onto an interactive canvas, connecting them together to form an experiment, which you run in ML Studio. To iterate on your model design, you edit the experiment, save a copy if desired, and you run it again. When you're ready, you can publish your experiment as a web service so that your model can be accessed by others.

There is no programming required, just visually connecting datasets and modules to construct your predictive analysis model.

ML Studio
Getting started with ML Studio

When you first enter ML Studio, you see the following tabs on the left:

  • Studio Home - A set of links to documentation and other resources

  • EXPERIMENTS - Experiments that have been created, run, and saved as drafts.

  • WEB SERVICES - A list of experiments that you have published.

  • SETTINGS - A collection of settings that you can use to configure your account and resources.

Note Note

When you are constructing an experiment, a working list of available datasets and modules is displayed to the left of the canvas. That is the list of components you use to build your model.

Components of an experiment

An experiment consists of datasets that provide data to analytical modules, which you connect together to construct a predictive analysis model. Specifically, a valid experiment has these characteristics:

  • It has at least one dataset and one module.

  • Datasets may be connected only to modules.

  • Modules may connect to either datasets or to other modules.

  • All input ports for modules must have some connection to the data flow.

  • All required parameters for a module must be set.

For an example of creating a simple experiment, see Creating a Simple Experiment. For more complete information on experiments, see Creating and Running Experiments.

Datasets

A dataset is data that has been uploaded to ML Studio so that it can be used in the modeling process. A number of sample datasets are included with ML Studio for you to experiment with, and you can upload more datasets as you need them. Here are some examples of included datasets:

  • MPG data for various automobiles - MPG values for automobiles identified by number of cylinders, horsepower, etc.

  • Breast cancer data - breast cancer diagnosis data

  • Forest fires data - forest fire sizes in northeast Portugal

As you build an experiment, the working list of datasets is available to the left of the canvas.

For more about datasets, see Getting Data.

Modules

A module is an algorithm that you can perform on your data. ML Studio has a number of modules ranging from data ingress functions to training, scoring, and validation processes. Here are some examples of included modules:

As you build an experiment, the working list of modules is available to the left of the canvas.

A module may have a set of parameters that you can use to configure the module's internal algorithms. When you select a module on the canvas, the module's parameters are displayed in the pane to the right of the canvas. You can modify the parameters in that pane to tune your model.

For more about using modules in experiments, see Creating and Editing an Experiment.