|What Is ML Studio?|
This topic contains the following sections.
To develop a predictive analysis model, you typically use data from one or more sources, transform and analyze that data through various data manipulation and statistical functions, and generate a set of results. Developing a model like this is an iterative process - as you modify the various functions and their parameters, your results converge until you are satisfied that you have a trained, effective model.
ML Studio gives you an interactive, visual workspace to easily build, test, and iterate on a predictive analysis model. You drag-and-drop datasets and analysis modules onto an interactive canvas, connecting them together to form an experiment, which you run in ML Studio. To iterate on your model design, you edit the experiment, save a copy if desired, and you run it again. When you're ready, you can publish your experiment as a web service so that your model can be accessed by others.
There is no programming required, just visually connecting datasets and modules to construct your predictive analysis model.
When you first enter ML Studio, you see the following tabs on the left:
Studio Home - A set of links to documentation and other resources
EXPERIMENTS - Experiments that have been created, run, and saved as drafts.
WEB SERVICES - A list of experiments that you have published.
SETTINGS - A collection of settings that you can use to configure your account and resources.
When you are constructing an experiment, a working list of available datasets and modules is displayed to the left of the canvas. That is the list of components you use to build your model.
An experiment consists of datasets that provide data to analytical modules, which you connect together to construct a predictive analysis model. Specifically, a valid experiment has these characteristics:
It has at least one dataset and one module.
Datasets may be connected only to modules.
Modules may connect to either datasets or to other modules.
All input ports for modules must have some connection to the data flow.
All required parameters for a module must be set.
A dataset is data that has been uploaded to ML Studio so that it can be used in the modeling process. A number of sample datasets are included with ML Studio for you to experiment with, and you can upload more datasets as you need them. Here are some examples of included datasets:
MPG data for various automobiles - MPG values for automobiles identified by number of cylinders, horsepower, etc.
Breast cancer data - breast cancer diagnosis data
Forest fires data - forest fire sizes in northeast Portugal
As you build an experiment, the working list of datasets is available to the left of the canvas.
For more about datasets, see Getting Data.
A module is an algorithm that you can perform on your data. ML Studio has a number of modules ranging from data ingress functions to training, scoring, and validation processes. Here are some examples of included modules:
Convert to ARFF - converts a .NET serialized dataset to ARFF format
Elementary Statistics - calculates elementary statistics such as mean, standard deviation, etc.
Linear Regression - creates an online gradient descent-based linear regression model
Score Model - scores a trained classification or regression model
As you build an experiment, the working list of modules is available to the left of the canvas.
A module may have a set of parameters that you can use to configure the module's internal algorithms. When you select a module on the canvas, the module's parameters are displayed in the pane to the right of the canvas. You can modify the parameters in that pane to tune your model.
For more about using modules in experiments, see Creating and Editing an Experiment.