Multi-class Classification
Introduction
Each app is a specialized intelligence to perform a single prediction task, trained from one or more datasets. Each app provides a space for the user to define the task, and to train and compare several models to achieve the goal. The "classification" app, as its name suggests, is tasked with performing the classification task.
Using SDK to create multi-class classification applications
This section shows how to build an application of problem type multi-class classification on the AI & Analytics Engine.
You can download the dataset used in this example here: penguin-classification.csv.
from aiaengine import Org, FileSource, Column, DataType, ClassificationConfig, ClassificationSubType
# create a new demo project in the org
org = Org(id='b6240512-cd17-43a0-8297-84c51c1bc5a0') # replace with your org ID
project = org.create_project(name="Demo project using Python SDK", description="Your demo project")
# or you can get an existing project that you want to work on
# project = Project(id='ID_of_your_project') # replace with your own project ID
# import the `Penguin Regression` dataset
data_file = 'examples/datasets/penguins_classification.csv'
# You can use the `print_schema` utility function to print the auto-inferred schema
# print_schema(pd.read_csv(data_file, header=0))
dataset = project.create_dataset(
name=f"Penguin Species",
data_source=FileSource(
file_urls=[data_file],
schema=[
Column('Culmen Length (mm)', DataType.Numeric),
Column('Culmen Depth (mm)', DataType.Numeric),
Column('Species', DataType.Text)
]
)
)
# set the ID of the input dataset that used for creating the application
dataset_id = dataset.id
# use the ClassificationConfg class with MULTI_CLASS sub_type
app = project.create_app(
name=f"Predict Penguin Species - Multi-class Classification",
dataset_id=dataset_id,
config=ClassificationConfig(
sub_type=ClassificationSubType.MULTI_CLASS,
target_column="Species",
)
)