Binary Classification
Introduction
Each app is a specialized intelligence to perform a single prediction task, trained from one or more datasets. Each app provides a space for the user to define the task, and to train and compare several models to achieve the goal. The "classification" app, as its name suggests, is tasked with performing the classification task.
Using SDK to create multi-class classification applications
This section shows how to build an application of problem type binary-class classification on the AI & Analytics Engine.
You can download the dataset used in this example here: german-credit.csv.
from aiaengine import Org, FileSource, Column, DataType, ClassificationConfig, ClassificationSubType
# create a new demo project in the org
org_id = 'b6240512-cd17-43a0-8297-84c51c1bc5a0' # replace with your org ID
org = Org(org_id)
project = org.create_project(name="Demo project using Python SDK", description="Your demo project")
# or you can get an existing project that you want to work on
# project = Project(id='ID_of_your_project') # replace with your own project ID
# import the `German Credit Data` dataset
data_file = 'examples/datasets/german-credit.csv'
# You can use the `print_schema` utility function to print the auto-inferred schema
# print_schema(pd.read_csv(data_file, header=0))
dataset = project.create_dataset(
name=f"German Credit Data",
data_source=FileSource(
file_urls=[data_file],
schema=[
Column('checking_status', DataType.Text),
Column('duration', DataType.Numeric),
Column('credit_history', DataType.Text),
Column('purpose', DataType.Text),
Column('credit_amount', DataType.Numeric),
Column('savings_status', DataType.Text),
Column('employment', DataType.Text),
Column('installment_commitment', DataType.Numeric),
Column('personal_status', DataType.Text),
Column('other_parties', DataType.Text),
Column('residence_since', DataType.Numeric),
Column('property_magnitude', DataType.Text),
Column('age', DataType.Numeric),
Column('other_payment_plans', DataType.Text),
Column('housing', DataType.Text),
Column('existing_credits', DataType.Numeric),
Column('job', DataType.Text),
Column('num_dependents', DataType.Numeric),
Column('own_telephone', DataType.Text),
Column('foreign_worker', DataType.Text),
Column('class', DataType.Text)
]
)
)
# set the ID of the input dataset that used for creating the application
dataset_id = dataset.id
app = project.create_app(
name=f"Predict Customer Credit - Binary Classification",
dataset_id=dataset_id,
config=ClassificationConfig(
sub_type=ClassificationSubType.BINARY,
target_column="class",
positive_class_label="good",
negative_class_label="bad"
)
)