Data Types and Schema
The AI & Analytics Engine (the Engine) follows a simplified data type system for handling data types. You can see the AI & Analytics Engine's columns types are decidedly simpler than the types that exists in other languages.
DataFrame and Columns and Types
Each dataset on the Engine is represented by a
DataFrame consists of a collection of
Columns where each Column is a vector of the same length. Each column also belongs to one of these types
|Column Type||Description||Corresponding Types|
|Numeric Columns||Integers and continuous real numbers|
|Boolean Columns||True or False|
|Categorical Columns||Discrete categories|
|Text Columns||Free form text|
|DateTime Columns||Date time|
A schema is attached to a dataset and it contains information about the column names, and the type of each column. A dataset must have have a schema before actions can be appplied on it. Therefore, every dataset on the AI & Analytics Engine platform has an associated schema, unless it has just been uploaded.
If the user uploads a dataset without a schema (e.g. a CSV or parquet file), then the platform will infer a schema for the dataset, and recommend the appropriate casting actions to convert the columns into the right types.