Task: Use Polyture’s AutoML to Make Avocado Price Predictions
Data From: Kaggle
Polyture Version: 0.13.13
Contact: team@polyture.com
The first step is to import import the historical Avocado pricing data CSV file into the Data Warehouse.
We then grab a CSV data source node from the Import panel, and select the file imported to the warehouse.
In the “Quick Insights” panel, we have to correct the data type of the date column and change it from float to date/time.
Next we use a “Column Filter” node and attach it to the CSV data source.
In the options panel on the bottom right, we select the columns we wish to include in the dataset we will run AutoML on.
To get a more accurate prediction, we are going to split the data into train and test tables.
We achieve this by connecting a “Train Test Split” node.
In the Settings panel, we select for data randomization, and split percentage.
We now take the training data from the split node and attach it to three separate “Send to AutoML” nodes.
We then give each node a name within the settings panel.
Navigate to the main navigation panel on the left hand side, navigate to “AutoML” and select “New Experiment”.
The system will prompt us to give the experiment a name.
We then click “Settings” and select the data source, as well as the predict column. Polyture’s AutoML will automatically detect the AutoML Type (Classification / Regression).
Hit the green “Start” button. Polyture will instantly begin training and testing different models. You can use the AutoML dashboard panels to evaluate the performance of the models.
Drag and connect a “Deploy AutoML Model” node to the Test data from the “Train Test Split”node.
In an example like this, we use the test data in place of production data as a means to quickly validate the performance of the trained model.
Within the settings panel of the “Deploy AutoML Model” node, select the AutoML model that was trained earlier.
Drag in and connect a “Column Filter” node, and select the date value.
Then, drag in and connect a “Column Append” node, and connect the Avocado price prediction data (from the output of the deployed AutoML model) with the Date/Time column from the test data.
Drag in and connect a “Dashboard Graph” node.
Click on the Graph Editor button to enter the editor.
For this project, we will select Time Series with the X axis being Date/Time and the Y axis being predicted Avocado Prices.
For the best view of the created graphs, we open up the “Dashboard” view.
In the Dashboard view, we can resize the graphs to achieve the best possible visual presentation.
Now that we have generated the graphs, we want to export a CSV file of the data.
To do so, we drag in an “Export Data” node and connect it to the “Column Append” node. From the export settings panel, we select CSV and give the file a name.