User Guide¶
About¶
Aliro is a platform to help researchers leverage supervised machine learning techniques to analyze data without needing an extensive data science background,and can also assist more experienced users with tasks such as choosing appropriate models for data. Users interact with Aliro via a web interface that allows them to execute machine learning experiments and explore generated models, and has an AI recommendation engine that will automatically choose appropriate models and parameters. Dataset profiles are generated and added to a knowledgebase as experiments are run, and the recommendation engine learns from this to give more informed recommendations as it is used. This allows the AI recommender to become tailored to specific data environments. Aliro comes with an initial knowledgebase generated from the PMLB benchmark suite.
Aliro¶
Requirements¶
Docker
Most recent stable release, minimum version is 17.06.0
Runtime Memory: (Mac and Windows only) We recommend docker VM to be configured with at least 6GB of runtime memory (Mac configuration, Windows configuration). By default, docker VM on Windows or Mac starts with 2G runtime memory.
File Sharing: (Windows only) Share the drive that will contain the Aliro directory with Docker by opening Docker Desktop, navigating to Resources->File Sharing and sharing the drive. Docker Desktop File Sharing
Docker-Compose (Version 1.22.0 or greater, Linux only) - Separate installation is only needed for linux, docker-compose is bundled with windows and mac docker installations
Installation¶
Download the production zip
Aliro-*.zip
from the asset section of the latest releaseNote that this is different from the source code zip file.
Unzip the archive
AliroEd¶
Aliro on the Raspberry Pi 400¶
We have built a custom Raspberry Pi OS Image containing Aliro (configured to be up and running as soon as you boot up the Operating System.)
AliroEd Requirements¶
A computer running Windows 10 or higher
A Card Reader
If your computer does not have an integrated card reader, you will need a USB card reader.
MicroSD Card
Minimum capacity: 32GB
Note: There are different speed classes for MicroSD Cards, Application Performance Class 1 (A1) and Application Performance 2 (A2). A2 cards are highly recommended as these are much faster than A1 cards.
A copy of the aliro-imager.exe
AliroEd Installation¶
Insert the MicroSD Card in your card reader.
Double-click the aliro-imager.exe on you computer. If prompted to allow the application to run, select Yes. You may need to enter your computer’s Administrator password to continue.
Follow the prompts to proceed with the installation.
Once installed, you can run the AliroEd Imager from the Start Menu. When the program starts up you will see this screen:
Click the CHOOSE STORAGE button and select your MicroSD Card from the popup menu.
Click the WRITE button to begin writing the Operatying System to your MicroSD Card.
be erased. Click Yes at the prompt to proceed.
on the AliroEd Imager.
Insert the MicroSD Card into your Raspberry Pi 400 and start it up.
When the Operating System has finished starting up, double-click the AliroEd Icon on the Destop or launch the Web Browser.
Now that Aliro is up and running, you are ready to run experiments, AliroEd comes preloaded with some data sets for you to experiment with. Other datasets can be downloaded from the Penn Machine Learning Benchmarks
You may also load your own datasets, please see the following sections below for further instructions:
Using Aliro¶
Starting and Stopping¶
To start Aliro, from the command line, navigate to the Aliro directory run the
command docker-compose up
. To stop Aliro, kill the process with ctrl+c
and
wait for the server to shut down. It may take a few minutes to build the first
time Aliro is run.
To reset the datasets and experiments in the server, start Aliro with the
command docker-compose up --force-recreate
or run the command
docker-compose down
after the server has stopped.
User Interface¶
Once the webserver is up, connect to http://localhost:5080/ to access the website. You should see the Datasets page. If it is your first time starting Aliro, there should be a message instructing one to add new datasets.
Adding Datasets¶
One can add new datasets using a UI form within the website or manually add new datasets to the data directory. Datasets have the following restrictions:
Datasets must have the extension .csv or .tsv
Datasets cannot have any null or empty values
Dataset values must be numeric. All categorical features must be encoded to a numeric value prior to upload.
Files must be smaller then 8mb
Some example datasets can be found in the classification section of the Penn Machine Learning Benchmarks github repository.
Uploading Using the Website¶
To upload new datasets from the website, click the “Add new Datasets” button on
the Datasets page to navigate to the upload form. Select a file using the
form’s file browser and enter the corresponding information about the dataset:
the name of the dependent column, a JSON of key/value pairs of ordinal
features, for example {"ord" : ["first", "second", "third"]}
, and a comma
separated list of categorical column names without quotes, such as
cat1, cat2
. Once uploaded, the dataset should be available to use within the
system.
Adding Initial Datasets to the Data Directory¶
Labeled datasets can also be loaded when Aliro starts by adding them to the
data/datasets/user
directory. Aliro must be restarted if new datasets are
added while it is running. If errors are encountered when validating a dataset,
they will appear in a log file in target/logs/loadInitialDatasets.log
and
that dataset will not be uploaded. Data can be placed in subfolders in this
directory.
An optional json configuration file can be provided with each dataset to specify the column that contains the label, the prediction type (classification or regression), and any categorical or ordinal features. By default, the label column is assumed to be ‘class’, the prediction type for the dataset is assumed to be classification, and all fields are numeric.
The coresponding configuration file must be in the same directory as the
dataset. If the file is named myDatafile.*sv
, the configuration file must be
named myDatafile_metadata.json
Example configuration file:
{
"target_column": "my_custom_target_column_name",
"prediction_type": "classification",
"categorical_features": ["cat1", "cat2"],
"ordinal_features": { "ord": ["first", "second", "third"] }
}
Analyzing Data¶
To run a classification machine learning experiment, from the click Build New
Experiment page, choose the desired algorithm and experiment parameters and
click ‘Launch Experiment’. To start the AI, from the Datasets page click
the AI toggle. The AI will start issuing experiments according to the
parameters in config/ai.config
. This file can be modified to change the
recommendation engine being used and how may recommendations the AI will give.
By default, the AI will make 10 recommendations.
From the Datasets page, click ‘completed experiments’ to navigate to the Experiments page for that dataset filtered for the completed experiments. If an experiment completed successfully, use the ‘Actions’ dropdown to download the fitted model for that experiment and a python script that can be used to run the model on other datasets. Click elsewhere on the row to navigate to the experiment Results page.
Downloading and Using Models¶
A pickled version of the fitted model and an example script for using that model can be downloded for any completed experiment from the Experiments page.
Please see the Jupiter Notebook script demo for instructions on using the scripts and model exported from Aliro to reproduce the findings on the results page and classify new datasets.
Working with ChatGPT component¶
This document explains how to interact with the ChatGPT component. The guide covers the following topics:
Where to get your OpenAI API key
How to connect to OpenAI using your API key
How to use the ChatGPT (Expert)
1. Where to Get Your OpenAI API Key¶
Before you can start using the ChatGPT component, you’ll need an API key from OpenAI. To get one:
Visit the OpenAI website
Create an account or log in if you already have one
Click on your account icon in the top-right corner, then select “View API keys.”
Generate your API key by clicking on “+ Create new secret key.”
Please copy your API key and store it in a secure location.
2. How to Connect to OpenAI Using Your API Key¶
After obtaining your API key, you can effortlessly connect to OpenAI. Once your machine learning experiment is complete, navigate to the results page to view performance metrics. A popup window will appear, prompting you to enter your OpenAI API key. Upon successful connection, a new popup window will display the message, “Connection to OpenAI is established,” and the “Expert” button will turn blue.
3. How to Use ChatGPT (Expert)¶
Once you click on the blue “Expert” button, a chat interface will appear on the right side of the page.
Settings
At the top of the chat interface, you’ll find the “Settings” tab and a “+ New Chat” button. Within “Settings,” you can adjust the model, temperature, and define prompt engineering.
Renaming a ChatBox
To rename the first ChatBox to “EDA”:
Double-click the ChatBox to enable editing.
Type in the new name, for example, “EDA.”
Mouse over the pencil emoji, which will then change to a checkmark.
Click the checkmark to save the new name.
Creating a New ChatBox
To create a new chatBox:
Click the “+ New Chat” button.
Name the new ChatBox following the process outlined in the “Renaming a ChatBox” section.
Removing Chat Tabs
To remove a ChatBox:
Click on the trash bin emoji associated with that tab.
Code Generation and Editing
To request Python code from ChatGPT, you can ask, for example: “Please write Python code to generate 10 random numbers and print them.”
To edit the generated code:
Double-click the code box.
Make your changes.
To save, press the ESC key or click the “Run” button.
Installing New Packages
If you need to install new packages to run the given code, a “Install and Run” button will appear. Clicking this will install the necessary packages and execute the code.
Code Error Handling
When encountering an error while running code, a button labeled “Submit Error” will appear. Clicking this button will prompt ChatGPT to provide an alternative code snippet to resolve the error.