Comment on page
Adding a model

Three pieces of information have to be provided about the model:
- 1.The machine learning problem type The type of problem the machine learning model is dealing with. The different options are binary classification, multiclass classification, and regression. This has lots of implications. It impacts what type of model output and target data NannyML is expecting and which metrics NannyML can calculate. This cannot be changed later.
- 2.The main performance metric Depending on which problem type you select, the available metrics will change. These metrics can always be changed later, and you can monitor more than one metric at a time. Currently, we support the following metrics:
Binary classification
Multi-class classification
Regression
- ROC-AUC
- F1
- Precision
- Recall
- Specificity
- Accuracy
- Business value
- Confusion matrix elements
- ROC-AUC
- F1
- Precision
- Recall
- Specificity
- Accuracy
- Business value
- MAE
- MAPE
- MSE
- RMSE
- RMSLE
- MSLE
- 3.How the data has to be chunked The time interval over which the metrics have to be aggregated, i.e., the granularity of the monitoring analysis. The options are daily, monthly, quarterly, and yearly. This aggregation can always be changed later in the model settings.

The reference dataset is the dataset NannyML will use as a baseline for monitoring your model. This dataset ideally represents a time when the model worked as expected. The ideal candidate for this is the test set. You need to point NannyML to where this dataset is located and provide some basic information about the dataset schema.

Pick one of the following upload option:
Public URL
Azure blob storage
AWS S3
Local file
Database connection
Provide a public URL
If the dataset is accessible via a public URL you can provide that link here:

To try out NannyML, use one of our public datasets on GitHub. Here is a link to the synthetic car price prediction - reference dataset:
Provide Azure blob storage location
There are six fields on the configuration page:

The first three fields are mandatory and related to the location of the dataset:
- 1.Azure Account Name
- 2.Blob storage container
- 3.File path
The easiest way to obtain the right values for the respective fields is by going to the Azure storage browser via the Azure portal:

The values for the first three fields can be derived as follows:

The last three fields provide ways of accessing/authenticating the blob storage. Only one of them has to be provided:
- If "Is public*" is enabled, NannyML will try to connect without credentials (only possible if the account is configured to allow for public access)
- The Account key is a secret key that gives access to all the files in the storage account. It can be found through the Azure portal. Link to the Microsoft docs.
- The Sas Token is a temporary token that allows NannyML to impersonate the user. It has to specifically be created when doing the onboarding. Link to the Microsoft docs.
Provide AWS S3 storage location

Upload via local file system
If you have your dataset download in you computer and it is smaller than 100 MB you can upload them directly to nannyML Cloud.

Not implemented in the product yet
NannyML needs some schema information about the reference dataset. It will derive most of the columns automatically, but double-checking is good. The most important columns to be defined are listed on the left. Which columns you need to define depends on the type of machine-learning problem you picked at this workflow's start. All other columns will be automatically considered features. NannyML automatically infers the data type of the feature columns.
Regression
Binary classification
Multi-class classification

The following columns have to be specified:
- Timestamp This provides NannyML with the date and time that the prediction was made.
- Prediction The model output that the model predicted for its target outcome.
- Target The ground truth or actual outcome of what the model is predicting.
The mapping of the columns can be changed when scrolling horizontally. It is possible to ignore specific columns or flag columns that should be used for joining predictions and targets later.

The following columns have to be specified:
- Timestamp This provides NannyML with the date and time that the prediction was made.
- Prediction The model output that the model predicts for its target outcome.
- Prediction score The model output scores or probabilities that the model predicts for its target outcome.
- Target The ground truth or actual outcome of what the model is predicting.
The mapping of the columns can be changed when scrolling horizontally. It is possible to ignore specific columns or flag columns that should be used for joining predictions and targets later.

The following columns have to be specified:
- Timestamp This provides NannyML with the date and time that the prediction was made.
- Prediction The model output that the model predicts for its target outcome.
- Target The ground truth or actual outcome of what the model is predicting.
The mapping of the columns can be changed when scrolling horizontally. It is possible to ignore specific columns or flag columns that should be used for joining predictions and targets later.
Since the problem is multiclass we also need to flag the prediction score column of each class as a prediction score.
After, the columns that were flagged as "prediction score" you need to map the classes that those scores belong to:

The analysis dataset is what NannyML uses to analyze the performance of the monitored model. Typically, it will consist of the latest production data up to a desired point in the past, which should be after the reference dataset ends. For that analysis, NannyML leverages the information present in the reference dataset.

Pick one of the following options (1) Upload via a public link or (2) Upload via Azure Blob Storage. (3) Upload via AWS S3. (4) Upload via local file system. (4) Upload via database connection is not fully implemented yet.
Public URL
Azure Blob Storage
AWS S3
Local file
Database connection
Provide a public URL
If the dataset is accessible via a public URL you can provide that link here:

To try out NannyML, you can use one of our public datasets on GitHub. Here is a link to the synthetic car price prediction - analysis dataset:
Provide Azure blob storage location
There are six fields on the configuration page. If you have also used Azure blob storage for the reference dataset, the relevant fields will already be filled in, and only the file path has to be provided, assuming the analysis dataset is stored in the same Blob storage container:

The first three fields are mandatory and related to the location of the dataset:
- 1.Azure Account Name
- 2.Blob storage container
- 3.File path
The easiest way to obtain the right values for the respective fields is by going to the Azure storage browser via the Azure portal:

The values for the first three fields can be derived as follows:

The last three fields provide ways of accessing/authenticating the blob storage. Only one of them has to be provided:
- If "Is public*" is enabled, NannyML will try to connect without credentials (only possible if the account is configured to allow for public access)
- The Account key is a secret key that gives access to all the files in the storage account. It can be found through the Azure portal. Link to the Microsoft docs.
- The Sas Token is a temporary token that allows NannyML to impersonate the user. It has to specifically be created when doing the onboarding. Link to the Microsoft docs.
Provide AWS S3 storage location

Upload via local file system
If you have your dataset download in you computer and it is smaller than 100 MB you can upload them directly to nannyML Cloud.

Not implemented in the product yet
NannyML assumes that the schema of the analysis dataset is the same as the reference dataset. However, production targets might not be immediately available/accessible. NannyML allows the targets to come in separately, i.e., they can be uploaded later in a separate target dataset, and NannyML will join them accordingly.
In this case, Joining means ensuring every prediction is connected to its corresponding outcome. To do that successfully, the analysis and target datasets should contain a "Join with" column, typically a sort of ID.
There are three ways to configure NannyML depending on when and if he production targets come in:
There is no extra configuration necessary. The reference dataset configuration will be ported individually to the analysis dataset, and no joining is required because the targets are already part of the analysis. This is typically the case when targets are instant or fully realized by the time the monitoring analysis is due.
Define which column of the analysis dataset should be used to complete the join:

In the configuration screen, the "Join with" field has to be set to not available:

When late ground truth will not available click on "Continue without identifier":

This step is only necessary when targets are not part of the analysis dataset and when they are available.

Pick one of the following options (1) Upload via a public link or (2) Upload via Azure Blob Storage. (3) Upload via AWS S3. (4) Upload via local file system. (4) Upload via database connection is not fully implemented yet.
Public URL
Azure Blob Storage
AWS S3
Local file
Database connection
Provide a public URL
If the dataset is accessible via a public URL you can provide that link here:

To try out NannyML, you can use one of our public datasets on Git Hub. Here is a link to the synthetic car price prediction - analysis target dataset:
Provide Azure blob storage location
There are six fields on the configuration page. If you have also used Azure blob storage before as part of reference or analysis configuration, the relevant fields will already be filled in, and only the file path has to be provided, assuming the target dataset is stored in the same Blob storage container:

The first 3 fields are mandatory and related to the location of the dataset:
- 1.Azure Account Name
- 2.Blob storage container
- 3.File path
The easiest way to obtain the right values for the respective fields is by going to the Azure storage browser via the Azure portal:

The values for the first three fields can be derived as follows:

The last three fields provide ways of accessing/authenticating the blob storage. Only one of them has to be provided:
- If "Is public*" is enabled, NannyML will try to connect without credentials (only possible if the account is configured to allow for public access)
- The Account key is a secret key that gives access to all the files in the storage account. It can be found through the Azure portal. Link to the Microsoft docs.
- The Sas Token is a temporary token that allows NannyML to impersonate the user. It has to specifically be created when doing the onboarding. Link to the Microsoft docs.
Provide AWS S3 storage location

Upload via local file system If you have your dataset download in you computer and it is smaller than 100 MB you can upload them directly to nannyML Cloud.

Not implemented in the product yet
If the "Join with" column is correctly identified during the configuration of the analysis dataset there is no extra configuration necessary for the target dataset.

Review the model settings and start monitoring:

Last modified 5d ago