4. Log production predictions

Once you have a registered model and a specific version that you want to monitor you should start to log production serving data to monitor. Superwise provides multiple flexible ways to integrate and accommodate the needs of both stream use cases and batch use cases.

Regardless of the way you log your data, any time you log data into the Superwise platform a new transaction is created and assigned a unique transaction id.

To troubleshoot any issue while logging data into Superwise, visit the Superwise transaction page and search for the specific transaction id you need to troubleshoot.

Stream logging

Your ML serving pipelines use the Superwise SDK to log prediction records.

from superwise import Superwise

sw = Superwise()

inputs['prediction'] = my_pipeline.predict(inputs)
inputs = inputs.to_dict(orient='records')

transaction_id = sw.transaction.log_records(
  model_id=diamond_model.id,
  version_id=new_version.id,
  records=inputs
)

πŸ“˜

Best practices while logging records

For efficiency, log records in batches when possible.
The maximum number of records that can be sent together is 1,000.

Once you've logged your records, you'll get the transaction id. Use this id to track the status of your data logging to ensure it passed successfully into the Superwise platform.

transaction = sw.transaction.get(transaction_id=transaction_id['transaction_id'])
print(transaction.get_properties()['status'])

Batch collector

To work with Superwise in batch mode, you should pass flat files that contain the model predictions.
The Superwise SDK supports out-of-the-box to collect files from AWS S3 and Google GCS.
Superwise supports both "CSV" and "parquet" file formats.
Here's an example of what a flat file should look like.

Feature 1Feature 2Feature 3
"ABC""Red"123

Whenever your files containing all required batch predictions are ready, use the SDK to submit a new batch request to collect the file and create a new transaction in the Superwise platform.

🚧

File size limitation

The data file should be up to 100MB. if your file is more extensive, make sure you split it

drawing AWS S3

Once your production serving data is available in S3, use the following code snippet, and the SDK will collect the data and send it to the platform.

transaction_id = sw.transaction.log_from_s3(
  file_path="s3://PATH",
  aws_access_key_id="ACCESS-KEY",
  aws_secret_access_key="SECRET-KEY",
  model_id=diamond_model.id,
  version_id=new_version.id
)

To simplify this deployment mode we've prepared a ready to deploy lambda function and end-to-end guide that will take care of all the required automation when files are saved to S3.

drawing Google Cloud Storage (GCS)

For GCS, use the following code snippet for the SDK to collect the data from GCS and send it to the platform.

transaction_id = sw.transaction.log_from_gcs(
  file_path="gs://PATH",
  service_account={"REPLACE-WITH-YOUR-SERVICE-ACCOUNT"},
  model_id=diamond_model.id,
  version_id=new_version.id
)

Log from local file

Superwise also allows you to log data from a local file.

🚧

File settings

The file name must be unique

transaction_id = sw.transaction.log_from_local_file(
  file_path="REPLACE-WITH-PATH-TO-LOCAL-FILE",
  model_id=diamond_model.id,
  version_id=new_version.id
)