Comet.ml Python API¶
This page is available as an executable or viewable Jupyter Notebook:
Comet.ml has an extensive interface to all of your data using a REST API through Comet.ml endpoints. Now, you can access this information easily through the Comet.ml Python SDK. Requires version comet_ml version 3.0.0 or greater.
Setup¶
To run the following experiments, you'll need to set your COMET_API_KEY. The easiest way to to this is to set the values in a cell like this:
import comet_ml
comet_ml.config.save(api_key="...")
where you replace the "..." with your key.
You can get your COMET_API_KEY under your quickstart link (replace YOUR_USERNAME with your Comet.ml username):
Quick Overview¶
To access the Python API through the Comet.ml SDK, you will need to make an API() instance.
Note: this is a new interface.
What's new?¶
The new API:
- is faster, having more items cached
- allows setting and logging more items
APIExperiment
now works similarly toExperiment
,ExistingExperiment
, andOfflineExperiment
- has a consistent interface
Let's try it out!
First, we import the API class and other libraries we will need. Note that this is a new interface and comes from comet_ml.api
:
from comet_ml import API
import comet_ml
import matplotlib.pyplot as plt
%matplotlib inline
and create the API instance:
comet_api = API()
Using the comet_api
instance, you can get the name of your workspaces:
comet_api.get()
If you reference your workspace by name using comet_api.get(WORKSPACE_NAME)
, you'll see your projects:
print(comet_api.get("cometpublic"))
Or, get the projects from another user or shared workspace:
comet_api.get("testuser")
Using the same method, you can refer to a project by name and get all of the experiments in a project:
comet_api.get("cometpublic", "fasttext")
Or, using the slash delimiter:
comet_api.get("cometpublic/fasttext")
And one more level, get an APIExperiment
object using the Experiment's ID:
comet_api.get("cometpublic", "fasttext", 'e64c5915920f481bab8f4cb4dbd615be')
Or, again using the slash shorthand:
comet_api.get("cometpublic/fasttext/e64c5915920f481bab8f4cb4dbd615be")
Let's get an experiment and save it to a variable named exp
:
exp = comet_api.get("cometpublic/fasttext/e64c5915920f481bab8f4cb4dbd615be")
exp
There are a number of items you get and set from the APIExperiment instance. For a complete reference, see: https://www.comet.ml/docs/python-sdk/API/
For example, we can explore the other
property, which shows items saved with Experiment.log_other(NAME, VALUE):
exp.get_others_summary()
In this example, we see that the experiment has the Name
"last". We can use Name
to also look up experiments:
exp = comet_api.get("cometpublic/fasttext/last")
exp.id, exp.name
Perhaps one of the most useful abilities for the Python API is to access your experiment's data in order to create a variation of a plot. To access the raw metric data, use the .get_metrics()
method of the APIExperiment:
len(exp.get_metrics())
We see here were over 2800 metrics logged during the training of this experiment. We can get the first using indexing with an integer:
exp.get_metrics()[0]
That shows that the "acc" (accuracy) metric had a value of about 0.5 at step 1 of the experiment.
We can also filter on a single metric name, like so:
acc_metrics = exp.get_metrics("acc")
len(acc_metrics)
acc_metrics[0]
Therefore, exp.get_metrics("acc")
gives us the dictionary for all "acc" items. We can then easily use Python's built in zip and matplotlib to plot these values:
steps_acc = [(m["step"], float(m["metricValue"])) for m in acc_metrics]
This breaks up the data into (step, value) pairs:
steps_acc[0]
A little Python trick to separate the steps from the accuracies so we can easily use matplotlib:
steps, acc = zip(*steps_acc[:100]) # just the first 100 for now
plt.plot(steps, acc);
That's it for a quick overview. Now let's look in detail at each component.
Workspaces¶
comet_api.get()
reports your workspace names:
comet_api.get()
You can also interate over those names:
for workspace in comet_api.get():
print(workspace)
As we saw above, you can also access other public workspaces as well:
comet_api.get("testuser")
Projects¶
Under get(WORKSPACE_NAME), you'll find the projects:
comet_api.get("cometpublic")
project = comet_api.get("cometpublic", "comet-notebooks")
## OR:
#project = comet_api.get("cometpublic/comet-notebooks")
If you just print out, or iterate over a project, you get access to the experiments:
project
project[0].id, project[0].name
project[1].id, project[1].name
Experiments¶
Continuing with the dictionary-like access, you can see and iterate over the experiment ids:
comet_api.get("cometpublic", "comet-notebooks")
exp = comet_api.get("cometpublic", "comet-notebooks", 'd21f94a1c71841d2961da1e6ddb5ab20')
## OR
# exp = comet_api.get("cometpublic/comet-notebooks/d21f94a1c71841d2961da1e6ddb5ab20")
exp
exp = comet_api.get("cometpublic", "comet-notebooks", 'example 001')
## OR
## exp = comet_api.get("cometpublic/comet-notebooks/example 001")
exp
Regular Expression Experiment Name Matching¶
You can also use regular expressions as the name for the experiment:
comet_api.get_experiments("cometpublic", "comet-notebooks", "example.*")
Query API¶
The Python API provides programmatic access to the same query system in the web UI. There are 5 types of items that you can query:
- Metric: items logged with
log_metric()
- Metadata: items logged by the system
- Other: items logged with
log_other()
- Parameter: items logged with
log_parameter()
- Tag: items logged with
add_tags()
To use these, you first need to import them:
from comet_ml.api import Metric, Metadata, Other, Parameter, Tag
You can then use these to build a query expression, like so:
comet_api.query("cometpublic", "general", Parameter("max_iter") == 100)
The API.query() method takes the following args:
- workspace: String, the name of the workspace
- project_name: String, the name of the project
- query: a query expression (see below)
- archived: (optional boolean), query the archived experiments if True
A query is of the form:
((QUERY-VARIABLE OPERATOR VALUE) & ...)
# or:
(QUERY-VARIABLE.METHOD(VALUE) & ...)
where:
QUERY-VARIABLE
isMetric(NAME)
,Parameter(NAME)
,Other(NAME)
,Metadata(NAME)
, orTag(VALUE)
.OPERATOR
is any of the standard mathematical operators==
,<=
,>=
,!=
,<
,>
.METHOD
isbetween()
,contains()
,startswith()
, orendswith()
.
You may also place the bitwise ~
not operator in front of an expression which means to invert the expression. Use &
to combine additional criteria. Currently, |
(bitwise or) is not supported.
VALUE
can be any query type, includeing string
, boolean
, double
, datetime
, or timenumber
(number of seconds). None
and ""
are special values that mean NULL
and EMPTY
, respectively. Use API.get_project_variables(WORKSPACE, PROJECT_NAME)
to see query variables and types for a project.
When using datetime
, be aware that the backend is using UTC datetimes. If you do not receive the correct experiments via a datetime query, please check with the web UI query builder to verify timezone of the server.
query()
returns a list of matching APIExperiments()
.
Examples:
# Find all experiments that have an acc metric value > .98:
>>> api.query("workspace", "project", Metric("acc") > .98)
[APIExperiment(), ...]
# Find all experiments that have a loss metric < .1 and
# a learning_rate parameter value >= 0.3:
>>> loss = Metric("loss")
>>> lr = Parameter("learning_rate")
>>> query = ((loss < .1) & (lr >= 0.3))
>>> api.query("workspace", "project", query)
[APIExperiment(), ...]
# Find all of the experiments tagged "My simple tag":
>>> tagged = Metric("My simple tag")
>>> api.query("workspace", "project", tagged)
[APIExperiment(), ...]
# Find all experiments started before Sept 24, 2019 at 5:00am:
>>> q = Metadata("start_server_timestamp") < datetime(2019, 9, 24, 5)
>>> api.query("workspace", "project", q)
[APIExperiment(), ...]
# Find all experiments lasting more that 2 minutes (in seconds):
>>> q = Metadata("duration") > (2 * 60)
>>> api.query("workspace", "project", q)
[APIExperiment(), ...]
Notes:
- Use
~
fornot
on any expression - Use
~QUERY-VARIABLE.between(2,3)
for values not between 2 and 3 - Use
(QUERY-VARIABLE == True)
for truth - Use
(QUERY-VARIABLE == False)
for not true - Use
(QUERY-VARIABLE == None)
for testing null - Use
(QUERY-VARIABLE != None)
or~(QUERY-VARIABLE == None)
for testing not null - Use
(QUERY-VARIABLE == "")
for testing empty - Use
(QUERY-VARIABLE != "")
or~(QUERY-VARIABLE == "")
for testing not empty - Use Python's datetime(YEAR, MONTH, DAY, HOUR, MINUTE, SECONDS) for comparing datetimes, like
Metadata("start_server_timestamp")
orMetadata("end_server_timestamp")
- Use seconds for comparing timenumbers, like
Metadata("duration")
- Use
API.get_project_variables(WORKSPACE, PROJECT_NAME)
to see query variables and types.
Do not use 'and', 'or', 'not', 'is', or 'in'. These are logical operators and you must use mathematical operators for queries. For example, always use '==' where you might usually use 'is'.
How can you know what query variables are available for a project? Use API.get_query_variables(WORKSPACE, PROJECT_NAME):
vars = comet_api.get_query_variables("cometpublic", "general")
vars
Experiment Properties¶
In this brief dictionary representation, you will see that get_others_summary()
, get_metrics_summary()
and get_parameters_summary()
give summary data for each item:
exp.get_parameters_summary()
exp.get_others_summary()[0]["name"], exp.get_others_summary()[0]["valueCurrent"]
exp.get_metrics_summary("train_loss")
You can see more information on the methods and propeties on an APIExperiment instance here: https://www.comet.ml/docs/python-sdk/APIExperiment/
Just like when creating and logging data, you can also use the .display()
method to show the Comet.ml page for that experiment right in the notebook:
exp.display()
Logging data¶
With this version of the API, you can now easily log, set, and add data to an experiment, either one that you just created, or one that existed previously.
To create a new APIExperiment, you can (assuming that your API key, workspace, and project name have been set in a Comet configure file):
from comet_ml.api import APIExperiment
api_experiment = APIExperiment()
If you have not set your API key, workspace, and project name, you can pass any and all to the constructor:
from comet_ml.api import APIExperiment
api_experiment = APIExperiment(api_key="MY-KEY",
workspace="MY-WORKSPACE",
project_name="MY-PROJECT")
To create an APIExperiment from a previously-made experiment, you can (assuming that your keys have been set):
from comet_ml.api import APIExperiment
api_experiment = APIExperiment(previous_experiment="7364746746743746") # previous experiment key or name
Once you have an APIExperiment, you can log, set, and add items to the experiment. You can use any of the following methods:
- api_experiment.log_output(lines, context=None, stderr=False)
- api_experiment.log_other(key, value)
- api_experiment.log_parameter(parameter, value, step=None)
- api_experiment.log_metric(metric, value, step=None)
- api_experiment.log_html(html, overwrite=False)
- api_experiment.log_asset(filename, step=None, overwrite=None, context=None)
- api_experiment.log_image(filename, image_name=None, step=None, overwrite=None, context=None)
- api_experiment.add_tags(tags)
- api_experiment.set_code(code)
- api_experiment.set_model_graph(graph)
- api_experiment.set_os_packages(os_packages)
- api_experiment.set_start_time(start_server_timestamp)
- api_experiment.set_end_time(end_server_timestamp)
Existing Experiments¶
For many aspects of an experiment, you can log them with the Python API. However, to get the full power of the streaming Experiment
interface, you can create an ExistingExperiment
.
First, we can look up an experiment using any of the methods outlined here:
api_experiment = comet_api.get("cometpublic", "comet-notebooks", 'example 001')
Then, we can use the APIExperiment.id property to make an ExistingExperiment
:
existing_experiment = comet_ml.ExistingExperiment(previous_experiment=api_experiment.id)
You can make changes to the saved data using the existing experiment:
existing_experiment.end()
Examples¶
As seen above, you can use the Query API that allows highly efficient queries of your data. However, you can also write your own query of sorts if you need to go beyond what the Query API provides.
Here is some code that prints out the names of experiments that have associated HTML (this can take a long time if you have many experiments):
%%time
workspace = "dsblank"
found = False
for project in comet_api.get(workspace):
if found:
break
print(" processing project", project, "...")
print(" processing experiment", exp.id, end="")
for exp in comet_api.get(workspace, project):
print(".", end="")
if exp.get_html() != None:
print("\nFound html in %s!" % exp.url)
found = True
break
print()
Here is a function that will find the first experiment that has associated images:
def find_image():
for workspace in comet_api.get():
for project in comet_api.get(workspace):
for exp in comet_api.get(workspace, project):
if exp.get_asset_list(asset_type="image") != []:
return exp
find_image()
Now, we get the experiment API and explore the APIExperiment.get_asset_list()
method:
comet_api.get('cometpublic/ludwig/02a0ed902ce2481fb6e2fc9009ee593c').get_asset_list(asset_type="image")
We can get a URL for the image, and display it in the notebook:
asset_list = comet_api.get('cometpublic/ludwig/02a0ed902ce2481fb6e2fc9009ee593c').get_asset_list(asset_type="image")
url = asset_list[0]["link"]
url
from IPython.display import Image
Image(url=url)
Now, let's write a short program that will find the run with the best accuracy given a workspace/project string:
Can we get all of the hidden_size
parameter values for the experiments in dsblank/pytorch?
[[p["valueCurrent"] for p in exp.get_parameters_summary()] for exp in comet_api.get("dsblank/pytorch")]
experiments = [[(exp, "hidden_size", int(param["valueCurrent"]))
for param in exp.get_parameters_summary()
if param["name"] == "hidden_size"]
for exp in comet_api.get("dsblank/pytorch")]
experiments = [e[0] for e in experiments if len(e) > 0]
experiments[0]
Assets¶
To get an asset, you need to get the asset_id. You can see all of the assets related to a project using the APIExperiment.get_asset_list()
:
def find_asset(workspaces):
for ws in workspaces or comet_api.get():
for pj in comet_api.get(ws):
for exp in comet_api.get(ws, pj):
if exp.get_asset_list() != []:
return (exp, exp.get_asset_list())
exp, elist = find_asset(["cometpublic"])
exp
len(elist)
From there, you can use the APIExperiment.get_asset(asset_id)
method to get the asset.
description = exp.get_asset("0b256dc8858f4fbeb09228bc96074341", return_type="json")
print(description)
We hope that this gives you some ideas of how you can use the Comet Python API!