OpenTV ENTera & OpenTV Platform Documentation

API monitoring

Overview

OpenTV Video Platform uses Prometheus to collect API usage and performance data from both Platform and SSP. You can then query Prometheus to get information such as:

  • Number of responses from a particular endpoint

  • Total response time

  • Whether a particular probe was successful or not

Prometheus allows you to construct complex, sophisticated queries. It is beyond the scope of this page to cover all of its functionality.

For full details, see the Prometheus API documentation:

You can perform calculations on the data that is returned to compute additional metrics.

Metric categories

For OpenTV Video Platform and SSP, there are two categories of metrics that NAGRA exposes through Prometheus:

Probe metrics

The probe_success metric indicates whether execution was successful for each probe.

Nginx metrics

There are a number of metrics that are gathered by monitoring nginx. These include response time, which is the interval between the arrival of a request and the response, that is, how long it takes to service each request. This is a key indicator of an application's performance.

An increase in response time can mean that there is an issue with an end-user application or with an upstream service, which may be caused by a recent change or upgrade.

The following metrics are available:

  • sni_http_response_count_total the total number of processed HTTP responses

  • sni_http_response_time_seconds – a summary vector of the total response times (in seconds)

  • sni_http_response_time_seconds_sum – a sum of the total response times in seconds

  • sni_http_response_time_seconds_count – the total number of processed HTTP responses

You can perform calculations on the data that is returned to compute additional metrics, such as:

  • Average response time per API

  • Requests per second per API

  • Response time for the top n% of calls per API

Available metrics

The following nginx metrics are available:

Module

Metric name

REST methods

Account and Device Manager (ADM)

adm_accounts_actions

DELETE

adm_bundled_accounts

GET, POST

adm_devices

GET, POST, DELETE,

adm_update

GET, POST

adm_user_accounts

GET

API Gateway (AGW)

agw_create

POST

Cast, Crew, and Persona Service (CCP)

ccp

GET

Content Builder

rail

GET

CRM Gateway (CRM-GW)

crm_gateway

GET

IAM (Keycloak)

iam

GET

Identity Authentication Service (IAS)

ias_content_token

GET, POST

ias_token

POST

Image Handler Service (IHS)

ihs

GET

Keycloak

keycloak_nagra

POST

keycloak_opcon

GET, POST

keycloak_resources

GET

Metadata Server (MDS)

mds_events

GET, PUT, POST, DELETE

btv_programmes

GET

btv_services

GET

epg

GET

solr_search

GET

vod_editorials

GET

vod_nodes

GET

vod_products

GET

Operator Console

(OpCon)

opui

GET

opconsole_adm

GET, POST

opconsole_bcm

GET

opconsole_core

GET, PUT, POST

Rights Manager (RMG)

rmg

GET, POST

User Activity Vault (UAV)

uav

GET, PUT, POST

User Recordings

cdvr

GET, POST, DELETE

Authentication

Access to the Prometheus APIs is controlled by Keycloak. See Accessing operator APIs using Keycloak for more information.

Output formatting

If you are using Postman to make these requests, it automatically pretty-prints the JSON output.

If you are using curl, you can pipe its output to the jq JSON formatting tool to make the output more readable.

For example:

curl -s --location -g --request GET 'https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=probe_success' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--header 'Authorization: bearer <keycloak_token>| jq

Examples

Get all monitored endpoints

Request

To query Prometheus for all the endpoints it is monitoring, send a GET request to:

https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=probe_success
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

To save space, the following example includes the output for one module only (in this case, ADM).

{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "__name__": "probe_success",
                    "api": "Account and Device Manager",
                    "instance": "http://http-router/adm/v1/accounts?limit=0",
                    "job": "adm-api"
                },
                "value": [
                    1673878896.183,
                    "1"
                ]
            },
            ...
        ]
    }
}

Get a count of monitored endpoints

Request

To query Prometheus for a count of the monitored endpoints, send a GET request to:

https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=count(probe_success)
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {},
                "value": [
                    1674043081.6,
                    "31"
                ]
            }
        ]
    }
}

This shows that 31 endpoints are being monitored. (The other value in the same block is the Unix epoch timestamp.)

Get a list of monitored endpoints showing only the most relevant fields

Request

To query Prometheus for a list of monitored endpoints, send a GET request to:

https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=count without (job,api)(probe_success)
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

To save space, the following example includes the output for one endpoint only (in this case, ADM accounts).

{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "instance": "https://operator.sitq3ga.otv-staging.com/adm/v1/accounts?limit=0"
                },
                "value": [
                    1674045670.591,
                    "1"
                ]
            },
            ...
        ]
    }
}

If you are using curl and jq, you can use the -r option to filter the output to show just the list of endpoints.

For example:

curl -s --location -g --request GET 'https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=count without(job,api)(probe_success)' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--header 'Authorization: bearer <keycloak_token> | jq -r '.data.result[].metric.instance'

Get a list of inactive endpoints

Request

To query Prometheus for just the endpoints that are inactive, send a GET request to:

https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=probe_success==0
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

To save space, the following example includes the output for one module only (in this case, MDS).

{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                 "metric": {
                    "__name__": "probe_success",
                    "api": "External",
                    "instance": "https://admin.sitq3ga.otv-staging.com/metadata/delivery/GLOBAL/vod/nodes?limit=0",
                    "job": "mds-api"
                },
                "value": [
                    1674048210.825,
                    "0"
                ]
             },
             ...
        ]
    }
}

Get usage counts for all metrics and statuses

Request

To query Prometheus for the total response count per HTTP status for each metric, send a GET request to:

https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sni_http_response_count_total

The value that is returned for a particular metric and status is the cumulative number of responses since the service started. To get the number of responses over a particular time period, use a time offset to get the count at a specific point in the past and compare it with the current value.

Note that multiple blocks are returned for certain modules.

For example, for ADM, there are separate blocks for adm_devices, adm_update, adm_bundled_accounts, and adm_user_accounts.

Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

To save space, the following example includes the output for one metric and one HTTP status only (in this case, status 201 for RMG).

{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "__name__": "sni_http_response_count_total",
                    "environment": "sitq3ga",
                    "host": "sni_router01",
                    "http_code": "201",
                    "instance": "sni_router01",
                    "job": "sni_router-log-exporter",
                    "method": "POST",
                    "request_uri": "rmg",
                    "status": "201"
                },
                "value": [
                    1673955157.403,
                    "27"
                ]
             },
             ...
        ]
    }
}

Get count for a specific metric and status

Request

To query Prometheus for the total response count for a specific HTTP status for a specific metric, send a GET request to:

https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sni_http_response_count_total{http_code="200",request_uri="adm_devices"}
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

This shows the response that is returned when you request the response count for HTTP status 200 for the adm_devices metric.

{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                 "metric": {
                    "__name__": "sni_http_response_count_total",
                    "environment": "sitq3ga",
                    "host": "sni_router01",
                    "http_code": "200",
                    "instance": "sni_router01",
                    "job": "sni_router-log-exporter",
                    "method": "DELETE",
                    "request_uri": "adm_devices",
                    "status": "200"
                },
                "value": [
                    1673955157.403,
                    "12"
                ]
             },
            ...
        ]
    }
}

Get the total response time for all metrics and statuses

Request

To query Prometheus for the total response time for all available metrics and statuses, send a GET request to:

https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sni_http_response_time_seconds_sum

You can use the total response time together with the usage counts to calculate the average response time for each metric.

Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

If there were no requests for the endpoints that are covered by a particular metric for the data collection period, the value returned will be NaN.

Example

This shows the response that is returned when you request the total response time.

To save space, the following example includes the output for one metric and one HTTP status only (in this case, status 200 for MDS events.

{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                 "metric": {
                    "__name__": "sni_http_response_time_seconds_sum",
                    "environment": "sitq3ga",
                    "host": "sni_router01",
                    "http_code": "200",
                    "instance": "sni_router01",
                    "job": "sni_router-log-exporter",
                    "method": "DELETE",
                    "request_uri": "mds_events",
                    "status": "200"
                },
                "value": [
                    1674482200.427,
                    "79.46000000000002"
                ]
            },
            ...
        ]
    }
}