BigQuery integration – mabl help

While mabl provides many in-app dashboards for reviewing test output and activity, we recognize that your application may benefit from bespoke reports that are unique to your business. Using the mabl integration for BigQuery, you can extract, transform, and view such reports in real time in tools like BigQuery and Looker Studio.

With the mabl BigQuery integration, you can enable real-time custom reporting and make deeper analyses of testing patterns. This article provides instructions for setting up the mabl BigQuery integration and an overview of tables in the mabl BigQuery dataset.

Setup

In order to use the mabl BigQuery integration, you must use a paid GCP BigQuery account.

The BigQuery Sandbox feature that is available to unpaid trial Google Cloud accounts cannot be used with the mabl BigQuery feed export because streaming inserts are not supported by the BigQuery Sandbox.

Create a service account for mabl

Add the service account mabl-feeds@mabl-prod.iam.gserviceaccount.com as a BigQuery Data Editor to the destination Google Cloud Platform project. Learn more.

Alternative IAM permission setup

If you want to constrain the mabl mabl-feeds@mabl-prod.iam.gserviceaccount.com user to a greater extent, create a custom IAM role with the following permissions:

bigquery.datasets.create
bigquery.datasets.get
bigquery.tables.create
bigquery.tables.get
bigquery.tables.update
bigquery.tables.updateData

Assign the custom IAM role to the mabl service account on the target GCP project.

Configure the BigQuery integration

In the mabl app, visit Settings > Integrations and click on the + Setup button for the BigQuery integration.

On the BigQuery integration page, add the destination project ID, dataset name, and a prefix for the export table. Optionally, you may toggle on React personal information fields. Click save to confirm your settings.

Redact personal information fields

If you toggle on React personal information fields, mabl excludes the following fields from the BigQuery Export Feed:

trigger_user_email
actor_email

mabl replaces the entries for these fields with the value PII_REDACTED on export. Other free text fields, such as plan_name or labels, are still included. mabl does not inspect the content of these fields.

BigQuery table layout

The BigQuery integration provides a streaming export of mabl data. Upon running the first mabl test through the BigQuery Export Feed, the mabl_export dataset and following tables are created:

Plan runs: <table_prefix>_plan_run
Test runs: <table_prefix>_journey_run
Failure reasons: <table_prefix>_run_categorization
Performance test runs: <table_prefix>_performance_test_run
Activity feed: <table_prefix>_activity_feed

Tests are referred to as "journeys" in the mabl BigQuery Export Feed.

For example, if you used the default names when setting up the BigQuery integration in mabl, the following BigQuery tables are created:

<your-project>:mabl_export.mabl_plan_run
<your-project>:mabl_export.mabl_journey_run
<your-project>:mabl_export.mabl_run_categorization
<your-project>:mabl_export.mabl_performance_test_run
<your-project>:mabl_export.mabl_activity_feed

Partitions

The tables are partitioned by day, based on the columns:

<table_prefix>_plan_run on column start_time
<table_prefix>_journey_run on column start_time
<table_prefix>_run_categorization on column created_time
<table_prefix>_performance-test_run on column start_time
<table_prefix>_activity_feed on column created_time

If you want to discard test data after a set amount of time, configure a partition expiration in your BigQuery data warehouse.

Table schema

Plan run table

The *_plan_run table contains every plan run on the integrated workspace, written upon plan run completion.

Column Name	Type	Description
id	String	plan run id
plan_id	String	plan id
plan_name	String	plan name
plan_url	String	plan UI URL
status	String	plan run status
success	Boolean	overall plan run success
start_time	Timestamp	plan run start time UTC
stop_time	Timestamp	plan run stop time UTC
runtime_millis	Integer	total plan run time milliseconds
tags	String, Repeated	plan tags
labels	String, Repeated	plan labels
application_id	String	application under test id
application_name	String	application under test name
application_url	String	application under test UI URL
starting_url	String	application under test starting url
trigger_type	String	execution trigger type
trigger_user_id	String	execution triggering user (optional)
trigger_user_email	String	execution triggering user email (optional)
deployment_id	String	execution trigger deployment id (optional)
workspace_id	String	workspace name
workspace_name	String	workspace name

Test run table

The *_journey_run table contains every test run in the integrated workspace, written upon test run completion. If the test is part of a plan run, the test run may be written to the test run table before the plan run is complete.

Column Name	Type	Description
id	String	test run id
journey_id	String	test id
journey_name	String	test name
journey_url	String	test UI URL
view_output_url	String	test run output UI URL
status	String	test run status
success	Boolean	overall test run success
start_time	Timestamp	test run start time UTC
stop_time	Timestamp	test run stop time UTC
runtime_millis	Integer	total test run time milliseconds
tags	String, Repeated	test tags
labels	String, Repeated	test labels
mabl_branch	String	mabl branch test ran against
environment_id	String	test run environment
environment_name	String	test run environment name
environment_url	String	test run environment UI URL
browser_type	String	browser under test type
browser_version	String	browser under test version
plan_id	String	plan id
plan_name	String	plan name
plan_url	String	plan UI URL
plan_run_id	String	plan run id
scenario_id	String	datatable scenario id
scenario_name	String	datatable scenario name
workspace_id	String	workspace id
workspace_name	String	workspace name

Failure reason table

The *_run_categorization table contains every test run failure reason for the integrated workspace, written when a user sets a failure reason on a given test run output.

Column name	Type	Description
id	String	unique id
category	String	failure reason (NULL on removed)
journey_id	String	test id
journey_name	String	test name
journey_run_id	String	test run id
created_time	Timestamp	grading time UTC
start_time	Timestamp	test run start time UTC
stop_time	Timestamp	test run stop time UTC
grader_user_id	String	grading user id
grader_user_email	String	grading user email (optional)
workspace_id	String	workspace id
workspace_name	String	workspace name
tombstone	Boolean	Indicator that failure reason has been removed from test run. Will be `true` if test run no longer has a failure reason and `category` will be NULL

Because BigQuery exports are done in a "data warehouse" fashion, tables are never updated or deleted. Instead, tables are only appended. As a result, a new row is written for each change to a test run's failure reason, which is stored in the category dimension.

Group by journey_run_id and take the latest create_time value to find the most recent category given to a test run. If a user clears the test run of a failure reason, the category is NULL.

You can use the following sample query to determine the latest failure reason. Replace my-project.mabl_export.mabl_run_categorization with the table name in your BigQuery system.

SELECT 
  failure_reason.* 

FROM ( 
  -- Get latest update time 
  SELECT 
    journey_run_id, 
    MAX(created_time) AS last_updated_time 
  FROM 
    `my-project.mabl_export.mabl_run_categorization` 
  GROUP BY 
    journey_run_id ) AS latest_categories 

JOIN `my-project.mabl_export.mabl_run_categorization` AS failure_reason 
ON failure_reason.created_time = latest_categories.last_updated_time 

-- Ignore removed failure reasons 
WHERE failure_reason.category IS NOT NULL

Performance test run table

The *_performance_test_run table contains metrics for every performance test run in the integrated workspace, written upon test completion. If the performance test is part of a plan run, the performance test run may be written to the performance test run table before the plan run is complete.

You may join the *_performance_test_run table with the *_journey_run table on id.

Column Name	Type	Description
id	String	test run id
journey_id	String	test id
journey_name	String	test name
journey_url	String	test UI URL
journey_run_id	String	test run output UI URL
start_time	Timestamp	test run start time UTC
stop_time	Timestamp	test run stop time UTC
api_virtual_user_hours_consumed	Float	virtual user hours consumed of performance test run
api_concurrent_user_count	Integer	concurrent users of performance test run
api_average_response_time	Integer	average response time in milliseconds of performance test run
api_requests_per_second	Integer	average requests per second of performance test run
api_error_rate_percent	Float	average error rate as a percent [0-100] of performance test run
workspace_id	String	workspace id
workspace_name	String	workspace name

Activity feed table

The *_activity_feed table contains every activity feed entry in the integrated workspace, including activity data for all types supported by the activity feed in the mabl UI.

Column Name	Type	Description
id	String	activity feed entry id
entity_id	String	id of the entity changed
entity_name	String	name of the entity changed
entity_type	String	type of the entity changed
entity_canonical_id	String	test run output UI URL
entity_version_number	Integer	version number for versioned entities
action_type	String	type of action taken
action_timestamp	Timestamp	time the action was taken
created_time	Timestamp	activity feed entry creation time
actor_id	String	id of the person or API key that took the action
actor_email	String	email of the actor, if applicable
entity_previous_id	String	id of the entity before this action
entity_previous_name	String	name of the entity before this action
entity_previous_version_number	Integer	version of the entity before this action
workspace_id	String	workspace id
workspace_name	String	workspace name

Next steps

After setting up the BigQuery integration, you can start querying the data or creating custom dashboards and charts using Looker Studio, Google Sheets, or another tool.