Much of automated UI testing relies on finding elements by their text-based attributes, such as selectors and classes. However, sometimes these attributes aren’t enough to reliably and accurately interact with certain types of elements, including canvas elements, PDFs, or elements that lack unique attributes.
With visual find, you can leverage the power of AI to target elements based on visual characteristics instead of text-based attributes in the DOM or page source.
Visual find is currently supported only for tap steps in mobile tests.
This article explains how to use visual find in your mobile tests.
Train a tap step with visual find
After launching your mobile application in the mabl Trainer, get your application in the correct state and add a new tap step: + (Add step) > Tap.
Select the target area
The step configuration menu selects “Visual find” by default. Click and drag to draw a box over the area you want to target.
mabl automatically sends a screenshot of the target area to the model, which sends back an AI-generated description of the selected area. mabl performs the find using the GenAI description to confirm that it finds a match.
The image preview shows where mabl will perform the tap step in the selected area on execution. If you need to target a precise location within the selected area, you can adjust the x-y coordinates.
Review the description
If the description fails to highlight the correct element, reselect the area or manually edit the description and test the find again.
Even if the description highlights the correct element, you should also ensure the description has the correct intent. Depending on what you’re trying to achieve, the model might generate a description that accurately highlights the correct element during training, but doesn’t capture your ultimate intent.
To illustrate, consider an app with many grid elements.
The model generated the description “A turquoise rounded square tile with the text ‘Car 10’ in the center”, which highlights the correct element. However, if the intent is to target a tile at a specific position on the page, you should manually update the description to reflect this intent - “The second square from the left, third row from the top” - and test it out.
When you are satisfied with the step, click Save to add it to your test.
Run tests with visual find
On execution, mabl sends the GenAI description and the screenshot at run time back to the model, which returns the bounding boxes. To perform the tap step, mabl uses the x-y coordinates within the bounding box specified during training.