Skip to content
This repository was archived by the owner on Oct 25, 2023. It is now read-only.

add -image locator strategy and related logic #235

Merged
merged 34 commits into from
Jul 30, 2018
Merged

Conversation

jlipps
Copy link
Member

@jlipps jlipps commented Jul 15, 2018

First stab at putting the find by image logic into base driver.

I know there's no tests and docstrings, will add those. Just wanted to get the ideas here for any initial comments.

I haven't yet done extensive testing, but the happy path works for me locally.

@@ -1,8 +1,17 @@
import log from '../logger';
import Jimp from 'jimp';
import Promise from 'bluebird';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually call it B

import Jimp from 'jimp';
import Promise from 'bluebird';
import { errors } from '../../..';
import { Buffer } from 'buffer';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be imported explicitly?

// findElement(s)FromElement), this doesn't make sense, so use of the
// -image locator strategy won't pass the strategy validation filter as is
// desired
return await this.findByImage(selector);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say there's still a difference, because in case of multiple elements we need to return an empty array or throw an exception if only one element is expected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in case of multiple elements (or anything other than strict findElement) it will throw an error saying there is no such strategy (https://github.com/appium/appium-base-driver/blob/master/lib/basedriver/commands/find.js#L19).

@@ -47,6 +65,120 @@ commands.findElementsFromElement = async function (strategy, selector, elementId
return await this.findElOrElsWithProcessing(strategy, selector, true, elementId);
};

helpers.findByImage = async function (b64Template, {checkingStaleness = false} = {}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> shouldCheckStaleness

helpers.findByImage = async function (b64Template, {checkingStaleness = false} = {}) {
let threshold = DEFAULT_MATCH_THRESHOLD;
if (this.settings) {
threshold = this.settings.getSettings().imageMatchThreshold;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we only assign it if imageMatchThreshold is defined?

// someone might have sent in a template that's larger than the screen
// dimensions. If so let's check and cut it down to size since the algorithm
// will not work unless we do
b64Template = await this.ensureTemplateSize(b64Template, screenWidth, screenHeight);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather skip this check and let the client to be responsible for template size. It might be too expensive to call getWindowSize each time the lookup is executed.

// are two potential types of mismatch: aspect ratio mismatch and scale
// mismatch. we need to detect and fix both

const screenAR = screenWidth / screenHeight;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

screen aspect ration is not going to be changed during the session, so it might be safe to cache it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it could be changed during session if the device orientation is changed

@@ -276,6 +280,11 @@ class BaseDriver extends Protocol {
// have an unexpected shutdown event, for example, we can cancel it from
// outside, rejecting the current command immediately
this.curCommandCancellable = new B((r) => { r(); }).then(() => { // eslint-disable-line promise/prefer-await-to-then
// if one of the args is an image element, handle it separately
const imgElId = getImgElFromArgs(args);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a bit hacky

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, it was the only way I could think of to pass control to ImageElement in the case of an image element existing in the arguments. I agree it's not super pretty. Any other ideas?

throw new errors.StaleElementReferenceError();
}

if (!this.equals(newImgEl)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure if click should mutate the element

this.rect = _.clone(newImgEl.rect);
}

// now set up the W3C touch action to click on the image by position
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this going to work for all the drivers? I'd rather add a fallback to MJSONWP

const imgEl = driver._imgElCache.get(imgElId);

switch (cmd) {
case "click":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single quotes

}
}

class ImageElementCache {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use LRU cache implementation from a lib instead, for example https://www.npmjs.com/package/lru-cache?

@@ -436,6 +437,12 @@ function driverShouldDoJwpProxy (driver, req, command) {
}
}

// if it looks like we have an image element in the body, never proxy
const stringBody = JSON.stringify(req.body);
if (stringBody && stringBody.indexOf(IMAGE_ELEMENT_PREFIX) !== -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

includes?

return b64Screenshot;
}

// otherwise, if they don't match, it could spell problems for the accuracy
Copy link
Contributor

@mykola-mokhnach mykola-mokhnach Jul 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather make this configurable from settings to make it as performant as possible by default and only do these performance-hitting transformations if explicitly asked by the client

@jlipps
Copy link
Member Author

jlipps commented Jul 19, 2018

(note that this PR now depends on appium/appium-support#77)

@@ -111,7 +110,8 @@ helpers.findByImage = async function (b64Template, {checkingStaleness = false} =
};

helpers.ensureTemplateSize = async function (b64Template, screenWidth, screenHeight) {
let {width: tplWidth, height: tplHeight, imgObj} = await getJimpData(b64Template);
let imgObj = await imageUtils.getJimpData(b64Template);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the function called getJimpImage or getJimpData ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, fixed.

@jlipps jlipps force-pushed the jlipps-image-element branch from 6fd3908 to 44b1a06 Compare July 19, 2018 06:19
}
/**
* @typedef {Object} FindByImageOptions
* @property {boolean} shouldCheckStaleness - whether this call to find an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also mention the default values for these properties or if any of them is required to set?

helpers.findByImage = async function (b64Template, {
shouldCheckStaleness = false,
multiple = false,
} = {}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume = {} is redundant

shouldCheckStaleness = false,
multiple = false,
} = {}) {
const threshold = this.settings.getSettings().imageMatchThreshold;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does getSettings return a promise?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const {
  imageMatchThreshold: threshold,
  fixImageTemplateSize
} = this.settings.getSettings();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does getSettings return a promise?

no

// handle the element not found response below. In the case where get some
// _other_ kind of error, it means something blew up totally apart from the
// implicit wait timeout. We should not mask that error and instead throw
// it straightaway
if (!err.match(/Condition unmet/)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should me match err.message instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!


// which method to use for tapping by coordinate for image elements. the
// options are 'w3c' or 'mjsonwp'
imageElementTapStrategy: 'w3c',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be detected automatically?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can it be detected automatically? we don't know which functionality the downstream driver supports. i.e., this doesn't (and shouldn't) correspond to the protocol used. it corresponds to the functionality available in the driver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we know in which mode the downstream driver is running (w3c/jsonwp), don't we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that we do. And anyway the mode doesn't matter: all we're doing is calling driver.performActions or driver.performTouch. There's no protocol involved. I'm just giving access to these methods protocol-related names because that's how they're known. But maybe strings like w3cAction and touchAction would be better.

But your comment makes me realize that this will not be proxied so in the case of WDA, we'll have to ensure the xcuitest driver has performActions on it, and manually proxy the command forward to WDA. That's not ideal.

@@ -439,7 +439,7 @@ function driverShouldDoJwpProxy (driver, req, command) {

// if it looks like we have an image element in the body, never proxy
const stringBody = JSON.stringify(req.body);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not quite sure about that. What if any other valid request includes accidentally includes '-image' substring in its body?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would have to include appium-image-element-. but you are right it is a risk. for a similar reason the webdriver spec does the whole element-6066-11e4-a52e-4f735466cecf thing.

one alternative is actually walking through the body recursively and seeing if we have an object with the exact form of an element. that seemed like it might be overkill. @imurchie what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having such suffix might be useful IMHO

actions: [
{type: 'pointerMove', x, y},
{type: 'pointerDown'},
{type: 'pause', duration: 500},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

500 is the duration of a long tap. single tap duration is about 100-125 ms

}

if (!rect) {
throw new errors.NoSuchElementError(); // eslint-disable-line no-unsafe-finally
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume this eslint directive is leftover from this line being in a finally block.

@@ -276,6 +280,11 @@ class BaseDriver extends Protocol {
// have an unexpected shutdown event, for example, we can cancel it from
// outside, rejecting the current command immediately
this.curCommandCancellable = new B((r) => { r(); }).then(() => { // eslint-disable-line promise/prefer-await-to-then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is existing, but can we change new B((r) -> { r(); } to B.resolve() (http://bluebirdjs.com/docs/api/promise.resolve.html)?

@jlipps jlipps force-pushed the jlipps-image-element branch from 981fcd4 to 353c789 Compare July 25, 2018 04:56
@jlipps
Copy link
Member Author

jlipps commented Jul 25, 2018

ok i've fixed all the issues reported so far and written all the unit tests that need writing. i will move on to writing functional tests in the xcui and uiauto2 drivers, so let's wait to merge this until i've at least run them locally. meanwhile i added a few significant commits since the last review so might want to have another look at it.

const IMAGE_ELEMENT_PREFIX = 'appium-image-element-';

const IMG_EL_BODY_RE = new RegExp(
`"(${W3C_ELEMENT_KEY}|${MJSONWP_ELEMENT_KEY})": ?` +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps ? -> \s*

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, will update

@@ -408,6 +423,31 @@ class BaseDriver extends Protocol {
return false;
}

proxyRouteIsAvoided (sessionId, method, url) {
for (let avoidSchema of this.getProxyAvoidList(sessionId)) {
if (!_.isArray(avoidSchema) || avoidSchema.length !== 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will throw inside avoidSchema.length if avoidSchema is null

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

existing code (moved from protocol) but good point, i'll fix

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, this is OK. if avoidSchema is not an array, the condition will short-circuit since the LHS is true, and avoidSchema.length will never be evaluated

if (!_.includes(['GET', 'POST', 'DELETE'], avoidMethod)) {
throw new Error(`Unrecognized proxy avoidance method '${avoidMethod}'`);
}
if (!(avoidPathRegex instanceof RegExp)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps _.isRegExp ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, existing code but will fix

return false;
}

async proxyCommand (/*url, method, body*/) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to implement it in each driver?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each driver does proxying differently (uiauto2 calls this.uiautomator2.jwproxy for example, which isn't the case with xcui)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot we just call click or performActions? Why the base driver should care about proxying?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if a driver doesn't implement performActions itself, but merely proxies requests for it to a downstream driver? for example, if uiauto2 server implemented the /actions endpoint, but uiauto2 driver did not. this is a totally valid case. we have many endpoints which are not specified explicitly in a driver and instead we rely on proxying to access the endpoint on a downstream driver.

does that make sense? maybe i'm confused but i think it's a real issue (at least in principle)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't try to outsmart drivers. If some endpoint is not implemented there and it exists in the downstream driver then it would be better and more practical to implement the missing stuff directly there. Adding new requirements to base driver creates unnecessary complexity for the whole infrastructure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. i think you have a good point. the way i am doing it now we will also require drivers to implement the proxyCommand method. so it's not less onerous than requiring them to simply implement performActions and performTouch. i'll do the same thing as i have done elsewhere and simply throw a meaningful error if these methods do not exist on the driver.

if the driver typically proxies the action/touch commands, then it will need to basically implement a one-line function to support image elements, like:

commands.performActions = function (actions) {
  this.proxyCommand('POST', '/actions', actions);
};

In most cases this will be bypassed, but in the case of ImageElement actions, it will be called.

@jlipps
Copy link
Member Author

jlipps commented Jul 25, 2018

ok, tests have been written for xcuitest driver and uiauto2 driver (those PRs are up but won't pass until this is merged). from my perspective this is solid (at least for a beta, presumably we'll find more bugs or important behavior suggestions when it's in the wild).

@jlipps
Copy link
Member Author

jlipps commented Jul 26, 2018

will also work on documentation, btw.

@jlipps
Copy link
Member Author

jlipps commented Jul 27, 2018

the one build failure is a travis timeout, so I think this is good to go. Any other thoughts @imurchie @mykola-mokhnach? Otherwise I'm ready to merge.

if (avoidMethod === req.method && avoidPathRegex.test(normalizedUrl)) {
return false;
}
if (driver.proxyRouteIsAvoided(req.params.sessionId, req.method, req.originalUrl)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be useful to add a docstring to this new method

@jlipps
Copy link
Member Author

jlipps commented Jul 30, 2018

@mykola-mokhnach ok docstring added. let me know if you're happy with things as they stand now.

@jlipps
Copy link
Member Author

jlipps commented Jul 30, 2018

@imurchie how about you?

@jlipps jlipps merged commit ec72948 into master Jul 30, 2018
@jlipps jlipps deleted the jlipps-image-element branch July 30, 2018 20:17
@jlipps
Copy link
Member Author

jlipps commented Jul 30, 2018

published in [email protected]

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants