Skip to content

Fixes Root cause of sharding race condition and allows driver to run on a custom port #2339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

avinash-bharti
Copy link

This PR has two parts

  1. Parallel session execution
    1. Fixes The root cause of
      1. Unable to run parallel/sharded tests on Android #1853
      2. Fix Android Parallel/Shards Race Condition #1867
    2. Add support for
      1. [Feature Request] Run maestro on multiple devices at the same time on a single computer #1485
      2. Sharding/parallel execution improvements #1818
  2. Adds --driver-host-port support for Driver server similar to --port for dadb server.

Part 1: Parallel session execution

Steps to reproduce

  1. Run parallel sessions on different devices using multiple instance of maestro cli.
    1. Test will run in one instance but will fail in another due to grpc io exception.
Screenshot 2025-02-28 at 9 23 57 AM 4. This same issue was faced for sharding flow in this issue #1853 and was fixed by a hack in this PR #1867

Root cause of the issue

  1. Maestro test execution logs of the errored session shows that Issue happens when maestro makes first call to the driver server on mobile device.
09:23:13.850 [ INFO] maestro.Maestro.invoke: Getting device info
09:23:23.295 [ERROR] maestro.cli.runner.TestRunner.runCatching: Failed to run flow
io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:275)
  1. On checking logcat log of the device, found that for this errored session no driver app was installed and no instrumentation command was triggered and the device driver didn't receive the device info request.
  2. This is happening because of connectToExistingSession feature which reconnects to existing session in case of some interruption or retries of session execution occurs.
    1. How connectToExistingSession feature works?
      1. This feature was implemented before introducing sharding feature [New Feature] Sharding / Parallel Execution #1732
      2. On every new session creation it checks in keystore (~/.maestro/sessions) if any session other than current one exists that were active in last 21 second. If yes, then skip setting up the driver on the device (Assuming driver was set already in last session) Tweak: allow multiple parallel Maestro sessions #575
    2. Why this feature doesn't goes well with sharding and create issues?
      1. Sharding runs parallel sessions on different devices
      2. Due to connectToExistingSession it skips setting up driver on devices of session which were created after the first session initialisation of the shard (race condition).
        1. Heartbeat creates entry of session_platform key and timestamp as value on session creation and keeps updating it every 5 sec during whole session lifecycle.
      3. Since the device of the second session doesn't have the android server setup, grpc call couldn't find any server on that particular port and gives io exception.
    3. Why PR Fix Android Parallel/Shards Race Condition #1867 (Adding a delay of 1 sec while recording first heartbeat) solves the issue?
      1. In sharding multiple session creation happens parallely at the same time.
      2. Session creation gets done in the very first sec and doesn't finds any existing session in key value storage and doesn't skip any driver setup on any device.

Part 2: Add custom port support for driver

  1. In case of sharding, a random free port(7001..7128) gets allocated for driver server while in case of one shard/no shard a fix port 7001 gets allocated for driver server.
  2. When running parallel session using multiple cli instances, same 7001 port gets allocated for different device and creates issue.

Proposed changes

Parallel session

  1. For connectToExistingSession, Instead of relying on presence of any existing session for that particular platform, rely on if any session for that particular deviceId and platform is present in the keystore.
  2. Why to use deviceId?
    1. A session gets always mapped to a single device id and in case of session reconnect/retry, we would want to reconnect to existing device that was used initially in the session and not any random device.

Custom port for driver

  1. Allow --driver-host-port setting for device driver to start server on.
    1. Previously value of --port was getting assigned to device driver server but was reverted back due to adb port clash here fix: update maestro test to respect --host and --port arguments #2235 (comment)
    2. This custom port will allow user's to let maestro start driver server on any given port.
      copilot:summary

Testing

  1. Ran multiple instances of maestro cli on a real android device and on a android simulator with custom driver ports
1. Maestro/maestro-cli/build/install/maestro/bin/maestro  --driver-host-port 8001 --device=Y9HE5L4L8P9T4P99  test run-test.yml

3. Maestro/maestro-cli/build/install/maestro/bin/maestro  --device=emulator-5554 --driver-host-port 8003 test run-test.yml
Screenshot 2025-02-28 at 10 03 33 AM
  1. Ran sharding tests on two android devices
Screenshot 2025-02-28 at 10 15 20 AM 3. New session device id keystore format Screenshot 2025-02-28 at 11 42 01 AM

Issues fixed

#1853
#1867
#1485
#2235 (comment)

@ice-cap0
Copy link

lfg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants