Skip to main content
POST
/
boxes
/
{boxId}
/
actions
/
click
JavaScript
import GboxSDK from "gbox-sdk";

const gboxSDK = new GboxSDK({
  apiKey: process.env["GBOX_API_KEY"] // This is the default and can be omitted
});

async function main() {
  const box = await gboxSDK.create({ type: "android" });

  await box.action.click({
    x: 100,
    y: 100
  });

  // Natural language click
  await box.action.click({
    target: "login button"
  });
}

main();
{
  "message": "Action executed successfully",
  "actionId": "c9bdc193-b54b-4ddb-a035-5ac0c598d32d",
  "actual": {
    "x": 350,
    "y": 250,
    "button": "left",
    "double": false,
    "modifierKeys": [
      "control"
    ]
  },
  "screenshot": {
    "trace": {
      "uri": "..."
    },
    "before": {
      "uri": "..."
    },
    "after": {
      "uri": "..."
    }
  }
}

Authorizations

Authorization
string
header
required

Enter your API Key in the format: Bearer . Get it from https://gbox.ai

Path Parameters

boxId
string
required

Box ID

Example:

"c9bdc193-b54b-4ddb-a035-5ac0c598d32d"

Body

application/json
  • Click Action
  • Click Action with Natural Language
  • Click Action by Element

Mouse click action configuration

x
number
required

X coordinate of the click

Example:

350

y
number
required

Y coordinate of the click

Example:

250

button
enum<string>
default:left

Mouse button to click

Available options:
left,
right,
middle
Example:

"left"

double
boolean
default:false

Whether to perform a double click

Example:

false

modifierKeys
enum<string>[]

Modifier keys to hold while performing the click (e.g., control, shift, alt). Supports the same key values as the pressKey action.

Available options:
a,
b,
c,
d,
e,
f,
g,
h,
i,
j,
k,
l,
m,
n,
o,
p,
q,
r,
s,
t,
u,
v,
w,
x,
y,
z,
0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
f1,
f2,
f3,
f4,
f5,
f6,
f7,
f8,
f9,
f10,
f11,
f12,
control,
alt,
shift,
meta,
win,
cmd,
option,
arrowUp,
arrowDown,
arrowLeft,
arrowRight,
home,
end,
pageUp,
pageDown,
enter,
space,
tab,
escape,
backspace,
delete,
insert,
capsLock,
numLock,
scrollLock,
pause,
printScreen,
;,
=,
,,
-,
.,
/,
`,
[,
\,
],
',
numpad0,
numpad1,
numpad2,
numpad3,
numpad4,
numpad5,
numpad6,
numpad7,
numpad8,
numpad9,
numpadAdd,
numpadSubtract,
numpadMultiply,
numpadDivide,
numpadDecimal,
numpadEnter,
numpadEqual,
volumeUp,
volumeDown,
volumeMute,
mediaPlayPause,
mediaStop,
mediaNextTrack,
mediaPreviousTrack
Example:
["control", "shift"]
options
Action Common Options · object

Action options. When options.screenshot is provided, ALL deprecated screenshot fields (outputFormat, presignedExpiresIn, screenshotDelay, screenshotRange, includeScreenshot) will be completely ignored.

Example:
{
"screenshot": {
"outputFormat": "base64",
"presignedExpiresIn": "30m",
"delay": "500ms",
"phases": ["before", "after"]
}
}

Response

200 - application/json

Click action executed successfully. The response includes the actual coordinates where the click was performed, which is especially useful when using natural language targeting.

Result of click action execution with actual parameters used. The actual field shows the exact parameters used when performing the click, which is especially useful when using natural language or element-based targeting to understand exactly what action was performed.

message
string
required

message

Example:

"Action executed successfully"

actionId
string
required

Unique identifier for each action. Use this ID to locate the action and report issues.

Example:

"c9bdc193-b54b-4ddb-a035-5ac0c598d32d"

actual
Click Action Actual Parameters · object
required

Actual parameters used when executing the click action, including coordinates (x, y), button type (left/right/middle), modifier keys (control/shift/alt), and whether it was a double click. Field names match the input parameters.

Example:
{
"x": 350,
"y": 250,
"button": "left",
"double": false,
"modifierKeys": ["control"]
}
screenshot
Action Result Screenshot · object

Optional screenshot data. Only present when screenshots are requested via options.screenshot.phases or deprecated fields

Example:
{
"trace": {
"uri": "..."
},
"before": {
"uri": "..."
},
"after": {
"uri": "..."
}
}