Skip to main content
POST
/
model
JavaScript
import GboxSDK from "gbox-sdk";

const gboxSDK = new GboxSDK({
  apiKey: process.env["GBOX_API_KEY"] // This is the default and can be omitted
});

async function main() {
  const result = await gboxSDK.model.call({
		model: "gbox-handy-1",
    screenshot: "https://gru-activate2-public-assets.s3.us-west-2.amazonaws.com/jessica/screenshot-1759332945616-pu0ovj.png",
    action: {
      type: "click",
      target: "the VSCode app icon on the bottom dock"
    }
  });

  console.log("Result:");
  console.info(JSON.stringify(result, null, 2));
}

main();
{
  "response": {
    "type": "click",
    "coordinates": {
      "x": 520,
      "y": 340
    }
  },
  "id": "123e4567-e89b-12d3-a456-426614174000"
}
Generate precise UI element coordinates using the gbox-handy-1 model. This specialized model analyzes screenshots and instructions to identify exact coordinates for UI operations.

Supported Actions

The model supports three core actions that cover nearly all coordinate-based UI interactions:
  • Click: Identify precise tap/click coordinates for buttons, links, and interactive elements
  • Drag: Calculate start and end coordinates for drag operations (e.g., swipe, scroll bars)
  • Scroll: Determine optimal scroll coordinates and directions

Authorizations

Authorization
string
header
required

Enter your API Key in the format: Bearer <token>. Get it from https://gbox.ai

Body

application/json

Model request

screenshot
string
required

HTTP(S) URL to screenshot image

Example:

"https://gru-activate2-public-assets.s3.us-west-2.amazonaws.com/jessica/screenshot-1759332945616-pu0ovj.png"

action
object
required

Structured action object (click or drag) Click action structure

  • Click Action
  • Drag Action
  • Scroll Action
Example:
{
"type": "click",
"target": "the VSCode app icon on the bottom dock"
}
model
enum<string>
default:gbox-handy-1

Model to use

Available options:
gbox-handy-1
Example:

"gbox-handy-1"

Response

200 - application/json

Model response data structure

response
object
required

Model response data Model click response data structure

  • Model Click Response Data
  • Model Drag Response Data
  • Model Scroll Response Data
id
string
required

Unique ID of this request, can be used for issue reporting and feedback

Example:

"123e4567-e89b-12d3-a456-426614174000"