Skip to main content
POST
/
model
JavaScript
import GboxSDK from "gbox-sdk";

const gboxSDK = new GboxSDK({
  apiKey: process.env["GBOX_API_KEY"] // This is the default and can be omitted
});

async function main() {
  const result = await gboxSDK.model.call({
		model: "gbox-handy-1",
    screenshot: "https://gru-activate2-public-assets.s3.us-west-2.amazonaws.com/jessica/screenshot-1759332945616-pu0ovj.png",
    action: {
      type: "click",
      target: "the VSCode app icon on the bottom dock"
    }
  });

  console.log("Result:");
  console.info(JSON.stringify(result, null, 2));
}

main();
{
  "response": {
    "type": "click",
    "coordinates": {
      "x": 520,
      "y": 340
    }
  },
  "id": "123e4567-e89b-12d3-a456-426614174000"
}
Generate precise UI element coordinates using the gbox-handy-1 model. This specialized model analyzes screenshots and instructions to identify exact coordinates for UI operations.

Supported Actions

The model supports three core actions that cover nearly all coordinate-based UI interactions:
  • Click: Identify precise tap/click coordinates for buttons, links, and interactive elements
  • Drag: Calculate start and end coordinates for drag operations (e.g., swipe, scroll bars)
  • Scroll: Determine optimal scroll coordinates and directions

Authorizations

Authorization
string
header
required

Enter your API Key in the format: Bearer . Get it from https://gbox.ai

Body

application/json

Model request

screenshot
string
required

Screenshot image as HTTP(S) URL or base64-encoded data URI. Supports both formats: 1) HTTP(S) URL pointing to an image file; 2) Base64-encoded data URI with format 'data:image/png;base64,[data]' or 'data:image/jpeg;base64,[data]'. Only PNG and JPEG formats are supported for base64.

action
Click Action · object
required

Structured action object (click or drag)

  • Click Action
  • Drag Action
  • Scroll Action
model
enum<string>
default:gbox-handy-1

Model to use

Available options:
gbox-handy-1
Example:

"gbox-handy-1"

Response

200 - application/json

Model response data structure

response
Model Click Response Data · object
required

Model response data

  • Model Click Response Data
  • Model Drag Response Data
  • Model Scroll Response Data
id
string
required

Unique ID of this request, can be used for issue reporting and feedback

Example:

"123e4567-e89b-12d3-a456-426614174000"