/* remove background color of images in code group */
[data-component-part="code-group-tab-bar"] img {
  background-color: transparent !important;
}


Overview

How It Works

Common Use Cases

Getting Started

Basic Usage

Understanding the Parameters

Best Practices

What Happens Next

Quickstart

AI Action

Gbox docs

Welcome to gbox.ai, an android runtime for AI agents. When your agent needs to use an Android phone, Gbox can quickly help you set up this environment.

Introduction

This guide will show you how to create your first box

API Key

Basic

Live View

Install app

Real Device

Playwright with Gbox

Execute code in multiple languages directly on Gbox

Explore how to connect Gbox with your favorite platforms and services.

Develop AI Agents that can browse the web autonomously using Gbox and AgentKit (by inngest).

AgentKit

Mastra

Create android box

Create linux box

List box

Terminate a running box. This action will stop the box and release its resources

Terminate box

Get box

Start box

Stop box

This endpoint allows you to generate a pre-signed URL for accessing the live view of a running box. The URL is valid for a limited time and can be used to view the box's live stream

Live view url

This endpoint allows you to generate a pre-signed URL for accessing the web terminal of a running box. The URL is valid for a limited time and can be used to access the box's terminal interface

Web terminal url

Retrieve the current display properties for a running box. This endpoint provides details about the box's screen resolution, orientation, and other visual properties

Get box display

Use natural language instructions to perform UI operations on the box. You can describe what you want to do in plain language (e.g., 'click the login button', 'scroll down to find settings', 'input my email address'), and the AI will automatically convert your instruction into the appropriate UI action and execute it on the box.

AI action

Click

Touch

Drag

Scroll

Performs a swipe in the specified direction

Swipe

Simulates pressing a specific key by triggering the complete keyboard key event chain (keydown, keypress, keyup). Use this to activate keyboard key event listeners such as shortcuts or form submissions.

Press key

Press button on the device. like power button, volume up button, volume down button, etc.

Press button

Directly inputs text content without triggering physical key events (keydown, etc.), ideal for quickly filling large amounts of text when intermediate input events aren't needed.

Type text

Move to position

Rotate screen

Take screenshot

Extract data from the UI interface using a JSON schema.

Extract data

Run code on the box

Execute a command on a running box. This endpoint allows you to send commands to the box and receive the output

Exec command

List box files

Get file/dir

Read box file

Creates or overwrites a file. Creates necessary directories in the path if they don't exist. If target path is already exists, the write will be failed.

Write box file

Delete a file or dir. If target path is not exists, the delete will be failed.

Delete box file/dir

Rename a file or dir. If target newPath is already exists, the rename will be failed.

Rename box/dir

Check if file/dir exists

This endpoint allows you to generate a pre-signed URL for accessing the Chrome DevTools Protocol (CDP) of a running box. The URL is valid for a limited time and can be used to interact with the box's browser environment

Generate CDP url

List apps

Get app

Get pkg

Uninstall app

Open app

Close app

Close all apps

Get pkg activities

Restart app

Retrieve detailed information for all installed pkg. This endpoint provides comprehensive pkg details

List pkg

A faster endpoint to quickly retrieve basic pkg information. This API provides better performance for scenarios where you need to get essential pkg details quickly

Get Started

UI Action

Android

Browser

Run Code

AI Action

Overview

How It Works

Common Use Cases

Quickstart

Getting Started

Basic Usage

Understanding the Parameters

Best Practices

What Happens Next

Get Started

UI Action

Android

Browser

Run Code

​Overview

​How It Works

​Common Use Cases

​Quickstart

​Getting Started

​Basic Usage

​Understanding the Parameters

​Best Practices

​What Happens Next

Overview

How It Works

Common Use Cases

Quickstart

Getting Started

Basic Usage

Understanding the Parameters

Best Practices

What Happens Next