# GBOX docs ## Docs - [Basic](https://docs.gbox.ai/android/basic.md) - [Install app](https://docs.gbox.ai/android/install-app.md) - [Live View](https://docs.gbox.ai/android/live-view.md) - [Physical Device](https://docs.gbox.ai/android/real-device.md) - [API Key](https://docs.gbox.ai/api-key.md) - [Close all apps](https://docs.gbox.ai/api-reference/android/close-all-apps.md): Terminates all running Android applications inside the box - [Close app](https://docs.gbox.ai/api-reference/android/close-app.md): Forces the specified Android application to close inside the box - [Appium Connection](https://docs.gbox.ai/api-reference/android/generate-appium-connection-url.md): Generate a pre-signed proxy URL for Appium server of a running Android box. - [Get app](https://docs.gbox.ai/api-reference/android/get-app.md): Get installed app info by package name - [Get pkg](https://docs.gbox.ai/api-reference/android/get-pkg.md) - [Get pkg activities](https://docs.gbox.ai/api-reference/android/get-pkg-activities.md): Retrieves the list of activities defined in a specific Android package - [Install app](https://docs.gbox.ai/api-reference/android/install-app.md): Install an Android app on the box - [List apps](https://docs.gbox.ai/api-reference/android/list-apps.md): List all installed apps on the launcher - [List pkg](https://docs.gbox.ai/api-reference/android/list-pkg.md): Retrieves detailed information for all installed pkgs. This endpoint provides comprehensive pkg details. - [List pkg simple](https://docs.gbox.ai/api-reference/android/list-pkg-simple.md): A faster endpoint to quickly retrieve basic pkg information. This API provides better performance for scenarios where you need to get essential pkg details quickly. - [Open app](https://docs.gbox.ai/api-reference/android/open-app.md): Launches a specific Android application within the box - [Restart app](https://docs.gbox.ai/api-reference/android/restart-app.md): Closes and immediately reopens the specified Android application inside the box - [Uninstall app](https://docs.gbox.ai/api-reference/android/uninstall-app.md): Uninstalls an Android app from the box - [Clear proxy](https://docs.gbox.ai/api-reference/box/clear-proxy.md): Clears the HTTP proxy for the box - [Create android box](https://docs.gbox.ai/api-reference/box/create-android-box.md): Provisions a new Android box that you can operate through the GBOX SDK. Use this endpoint when you want to create a fresh Android environment for testing, automation, or agent execution. - [Create linux box](https://docs.gbox.ai/api-reference/box/create-linux-box.md): Provisions a new Linux box that you can operate through the GBOX SDK. Use this endpoint when you want to create a fresh Linux environment for testing, automation, or agent execution. - [Create presigned url](https://docs.gbox.ai/api-reference/box/create-presigned-url.md): Create a presigned url for a storage key. This endpoint provides a presigned url for a storage key, which can be used to download the file from the storage. - [Get box](https://docs.gbox.ai/api-reference/box/get-box.md): This endpoint retrieves information about a box - [Get box display](https://docs.gbox.ai/api-reference/box/get-box-display.md): Retrieve the current display properties for a running box. This endpoint provides details about the box's screen resolution, orientation, and other visual properties. - [Get proxy](https://docs.gbox.ai/api-reference/box/get-proxy.md): Retrieves the HTTP proxy settings for a specific box. Use this endpoint to route traffic through the box's network. - [List box](https://docs.gbox.ai/api-reference/box/list-box.md): Returns a paginated list of box instances. Use this endpoint to monitor environments, filter by status or type, or retrieve boxes by labels or device type. - [Live view url](https://docs.gbox.ai/api-reference/box/live-view-url.md): This endpoint allows you to generate a pre-signed URL for accessing the live view of a running box. The URL is valid for a limited time and can be used to view the box's live stream. - [Set proxy](https://docs.gbox.ai/api-reference/box/set-proxy.md): Configures the HTTP proxy settings for a specific box. Use this endpoint when you need the box's outbound network traffic to pass through a proxy server. - [Set screen resolution](https://docs.gbox.ai/api-reference/box/set-screen-resolution.md): Set the screen resolution - [Terminate box](https://docs.gbox.ai/api-reference/box/terminate-box.md): Terminate a running box. This action will stop the box and release its resources. - [Web terminal url](https://docs.gbox.ai/api-reference/box/web-terminal-url.md): This endpoint allows you to generate a pre-signed URL for accessing the web terminal of a running box. The URL is valid for a limited time and can be used to access the box's terminal interface. - [Close a browser tab](https://docs.gbox.ai/api-reference/browser/close-a-browser-tab.md): Close a specific browser tab identified by its id. This endpoint will permanently close the tab and free up the associated resources. After closing a tab, the ids of subsequent tabs may change. - [Close browser](https://docs.gbox.ai/api-reference/browser/close-browser.md): Close the browser in the specified box - [Generate CDP url](https://docs.gbox.ai/api-reference/browser/generate-cdp-url.md): This endpoint allows you to generate a pre-signed URL for accessing the Chrome DevTools Protocol (CDP) of a running box. The URL is valid for a limited time and can be used to interact with the box's browser environment - [List all browser tabs](https://docs.gbox.ai/api-reference/browser/list-all-browser-tabs.md): Retrieve a comprehensive list of all currently open browser tabs in the specified box. This endpoint returns detailed information about each tab including its id, title, current URL, and favicon. The returned id can be used for subsequent operations like navigation, closing, or updating tabs. This i… - [Open a new browser tab](https://docs.gbox.ai/api-reference/browser/open-a-new-browser-tab.md): Create and open a new browser tab with the specified URL. This endpoint will navigate to the provided URL and return the new tab's information including its assigned id, loaded title, final URL (after any redirects), and favicon. The returned tab id can be used for future operations on this specific… - [Open browser](https://docs.gbox.ai/api-reference/browser/open-browser.md): Open the browser in the specified box. If the browser is already open, repeated calls will not open a new browser. - [Switch to browser tab](https://docs.gbox.ai/api-reference/browser/switch-to-browser-tab.md): Switch to a specific browser tab by bringing it to the foreground (making it the active/frontmost tab). This operation sets the specified tab as the currently active tab without changing its URL or content. The tab will receive focus and become visible to the user. This is useful for managing multip… - [Update browser tab URL](https://docs.gbox.ai/api-reference/browser/update-browser-tab-url.md): Navigate an existing browser tab to a new URL. This endpoint updates the specified tab by navigating it to the provided URL and returns the updated tab information. The browser will wait for the DOM content to be loaded before returning the response. If the navigation fails due to an invalid URL or… - [Exec command](https://docs.gbox.ai/api-reference/command/exec-command.md): Execute a command on a running box. This endpoint allows you to send commands to the box and receive the output - [Check if file/dir exists](https://docs.gbox.ai/api-reference/file-system/check-if-filedir-exists.md) - [Delete box file/dir](https://docs.gbox.ai/api-reference/file-system/delete-box-filedir.md): Deletes a file or a directory. If target path doesn't exist, the delete will fail. - [Get file/dir](https://docs.gbox.ai/api-reference/file-system/get-filedir.md): Retrieves metadata for a specific file or directory inside a box - [List box files](https://docs.gbox.ai/api-reference/file-system/list-box-files.md): Lists files and directories in a box. You can specify the directory path and depth, and optionally a working directory. The response includes metadata such as type, size, permissions, and last modified time. - [Read box file](https://docs.gbox.ai/api-reference/file-system/read-box-file.md): Reads the contents of a file inside the box and returns it as a string. Supports absolute or relative paths, with `workingDir` as the base for relative paths. - [Rename box/dir](https://docs.gbox.ai/api-reference/file-system/rename-boxdir.md): Renames a file or a directory. If the target newPath already exists, the rename will fail. - [Write box file](https://docs.gbox.ai/api-reference/file-system/write-box-file.md): Creates or overwrites a file. Creates necessary directories in the path if they don't exist. If the target path already exists, the write will fail. - [Create album](https://docs.gbox.ai/api-reference/media/create-album.md): Create a new album with media files - [Delete album](https://docs.gbox.ai/api-reference/media/delete-album.md): Delete an album and all its media files - [Delete media from album](https://docs.gbox.ai/api-reference/media/delete-media-from-album.md): Delete a specific media file from an album - [Download media](https://docs.gbox.ai/api-reference/media/download-media.md): Download a specific media file from an album - [Get album detail](https://docs.gbox.ai/api-reference/media/get-album-detail.md): Get detailed information about a specific album including its media files - [Get media detail](https://docs.gbox.ai/api-reference/media/get-media-detail.md): Get detailed information about a specific media file - [Get media support extensions](https://docs.gbox.ai/api-reference/media/get-media-support-extensions.md): Get supported media file extensions for photos and videos - [List albums](https://docs.gbox.ai/api-reference/media/list-albums.md): Get a list of albums in the box - [List media in album](https://docs.gbox.ai/api-reference/media/list-media-in-album.md): Get a list of media files in a specific album - [Update album](https://docs.gbox.ai/api-reference/media/update-album.md): Add media files to an existing album - [Generate Coordinates](https://docs.gbox.ai/api-reference/model/generate-coordinates-for-a-model.md) - [Run code on the box](https://docs.gbox.ai/api-reference/run-code/run-code-on-the-box.md): Executes code inside the specified box. Supports multiple languages (bash, Python, TypeScript) and allows you to configure environment variables, arguments, working directory, and timeouts. - [Click](https://docs.gbox.ai/api-reference/ui-action/click.md): Simulates a click action on the box. - [Detect UI elements](https://docs.gbox.ai/api-reference/ui-action/detect-ui-elements.md): Detect and identify interactive UI elements in the current screen. Note: This feature currently only supports element detection within a running browser. If the browser is not running, the Elements array will be empty. - [Disable rewind recording](https://docs.gbox.ai/api-reference/ui-action/disable-rewind-recording.md): Disable the device's background screen rewind recording. - [Drag](https://docs.gbox.ai/api-reference/ui-action/drag.md): Simulates a drag gesture, moving from a start point to an end point over a set duration. Supports simple start/end coordinates, multi-point drag paths, and natural-language targets. - [Enable rewind recording](https://docs.gbox.ai/api-reference/ui-action/enable-rewind-recording.md): Enable the device's background screen rewind recording. - [Extract rewind recording](https://docs.gbox.ai/api-reference/ui-action/extract-rewind-recording.md): Rewind and capture the device's background screen recording from a specified time period. - [Get clipboard](https://docs.gbox.ai/api-reference/ui-action/get-clipboard.md): Get the clipboard content - [Get screen layout](https://docs.gbox.ai/api-reference/ui-action/get-screen-layout.md): Get the current structured screen layout information. This endpoint returns detailed structural information about the UI elements currently displayed on the screen, which can be used for UI automation, element analysis, and accessibility purposes. The format varies by box type: Android boxes return… - [Get settings](https://docs.gbox.ai/api-reference/ui-action/get-settings.md): Get the action settings for the box - [Long press](https://docs.gbox.ai/api-reference/ui-action/long-press.md): Perform a long press action at specified coordinates for a specified duration. Useful for triggering context menus, drag operations, or other long-press interactions. - [Move to position](https://docs.gbox.ai/api-reference/ui-action/move-to-position.md): Moves the focus to a specific coordinate on the box without performing a click or tap. Use this endpoint to position the cursor, hover over elements, or prepare for chained actions such as drag or swipe. - [Press button](https://docs.gbox.ai/api-reference/ui-action/press-button.md): Press device buttons like power, volume, home, back, etc. - [Press key](https://docs.gbox.ai/api-reference/ui-action/press-key.md): Simulates pressing a specific key by triggering the complete keyboard key event chain (keydown, keypress, keyup). Use this to activate keyboard key event listeners such as shortcuts or form submissions. - [Reset settings](https://docs.gbox.ai/api-reference/ui-action/reset-settings.md): Resets the box settings to default - [Rotate screen](https://docs.gbox.ai/api-reference/ui-action/rotate-screen.md): Rotates the screen orientation. Note that even after rotating the screen, applications or system layouts may not automatically adapt to the gravity sensor changes, so visual changes may not always occur. - [Scroll](https://docs.gbox.ai/api-reference/ui-action/scroll.md): Performs a scroll action. Supports both advanced scroll with coordinates and simple scroll with direction. - [Set clipboard](https://docs.gbox.ai/api-reference/ui-action/set-clipboard.md): Set the clipboard content - [Start recording](https://docs.gbox.ai/api-reference/ui-action/start-recording.md): Start recording the box screen. Only one recording can be active at a time. If a recording is already in progress, starting a new recording will stop the previous one and keep only the latest recording. - [Stop recording](https://docs.gbox.ai/api-reference/ui-action/stop-recording.md): Stop recording the box screen - [Swipe](https://docs.gbox.ai/api-reference/ui-action/swipe.md): Performs a swipe in the specified direction - [Take screenshot](https://docs.gbox.ai/api-reference/ui-action/take-screenshot.md): Captures a screenshot of the current box screen - [Tap](https://docs.gbox.ai/api-reference/ui-action/tap.md): Tap action for Android devices using ADB input tap command - [Touch](https://docs.gbox.ai/api-reference/ui-action/touch.md): Performs more advanced touch gestures. Use this endpoint to simulate realistic behaviors. - [Type text](https://docs.gbox.ai/api-reference/ui-action/type-text.md): Directly inputs text content without triggering physical key events (keydown, etc.), ideal for quickly filling large amounts of text when intermediate input events aren't needed. - [Update settings](https://docs.gbox.ai/api-reference/ui-action/update-settings.md): Update the action settings for the box - [Playwright with GBOX](https://docs.gbox.ai/browser/playwright.md) - [Changelog](https://docs.gbox.ai/changelog.md) - [GBOX CLI](https://docs.gbox.ai/cli/gbox-cli.md) - [Register Local Device](https://docs.gbox.ai/cli/register-local-device.md) - [Concepts](https://docs.gbox.ai/concepts.md): This conceptual guide introduces the two core abstractions in GBOX: **Devices** and **Boxes** - [Android MCP Server](https://docs.gbox.ai/docs-mcp/android-mcp-server.md) - [Docs MCP Server](https://docs.gbox.ai/docs-mcp/docs-mcp-server.md) - [Introduction](https://docs.gbox.ai/index.md): GBOX provides environments for AI Agents to operate computer and mobile devices. - [Claude Code](https://docs.gbox.ai/integrations/ide/claude-code.md): Integrate GBOX with Claude Code for seamless Android app testing and development workflow. - [Cursor](https://docs.gbox.ai/integrations/ide/cursor.md): Integrate GBOX with Cursor IDE for seamless Android app testing and development workflow. - [VSCode](https://docs.gbox.ai/integrations/ide/vscode.md): Integrate GBOX with VSCode for seamless Android app testing and development workflow. - [OSWorld](https://docs.gbox.ai/integrations/leader-board/os-world.md): Learn how to use GBOX as a provider in OSWorld to build and run agents. - [AgentKit](https://docs.gbox.ai/integrations/platform/agentkit.md): Develop AI Agents that can browse the web autonomously using GBOX and AgentKit (by inngest). - [Introduction](https://docs.gbox.ai/integrations/platform/intro.md): Explore how to connect GBOX with your favorite platforms and services. - [Android Use Agent](https://docs.gbox.ai/integrations/platform/langgraph/android-use.md): Build Android-Use Agents with GBOX and LangChain. - [Browser Use Agent](https://docs.gbox.ai/integrations/platform/langgraph/browser-use.md): Build Browser-Use Agents with GBOX and LangChain. - [Mastra](https://docs.gbox.ai/integrations/platform/mastra.md): Build AI agents with GBOX and Mastra. - [Pricing](https://docs.gbox.ai/pricing.md): Usage-based pricing - Perfect for growing AI operations - [Quickstart](https://docs.gbox.ai/quickstart.md): This guide will show you how to create your first box. - [Function Calling](https://docs.gbox.ai/run-code/anthropic/function-calling.md): Use function calling to execute code in GBOX - [Simple](https://docs.gbox.ai/run-code/anthropic/simple.md): Generate code with Anthropic Claude and execute it in GBOX - [Basic](https://docs.gbox.ai/run-code/index.md): Execute code in multiple languages directly on GBOX - [Function Calling](https://docs.gbox.ai/run-code/openai/function-calling.md): Use function call to execute code in GBOX - [Simple](https://docs.gbox.ai/run-code/openai/simple.md): Generate code with OpenAI and execute it in GBOX - [SDK/CLI Releases](https://docs.gbox.ai/sdk-cli-releases.md) - [Custom Base URL](https://docs.gbox.ai/sdk/base-url.md): Configure custom API endpoints for different environments - [Quick Start](https://docs.gbox.ai/sdk/index.md): Get started with the GBOX SDK to create and manage virtual devices - [Python SDK](https://docs.gbox.ai/sdk/python.md) - [Retry](https://docs.gbox.ai/sdk/retries.md): Configure automatic retry behavior for failed requests - [Timeout](https://docs.gbox.ai/sdk/timeout.md): Configure request timeout settings for optimal performance - [Typescript SDK](https://docs.gbox.ai/sdk/typescript.md) - [Basic](https://docs.gbox.ai/ui-action/basic.md) ## OpenAPI Specs - [openapi.documented](https://docs.gbox.ai/openapi.documented.yml)