Skip to content

Under the Hood

This page maps the public UI API to the Chromium DevTools Protocol calls it actually makes.

It is intentionally strict:

  • only commands that really talk to Chromium are shown with CDP
  • thin aliases are grouped instead of repeated
  • constructor helpers like locator() or getByRole() are called out when they do not send CDP by themselves

How To Read This

  • Page owns navigation, capture, and page-wide helpers.
  • Frame uses the same command sequences, but scoped to a frame execution context.
  • Locator resolves fresh on every call and then delegates to Frame.

Page Command Map

CommandCDP command(s)What it does under the hood
page.goto(url)Page.navigateLoads the URL and waits for the requested lifecycle event.
page.query(selector)Runtime.evaluateReturns the first matching element handle.
page.queryAll(selector)Runtime.evaluate + Runtime.getPropertiesReturns all matching element handles.
page.queryXPath(selector)Runtime.evaluateReturns the first XPath match.
page.queryAllXPath(selector)Runtime.evaluate + Runtime.getPropertiesReturns all XPath matches.
page.click(selector)Runtime.evaluate + Input.dispatchMouseEventResolves, scrolls, and clicks the element.
page.dblclick(selector)Runtime.evaluate + Input.dispatchMouseEventResolves, scrolls, and double-clicks the element.
page.fill(selector, value)Runtime.evaluateSets the value directly and dispatches input and change.
page.type(selector, text)Runtime.evaluate + Input.insertTextFocuses the element and inserts text as input events.
page.clear(selector)Runtime.evaluateSame path as fill(selector, "").
page.hover(selector)Runtime.evaluate + Input.dispatchMouseEventMoves the pointer to the element center.
page.press(selector, key)Runtime.evaluate + Input.dispatchKeyEventFocuses the element and sends a key down/up pair.
page.focus(selector)Runtime.evaluateFocuses the element in the page context.
page.blur(selector)Runtime.evaluateBlurs the element in the page context.
page.selectText(selector)Runtime.evaluateSelects text in inputs, textareas, or editable content.
page.scrollIntoViewIfNeeded(selector)Runtime.evaluateScrolls the element into view.
page.check(selector)Runtime.evaluateSets a checkbox or toggle to checked and dispatches form events.
page.uncheck(selector)Runtime.evaluateSets a checkbox or toggle to unchecked and dispatches form events.
page.setChecked(selector, checked)Runtime.evaluateSame check/uncheck path, with the target state explicit.
page.setInputFiles(selector, files)Runtime.evaluateCreates File objects and assigns them to <input type="file">.
page.selectOption(selector, value)Runtime.evaluateSets the selected value on a <select>.
page.evaluate(fnOrString, ...)Runtime.evaluateExecutes arbitrary page-side script and awaits promises.
page.text(selector)Runtime.evaluateReads text content from the page.
page.value(selector)Runtime.evaluateReads the value from a form field.
page.attribute(selector, name)Runtime.evaluateReads an attribute value.
page.isEnabled(selector)Runtime.evaluateChecks disabled and aria-disabled state.
page.isChecked(selector)Runtime.evaluateChecks checkbox or aria-checked state.
page.count(selector)Runtime.evaluateCounts matching nodes.
page.classes(selector)Runtime.evaluateReturns the element class list.
page.css(selector, property)Runtime.evaluateReads a computed CSS property.
page.hasFocus(selector)Runtime.evaluateChecks whether the element is the active element.
page.isInViewport(selector)Runtime.evaluateChecks viewport intersection.
page.isEditable(selector)Runtime.evaluateChecks disabled, readonly, and aria-disabled state.
page.findLocators()Runtime.evaluateScans the DOM and optionally writes JSON / HTML artifacts.
page.content()Runtime.evaluateSerializes the document HTML after waiting for load.
page.screenshot()Page.captureScreenshotCaptures a page image.
page.screenshotBase64()Page.captureScreenshotSame capture path, returned as base64.
page.pdf()Emulation.setEmulatedMedia + Page.printToPDFPrints the page to PDF.

Frame Behavior

Frame uses the same CDP sequences as Page for every interaction and inspection method.

The only meaningful difference is scope:

  • Page acts on the main frame
  • Frame acts on the selected frame

Frame does not add a different protocol story for interaction methods. It reuses the same commands above with frame-specific execution context.

Locator Command Map

Locator is a fresh-resolution wrapper. It does not hold a stale element handle.

CommandUnderlying Frame methodCDP command(s)Notes
locator.click()frame.clickLocator()Runtime.evaluate + Input.dispatchMouseEventResolves fresh, then clicks.
locator.dblclick()frame.dblclickLocator()Runtime.evaluate + Input.dispatchMouseEventDouble click sequence.
locator.fill(value)frame.fillLocator()Runtime.evaluateDirect value set and form events.
locator.clear()frame.clearLocator()Runtime.evaluateFills an empty string.
locator.type(text)frame.typeLocator()Runtime.evaluate + Input.insertTextText insertion after focus.
locator.focus()frame.focusLocator()Runtime.evaluateFocuses the resolved element.
locator.blur()frame.blurLocator()Runtime.evaluateBlurs the resolved element.
locator.hover()frame.hoverLocator()Runtime.evaluate + Input.dispatchMouseEventPointer move over the resolved element.
locator.press(key)frame.pressLocator()Runtime.evaluate + Input.dispatchKeyEventFocus, then key down/up.
locator.selectText()frame.selectTextLocator()Runtime.evaluateSelects text in a field or editable node.
locator.scrollIntoViewIfNeeded()frame.scrollIntoViewIfNeededLocator()Runtime.evaluateScrolls the resolved node into view.
locator.check()frame.checkLocator()Runtime.evaluateToggles checked state to true.
locator.uncheck()frame.uncheckLocator()Runtime.evaluateToggles checked state to false.
locator.setChecked(checked)frame.setCheckedLocator()Runtime.evaluateExplicit check state change.
locator.setInputFiles(files)frame.setInputFilesLocator()Runtime.evaluateFile input attachment.
locator.exists()frame.existsLocator()Runtime.evaluateChecks whether a matching node exists.
locator.isVisible()frame.isVisibleLocator()Runtime.evaluateChecks rendered visibility.
locator.isEnabled()frame.isEnabledLocator()Runtime.evaluateChecks enabled state.
locator.isChecked()frame.isCheckedLocator()Runtime.evaluateChecks checked or aria-checked state.
locator.text()frame.textLocator()Runtime.evaluateReads text content.
locator.value()frame.valueLocator()Runtime.evaluateReads the current value.
locator.attribute(name)frame.attributeLocator()Runtime.evaluateReads an attribute.
locator.classes()frame.classesLocator()Runtime.evaluateReads class names.
locator.css(property)frame.cssLocator()Runtime.evaluateReads a computed style property.
locator.hasFocus()frame.hasFocusLocator()Runtime.evaluateChecks active element state.
locator.isInViewport()frame.isInViewportLocator()Runtime.evaluateChecks viewport visibility.
locator.isEditable()frame.isEditableLocator()Runtime.evaluateChecks editable state.
locator.count()frame.countLocator()Runtime.evaluateCounts matches for the locator query.

Exact Sequences

page.click() and page.dblclick()

These actions follow the same pattern:

ts
// 1. Resolve and scroll the element into view
Runtime.evaluate(...)

// 2. Move pointer to the element center
Input.dispatchMouseEvent({ type: "mouseMoved", ... })

// 3. Press and release the mouse button
Input.dispatchMouseEvent({ type: "mousePressed", button: "left", clickCount: 1, ... })
Input.dispatchMouseEvent({ type: "mouseReleased", button: "left", clickCount: 1, ... })

For dblclick(), the mouse press/release pair repeats with clickCount: 2.

page.fill() and page.clear()

fill() and clear() both use a single page-side evaluation:

ts
Runtime.evaluate(...)

The evaluated script:

  • finds the target element
  • sets the value directly
  • dispatches input
  • dispatches change

clear() is just fill(selector, "").

page.type()

type() uses a focus step plus native text insertion:

ts
Runtime.evaluate(...) // focus element
Input.insertText({ text })

Use this when you want the page to observe real text entry rather than a direct value set.

page.press()

press() focuses first, then sends a key down and key up:

ts
Runtime.evaluate(...) // focus element
Input.dispatchKeyEvent({ type: "keyDown", ... })
Input.dispatchKeyEvent({ type: "keyUp", ... })

This is the path for Enter, Tab, Escape, arrows, and shortcuts.

page.hover()

hover() resolves the box and moves the pointer to the element center:

ts
Runtime.evaluate(...)
Input.dispatchMouseEvent({ type: "mouseMoved", ... })
page.check(), page.uncheck(), and page.setChecked()

These use page-side evaluation to update the element state and fire form events:

ts
Runtime.evaluate(...)

The script:

  • confirms the element is an <input>
  • compares the current checked state
  • updates checked
  • dispatches input
  • dispatches change
page.setInputFiles()

File upload is also done in the page context:

ts
Runtime.evaluate(...)

The script:

  • decodes the provided file data
  • creates File objects
  • populates a DataTransfer
  • assigns el.files
  • dispatches input
  • dispatches change
page.screenshot(), page.screenshotBase64(), and page.pdf()
ts
Page.captureScreenshot(...)

For PDF:

ts
Emulation.setEmulatedMedia({ media: "screen" })
Page.printToPDF(...)

pdf() temporarily switches media to screen, prints, then restores the media setting.

page.evaluate(), page.content(), and query helpers

These are the "read / inspect / script" paths:

  • page.evaluate() uses Runtime.evaluate
  • page.content() uses Runtime.evaluate to serialize the document HTML
  • query() and queryXPath() use Runtime.evaluate with returnByValue: false
  • queryAll() and queryAllXPath() use Runtime.evaluate plus Runtime.getProperties

Aliases And Wrappers

These do not add a new CDP story of their own:

  • fillInput() is the same as fill()
  • clear() is fill() with an empty string
  • typeSecure() is the same as type(), with sensitive logging
  • textSecure() is the same as text(), with sensitive logging
  • valueSecure() is the same as value(), with sensitive logging
  • setFileInput() is a small wrapper around setInputFiles()
  • page.locator(), page.getByText(), and page.getByRole() create locator queries only; they do not send CDP until an action runs
  • locator() methods are fresh-resolution wrappers around Frame

If You Want More

This page is intentionally "full but readable".

If you want an even deeper layer later, the next step would be a protocol inspector that shows:

  • the exact session.send(...) payload shape
  • the element resolution script used for each action
  • the frame-scoped differences for shadow DOM and XPath

Chromium-only automation built on CDP.