Write test intent once. Get Playwright code, runs, and artifacts.
This is an enterprise-grade, web-based test automation framework centered around a strict FlowStep DSL. You submit a spec, the system parses it into a validated plan, generates Playwright tests (TS or JS), and can optionally execute the job while streaming logs and collecting artifacts (report, traces, screenshots, video).
What you get (end-to-end)
Example input
CONFIG
base_url | https://example.com
headless | true
trace | on-first-retry
video | retain-on-failure
screenshot | only-on-failure
ENDCONFIG
DATA
email | user@example.com
password | $SECRET:PASSWORD
ENDDATA
TEST | Login Smoke
S001 | NAVIGATE | /
S002 | FILL | role=textbox name="Email" | ${email}
S003 | FILL | role=textbox name="Password" | ${password}
S004 | CLICK | role=button name="Sign in"
S005 | EXPECT_VISIBLE | text="Welcome"
ENDTEST
Secrets are resolved from environment variables at runtime (not stored in the spec).
What it is
A monorepo application (web + API + shared contracts) that turns test specs into deterministic plans, then into Playwright tests, and optionally into executed runs with durable artifacts.
This tool generates Playwright code and can execute it as a job. If you want “push to repo / auto-commit” behavior, that would be an additional feature on top of what’s here. What you do get today is code output you can review and commit using your normal Git workflow.
How it works (request → artifacts)
The important part is the pipeline discipline. The system keeps the intermediate artifacts, so failures are explainable, not mysterious.
Submit steps text, choose TS/JS, and decide whether to run immediately.
If enabled, NL lines can be converted into FlowStep DSL using a configured LLM provider.
CONFIG/DATA/TEST blocks are parsed into a structured representation.
Steps become a normalized plan with resolved variables, validated command shapes, and consistent semantics.
For ambiguous intent, enrichment can produce a deterministic selection trace for complex target choices.
Emit runnable Playwright spec(s) and supporting files in a job workspace.
Run Playwright. Stream logs via Server-Sent Events. Capture outputs based on configured retention.
Fetch logs, parsed plan, artifacts, and the Playwright HTML report for review and debugging.
A lot of “rapid automation” tools hide the intermediate decisions. This one makes the pipeline explicit: input, parse, normalize, generate, run, artifacts. That’s what keeps speed from turning into chaos.
FlowStep DSL (strict, but practical)
FlowStep supports different levels of explicitness, plus a locator syntax designed for Playwright selectors (roles, text, CSS, XPath).
Syntax comparison (same intent, different strictness)
// Concise (inference)
S001 FILL | "Email" | user@example.com
S002 CLICK | "Login"
// More explicit
S001 FILL | "Email" textbox | user@example.com
S002 CLICK | "Login" button
// Strict (full selector control)
S001 | FILL | role=textbox name="Email" | user@example.com
S002 | CLICK | role=button name="Login"
Variables can be referenced as ${name}. Secrets are $SECRET:NAME (resolved from env).
What the DSL covers
CONFIGdefaults (browser/headless/timeouts/workers/retries/trace/video/screenshot, etc.)DATAvalues and secret referencesTESTblocks with step commands and validation- Locator syntax suitable for Playwright selector strategies
- Optional decision trace for complex or ambiguous selection intent
Natural language steps (accepted patterns)
Natural language lines (without pipe syntax) can be interpreted into FlowStep DSL when an LLM provider is enabled. The intent must still be clear and test-oriented. These are examples of supported patterns.
Accepted natural language examples
Navigate to the login page
Fill in the Email field with user@example.com
Type the password into the Password textbox
Click the Sign In button
Check the Remember me checkbox
Wait for the dashboard to load
Verify that "Welcome" is visible
Expect the error message "Invalid credentials"
Select "Canada" from the Country dropdown
Each line must express a single, clear test action or assertion.
How they are converted internally
// Natural language
Click the Sign In button
// Converted FlowStep
S004 | CLICK | role=button name="Sign In"
// Natural language
Verify that "Welcome" is visible
// Converted FlowStep
S005 | EXPECT_VISIBLE | text="Welcome"
// Natural language
Fill in the Email field with user@example.com
// Converted FlowStep
S002 | FILL | role=textbox name="Email" | user@example.com
The conversion produces strict DSL, which then follows the same deterministic pipeline as manually written FlowStep.
Natural language is interpreted into structured FlowStep before validation and generation. The system does not execute raw English. It executes validated DSL. That separation keeps AI-assisted input from turning into unpredictable automation.
LLM support (optional, controlled)
LLMs are used in two distinct places: converting natural language into FlowStep DSL, and enriching locators/decisions. You can run fully local (Ollama) or remote (OpenAI-compatible), and restrict allowed remote base URLs.
Use LLMs where they save time, but keep the system grounded in deterministic outputs: DSL, validated plan, and standard Playwright code.
Architecture (monorepo)
The project is split into a backend service, a frontend service, and shared contracts/schemas. Jobs and artifacts live under a workspace directory.
apps/api (backend)
Job creation, parsing/validation, optional LLM services, code generation, execution, SSE log streaming, artifacts serving.
apps/web (frontend)
UI for creating jobs, selecting LLM provider modes, viewing plans, streaming logs, and accessing reports/artifacts.
packages/shared
Shared types/contracts and schemas to keep API ↔ UI aligned and reduce drift.
Runtime notes
- Workspace dir holds job inputs, generated code, logs, and artifacts.
- Jobs are durable units: they keep parsed plan, run outputs, and downloadable reports.
- Suites can group multiple specs and run them as jobs.
Deploy a self-contained Playwright repo (TS or JS)
In this app, “deployment” means more than generating a test file. You can choose an output format (Playwright + TypeScript or Playwright + JavaScript), then deploy the generated framework into a selected external repository root. The result is a self-contained Playwright environment where the generated scripts can be run immediately.
What gets deployed
package.jsonwith Playwright dependencies and scripts (install / test / report)- Playwright config (
playwright.config.tsfor TS orplaywright.config.jsfor JS) - Test folder layout (
tests/or similar) containing generated specs - Optional helpers / fixtures / shared utilities needed by the generated output
- Artifacts/report output directories (trace/video/screenshots + HTML report)
- Optional
READMEstub in the deployed repo with “run” instructions
How deployment works (conceptually)
1) Input spec (FlowStep + optional NL)
2) Parse + validate → TestPlan
3) Generate Playwright tests (TS or JS)
4) Assemble a complete repo scaffold for the selected format
5) Write files into the selected external repository root
6) Result: `npm install` + `npx playwright install` + `npm test`
This is “deployment” as an output deliverable: a runnable repo, not just generated code.
It removes the usual setup tax. Instead of “here’s a test file, now wire Playwright,” you get a complete runnable repository skeleton that can be committed, reviewed, and executed in CI.
Disclaimer: This app is entirely owned by Emil Zimmermann. Any use of this app must be authorized by the owner.