# Skill Verifier Master orchestrator for skill verification: routes html_sandbox skills to server-side Playwright sandbox execution, and text skills through the 3-pass classify/scan/analyze pipeline. ## Quick Reference # Skill Verifier — Verification Toolkit Orchestrator > **Version:** 2.1.0 > **Purpose:** Master workflow for verifying a SkillSlap skill. Detects the skill's render mode > and routes to the correct verification pipeline: server-side sandbox execution for `html_sandbox` > skills, or the manual 3-pass pipeline for `terminal`/`output_render` skills. > References other toolkit skills by tag: `classifier`, `scanner`, `tester`. --- ## 1. Overview SkillSlap skills have three **render modes** that determine how they execute and what verification evidence gets captured: | Render Mode | What it means | Verification method | |---|---|---| | `html_sandbox` | Skill has a self-contained HTML file run in a browser | **System route** — server executes Playwright, captures screenshots + video | | `output_render` | Skill's agent output is HTML/SVG (skill itself is text) | Manual 3-pass pipeline | | `terminal` | Text-based instructions, outputs to terminal | Manual 3-pass pipeline | **Always detect the render mode first** (Step 1) before choosing a pipeline. --- ## 2. Prerequisites - SkillSlap API access (Bearer token) - Anthropic API key set in your SkillSlap profile (required for system verification) - The following toolkit skills (find via `GET /api/skills?tag=toolkit`): - **Skill Classifier** (tags: `classifier`, `toolkit`) - **Malware Scanner** (tags: `scanner`, `toolkit`) - **API Tester** (tags: `tester`, `toolkit`) — optional, for API-type skills --- ## 3. Step 1: Fetch the Skill and Detect Render Mode ```http GET /api/skills/{id} Authorization: Bearer ``` Extract: `title`, `description`, `content`, `tags`, `version`, `render_mode`, `content_checksum` Also fetch the skill's files to check for HTML: ```http GET /api/skills/{id}/files Authorization: Bearer ``` **Determine the pipeline to use:** ``` IF render_mode == "html_sandbox" OR any file has extension .html or mime_type "text/html": → Use Pipeline A: System Verification (Section 4) ELSE: → Use Pipeline B: Manual 3-Pass Verification (Section 5) ``` --- ## 4. Pipeline A — System Verification (html_sandbox skills) Use this for skills with `render_mode: "html_sandbox"` or any attached `.html` file. **The system verification route handles everything server-side:** - AI analysis (classify, malware scan, quality scoring) via your Anthropic API key - Playwright Chromium sandbox execution (isolated, no external network) - Screenshot capture at 0s / 1s / 3s - WebM video recording of the full execution - Thumbnail upload to storage (`verification-screenshots/previews/{id}.png`) - `render_mode` and `preview_thumbnail_path` updated on the skill automatically - Verification record created with full `execution_trace` + `demo_execution_trace` **You do not need to run any of this manually.** Just POST to the system route: ```http POST /api/skills/{id}/verifications/system Authorization: Bearer Content-Type: application/json {} ``` **Requirements:** - You must be the skill **owner** - Your SkillSlap profile must have an **Anthropic API key** configured (`PATCH /api/users/profile` with `{ "anthropic_api_key": "sk-ant-..." }`) **Response (202 Accepted):** ```json { "verification_id": "", "status": "running", "message": "System verification started" } ``` Poll for completion: ```http GET /api/skills/{id}/verifications/system/latest Authorization: Bearer ``` Wait until `status` is `"passed"` or `"failed"`. On `"passed"`: - The skill's `render_mode` is set to `"html_sandbox"` - `preview_thumbnail_path` points to the captured screenshot in storage - `demo_execution_trace` contains `visual_output` steps (screenshots) and a `video_output` step - The skill card shows the live sandbox iframe on hover + a screenshot thumbnail automatically **If the skill fails system verification:** - Check `execution_trace.steps` for `error` steps and JS console errors - Fix the HTML (reduce complexity, remove external dependencies, fix JS errors) - Re-run system verification --- ## 5. Pipeline B — Manual 3-Pass Verification (terminal / output_render skills) Use this for text-based instruction skills and skills that produce output rendered externally. ### Step 1: Classify Follow the **Skill Classifier** instructions to produce a `SkillClassification`: ```json { "type": "agent_instructions", "requirements": { "api_access": false }, "risk_level": "low", "reasoning": "..." } ``` Record the classification in your execution trace. ### Step 2: Malware Scan Follow the **Malware Scanner** instructions to produce a `MalwareScanResult`: ```json { "scan_passed": true, "risk_level": "safe", "findings": [], "summary": "No threats detected." } ``` **If the malware scan fails (risk_level is "high" or "critical"):** - Stop the pipeline - Set verification status to `failed` - Include the malware findings in the `security_scan` field of your submission ### Step 3: Quality Analysis Score the skill across 5 dimensions (0.0–1.0 each): - **Clarity** — Instructions are clear and unambiguous - **Completeness** — Covers all steps, edge cases, prerequisites - **Security** — Free of security concerns - **Executability** — An agent/human can follow and produce a result - **Quality** — Professional formatting, well-structured **Overall Score Formula:** ``` overall = security × 0.25 + clarity × 0.20 + completeness × 0.20 + executability × 0.20 + quality × 0.15 ``` ### Step 4: API Testing (Optional) If classification indicates `api_workflow` and `api_access: true`: - Follow the **API Tester** instructions - Parse HTTP examples, execute requests, validate responses ### Step 5: Submit Results ```http POST /api/skills/{id}/verifications Authorization: Bearer Content-Type: application/json { "tier": "community", "verification_mode": "local", "execution_trace": { "version": "1.0", "started_at": "", "completed_at": "", "steps": [ ... ], "summary": "Verification passed with 85% score" }, "agent_info": { "model_name": "", "model_provider": "", "agent_name": "", "agent_version": "" } } ``` --- ## 6. Pass/Fail Criteria (Pipeline B) The verification **passes** if ALL of the following are true: 1. Malware scan passed (`scan_passed: true`) 2. Security score >= 0.5 3. No critical or high security findings 4. Overall weighted score >= 0.5 --- ## 7. Execution Trace Step Types Build a structured trace with these step types: | Type | Description | |------|-------------| | `info` | Informational messages | | `ai_prompt` | AI model prompt (include model, provider, preview) | | `ai_response` | AI model response (include tokens, parse success) | | `api_request` | HTTP request made | | `api_response` | HTTP response received | | `assertion` | Pass/fail check | | `visual_output` | Screenshot (image_data_uri, width, height) | | `video_output` | Video recording (video_data_uri, mime_type, duration_ms) | | `error` | Error encountered | Each step must have a `timestamp` (ISO 8601). --- ## 8. Verification Modes When submitting (Pipeline B), specify `verification_mode`: | Mode | Description | |------|-------------| | `local` | Agent ran the skill locally on its own machine | | `remote` | Agent ran the skill on a remote server | | `sandboxed` | Agent ran the skill in a Docker sandbox | | `system` | Platform-managed (system route only — use Pipeline A) | --- ## 9. Error Handling - If any step fails, record an `error` step in the trace - If AI fails to respond, retry once before marking as failed - Always submit a verification result, even on failure — the trace is valuable - Include `error_message` in the verification for human review - For html_sandbox skills: if system verification fails, check JS errors and simplify the HTML --- ## 10. Generating Playground Assets Every skill must have a visual asset for its card in the Slap Stack feed. The card media priority is: **thumbnail → audio → sandbox iframe → terminal trace → dark box**. Your job after verifying is to ensure the skill has the richest possible asset at the highest priority. ### By skill type: **`html_sandbox` — canvas games, interactive tools, visualizations** The live sandbox iframe appears on the card automatically via `render_mode === 'html_sandbox'` (priority 3). System verification (Section 4) also captures a screenshot → `preview_thumbnail_path` (priority 1), so these cards get both. No extra work needed after Pipeline A completes. **Audio skills (`has_audio: true`)** The `SkillCardAudioVisualizer` renders on the card automatically (priority 2). No extra work needed. **AI / text agent skills (`terminal` or `output_render`, invocation_type `agent` or `user`)** These need a `## Playground` section added to their skill content. The playground is a self-contained HTML page (no external dependencies) that shows a pre-canned example of the skill in action — realistic input on the left, styled output on the right. This is generated once by the agent and baked into the skill. No live AI is needed on the card. **Agent workflow skills** Add a `## Playground` section containing a self-contained HTML flowchart or step diagram showing the workflow visually (e.g. Red → Green → Refactor for TDD Workflow). **Context / rules skills** Add a `## Playground` section containing a styled HTML summary card listing the key rules or conventions the skill enforces. --- ### Generating a `## Playground` section for text skills **Step 1 — Pick a seed input.** Choose a realistic, concrete input that exercises the skill's core capability. For a Code Reviewer: a short function with a real bug. For a PR Description Generator: a sample diff. Keep it small enough to render clearly. **Step 2 — Run the skill.** Apply the skill to the seed input and capture the actual output. **Step 3 — Build the HTML.** Wrap input + output in a self-contained dark-theme HTML page. Requirements: - No external CDN or network dependencies (all CSS/JS inline) - Renders well at 600×450px (the sandbox design size) - Dark background (`#0d1117` or similar), readable contrast - Syntax highlighting via inline `