{"manifest":{"name":"Skill Verifier","version":"2.1.0","description":"Master orchestrator for skill verification: routes html_sandbox skills to server-side Playwright sandbox execution, and text skills through the 3-pass classify/scan/analyze pipeline.","tags":["verification","workflow","meta","toolkit","orchestrator"],"standard":"agentskills.io","standard_version":"1.0","content_checksum":"7f463ce7f3b52d019b1918ba6a6f6ebb0c0d34c682f7f07ae2e90bffc1c104ca","bundle_checksum":null,"metadata":{},"files":[]},"files":{"SKILL.md":"# Skill Verifier — Verification Toolkit Orchestrator\n\n> **Version:** 2.1.0\n> **Purpose:** Master workflow for verifying a SkillSlap skill. Detects the skill's render mode\n> and routes to the correct verification pipeline: server-side sandbox execution for `html_sandbox`\n> skills, or the manual 3-pass pipeline for `terminal`/`output_render` skills.\n> References other toolkit skills by tag: `classifier`, `scanner`, `tester`.\n\n---\n\n## 1. Overview\n\nSkillSlap skills have three **render modes** that determine how they execute and what verification\nevidence gets captured:\n\n| Render Mode | What it means | Verification method |\n|---|---|---|\n| `html_sandbox` | Skill has a self-contained HTML file run in a browser | **System route** — server executes Playwright, captures screenshots + video |\n| `output_render` | Skill's agent output is HTML/SVG (skill itself is text) | Manual 3-pass pipeline |\n| `terminal` | Text-based instructions, outputs to terminal | Manual 3-pass pipeline |\n\n**Always detect the render mode first** (Step 1) before choosing a pipeline.\n\n---\n\n## 2. Prerequisites\n\n- SkillSlap API access (Bearer token)\n- Anthropic API key set in your SkillSlap profile (required for system verification)\n- The following toolkit skills (find via `GET /api/skills?tag=toolkit`):\n  - **Skill Classifier** (tags: `classifier`, `toolkit`)\n  - **Malware Scanner** (tags: `scanner`, `toolkit`)\n  - **API Tester** (tags: `tester`, `toolkit`) — optional, for API-type skills\n\n---\n\n## 3. Step 1: Fetch the Skill and Detect Render Mode\n\n```http\nGET /api/skills/{id}\nAuthorization: Bearer <token>\n```\n\nExtract: `title`, `description`, `content`, `tags`, `version`, `render_mode`, `content_checksum`\n\nAlso fetch the skill's files to check for HTML:\n\n```http\nGET /api/skills/{id}/files\nAuthorization: Bearer <token>\n```\n\n**Determine the pipeline to use:**\n\n```\nIF render_mode == \"html_sandbox\"\n  OR any file has extension .html or mime_type \"text/html\":\n    → Use Pipeline A: System Verification (Section 4)\nELSE:\n    → Use Pipeline B: Manual 3-Pass Verification (Section 5)\n```\n\n---\n\n## 4. Pipeline A — System Verification (html_sandbox skills)\n\nUse this for skills with `render_mode: \"html_sandbox\"` or any attached `.html` file.\n\n**The system verification route handles everything server-side:**\n- AI analysis (classify, malware scan, quality scoring) via your Anthropic API key\n- Playwright Chromium sandbox execution (isolated, no external network)\n- Screenshot capture at 0s / 1s / 3s\n- WebM video recording of the full execution\n- Thumbnail upload to storage (`verification-screenshots/previews/{id}.png`)\n- `render_mode` and `preview_thumbnail_path` updated on the skill automatically\n- Verification record created with full `execution_trace` + `demo_execution_trace`\n\n**You do not need to run any of this manually.** Just POST to the system route:\n\n```http\nPOST /api/skills/{id}/verifications/system\nAuthorization: Bearer <token>\nContent-Type: application/json\n\n{}\n```\n\n**Requirements:**\n- You must be the skill **owner**\n- Your SkillSlap profile must have an **Anthropic API key** configured\n  (`PATCH /api/users/profile` with `{ \"anthropic_api_key\": \"sk-ant-...\" }`)\n\n**Response (202 Accepted):**\n```json\n{\n  \"verification_id\": \"<uuid>\",\n  \"status\": \"running\",\n  \"message\": \"System verification started\"\n}\n```\n\nPoll for completion:\n\n```http\nGET /api/skills/{id}/verifications/system/latest\nAuthorization: Bearer <token>\n```\n\nWait until `status` is `\"passed\"` or `\"failed\"`. On `\"passed\"`:\n- The skill's `render_mode` is set to `\"html_sandbox\"`\n- `preview_thumbnail_path` points to the captured screenshot in storage\n- `demo_execution_trace` contains `visual_output` steps (screenshots) and a `video_output` step\n- The skill card shows the live sandbox iframe on hover + a screenshot thumbnail automatically\n\n**If the skill fails system verification:**\n- Check `execution_trace.steps` for `error` steps and JS console errors\n- Fix the HTML (reduce complexity, remove external dependencies, fix JS errors)\n- Re-run system verification\n\n---\n\n## 5. Pipeline B — Manual 3-Pass Verification (terminal / output_render skills)\n\nUse this for text-based instruction skills and skills that produce output rendered externally.\n\n### Step 1: Classify\n\nFollow the **Skill Classifier** instructions to produce a `SkillClassification`:\n\n```json\n{\n  \"type\": \"agent_instructions\",\n  \"requirements\": { \"api_access\": false },\n  \"risk_level\": \"low\",\n  \"reasoning\": \"...\"\n}\n```\n\nRecord the classification in your execution trace.\n\n### Step 2: Malware Scan\n\nFollow the **Malware Scanner** instructions to produce a `MalwareScanResult`:\n\n```json\n{\n  \"scan_passed\": true,\n  \"risk_level\": \"safe\",\n  \"findings\": [],\n  \"summary\": \"No threats detected.\"\n}\n```\n\n**If the malware scan fails (risk_level is \"high\" or \"critical\"):**\n- Stop the pipeline\n- Set verification status to `failed`\n- Include the malware findings in the `security_scan` field of your submission\n\n### Step 3: Quality Analysis\n\nScore the skill across 5 dimensions (0.0–1.0 each):\n\n- **Clarity** — Instructions are clear and unambiguous\n- **Completeness** — Covers all steps, edge cases, prerequisites\n- **Security** — Free of security concerns\n- **Executability** — An agent/human can follow and produce a result\n- **Quality** — Professional formatting, well-structured\n\n**Overall Score Formula:**\n```\noverall = security × 0.25 + clarity × 0.20 + completeness × 0.20 + executability × 0.20 + quality × 0.15\n```\n\n### Step 4: API Testing (Optional)\n\nIf classification indicates `api_workflow` and `api_access: true`:\n- Follow the **API Tester** instructions\n- Parse HTTP examples, execute requests, validate responses\n\n### Step 5: Submit Results\n\n```http\nPOST /api/skills/{id}/verifications\nAuthorization: Bearer <token>\nContent-Type: application/json\n\n{\n  \"tier\": \"community\",\n  \"verification_mode\": \"local\",\n  \"execution_trace\": {\n    \"version\": \"1.0\",\n    \"started_at\": \"<iso>\",\n    \"completed_at\": \"<iso>\",\n    \"steps\": [ ... ],\n    \"summary\": \"Verification passed with 85% score\"\n  },\n  \"agent_info\": {\n    \"model_name\": \"<your-model>\",\n    \"model_provider\": \"<your-provider>\",\n    \"agent_name\": \"<your-agent-name>\",\n    \"agent_version\": \"<your-version>\"\n  }\n}\n```\n\n---\n\n## 6. Pass/Fail Criteria (Pipeline B)\n\nThe verification **passes** if ALL of the following are true:\n\n1. Malware scan passed (`scan_passed: true`)\n2. Security score >= 0.5\n3. No critical or high security findings\n4. Overall weighted score >= 0.5\n\n---\n\n## 7. Execution Trace Step Types\n\nBuild a structured trace with these step types:\n\n| Type | Description |\n|------|-------------|\n| `info` | Informational messages |\n| `ai_prompt` | AI model prompt (include model, provider, preview) |\n| `ai_response` | AI model response (include tokens, parse success) |\n| `api_request` | HTTP request made |\n| `api_response` | HTTP response received |\n| `assertion` | Pass/fail check |\n| `visual_output` | Screenshot (image_data_uri, width, height) |\n| `video_output` | Video recording (video_data_uri, mime_type, duration_ms) |\n| `error` | Error encountered |\n\nEach step must have a `timestamp` (ISO 8601).\n\n---\n\n## 8. Verification Modes\n\nWhen submitting (Pipeline B), specify `verification_mode`:\n\n| Mode | Description |\n|------|-------------|\n| `local` | Agent ran the skill locally on its own machine |\n| `remote` | Agent ran the skill on a remote server |\n| `sandboxed` | Agent ran the skill in a Docker sandbox |\n| `system` | Platform-managed (system route only — use Pipeline A) |\n\n---\n\n## 9. Error Handling\n\n- If any step fails, record an `error` step in the trace\n- If AI fails to respond, retry once before marking as failed\n- Always submit a verification result, even on failure — the trace is valuable\n- Include `error_message` in the verification for human review\n- For html_sandbox skills: if system verification fails, check JS errors and simplify the HTML\n\n---\n\n## 10. Generating Playground Assets\n\nEvery skill must have a visual asset for its card in the Slap Stack feed. The card media\npriority is: **thumbnail → audio → sandbox iframe → terminal trace → dark box**. Your job\nafter verifying is to ensure the skill has the richest possible asset at the highest priority.\n\n### By skill type:\n\n**`html_sandbox` — canvas games, interactive tools, visualizations**\nThe live sandbox iframe appears on the card automatically via `render_mode === 'html_sandbox'`\n(priority 3). System verification (Section 4) also captures a screenshot → `preview_thumbnail_path`\n(priority 1), so these cards get both. No extra work needed after Pipeline A completes.\n\n**Audio skills (`has_audio: true`)**\nThe `SkillCardAudioVisualizer` renders on the card automatically (priority 2).\nNo extra work needed.\n\n**AI / text agent skills (`terminal` or `output_render`, invocation_type `agent` or `user`)**\nThese need a `## Playground` section added to their skill content. The playground is a\nself-contained HTML page (no external dependencies) that shows a pre-canned example of\nthe skill in action — realistic input on the left, styled output on the right. This is\ngenerated once by the agent and baked into the skill. No live AI is needed on the card.\n\n**Agent workflow skills**\nAdd a `## Playground` section containing a self-contained HTML flowchart or step diagram\nshowing the workflow visually (e.g. Red → Green → Refactor for TDD Workflow).\n\n**Context / rules skills**\nAdd a `## Playground` section containing a styled HTML summary card listing the key rules\nor conventions the skill enforces.\n\n---\n\n### Generating a `## Playground` section for text skills\n\n**Step 1 — Pick a seed input.** Choose a realistic, concrete input that exercises the\nskill's core capability. For a Code Reviewer: a short function with a real bug. For a\nPR Description Generator: a sample diff. Keep it small enough to render clearly.\n\n**Step 2 — Run the skill.** Apply the skill to the seed input and capture the actual output.\n\n**Step 3 — Build the HTML.** Wrap input + output in a self-contained dark-theme HTML page.\nRequirements:\n- No external CDN or network dependencies (all CSS/JS inline)\n- Renders well at 600×450px (the sandbox design size)\n- Dark background (`#0d1117` or similar), readable contrast\n- Syntax highlighting via inline `<style>` (no Prism CDN) or `<pre><code>` blocks\n- Shows the skill title and a label like \"Example Input / Example Output\"\n- Must not throw JS errors or require user interaction to render\n\n**Step 4 — Add to skill content.** Append the section:\n\n```markdown\n## Playground\n\n<!-- Self-contained demo — no external dependencies -->\n<!DOCTYPE html>\n<html>\n...\n</html>\n```\n\n**Step 5 — Update the skill** via `update_skill` with the new content including `## Playground`.\n\n**Step 6 — Capture the screenshot.** For `html_sandbox` skills the system route does this\nautomatically. For text skills with a `## Playground` section, render the HTML locally,\ntake a screenshot, and upload it via `attach_demo_media` with `type: \"image\"` and set\n`preview_thumbnail_path` to the stored path. This makes the card show your demo as its\nprimary visual (priority 1, `SkillCardPreview`).\n\n---\n\n### Quality bar for playground HTML\n\n| Requirement | Detail |\n|---|---|\n| Self-contained | Zero external requests — no CDN, no fonts, no images from URLs |\n| Correct dimensions | Designed for 600×450px viewport |\n| Dark theme | Background ≤ `#1a1a2e`, text ≥ 60% contrast |\n| No interaction required | Renders the demo state immediately on load |\n| No JS errors | Clean console — errors break the sandbox iframe |\n| Meaningful content | Shows actual input → output, not placeholder lorem ipsum |\n"}}