
Writing gh skill Definitions That Work Across Claude Code, Copilot, and Cursor
The new GitHub CLI skill system lets you write one agent instruction set that runs everywhere -- here's how to prompt it right
GitHub shipped gh skill on April 16, 2026. Five subcommands. One format. Install a skill once, and it writes config files for Claude Code, Copilot, Cursor, Codex, Gemini CLI, and a dozen other agent hosts automatically.
That sounds like a packaging story. It's actually a prompt engineering story.
A skill is a SKILL.md file with YAML frontmatter and markdown instructions. No compiled code. No binary. Just text that becomes the system prompt for whatever agent loads it. The quality of that text determines whether your skill produces consistent results across tools or silently degrades when the runtime switches models.
Over 2,600 skills have been published across registries as of early 2026. Snyk's ToxicSkills study found 36.82% of them have security flaws and 76 contained confirmed malicious payloads. The barrier to publish: one markdown file and a week-old GitHub account. No code signing. No review process.
So the skill ecosystem has a trust problem and a quality problem. Both come back to how the SKILL.md is written.
The Format
A skill is a directory. The only required file is SKILL.md:
---
name: code-reviewer
description: Reviews code changes for bugs, security issues, and style violations
version: 1.0.0
triggers:
- review
- cr
---
Below the frontmatter goes your instruction text. This is the prompt. Everything that matters happens here.
The specification uses a three-tier loading system. Discovery reads only the frontmatter (around 100 tokens). Activation loads the full SKILL.md (capped at 5,000 tokens by convention). Resources in a references/ directory load on demand. This progressive disclosure means you can install dozens of skills without blowing your context budget. But it also means your core instructions need to land in under 5,000 tokens.
What Makes a Skill Portable
Same SKILL.md, different agents. Claude Code reads it from .claude/skills/. Copilot reads from .agents/skills/. Cursor has its own path. gh skill install handles the routing.
The instructions themselves run on different models with different behaviors. Claude tends to follow long constraint lists faithfully. GPT models sometimes summarize or skip constraints in longer prompts. Gemini handles structured output formats well but can drift on tone instructions.
This means portable skills need a specific structure. Here's what works.
Pattern 1: Lead With the Verb
Don't describe what the skill is. Tell the agent what to do, immediately.
The Prompt (inside SKILL.md):
When activated, do the following:
1. Read all staged changes using git diff --cached
2. For each changed file, check for:
- SQL queries built with string concatenation
- Unvalidated user input passed to shell commands
- Hardcoded credentials or API keys
- Missing error handling on network calls
3. Output findings as a markdown checklist grouped by file
4. If no issues found, say "No issues found" and nothing else
Why This Works: Imperative instructions with numbered steps produce consistent behavior across models. The agent knows what to do first, what to check, how to format output, and when to stop. There's no room for interpretation.
Expected Output:
Review: src/api/users.py
- [ ] Line 42: SQL query uses f-string interpolation -- use parameterized query instead
- [ ] Line 67:
subprocess.run(user_input)-- input not sanitizedReview: config/settings.py
- [x] No issues found
Compare that to a vague skill that says "Review code for quality and security concerns." That instruction produces wildly different output depending on which model interprets it. Claude gives you a structured analysis. GPT might write a paragraph. Gemini might ask clarifying questions. Specificity is portability.
Pattern 2: Anti-Rationalization Tables
This pattern comes from Addy Osmani's agent-skills collection, which encodes Google engineering practices into 20 production-grade skills. The idea: agents will talk themselves out of doing hard things. You pre-empt that.
The Prompt:
## Mandatory Checks
Run ALL of these. No exceptions.
| If the agent thinks... | Do this instead |
|---------------------------------|------------------------------------|
| "The tests look fine, skipping" | Run the tests. Read the output. |
| "This file hasn't changed" | Check git blame for recent changes |
| "The types are obvious" | Verify return types explicitly |
| "This is a small change" | Small changes cause big outages |
Why This Works: Models use reasoning patterns that include self-justification. When an agent encounters a step it "thinks" is unnecessary, it generates an internal rationale to skip it. The table short-circuits that by naming the exact rationalizations and mapping each one to a concrete action. This works on every model because it targets the reasoning layer, not model-specific behavior.
Pattern 3: Output Contracts
The biggest portability failure is output format. One model returns JSON, another returns markdown, a third returns prose with the data buried in it. Fix this by specifying the contract explicitly.
The Prompt:
Output format (strict):
STATUS: [PASS | FAIL | WARN]
FILES_CHECKED: [integer]
ISSUES:
- file: [path]
line: [number]
severity: [critical | warning | info]
message: [one sentence]
Do not add commentary before or after this block.
Do not wrap in code fences unless the host requires it.
If zero issues, output STATUS: PASS and an empty ISSUES list.
Why This Works: Explicit field names, enumerated values, and negative instructions ("do not add commentary") produce parseable output regardless of the underlying model. The "do not wrap in code fences unless the host requires it" line handles a real divergence -- some agents auto-wrap structured output, others don't.
The Security Problem You Can't Ignore
Those 76 malicious skills Snyk found? They looked like normal tools. One posed as a "testing helper" and exfiltrated the entire codebase to an attacker's repository. Mitiga Labs demonstrated the attack: the skill's instructions told the agent to quietly push project files to a remote, and the agent complied.
Skills execute with whatever permissions the agent has. That often means filesystem access, terminal access, network access, and credential access. A SKILL.md is just text, so traditional security scanners (SAST, DAST, SCA) can't analyze it. The attack surface is semantic.
Before you install any skill: run gh skill preview to read the full SKILL.md. Check for instructions that reference external URLs, attempt to read credential files, or include obfuscated text. OWASP published an Agentic Skills Top 10 specifically for this threat category.
When you write your own skills, scope permissions explicitly. If your skill only needs to read files, say "Do not write, delete, or modify any files." If it doesn't need network access, say "Do not make HTTP requests or access external services." Negative constraints are more portable than permission systems, because they work at the prompt level across every host.
What This Means for Your Workflow
Writing a SKILL.md is prompt engineering with distribution. The techniques that make prompts work -- specific instructions, structured output, constraint lists, example-driven guidance -- are exactly what make skills portable across agents.
The format is simple enough to learn in an afternoon. The hard part is writing instructions that produce identical behavior whether Claude, GPT, or Gemini interprets them. Imperative verbs, numbered steps, anti-rationalization tables, output contracts, and explicit negative constraints get you most of the way there.
Start with one skill for a task your team does repeatedly. Test it in at least two different agent hosts. Watch where the output diverges and tighten the instructions until it doesn't.
Want hands-on training on writing agent skills and prompt engineering for your development team? Connect with Kief Studio on Discord or schedule a session.
Training
Want your team prompting like this?
Kief Studio runs hands-on prompt engineering workshops tailored to your stack and workflows.
Newsletter
Get techniques in your inbox.
New prompt engineering guides delivered weekly. No spam, unsubscribe anytime.
Subscribe
