Using GitHub Webhooks to Automate Your Documentation Pipeline
2026-05-26
title: "Using GitHub Webhooks to Automate Your Documentation Pipeline" description: "A technical deep-dive into how GitHub webhooks power a fully automated documentation pipeline. From push event to PR in under two minutes." date: "2026-05-26" keywords: ["github webhook documentation", "automate docs with webhooks", "github webhook tutorial", "documentation automation pipeline"] readTime: "13 min read" category: "Engineering"
Every documentation automation tool, including Pushpen, is built on the same primitive: the GitHub webhook. Understanding how webhooks work gives you a mental model for what happens between "you push code" and "a documentation PR appears."
This post is a technical deep-dive. We are going to cover the webhook payload structure, how to verify webhook signatures, how to extract meaningful changes from a push event, and how a complete documentation pipeline processes all of this.
If you are implementing something similar yourself, you can use this as a reference. If you are using Pushpen, this explains what is happening under the hood.
What Are GitHub Webhooks?
A webhook is an HTTP callback. You register a URL with GitHub, and GitHub calls that URL with an HTTP POST request whenever a specified event occurs in your repository.
GitHub supports dozens of event types. For documentation automation, the most relevant are:
- push — Fires when commits are pushed to any branch
- pull_request — Fires when a PR is opened, closed, synchronized, or merged
- issues — Fires when an issue is created, edited, or closed
- check_run — Fires when a CI check completes
For documentation purposes, the push event is the primary trigger. You push code, GitHub fires the webhook, and the documentation system runs.
The Push Event Payload Explained
The push event payload contains everything you need to understand what changed and why. Here is a real example with all the fields that matter for documentation:
{
"ref": "refs/heads/main",
"before": "abc123",
"after": "def456",
"repository": {
"id": 123456789,
"full_name": "acme-corp/api-service",
"name": "api-service",
"default_branch": "main",
"private": true,
"owner": {
"login": "acme-corp"
}
},
"pusher": {
"name": "alexchen",
"email": "alex@acme.com"
},
"commits": [
{
"id": "def456abc789",
"message": "feat: add rate limiting to auth endpoints",
"timestamp": "2026-05-26T14:23:00Z",
"author": {
"name": "Alex Chen",
"email": "alex@acme.com"
},
"added": [
"src/middleware/rateLimiter.ts",
"src/config/limits.ts"
],
"modified": [
"src/routes/auth.ts",
"src/app.ts"
],
"removed": []
}
]
}
From this payload, a documentation system can extract:
- Which branch was pushed to (and ignore non-default branches)
- Which files were added, modified, or removed
- What the commit messages say
- Who pushed (useful for ignoring bot commits)
How to Set Up a Webhook Manually
If you are building your own documentation automation, you register a webhook through the GitHub API or the repository settings UI. Through the API:
curl -X POST \
-H "Authorization: Bearer $GITHUB_TOKEN" \
-H "Content-Type: application/json" \
https://api.github.com/repos/OWNER/REPO/hooks \
-d '{
"config": {
"url": "https://yourdomain.com/api/webhook/github",
"content_type": "json",
"secret": "your-webhook-secret"
},
"events": ["push", "pull_request", "issues", "check_run"],
"active": true
}'
The secret is important — it is used to verify that incoming requests actually came from GitHub and not from someone who discovered your webhook URL.
Pushpen handles this automatically when you connect a repository. The webhook is installed via the GitHub API using the Pushpen GitHub token, and the secret is configured and stored securely.
Verifying Webhook Signatures
Every incoming webhook request from GitHub includes an X-Hub-Signature-256 header containing an HMAC-SHA256 signature of the request body, signed with your webhook secret.
You must verify this signature before processing any webhook payload. Without verification, anyone who discovers your webhook URL can send fake events.
Here is the verification logic in Node.js/TypeScript:
async function verifyGitHubSignature(
secret: string,
body: string,
signatureHeader: string
): Promise<boolean> {
const encoder = new TextEncoder();
// Import the secret as a cryptographic key
const key = await crypto.subtle.importKey(
"raw",
encoder.encode(secret),
{ name: "HMAC", hash: "SHA-256" },
false,
["sign"]
);
// Sign the body with the key
const mac = await crypto.subtle.sign(
"HMAC",
key,
encoder.encode(body)
);
// Format as hex string with sha256= prefix
const expected = "sha256=" + Array.from(new Uint8Array(mac))
.map((b) => b.toString(16).padStart(2, "0"))
.join("");
// Constant-time comparison to prevent timing attacks
if (expected.length !== signatureHeader.length) return false;
const a = encoder.encode(expected);
const b = encoder.encode(signatureHeader);
let diff = 0;
for (let i = 0; i < a.length; i++) {
diff |= a[i] ^ b[i];
}
return diff === 0;
}
// Usage in a Next.js API route:
export async function POST(req: NextRequest) {
const rawBody = await req.text();
const signature = req.headers.get("x-hub-signature-256") ?? "";
const secret = process.env.GITHUB_WEBHOOK_SECRET ?? "";
const valid = await verifyGitHubSignature(secret, rawBody, signature);
if (!valid) {
return NextResponse.json(
{ error: "Invalid signature" },
{ status: 401 }
);
}
// Safe to process the payload
const payload = JSON.parse(rawBody);
// ...
}
Two things worth noting about this implementation: first, we read the raw body as text before parsing it as JSON, because HMAC verification requires the exact byte sequence that was signed. Second, we use constant-time comparison to prevent timing attacks — the loop runs for the full length of the expected string regardless of where a mismatch occurs.
Extracting Meaningful Changes from a Push Payload
The raw push payload tells you which files changed and what the commit messages say. The documentation system needs to extract a useful summary from this.
Here is how Pushpen builds a diff summary from the push payload:
interface PushCommit {
message: string;
added: string[];
modified: string[];
removed: string[];
}
function buildDiffSummary(commits: PushCommit[]): string {
const added = new Set<string>();
const modified = new Set<string>();
const removed = new Set<string>();
const messages: string[] = [];
for (const commit of commits) {
// Take the first line of each commit message
messages.push(`- ${commit.message.split("\n")[0]}`);
for (const f of commit.added ?? []) added.add(f);
for (const f of commit.modified ?? []) modified.add(f);
for (const f of commit.removed ?? []) removed.add(f);
}
const parts: string[] = [
`Commits:\n${messages.join("\n")}`
];
if (added.size > 0) {
parts.push(`Added files:\n${[...added].map(f => ` ${f}`).join("\n")}`);
}
if (modified.size > 0) {
parts.push(`Modified files:\n${[...modified].map(f => ` ${f}`).join("\n")}`);
}
if (removed.size > 0) {
parts.push(`Removed files:\n${[...removed].map(f => ` ${f}`).join("\n")}`);
}
const summary = parts.join("\n\n");
// Truncate if too long for the AI context window
return summary.length > 4000 ? summary.slice(0, 3997) + "..." : summary;
}
This diff summary, combined with the full repository context fetched from the GitHub API, becomes the input to the AI model that generates the documentation.
The Complete Pipeline: Webhook to PR
Here is the full sequence of events that happens between a push and a documentation pull request:
1. Developer pushes commits to main branch
↓
2. GitHub fires POST to https://pushpen.dev/api/webhook/github
- Payload: full push event with commits and file lists
- Header: X-Hub-Signature-256 for verification
↓
3. Pushpen verifies the webhook signature
- Reject if invalid (return 401)
- Accept if valid (return 200 immediately)
↓
4. Pushpen checks whether the repo is connected and which features are enabled
- Look up connected_repos table by repo_full_name
- Check auto_docs_enabled, pr_summaries_enabled, etc.
↓
5. Pushpen filters out bot pushes
- Ignore pushes from github-actions[bot]
- Ignore Pushpen's own commits (to prevent infinite loops)
↓
6. Pushpen fetches full repository context from GitHub API
- Reads up to 100 relevant source files
- Includes package.json, route files, existing docs
↓
7. AI model generates updated documentation
- Input: diff summary + repository context + existing docs
- Output: updated README, changelog entry, API docs, onboarding guide
↓
8. Pushpen creates a PR with the updated docs
- Branch: docs/auto-update-[timestamp]
- PR title: "docs: auto-update via Pushpen"
- Files: whichever doc types are enabled
↓
9. Developer reviews and merges the PR
- 2-minute review vs. 30-minute manual writing
The return 200 in step 3 is important. GitHub expects webhook endpoints to respond quickly. If your endpoint takes too long, GitHub may retry the webhook or mark it as failed. Pushpen returns 200 as soon as signature verification passes, then processes the payload asynchronously.
Why Webhooks Beat Polling for Real-Time Docs
An alternative approach is polling — periodically checking for new commits and generating documentation if anything changed. This is simpler to implement but has significant downsides.
| Aspect | Webhooks | Polling | |---|---|---| | Latency | Near-instant (seconds) | Whatever your polling interval | | Resource usage | Proportional to actual pushes | Constant regardless of activity | | Missed events | None (GitHub retries on failure) | Possible if timing is wrong | | Implementation complexity | Medium (signature verification) | Low | | Scale | Efficient at any volume | Expensive at high volume |
For documentation automation, latency matters. You want the documentation PR to appear before the code review is finished — ideally within two minutes of the push, so the reviewer sees the docs update alongside the code change. Polling at five-minute intervals would mean documentation updates arrive after the code review is already underway.
Security Considerations for Webhook Endpoints
A few things to get right when building webhook-receiving endpoints:
Always verify signatures. This is not optional. Without signature verification, anyone who knows your webhook URL can trigger documentation generation for any payload they construct.
Read the body as raw text first. If you parse the JSON before verifying the signature, you may lose the exact byte sequence needed for verification. Read the raw body as a string, verify, then parse.
Return 200 quickly. GitHub has a 10-second timeout for webhook delivery. If your processing is slow, return 200 immediately and process asynchronously.
Handle retries gracefully. If your endpoint returns an error, GitHub will retry delivery with exponential backoff. Make your processing idempotent — if the same push event arrives twice, it should not generate two documentation PRs.
Ignore bot commits. A documentation automation system that generates a PR on a push, which then triggers another push when the PR is merged, which triggers another documentation update — that is an infinite loop. Filter out pushes from known bots and from your own tooling.
The GitHub documentation automation guide covers setup from the user perspective. The changelog automation deep-dive covers the content generation side. This post covers the plumbing that connects the two.
Build Your Documentation Pipeline on Solid Foundations
Or use Pushpen and get the plumbing already built and battle-tested.
Tired of outdated codebases?
Start free →