The N8N SEO Pipeline: From Google Search Console to WordPress

SEO teams used to need researchers to find keywords, writers to produce content, and designers to create images. One person now runs the whole pipeline at one of my clients’ sites. He approves ideas, reviews drafts before they are published, and nothing else.

That reduction came from building the right automation, instead of just throwing AI at the problem and hoping the output is good enough to publish.

And this is how it was built.

Why Two Workflows, Not One

The instinct when building an SEO pipeline is to make it linear: data in, post out. The problem with that approach is that keyword research and content generation run on completely different cadences and have different failure modes.

The GSC analysis runs on a schedule: monthly or weekly, depending on how much the site's ranking moves. It populates a backlog of ideas. Content generation runs against that backlog when someone approves an item. Coupling them in a single workflow means either the content runs too frequently (burning API costs on keywords that haven't been vetted) or too infrequently (a bottleneck kills the research value before content gets generated).

So: two workflows. The first one thinks on a higher, almost holistic level. The second one writes with a focus on SEO.

Workflow One: From GSC Data to a Prioritized Backlog

The analysis workflow starts with the Google Search Console API, pulled twice per keyword — once to find seed topics, once to evaluate each candidate.

The first GSC pull fetches the site's actual search queries over a configurable lookback window (90 days by default). Brand terms are filtered out. What's left is the real organic landscape: the queries real visitors used to find the site. These become seed topics fed into the research layer.

That research layer runs on Perplexity, more specifically the sonar-pro model with a low temperature setting. The reason for Perplexity over a standard chat model here is web grounding: Perplexity has live search access, which means the keyword suggestions it returns reflect what's currently ranking and being searched, not what was in training data six months ago. Given a configurable site context (niche, audience, content focus areas, keywords to avoid), it returns a batch of long-tail keyword opportunities with estimated intent and difficulty.

Then comes the part most pipelines skip.

For each candidate keyword from Perplexity, the workflow makes a second GSC call to check whether the site is already ranking for it. This is where the scoring logic runs. A keyword with zero impressions and no ranking data is a creation opportunity. A keyword already getting impressions above a threshold gets skipped (the site has it covered). A keyword that shows up in GSC data but is stuck in position 8+ or has a low CTR for its impression volume gets flagged for an update rather than a new post.

The scoring weights difficulty, priority, intent, and action type. Low-difficulty informational keywords with no current ranking get the highest scores. High-difficulty keywords that the site is already performing well on get skipped.

Keywords that clear the score threshold go into an OpenAI step that generates a title, a recommended content format (guide, listicle, comparison, quick answer), a target word count, and a list of must-have sections. All of that gets written to a Google Sheet with a status of WAITING.

Now the client has around 30~50 ideas to approve.

The First Human Gate

The Google Sheet is the point where a human decides what actually gets built.

Each row contains: the keyword, the recommended title, the content format, the target word count, a rationale for why the keyword scored well, and the must-have sections the content strategy suggests. Changing the status from WAITING to APPROVED is the only action required to proceed. Deleting or marking a row as REPROVED means the content never gets written.

This gate exists because automated keyword scoring is good at finding opportunities, but doesn't know what the business actually wants to write about this month. It doesn't know there's a new product launching next week that changes the internal linking strategy. It doesn't know the last three posts on that topic already performed poorly, and you want to try a different angle.

This is the more important step, a human-in-the-loop reviewing and approving the ideas, or even updating some descriptions, if needed.

Workflow Two: From Approved Row to WordPress Draft

When a row flips to APPROVED, the second workflow picks it up.

As said, the sheet row carries everything the content generation step needs: keyword, title suggestion, content format, word count, and must-have sections. Perplexity writes the article directly in HTML format, structured to WordPress publishing requirements (proper heading hierarchy, linked citations inline, no document wrappers). The content format and sections from the planning step shape the output directly.

After the article is written, a smaller OpenAI model generates a set of DALL-E prompts based on the content, one for the featured image, and others for placement next to major sections. The prompts are designed for editorial-safe, brand-appropriate imagery with no text in the images. DALL-E 3 generates them at 1024×1024, and they get uploaded directly to the WordPress media library.

The draft lands in WordPress with the title from the sheet, the article HTML, and the featured image attached. Status at this point is configurable, with draft by default.

Then a separate step runs. An OpenAI model reads the published draft and produces the SEO metadata: a title under 60 characters, a meta description under 160, and a focus keyword. Running this after the content is written rather than before means the metadata is generated from the actual article text, not from an intent assumption. The meta description reflects what's actually in the post.

The sheet gets updated: post ID, URL, published date, meta title, meta description, status WRITTEN.

The Second Human Gate

The post is now just a draft. Nothing is live until a human looks at it. And this is a product decision.

Automated content that publishes itself without review is a liability. One post with a factual error, a citation to a bad source, or a misaligned tone costs more than the time saved across ten posts. The review step is the thing that makes the automation trustworthy enough to run at scale.

In practice, the review is light. The article structure came from a vetted brief. The keyword targeting came from real GSC data. The tone guidelines are baked into the system prompts. What the human is looking for is correctness and fit, not wholesale rewriting. Most of the time, he just reads it and checks if internal/external links are correct.

The Two-Model Split

Perplexity handles keyword research and article writing. OpenAI handles title strategy and image prompt generation and the SEO metadata.

Each of these is doing what it's actually good at. Perplexity's web grounding makes it better for research and for writing content that references current information — the citations it adds to articles are live links, not fabricated. OpenAI's GPT-4o-mini is sufficient for the deterministic task of generating image prompts and cheap enough to run across a batch without cost pressure. A smaller OpenAI model produces clean, length-constrained structured output reliably, which is exactly what you need when you're generating metadata that has to fit within strict character limits.

What Breaks

The most common failure point is the Perplexity content step under heavy load — it times out or returns truncated HTML. The workflow has retry logic with backoff, which handles most cases, but a truncated article that makes it through to WordPress is a problem. The review gate catches it, but it's worth monitoring.

The scoring logic also needs calibration per site. The default thresholds (skip at 50+ impressions, update if position ≥ 8 or CTR ≤ 2%) are reasonable starting points, not universal truths. A site getting 500 impressions per keyword needs different thresholds than one averaging 20. Running a few weeks of the analysis workflow without the content step and auditing what it would have created is worth doing before going fully automated.

The image pipeline occasionally produces images that are technically correct but tonally wrong for the content. A cooking blog that gets an abstract geometric featured image because the prompt wasn't specific enough is a minor issue fixed in review. A B2B SaaS site that gets the same image is a harder conversation.

What This Pipeline Is Not

It's not a fire-and-forget system. The two human gates are the main part of the architecture. Removing them to save time is how you end up with 200 posts published in a month that nobody trusts and nobody reads. And you get penalized by Google.

It's also not a replacement for a content strategy. The pipeline is good at identifying what keywords have gap opportunities based on current GSC data. It doesn't know your positioning, your funnel stage distribution, or what your audience actually cares about this quarter. Those decisions still belong to a human. The pipeline executes on them efficiently.

At the clients where I've deployed this, one person handles the entire content production workflow — reviewing the backlog, approving items, reviewing drafts before publish. The volume they can sustain at that headcount is the difference. The n8n automation patterns behind this scale to more complex pipelines too, but the SEO use case is where the ROI is most immediate.

I'm packaging both workflows as a template for teams that want to run this themselves. If you're building something similar or want to see the full setup working before buying a template, how I work starts with a scoping call.