Your Website / Webhook

Connecting to a Webhook account is straightforward. There are 2 options to use Webhook accounts:

Use code to take in POST notifications and generate new articles on your site. OR
Setup a Zapier account and use a Zap to publish it to your website, and only use the Webhook account as an internal link source.

Coding gives you more customization and has no limitations, while Zapier is convenient for people who don't know (or don't want to) code.

For coding, this page will show you:

How to connect Wraith Scribe to your Webhook.
How the webhook works and how to configure it.
Sample code / example to integrate the webhook, so you'll be able to automate blog posts to any site.

For Zapier, this page will show you:

How to use Webhook with Zapier.

Connecting Wraith Scribe With Your Website (Required)

Simple Text Instructions

Go to integrations.
Click on "Webhook"
Copy and paste your website URL. Must start with https://
Copy the DNS record.
1. Note: You must copy your DNS record after step 3, as the DNS record changes every time you type something, for max security.
Go to your domain provider (Cloudflare, Namecheap, Godaddy, Netlify) and add the a TXT record to root.
1. Host: @
2. Value: The copied value in step 4.
Set TTL to 1 minute, instead of 'auto' -- this ensures the TXT data is propagated in a timely manner.
Wait 1 minute.
Press verify.

If everything goes well, you'll see a green box saying everything's connected.

Video Instructions

The video instructions here covers step 2 through 8, and uses Cloudflare as the example DNS services.

Connecting Your Wraith Scribe's Webhook To Google

Once you've connected Wraith Scribe with the webhook, you can optionally connect your website to Google, and we can ask Google to prioritize indexing any newly published articles for you.

This is even easier. Just follow these steps.

How the Webhook Works / Configuration

Regardless of whether you're writing 1 article or 100, every time a job finishes, Wraith Scribe will notify your website via a POST request.

In order for the Webhook to work properly, you need to configure your Webhook settings. To do this:

Go to integrations.
Click on "Webhook"
Select the appropriate website from the dropdown.
Click the gear icon to edit the webhook:
1. The Webhook Path is the path where Wraith Scribe will notify your website via a POST request, whenever an article's finished. The default value is yourwebsite.com/wraith-webhook but you can change it to whatever path you want. Just make sure you set up a route for it.
2. The Sitemap Path is the path where your sitemap resides. This can be a high level sitemap pointing to other sitemaps. This is required for Wraith Scribe to crawl and index your website, so we can generate relevant internal links.
3. The Blog Path is the path where your blog posts resides. It:
  1. Defaults to yourwebsite.com/blog, and
  2. It assumes that new blog posts will be posted in: yourwebsite.com/blog/<slug> (our POST notification provides the slug).
Finally, you will see at the top Your Webhook Secret -- ensure this secret is never public and you should not post it in a code repository. Save it in an .env as an environmental variable instead. This secret is required to authenticate our POST request to your Webhook Path.

CODING OPTION: Integrating the Webhook to your website

Once you've connected everything, how do you consume the POST notification to automate blog posts to your site?

In this example, I'll walk through automating blog posts on Wraith Scribe itself. Even though the setup here is Django + Wagtail for CMS, you can easily customize the POST notifications to your own setup.

Triggering the webhook

Before we dive into code, let's talk very quickly about how to even trigger the webhook notification. Prerequisite is that you've connected a Webhook site (see above).

Single Articles

Hover over the Wordpress icon.
Press the Code icon.

When the button's pressed, it'll trigger a webhook for your site to receive.

Batch Articles

Just select a webhook account under the Select Account To Publish To dropdown, and pick "NO" for "Publish To Your Email / Zap?"

Every time a batch article is ready to publish, it'll trigger the webhook for your site to receive. Below, let's see how to receive the webhook notification.

Schema

If you choose to publish to your webhook, Wraith Scribe will publish the below data to your Webhook Path that you chose above:

POST request headers

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {secret}",
}

Your Webhook Secret will be attached to the header as a Bearer.

The following will come as a POST request.

POST request schema

{
    "title": str,
    "content": str,
    "meta_description": str,
    "categories": List[str],
    "slug": str,
    "feature_img_url": str,
    "feature_img_alt": str,
    "table_of_contents": str,
}

title is the title of the blog post.

content is an HTML string of the blog content. Does not include title.

meta_description is a short description of what the blog post's about.

categories is a list of strings, where each item in the list is a category that's associated with this blog post.

slug is the URL path assumed for this blog post. That is, we will request Google to prioritize indexing yourwebsite.com/your/blog/path/<slug> -- thus, it is important that the Blog Path above is the exact path where you will publish the article. Otherwise, the Google won't be able to prioritize indexing your blog posts properly.

feature_img_url is a URL of where the feature image is. If you want to absorb it in your domain, you may download it (see example below).

feature_img_alt is a string of the alt text that can be attached to the image, if you want.

Feel free to use all (or none) of these fields in your implementation. In the below Python/Django implementation, I will be using most of it.

Example Python Implementation

Routes

urls.py

from django.urls import path
from . import views

app_name = "content"
urlpatterns = [
    path("wraith-webhook/", views.wraith_webhook, name="wraith_webhook"),
]

I accept the POST notification in the default Webhook path, which is just wraith-webhook.

The Wagtail models are housed in apps/content and so in this example, the webhook handler's also in the same app.

View / Main Implementation

wraith_webhook function

@csrf_exempt
@require_POST
def wraith_webhook(request):
    # Verify the webhook secret
    secret = request.headers.get("Authorization")
    if not secret or secret != f"Bearer {settings.WRAITH_WEBHOOK_SECRET}":
        return HttpResponse("Unauthorized", status=401)

    # Parse the JSON data
    try:
        data = json.loads(request.body)
    except json.JSONDecodeError:
        return HttpResponse("Invalid JSON", status=400)

    # Create the blog post
    try:
        blog_post = create_blog_post(data)
        return HttpResponse(f"Blog post created: {blog_post.full_url}", status=200)
    except Exception as e:
        return HttpResponse(f"Error creating blog post: {str(e)}", status=500)

This takes the secret from the header and compares it against the Webhook Secret that you can grab from your integrations. If it doesn't match, then it's an unauthorized attempt. Otherwise, we load the schema as described above. Then, all we need to do is feed the data and create a blog post.

That's it for the high level implementation!

Diving deeper, we can look at how this works for wagtail specifically:

Wagtail Implementation

create_blog_post function

def create_blog_post(data):
    # Get the BlogIndexPage
    blog_index = BlogIndexPage.objects.live().first()
    if not blog_index:
        raise Exception("Blog index page not found")

    # Convert HTML content to StreamField format
    stream_data = html_to_stream_field(data["content"])

    # Create the BlogPage
    blog_page = BlogPage(
        title=data["title"],
        slug=data["slug"],
        intro=data["meta_description"],
        body=stream_data,
        date=timezone.now().date(),
    )

    # Add the blog page as a child of the blog index
    blog_index.add_child(instance=blog_page)

    # Set categories
    if "categories" in data:
        blog_page.tags.add(*data["categories"])

    # Add feature image if provided
    if "feature_img_url" in data and "feature_img_alt" in data:
        image = create_or_get_image(data["feature_img_url"])
        if image:
            blog_page.gallery_images.create(
                image=image, caption=data["feature_img_alt"]
            )

    blog_page.save()

    # Create revision
    revision = blog_page.save_revision()

    # Publish the page
    revision.publish()

    return blog_page

In Wagtail, all blog posts sit under Blog Index pages. Thus, we need to ensure that a Blog Index page exists. A Blog Index in wagtail is just a page that lists out all the blog posts.
If a Blog Index doesn't exist, then we need to setup our Wagtail site properly first. This is beyond the scope of this tutorial, but this can be fixed simply by going to yourwebsite.com/cms (or wherever you put your Wagtail), and just adding a new Blog Index page.
Wagtail's BlogPage model requires a JSON-based StreamField object for its body, so we convert the HTML content to a Stream Field. More on this later.
Then, we just create a new instance of a blog page and add it under the blog index.
After that, we unroll the categories and add it to the blog page's tags.
Finally, we download the image from the feature image URL, and then add a caption to it, and then we save the blog page, and publish it.

Before we go on to the helper functions, let's look at the modified Blog Page model which has a Stream Field:

BlogPage Model

class BlogPage(Page):
    """
    A single blog post
    """

    date = models.DateField("Post date")
    intro = models.CharField(max_length=250)
    tags = ClusterTaggableManager(through=BlogPageTag, blank=True)

    body = StreamField(
        [
            ("paragraph", blocks.RichTextBlock()),
            ("table", TableBlock()),
        ],
        use_json_field=True,
        blank=True,
    )

    search_fields = Page.search_fields + [
        index.SearchField("intro"),
        index.SearchField("body"),
    ]

    content_panels = Page.content_panels + [
        FieldPanel("date"),
        FieldPanel("intro"),
        FieldPanel("body", classname="full"),
        InlinePanel("gallery_images", label="Gallery images"),
        FieldPanel("tags"),
    ]

    @property
    def main_image(self):
        gallery_item = self.gallery_images.first()
        if gallery_item:
            return gallery_item.image
        else:
            return None

    @property
    def title_with_last_modified_date(self):
        last_modified = timezone.localtime(self.latest_revision_created_at)
        return f"{self.title} (last updated {last_modified.strftime('%Y-%m-%d')})"

    class Meta:
        ordering = ["-date"]

Of note here is the body attribute. Because Wagtail doesn't really handle tables well, I separated the regular Rich Text block in Wagtail and tables as blocks. With this modified body, the content panels just attach this new body via FieldPanel("body", classname="full"), -- everything else is pretty standard Wagtail.

Helper Functions

Convert HTML To StreamField

def html_to_stream_field(html_content):
    soup = BeautifulSoup(html_content, "html.parser")
    stream_data = []
    current_content = []

    def add_paragraph():
        if current_content:
            paragraph_html = "".join(str(item) for item in current_content)
            stream_data.append(("paragraph", paragraph_html))
            current_content.clear()

    for element in soup.children:
        if element.name == "table":
            add_paragraph()

            # Convert table to the format expected by TableBlock
            rows = []
            for i, tr in enumerate(element.find_all("tr")):
                row = [td.get_text(strip=True) for td in tr.find_all(["td", "th"])]
                rows.append(row)

            table_data = {
                "data": rows,
                "first_row_is_table_header": True,
                "first_col_is_header": False,
            }

            stream_data.append(("table", table_data))
        else:
            current_content.append(element)

    # Add any remaining content as a final paragraph
    add_paragraph()

    return stream_data

Because table's a separate block, the above simply:

Runs through the HTML in order with Beautiful Soup.
When a table's encountered, the stream data appends the rows for table data.
1. Tables that Wraith Scribe provides have the first row as header, so we tag "first_row_is_table_header": True to have Wagtail style the first row properly.
current_content collects non-table data, and any time we run into a table, we dump it into a paragraph block (add_paragraph) -- and once again when the whole HTML is finished (i.e. the block after the final table).

Downloading Image

def create_or_get_image(url):
    # Check if an image with this URL already exists
    existing_image = Image.objects.filter(title=url).first()
    if existing_image:
        return existing_image

    # If not, download and create the image
    response = requests.get(url)
    if response.status_code != 200:
        return None

    # Create a file-like object from the image data
    image_file = BytesIO(response.content)

    # Create the image in Wagtail
    image = Image(title=url, file=ImageFile(image_file, name=url.split("/")[-1]))
    image.save()

    return image

(Optional) -- we see if Wagtail's Image already has the URL. If so, we'll just use it.
We download the image and use BytesIO to create an image file.
We save it to Wagtail's Image model.
We return the image so we can attach it to the blog_page row referenced in def create_blog_post above.

Frontend

Since the body is now split into {tables, paragraphs} blocks instead of a single Rich Text block, rendering it may be tricky. But in Django it is actually somewhat straightforward:

Blog Post

<div id="your-blog-post-container">
{% for block in page.body %}
    {% if block.block_type == 'table' %}
        {% include_block block %}
    {% else %}
        {{ block.value }}
    {% endif %}
{% endfor %}
</div>

We just loop through the blocks inside BlogPage.body and if it's a table, we shove the block in, and if it's a regular block, we put the value in.

For best SEO results, you should update your meta title to match the blog post's title as well:

title

<head>
  ...
  <title>{{ page.title }}</title>
  ...
</head>

Styling Table Of Contents

Table of contents will be given to you with an H2 heading: <div class="toc"><h2>Table of Contents</h2>

The rest of the HTML are just list items with classes:

toc-level-1 - for H1 headings

toc-level-2 - for H2 headings

...

toc-level-6 - for H6 headings

You are free to style these appropriately. It can be just as simple as adding a margin-left: 20px * heading-level to have a hierarchical TOC.

Uploading Images To Your Server

All images provided to the webhook just sits on our s3 bucket. You may want to localize it to your own S3 bucket or in the example below, your own domain, to prevent too many external links.

You can upload it to your server like this:

Localizing Images

import os
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse
from io import BytesIO
from PIL import Image
import hashlib

def process_and_upload_images(content, upload_dir='uploads'):
    # Create upload directory if it doesn't exist
    os.makedirs(upload_dir, exist_ok=True)

    # Parse the HTML content
    soup = BeautifulSoup(content, 'html.parser')

    # Find all img tags
    images = soup.find_all('img')

    for img in images:
        src = img.get('src')
        if not src:
            continue

        try:
            # Download the image
            response = requests.get(src)
            if response.status_code == 200:
                # Generate a unique filename
                file_extension = os.path.splitext(urlparse(src).path)[1]
                if not file_extension:
                    file_extension = '.jpg'  # Default to .jpg if no extension is found
                filename = hashlib.md5(src.encode()).hexdigest() + file_extension

                # Save the image
                img_path = os.path.join(upload_dir, filename)
                with Image.open(BytesIO(response.content)) as img_file:
                    img_file.save(img_path)

                # Update the src attribute with the new path
                img['src'] = os.path.join(upload_dir, filename)
        except Exception as e:
            print(f"Error processing image {src}: {str(e)}")

    # Return the updated HTML content
    return str(soup)

This just takes the HTML content, finds the img tags, and substitutes the src pointer with a local file you downloaded. Keep in mind this is just a simple example of the images downloaded in your local uploads/ directory. However, you probably want to upload it to an S3 bucket / something more scalable.

As the feature image is provided to you as a URL, you can skip parsing for the image tags and go straight to:

Downloading the image.
Optionally uploading it to your favorite cloud.
Attach the new feature image location + alt text to your blog post.

Just use an LLM.

The Django + Wagtail is a very specific example. But the main principles boil down to:

Set up your routes to receive the POST notifications.
Authenticate with secrets in the header.
Once authenticated, process the schema sent to your server, similar to wraith_webhook above.

If you don't have a cumbersome UI-plugin for your CMS like Wagtail, your implementation is probably easier. This is because you can just spin a new page as the webhook provides everything in HTML. You'd just need to style the page.

While I am unable to assist in any and all web server combinations -- just use ChatGPT and Claude. After all, the above complicated example is 95% done by Claude with a few bugs that I fixed. Took me a total one-time cost of 1-2 hours and now the website can permanently generate automated blog posts (as can any other websites I build in the future with Wagtail).

ZAPIER OPTION: Using Your Account With Zapier

Once you've connected your website with Wraith Scribe, you're free not to do any coding.

For any website connected to Wraith Scribe, there's 2 parts:

Using the website for internal linking
Publishing to the website

Connecting the website to Wraith Scribe takes care of the 1st part. You can follow the coding tutorial above for the 2nd part, OR you can setup a Zapier integration to handle the publishing. Up to you.

Strategy

The Zap template I give you has 4 steps:

Take in an email of a specific format.
Use regex to parse it.
Feed the data to an Airtable to categorize each part of the article neatly.
Use this orgnanized data to feed to Ghost.io

The bolded first 2 steps are required, and you are free to do whatever you want once you've extracted the main data from the article. You can skip posting to Ghost.io or any Zapier-compatible CMS, and only keep it in an Airtable, or you can keep it in a Google Sheets instead. You can post it to a CMS directly and skip storing the data in a table. The possibilities are endless!

Single Articles

Once you've followed the instructions on setting up the Zapier account and connected your website above, just go to any article you've generated and:

Hover over the Wordpress icon.
Press the Envelope icon.

The email you use for Wraith Scribe will then get 2 emails:

A normal, backup email of the article.
A specially-formatted version of the article that'll be parsed and auto-published by your Zap.

Batch Articles

Like single articles, you'll need to 1) connect your website with Wraith Scribe and 2) setup your Zapier account appropriately.

In your batch articles section, simply pick "YES" for "Publish To Your Email / Zap?". When you select YES:

If you pick a Webhook account under Select Account To Publish To:
- Wraith will no longer send a POST notification to your Webhook account and will no longer do automatic Google indexing (since we won't know the URL of what your Zaps go to, or if the Zap publishes an article at all).
- Wraith will use your Webhook website as an internal link source, provided your sitemap is configured properly.
- When the article publishes, it'll publish to your email, which will trigger the Zap you setup.
- It works like this because assumption is you don't want to code to implement the webhook, so you just use your account as an internal link source -- and even if you did code and implement the webhook it's unlikely you want to publish identical articles twice verbatim since search engines don't like that.
If you pick a Wordpress account under Select Account To Publish To:
- Since Wraith has a direct integration with Wordpress, it'll continue to 1) publish to your Wordpress account, and 2) automatically request Google Indexing upon publishing.
- Wraith will publish to your email, which will trigger your Zap.
- If your Zap is going to be used to publish to a website, it is NOT RECOMMENDED you pick "YES" for "Publish To Your Email / Zap?" Having multiple places with identical articles is bad for SEO. Instead, just pick "NO" and let the Wordpress Integration you've already setup do all the work.
- If your Zap is going to just be used to store data / for tracking (i.e. to an Airtable), then picking "YES" for "Publish To Your Email / Zap?" is perfectly fine.

While Webhook Account + Zap has some limitations when compared to direct Wordpress integration, you only lose out on automatic Google index (which you can always do manually--and even if you don't, Google will eventually index your articles). The upside of this small limitation is that:

You don't have to code at all.
You unlock 100s of no-code Zapier integrations.