Your Website / Webhook
Connecting to a Webhook account is straightforward. There are 2 options to use Webhook accounts:
- Use code to take in POST notifications and generate new articles on your site. OR
- Setup a Zapier account and use a Zap to publish it to your website, and only use the Webhook account as an internal link source.
Coding gives you more customization and has no limitations, while Zapier is convenient for people who don't know (or don't want to) code.
For coding, this page will show you:
- How to connect Wraith Scribe to your Webhook.
- How the webhook works and how to configure it.
- Sample code / example to integrate the webhook, so you'll be able to automate blog posts to any site.
For Zapier, this page will show you:
- How to use Webhook with Zapier.
Connecting Wraith Scribe With Your Website (Required)
Simple Text Instructions
- Go to integrations.
- Click on "Webhook"
- Copy and paste your website URL. Must start with
https://
-
Copy the DNS record.
- Note: You must copy your DNS record after step 3, as the DNS record changes every time you type something, for max security.
-
Go to your domain provider (Cloudflare, Namecheap, Godaddy, Netlify) and add the a TXT record to root.
- Host: @
- Value: The copied value in step 4.
-
Set TTL to 1 minute, instead of 'auto' -- this ensures the TXT data is propagated in a timely manner.
- Wait 1 minute.
- Press verify.
If everything goes well, you'll see a green box saying everything's connected.
Video Instructions
The video instructions here covers step 2 through 8, and uses Cloudflare as the example DNS services.
Connecting Your Wraith Scribe's Webhook To Google
Once you've connected Wraith Scribe with the webhook, you can optionally connect your website to Google, and we can ask Google to prioritize indexing any newly published articles for you.
This is even easier. Just follow these steps.
How the Webhook Works / Configuration
Regardless of whether you're writing 1 article or 100, every time a job finishes, Wraith Scribe will notify your website via a POST request.
In order for the Webhook to work properly, you need to configure your Webhook settings. To do this:
- Go to integrations.
- Click on "Webhook"
- Select the appropriate website from the dropdown.
- Click the gear icon to edit the webhook:
- The Webhook Path is the path where Wraith Scribe will notify your website via a POST request, whenever an article's finished. The default value is
yourwebsite.com/wraith-webhook
but you can change it to whatever path you want. Just make sure you set up a route for it. - The Sitemap Path is the path where your sitemap resides. This can be a high level sitemap pointing to other sitemaps. This is required for Wraith Scribe to crawl and index your website, so we can generate relevant internal links.
- The Blog Path is the path where your blog posts resides. It:
- Defaults to
yourwebsite.com/blog
, and - It assumes that new blog posts will be posted in:
yourwebsite.com/blog/<slug>
(our POST notification provides the slug).
- Defaults to
- The Webhook Path is the path where Wraith Scribe will notify your website via a POST request, whenever an article's finished. The default value is
- Finally, you will see at the top
Your Webhook Secret
-- ensure this secret is never public and you should not post it in a code repository. Save it in an.env
as an environmental variable instead. This secret is required to authenticate our POST request to your Webhook Path.
CODING OPTION: Integrating the Webhook to your website
Once you've connected everything, how do you consume the POST notification to automate blog posts to your site?
In this example, I'll walk through automating blog posts on Wraith Scribe itself. Even though the setup here is Django + Wagtail for CMS, you can easily customize the POST notifications to your own setup.
Triggering the webhook
Before we dive into code, let's talk very quickly about how to even trigger the webhook notification. Prerequisite is that you've connected a Webhook site (see above).
Single Articles
- Hover over the Wordpress icon.
- Press the Code icon.
When the button's pressed, it'll trigger a webhook for your site to receive.
Batch Articles
Just select a webhook account under the Select Account To Publish To
dropdown, and pick "NO" for "Publish To Your Email / Zap?"
Every time a batch article is ready to publish, it'll trigger the webhook for your site to receive. Below, let's see how to receive the webhook notification.
Schema
If you choose to publish to your webhook, Wraith Scribe will publish the below data to your Webhook Path that you chose above:
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {secret}",
}
Your Webhook Secret will be attached to the header as a Bearer.
The following will come as a POST
request.
{
"title": str,
"content": str,
"meta_description": str,
"categories": List[str],
"slug": str,
"feature_img_url": str,
"feature_img_alt": str,
"table_of_contents": str,
}
title
is the title of the blog post.
content
is an HTML string of the blog content. Does not include title.
meta_description
is a short description of what the blog post's about.
categories
is a list of strings, where each item in the list is a category that's associated with this blog post.
slug
is the URL path assumed for this blog post. That is, we will request Google to prioritize indexing yourwebsite.com/your/blog/path/<slug>
-- thus, it is important that the Blog Path above is the exact path where you will publish the article. Otherwise, the Google won't be able to prioritize indexing your blog posts properly.
feature_img_url
is a URL of where the feature image is. If you want to absorb it in your domain, you may download it (see example below).
feature_img_alt
is a string of the alt text that can be attached to the image, if you want.
Feel free to use all (or none) of these fields in your implementation. In the below Python/Django implementation, I will be using most of it.
Example Python Implementation
Routes
from django.urls import path
from . import views
app_name = "content"
urlpatterns = [
path("wraith-webhook/", views.wraith_webhook, name="wraith_webhook"),
]
I accept the POST notification in the default Webhook path, which is just wraith-webhook
.
The Wagtail models are housed in apps/content
and so in this example, the webhook handler's also in the same app.
View / Main Implementation
@csrf_exempt
@require_POST
def wraith_webhook(request):
# Verify the webhook secret
secret = request.headers.get("Authorization")
if not secret or secret != f"Bearer {settings.WRAITH_WEBHOOK_SECRET}":
return HttpResponse("Unauthorized", status=401)
# Parse the JSON data
try:
data = json.loads(request.body)
except json.JSONDecodeError:
return HttpResponse("Invalid JSON", status=400)
# Create the blog post
try:
blog_post = create_blog_post(data)
return HttpResponse(f"Blog post created: {blog_post.full_url}", status=200)
except Exception as e:
return HttpResponse(f"Error creating blog post: {str(e)}", status=500)
This takes the secret from the header and compares it against the Webhook Secret that you can grab from your integrations. If it doesn't match, then it's an unauthorized attempt. Otherwise, we load the schema as described above. Then, all we need to do is feed the data and create a blog post.
That's it for the high level implementation!
Diving deeper, we can look at how this works for wagtail specifically:
Wagtail Implementation
def create_blog_post(data):
# Get the BlogIndexPage
blog_index = BlogIndexPage.objects.live().first()
if not blog_index:
raise Exception("Blog index page not found")
# Convert HTML content to StreamField format
stream_data = html_to_stream_field(data["content"])
# Create the BlogPage
blog_page = BlogPage(
title=data["title"],
slug=data["slug"],
intro=data["meta_description"],
body=stream_data,
date=timezone.now().date(),
)
# Add the blog page as a child of the blog index
blog_index.add_child(instance=blog_page)
# Set categories
if "categories" in data:
blog_page.tags.add(*data["categories"])
# Add feature image if provided
if "feature_img_url" in data and "feature_img_alt" in data:
image = create_or_get_image(data["feature_img_url"])
if image:
blog_page.gallery_images.create(
image=image, caption=data["feature_img_alt"]
)
blog_page.save()
# Create revision
revision = blog_page.save_revision()
# Publish the page
revision.publish()
return blog_page
- In Wagtail, all blog posts sit under Blog Index pages. Thus, we need to ensure that a Blog Index page exists. A Blog Index in wagtail is just a page that lists out all the blog posts.
- If a Blog Index doesn't exist, then we need to setup our Wagtail site properly first. This is beyond the scope of this tutorial, but this can be fixed simply by going to
yourwebsite.com/cms
(or wherever you put your Wagtail), and just adding a new Blog Index page. - Wagtail's BlogPage model requires a JSON-based StreamField object for its body, so we convert the HTML content to a Stream Field. More on this later.
- Then, we just create a new instance of a blog page and add it under the blog index.
- After that, we unroll the categories and add it to the blog page's tags.
- Finally, we download the image from the feature image URL, and then add a caption to it, and then we save the blog page, and publish it.
Before we go on to the helper functions, let's look at the modified Blog Page model which has a Stream Field:
BlogPage Model
class BlogPage(Page):
"""
A single blog post
"""
date = models.DateField("Post date")
intro = models.CharField(max_length=250)
tags = ClusterTaggableManager(through=BlogPageTag, blank=True)
body = StreamField(
[
("paragraph", blocks.RichTextBlock()),
("table", TableBlock()),
],
use_json_field=True,
blank=True,
)
search_fields = Page.search_fields + [
index.SearchField("intro"),
index.SearchField("body"),
]
content_panels = Page.content_panels + [
FieldPanel("date"),
FieldPanel("intro"),
FieldPanel("body", classname="full"),
InlinePanel("gallery_images", label="Gallery images"),
FieldPanel("tags"),
]
@property
def main_image(self):
gallery_item = self.gallery_images.first()
if gallery_item:
return gallery_item.image
else:
return None
@property
def title_with_last_modified_date(self):
last_modified = timezone.localtime(self.latest_revision_created_at)
return f"{self.title} (last updated {last_modified.strftime('%Y-%m-%d')})"
class Meta:
ordering = ["-date"]
Of note here is the body
attribute. Because Wagtail doesn't really handle tables well, I separated the regular Rich Text block in Wagtail and tables as blocks. With this modified body, the content panels just attach this new body via FieldPanel("body", classname="full"),
-- everything else is pretty standard Wagtail.
Helper Functions
def html_to_stream_field(html_content):
soup = BeautifulSoup(html_content, "html.parser")
stream_data = []
current_content = []
def add_paragraph():
if current_content:
paragraph_html = "".join(str(item) for item in current_content)
stream_data.append(("paragraph", paragraph_html))
current_content.clear()
for element in soup.children:
if element.name == "table":
add_paragraph()
# Convert table to the format expected by TableBlock
rows = []
for i, tr in enumerate(element.find_all("tr")):
row = [td.get_text(strip=True) for td in tr.find_all(["td", "th"])]
rows.append(row)
table_data = {
"data": rows,
"first_row_is_table_header": True,
"first_col_is_header": False,
}
stream_data.append(("table", table_data))
else:
current_content.append(element)
# Add any remaining content as a final paragraph
add_paragraph()
return stream_data
Because table's a separate block, the above simply:
- Runs through the HTML in order with Beautiful Soup.
- When a table's encountered, the stream data appends the rows for table data.
- Tables that Wraith Scribe provides have the first row as header, so we tag
"first_row_is_table_header": True
to have Wagtail style the first row properly.
- Tables that Wraith Scribe provides have the first row as header, so we tag
current_content
collects non-table data, and any time we run into a table, we dump it into a paragraph block (add_paragraph
) -- and once again when the whole HTML is finished (i.e. the block after the final table).
def create_or_get_image(url):
# Check if an image with this URL already exists
existing_image = Image.objects.filter(title=url).first()
if existing_image:
return existing_image
# If not, download and create the image
response = requests.get(url)
if response.status_code != 200:
return None
# Create a file-like object from the image data
image_file = BytesIO(response.content)
# Create the image in Wagtail
image = Image(title=url, file=ImageFile(image_file, name=url.split("/")[-1]))
image.save()
return image
- (Optional) -- we see if Wagtail's Image already has the URL. If so, we'll just use it.
- We download the image and use BytesIO to create an image file.
- We save it to Wagtail's Image model.
- We return the image so we can attach it to the
blog_page
row referenced indef create_blog_post
above.
Frontend
Since the body is now split into {tables, paragraphs}
blocks instead of a single Rich Text block, rendering it may be tricky. But in Django it is actually somewhat straightforward:
<div id="your-blog-post-container">
{% for block in page.body %}
{% if block.block_type == 'table' %}
{% include_block block %}
{% else %}
{{ block.value }}
{% endif %}
{% endfor %}
</div>
We just loop through the blocks inside BlogPage.body
and if it's a table, we shove the block in, and if it's a regular block, we put the value in.
For best SEO results, you should update your meta title to match the blog post's title as well:
Styling Table Of Contents
Table of contents will be given to you with an H2 heading: <div class="toc"><h2>Table of Contents</h2>
The rest of the HTML are just list items with classes:
toc-level-1
- for H1 headings
toc-level-2
- for H2 headings
...
toc-level-6
- for H6 headings
You are free to style these appropriately. It can be just as simple as adding a margin-left: 20px * heading-level
to have a hierarchical TOC.
Uploading Images To Your Server
All images provided to the webhook just sits on our s3 bucket. You may want to localize it to your own S3 bucket or in the example below, your own domain, to prevent too many external links.
You can upload it to your server like this:
import os
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse
from io import BytesIO
from PIL import Image
import hashlib
def process_and_upload_images(content, upload_dir='uploads'):
# Create upload directory if it doesn't exist
os.makedirs(upload_dir, exist_ok=True)
# Parse the HTML content
soup = BeautifulSoup(content, 'html.parser')
# Find all img tags
images = soup.find_all('img')
for img in images:
src = img.get('src')
if not src:
continue
try:
# Download the image
response = requests.get(src)
if response.status_code == 200:
# Generate a unique filename
file_extension = os.path.splitext(urlparse(src).path)[1]
if not file_extension:
file_extension = '.jpg' # Default to .jpg if no extension is found
filename = hashlib.md5(src.encode()).hexdigest() + file_extension
# Save the image
img_path = os.path.join(upload_dir, filename)
with Image.open(BytesIO(response.content)) as img_file:
img_file.save(img_path)
# Update the src attribute with the new path
img['src'] = os.path.join(upload_dir, filename)
except Exception as e:
print(f"Error processing image {src}: {str(e)}")
# Return the updated HTML content
return str(soup)
This just takes the HTML content, finds the img tags, and substitutes the src pointer with a local file you downloaded. Keep in mind this is just a simple example of the images downloaded in your local uploads/
directory. However, you probably want to upload it to an S3 bucket / something more scalable.
As the feature image is provided to you as a URL, you can skip parsing for the image tags and go straight to:
- Downloading the image.
- Optionally uploading it to your favorite cloud.
- Attach the new feature image location + alt text to your blog post.
Just use an LLM.
The Django + Wagtail is a very specific example. But the main principles boil down to:
- Set up your routes to receive the POST notifications.
- Authenticate with secrets in the header.
- Once authenticated, process the schema sent to your server, similar to
wraith_webhook
above.
If you don't have a cumbersome UI-plugin for your CMS like Wagtail, your implementation is probably easier. This is because you can just spin a new page as the webhook provides everything in HTML. You'd just need to style the page.
While I am unable to assist in any and all web server combinations -- just use ChatGPT and Claude. After all, the above complicated example is 95% done by Claude with a few bugs that I fixed. Took me a total one-time cost of 1-2 hours and now the website can permanently generate automated blog posts (as can any other websites I build in the future with Wagtail).
ZAPIER OPTION: Using Your Account With Zapier
Once you've connected your website with Wraith Scribe, you're free not to do any coding.
For any website connected to Wraith Scribe, there's 2 parts:
- Using the website for internal linking
- Publishing to the website
Connecting the website to Wraith Scribe takes care of the 1st part. You can follow the coding tutorial above for the 2nd part, OR you can setup a Zapier integration to handle the publishing. Up to you.
Strategy
The Zap template I give you has 4 steps:
- Take in an email of a specific format.
- Use regex to parse it.
- Feed the data to an Airtable to categorize each part of the article neatly.
- Use this orgnanized data to feed to Ghost.io
The bolded first 2 steps are required, and you are free to do whatever you want once you've extracted the main data from the article. You can skip posting to Ghost.io or any Zapier-compatible CMS, and only keep it in an Airtable, or you can keep it in a Google Sheets instead. You can post it to a CMS directly and skip storing the data in a table. The possibilities are endless!
Single Articles
Once you've followed the instructions on setting up the Zapier account and connected your website above, just go to any article you've generated and:
- Hover over the Wordpress icon.
- Press the Envelope icon.
The email you use for Wraith Scribe will then get 2 emails:
- A normal, backup email of the article.
- A specially-formatted version of the article that'll be parsed and auto-published by your Zap.
Batch Articles
Like single articles, you'll need to 1) connect your website with Wraith Scribe and 2) setup your Zapier account appropriately.
In your batch articles section, simply pick "YES" for "Publish To Your Email / Zap?". When you select YES:
- If you pick a Webhook account under
Select Account To Publish To
:- Wraith will no longer send a POST notification to your Webhook account and will no longer do automatic Google indexing (since we won't know the URL of what your Zaps go to, or if the Zap publishes an article at all).
- Wraith will use your Webhook website as an internal link source, provided your sitemap is configured properly.
- When the article publishes, it'll publish to your email, which will trigger the Zap you setup.
- It works like this because assumption is you don't want to code to implement the webhook, so you just use your account as an internal link source -- and even if you did code and implement the webhook it's unlikely you want to publish identical articles twice verbatim since search engines don't like that.
- If you pick a Wordpress account under
Select Account To Publish To
:- Since Wraith has a direct integration with Wordpress, it'll continue to 1) publish to your Wordpress account, and 2) automatically request Google Indexing upon publishing.
- Wraith will publish to your email, which will trigger your Zap.
- If your Zap is going to be used to publish to a website, it is NOT RECOMMENDED you pick "YES" for "Publish To Your Email / Zap?" Having multiple places with identical articles is bad for SEO. Instead, just pick "NO" and let the Wordpress Integration you've already setup do all the work.
- If your Zap is going to just be used to store data / for tracking (i.e. to an Airtable), then picking "YES" for "Publish To Your Email / Zap?" is perfectly fine.
While Webhook Account + Zap has some limitations when compared to direct Wordpress integration, you only lose out on automatic Google index (which you can always do manually--and even if you don't, Google will eventually index your articles). The upside of this small limitation is that:
- You don't have to code at all.
- You unlock 100s of no-code Zapier integrations.