The LLM Advantage in Blogging

I’ve used large language model (LLM) powered chatbots (ChatGPT & Claude to help with some of my writing. They’ve been especially beneficial with blog posts where I have functionality dependent on JavaScript code.

The Automation Dilemma

Utilizing these LLM chatbots is pretty straightforward, but it gets annoying when you want to provide them with writing samples. You can pick and choose a couple representative posts and share those, but that’s too scattershot for me. Ideally, I’d like my whole corpus of blog posts to be used as samples for the chatbots to draw from. I had written some python scripts that loop over my posts and create a concatenated file. This worked fine for creating a file - but it was annoying to manually kick off the process every time I made a new post. So, I started thinking about how to automate the process.

There are many ways to approach it, but I wanted to keep it simple. The most straightforward route was to build off my existing automation infrastructure - the GitHub pages build process.

GitHub Actions: My Automation Hero

The GitHub pages build process automatically converts the documents I use to write my blog (markdown files) into the web pages you see (HTML). GitHub provides this service as a tool for developers to quickly spin up webpages using the GitHub Actions framework. GitHub actions are fantastic as they enable continuous integration and continuous delivery/deployment (CI/CD).

    graph TB

    %% Primary Path
    A[Push new blog .md post to github] --> BA
    BB --> CA
    CB --> D[Commit & push changes]

    %% GitHub Pages Build Process
    subgraph B[GitHub Pages Build Process]
        BA[Build eotles.com webpages] --> BB[Trigger: gh-pages branch]
    end

    %% Concatenate .md Files Action
    subgraph C[Concatenate .md Files Action]
        CA[Create file] --> CB[Loop over all posts and concat to file]
    end

    %% .md Files
    A -.-> P[.md files]
    P -.-> B
    P -.-> C

The above diagram provides a visual overview of the automation process I’ve set up using GitHub Actions.

Connecting the Dots with Jekyll, GitHub Pages, and Minimal Mistakes Theme

We’ve primarily centered our dicussion of automation around GitHub Actions; however, it’s essential to recognize the broader ecosystem that supports my blogging. I use the Jekyll blogging platform, a simple, blog-aware, static site generator. It’s a fantastic tool that allows me to write in Markdown (.md), keeping things straightforward and focused on content. And Jekyll seamlessly integrates with GitHub Pages! The aesthetic and design of my blog is courtesy of the Minimal Mistakes theme. It’s a relatively flexible theme for Jekyll that’s ideal for building personal portfolio sites.

For those of you who are on the Jekyll-GitHub Pages-Minimal Mistakes trio, the automation process I’ve described using GitHub Actions can be a game-changer. It’s not just about streamlining; it’s about harnessing the full potential of these interconnected tools to actually speed up your work.

Diving into CI/CD

CI/CD is essential if you regularly ship production code. For example, it enables you to automatically kick off testing code as a part of your code deployment process. This is really important when you are working on a large codebase as a part of a team. Fortunately/unfortunately, I’m in the research business, so I’m usually just coding stuff up by my lonesome. CI/CD isn’t a regular part of my development process (although maybe it should be 🤔). Despite not using it before, I decided to see if I could get it to work for my purposes.

My First Foray into GitHub Action

Since this was my first time with GitHub Actions, I turned to an expert, ChatGPT. I had initially asked it to make a bash script that I was going to run manually, but then I wondered:

so I have a website I host on GitHub. Is there a way to use the GitHub actions to automatically concantenate all the .md files in the /_posts directory?

It described the process, which comprised of two steps:

  1. Create a GitHub Action Workflow: you tell GitHub about an action by creating a YAML file in a special subdirectory (.github/workflows) of the project
  2. Define the Workflow: in the YAML file, specify what you want to happen. ChatGPT suggested some code to put in this file.

I committed and pushed the changes. A couple minutes later, I got an email that my GitHub Action(s) had errored out. The action that I created conflicted with the existing website creation actions. With assistance from ChatGPT, I solved this by having my new concatenation action wait for the website creation action to finish before running. We achieved this by using the gh-pages branch as a trigger, ensuring our action ran after the webpages were built and deployed.

The Code Behind the Magic

The code for this GitHub Action is as follows:

name: Concatenate MD Files with Metadata

on:
  push:
    paths:
      - '_posts/*.md'

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout repository
      uses: actions/checkout@v2

    - name: Concatenate .md files with metadata
      run: |
        mkdir -p workflows_output
        > workflows_output/concatenated_posts.md
        cd _posts
        for file in *.md; do
            echo "File: $file" >> ../workflows_output/concatenated_posts.md
            echo "Creation Date: $(git log --format=\"%aD\" -n 1 -- $file)" >> ../workflows_output/concatenated_posts.md
            cat "$file" >> ../workflows_output/concatenated_posts.md
            echo "------------------------" >> ../workflows_output/concatenated_posts.md
        done

    - name: Commit and push if there are changes
      run: |
        git config --local user.email "action@github.com"
        git config --local user.name "GitHub Action"
        git add -A
        git diff --quiet && git diff --staged --quiet || git commit -m "Concatenated .md files with metadata"
        git push

Conclusion: Automation Can Be a Warm Hug

The final result was an automation process that runs in the background every time a new post is added. Overall, I was impressed with the power and flexibility of GitHub Actions. This experience demonstrated that CI/CD isn’t just for large software projects but can be a valuable tool for individual researchers and developers!

Update!

This automation didn’t end up working well. I ended up switching the automation trigger to be time-based. You can read about the updated setup here.

Cheers,
Erkin
Go ÖN Home

PS

The mermaid diagram (the flow diagram) was embedded thanks to a post from Ed Griebel.

PS

The embedding code didn’t seem to like subgraphs, now using HTML provided by Mermaid.