The self-updating website

X @urre

I have this website called Chockpress, a silly project that scrapes and picks out chocking headlines from Swedish evening news. The scraper is built using Node.js, the website itself is built using React and is hosted on Now.

I already had a pretty good workflow for deploying the site, but wanted to automate more things.

🤖 Automate all the things

Desired result:

  • Get fresh new content
  • Build the website
  • Deploy the website
  • Add data to a Google Sheet
  • Schedule this to run every day

Circle CI

Circle is my platform of choice for continuous integration and delivery.

Configuring our build and deployment

First we create.circleci/config.yml and start adding the configuration. We want to use a prebuild docker container for Node.js. Read more about Configuring CircleCI

version: 2
jobs:
  build:
    docker:
      - image: circleci/node:10

Install dependencies

We use dependency caching to make jobs faster on CircleCI by reusing the data from previous jobs. Using restore_cache and save_cache we can specify a key to find a cache corresponding to a specific package-lock.json checksum. More about caching

working_directory: ~/repo

steps:
    - checkout

    - restore_cache:
        keys:
        - dependency-cache-{{ checksum "package.json" }}
        - dependency-cache-
    - run:
        name: Install dependencies
        command: npm install

    - save_cache:
        key: dependency-cache-{{ checksum "package.json" }}
        paths:
        - node_modules

Get new data

A couple of npm scripts handle the data scraping. See the full CircleCI config for more info.

Deploy

First setup our NOW_TOKEN as a environment variable on Circle CI. Then we run npx for executing the now cli. Using the -t option we can specify our token.

now.json

We can also define options to the now client by keeping a now.json file in our root project. Among other things we can cale our deployment in a region, in my case using bru, (Brussels, Belgium). We can also define what files should be deployed.

{
	"name": "chockpress",
	"alias": "chockpress",
	"version": 2,
	"regions": ["bru"],
	"files": ["build"]
}

in our CircleCI config:

- run:
    name: Deploy on Zeit now
    command: npx now -t ${NOW_TOKEN} ./build -A ../now.json

Alias our deployment

Every time we run now the app is deployed to a unique url. Now has this concept of immutability that has many advantages, including the option of testing multiple releases at the same time, multiple developers working on different parts of an app, rollbacks, and more. These URLs are ideal for development and staging, but not ideal for end users.

​Now aliases have two purposes: Giving your deployment a friendly and memorable name and Updating deployments with zero downtime.

- run:
    name: Alias domain chockpress.now.sh
    command: npx now -t ${NOW_TOKEN} alias chockpress

To remove old deployments we can use now rm chockpress --safe --yes. To list all instances of our app use: now ls chockpress. Read more about using Now

Save data to a Google Sheet

I’m keeping all the scraped headings in a Google Sheet using the Google Drive API. For authentication we can either download the credentials.json or use ENV variables like this:

GOOGLE_SHEET_ID="XXX"
GOOGLE_PRIVATE_KEY="XXX"
GOOGLE_CLIENT_EMAIL="XXX"

We also must share our Sheet with the Google IAM project email address. Click Share, add the email and grant access.

We can now start populate our sheet. A simplified example:

import GoogleSpreadsheet from 'google-spreadsheet'
const doc = new GoogleSpreadsheet(process.env.GOOGLE_SHEET_ID)

var creds_json = {
    client_email: process.env.GOOGLE_CLIENT_EMAIL,
    private_key: process.env.GOOGLE_PRIVATE_KEY
}

doc.useServiceAccountAuth(creds_json, step)
...

/* Loop our data and add new rows using the doc.addRow method */

doc.addRow( 1,
    {
        heading: data[key].title.trim(),
        source: source,
        url: data[key]['url']
    },

    ...

Success!

Setup as a npm script and add to our job:

- run:
    name: Save data to Google Sheet
    command: npm run save

Scheduling builds using workflows

A workflow is a set of rules for defining a collection of jobs and their run order. You can do all sorts of things, in my case i wanted two strategies, a regular commit workflow for pushing changes, and a daily workflow updating the website every day at 14.00.

workflows:
  version: 2
  commit-workflow:
    jobs:
      - build
  daily-workflow:
    triggers:
      - schedule:
          cron: '0 14 * * *'
          filters:
            branches:
              only:
                - dev
    jobs:
      - build

📄 See my complete Circle CI config here

Kick things off

After checking in changes, the CircleCI build should kick off nicely.

Slack notice

Using the CircleCI build notifications in Slack helps stay up-to-date with the latest build status.

Slack

Summary

🚀 Success! We have a self updating website with a fully automated process.