Skip to main content

BookStack Backup: Local HTML + Markdown to Codeberg


BookStack Backup: Local HTML + Markdown to Codeberg

1. Prerequisites

  • A running BookStack instance
  • PHP installed on the host machine
  • Git installed on the host machine
  • A Codeberg account

2. Get your BookStack API credentials

In BookStack, click your avatar (top right) > Edit Profile > scroll down to API Tokens > Create Token.

Give it a name, save, and note down the Token ID and Token Secret. These are your $clientId and $clientSecret.

Your $apiUrl is the local address of your BookStack instance, for example http://192.168.1.10:6875.


3. Get your Codeberg token

On Codeberg, click your avatar > Settings > Applications > Generate Token.

Give it a name, check repository read/write permissions, generate and copy the token. You will not see it again.


4. Create the export folders

mkdir -p /home/bookstackexportbypagehtml
mkdir -p /home/bookstackexportbypagemd
mkdir -p /home/scripts/bookstack

5. HTML export script

File: /home/scripts/bookstack/export-html.php

#!/usr/bin/env php
<?php
// Your BookStack instance URL, e.g. http://192.168.1.10:6875
$apiUrl = 'YOUR_BOOKSTACK_URL';
// Token ID from your BookStack profile > API Tokens
$clientId = 'YOUR_TOKEN_ID';
// Token Secret from your BookStack profile > API Tokens
$clientSecret = 'YOUR_TOKEN_SECRET';
$exportLocation = '/home/bookstackexportbypagehtml';

$shelves = getAllShelves();
$shelfBookIds = [];

foreach ($shelves as $shelf) {
    $shelfDetail = apiGetJson("api/shelves/{$shelf['id']}");
    $books = $shelfDetail['books'] ?? [];
    foreach ($books as $book) {
        $shelfBookIds[$book['id']] = sanitize($shelf['slug']);
        exportBook($book, $exportLocation . '/' . sanitize($shelf['slug']));
    }
}

// Handle books not attached to any shelf
$allBooks = getAllBooks();
foreach ($allBooks as $book) {
    if (isset($shelfBookIds[$book['id']])) continue;
    exportBook($book, $exportLocation . '/no-shelf');
}

function exportBook(array $book, string $baseDir): void {
    $bookDir = $baseDir . '/' . sanitize($book['slug']);
    mkdir($bookDir, 0755, true);

    $pages = getPagesForBook($book['id']);
    foreach ($pages as $page) {
        usleep(500000); // 0.5s delay to avoid hitting API rate limit
        $content = apiGet("api/pages/{$page['id']}/export/html");
        if (empty($content)) {
            echo "SKIP (empty): {$book['slug']} / {$page['slug']}\n";
            continue;
        }

        $filename = $bookDir . '/' . sanitize($page['slug']) . '.html';
        file_put_contents($filename, $content);
        echo "OK: {$book['slug']} / {$page['slug']}\n";
    }
}

function getAllShelves(): array {
    $all = [];
    $offset = 0;
    $total = 0;
    do {
        usleep(300000);
        $resp = apiGetJson('api/shelves?' . http_build_query(['count' => 100, 'offset' => $offset]));
        if ($resp === null) break;
        if ($offset === 0) $total = $resp['total'] ?? 0;
        $new = $resp['data'] ?? [];
        array_push($all, ...$new);
        $offset += 100;
    } while ($offset < $total);
    return $all;
}

function getAllBooks(): array {
    $all = [];
    $offset = 0;
    $total = 0;
    do {
        usleep(300000);
        $resp = apiGetJson('api/books?' . http_build_query(['count' => 100, 'offset' => $offset]));
        if ($resp === null) break;
        if ($offset === 0) $total = $resp['total'] ?? 0;
        $new = $resp['data'] ?? [];
        array_push($all, ...$new);
        $offset += 100;
    } while ($offset < $total);
    return $all;
}

function getPagesForBook(int $bookId): array {
    $all = [];
    $offset = 0;
    $total = 0;
    do {
        usleep(300000);
        $resp = apiGetJson('api/pages?' . http_build_query([
            'count' => 100,
            'offset' => $offset,
            'filter[book_id]' => $bookId,
        ]));
        if ($resp === null) break;
        if ($offset === 0) $total = $resp['total'] ?? 0;
        $new = $resp['data'] ?? [];
        array_push($all, ...$new);
        $offset += 100;
    } while ($offset < $total);
    return $all;
}

function sanitize(string $name): string {
    return preg_replace('/[^a-zA-Z0-9_\-]/', '_', $name);
}

function apiGet(string $endpoint): string {
    global $apiUrl, $clientId, $clientSecret;
    $url = rtrim($apiUrl, '/') . '/' . ltrim($endpoint, '/');
    $opts = ['http' => [
        'header' => "Authorization: Token {$clientId}:{$clientSecret}",
        'ignore_errors' => true,
    ]];
    $result = file_get_contents($url, false, stream_context_create($opts));
    return $result === false ? '' : $result;
}

function apiGetJson(string $endpoint): ?array {
    $data = apiGet($endpoint);
    if (empty($data)) return null;
    return json_decode($data, true);
}

6. Markdown export script

File: /home/scripts/bookstack/export-md.php

#!/usr/bin/env php
<?php
// Your BookStack instance URL, e.g. http://192.168.1.10:6875
$apiUrl = 'YOUR_BOOKSTACK_URL';
// Token ID from your BookStack profile > API Tokens
$clientId = 'YOUR_TOKEN_ID';
// Token Secret from your BookStack profile > API Tokens
$clientSecret = 'YOUR_TOKEN_SECRET';
$exportLocation = '/home/bookstackexportbypagemd';

$shelves = getAllShelves();
$shelfBookIds = [];

foreach ($shelves as $shelf) {
    $shelfDetail = apiGetJson("api/shelves/{$shelf['id']}");
    $books = $shelfDetail['books'] ?? [];
    foreach ($books as $book) {
        $shelfBookIds[$book['id']] = sanitize($shelf['slug']);
        exportBook($book, $exportLocation . '/' . sanitize($shelf['slug']));
    }
}

// Handle books not attached to any shelf
$allBooks = getAllBooks();
foreach ($allBooks as $book) {
    if (isset($shelfBookIds[$book['id']])) continue;
    exportBook($book, $exportLocation . '/no-shelf');
}

function exportBook(array $book, string $baseDir): void {
    $bookDir = $baseDir . '/' . sanitize($book['slug']);
    mkdir($bookDir, 0755, true);

    $pages = getPagesForBook($book['id']);
    foreach ($pages as $page) {
        usleep(500000); // 0.5s delay to avoid hitting API rate limit
        $content = apiGet("api/pages/{$page['id']}/export/markdown");
        if (empty($content)) {
            echo "SKIP (empty): {$book['slug']} / {$page['slug']}\n";
            continue;
        }

        $filename = $bookDir . '/' . sanitize($page['slug']) . '.md';
        file_put_contents($filename, $content);
        echo "OK: {$book['slug']} / {$page['slug']}\n";
    }
}

function getAllShelves(): array {
    $all = [];
    $offset = 0;
    $total = 0;
    do {
        usleep(300000);
        $resp = apiGetJson('api/shelves?' . http_build_query(['count' => 100, 'offset' => $offset]));
        if ($resp === null) break;
        if ($offset === 0) $total = $resp['total'] ?? 0;
        $new = $resp['data'] ?? [];
        array_push($all, ...$new);
        $offset += 100;
    } while ($offset < $total);
    return $all;
}

function getAllBooks(): array {
    $all = [];
    $offset = 0;
    $total = 0;
    do {
        usleep(300000);
        $resp = apiGetJson('api/books?' . http_build_query(['count' => 100, 'offset' => $offset]));
        if ($resp === null) break;
        if ($offset === 0) $total = $resp['total'] ?? 0;
        $new = $resp['data'] ?? [];
        array_push($all, ...$new);
        $offset += 100;
    } while ($offset < $total);
    return $all;
}

function getPagesForBook(int $bookId): array {
    $all = [];
    $offset = 0;
    $total = 0;
    do {
        usleep(300000);
        $resp = apiGetJson('api/pages?' . http_build_query([
            'count' => 100,
            'offset' => $offset,
            'filter[book_id]' => $bookId,
        ]));
        if ($resp === null) break;
        if ($offset === 0) $total = $resp['total'] ?? 0;
        $new = $resp['data'] ?? [];
        array_push($all, ...$new);
        $offset += 100;
    } while ($offset < $total);
    return $all;
}

function sanitize(string $name): string {
    return preg_replace('/[^a-zA-Z0-9_\-]/', '_', $name);
}

function apiGet(string $endpoint): string {
    global $apiUrl, $clientId, $clientSecret;
    $url = rtrim($apiUrl, '/') . '/' . ltrim($endpoint, '/');
    $opts = ['http' => [
        'header' => "Authorization: Token {$clientId}:{$clientSecret}",
        'ignore_errors' => true,
    ]];
    $result = file_get_contents($url, false, stream_context_create($opts));
    return $result === false ? '' : $result;
}

function apiGetJson(string $endpoint): ?array {
    $data = apiGet($endpoint);
    if (empty($data)) return null;
    return json_decode($data, true);
}

7. Initialize the git repository

cd /home/bookstackexportbypagemd
git init
git branch -m master main
git remote add origin https://YOUR_CODEBERG_USERNAME:YOUR_CODEBERG_TOKEN@codeberg.org/YOUR_CODEBERG_USERNAME/bookstack-backup.git

8. Global backup script

File: /home/scripts/bookstack/backup.sh

#!/bin/bash

# Wipe export folders before each run to avoid stale folders from renamed books
rm -rf /home/bookstackexportbypagehtml/*
rm -rf /home/bookstackexportbypagemd/*

# Export all pages as HTML (local backup)
php /home/scripts/bookstack/export-html.php

# Export all pages as Markdown
php /home/scripts/bookstack/export-md.php

# Push Markdown to Codeberg
cd /home/bookstackexportbypagemd
git add -A

# Only commit if there are actual changes
if git diff --cached --quiet; then
    echo "Nothing new to commit"
    exit 0
fi

git commit -m "backup $(date '+%Y-%m-%d %H:%M')"
git push origin main
chmod +x /home/scripts/bookstack/backup.sh
sed -i 's/\r//' /home/scripts/bookstack/backup.sh

9. First run

bash /home/scripts/bookstack/backup.sh

10. Schedule with cron

crontab -e

Add the following line to run every day at 2am:

0 2 * * * /bin/bash /home/scripts/bookstack/backup.sh >> /var/log/bookstack-backup.log 2>&1

Expected output structure

/home/bookstackexportbypagehtml/
  my-shelf/
    my-book/
      my-page.html
  no-shelf/
    book-without-shelf/
      my-page.html

/home/bookstackexportbypagemd/
  my-shelf/
    my-book/
      my-page.md
  no-shelf/
    book-without-shelf/
      my-page.md