Planet Grep

Planet'ing Belgian FLOSS people

Planet Grep is maintained by Wouter Verhelst. All times are in UTC.

April 23, 2025

Managing multiple servers can be a daunting task, especially when striving for consistency and efficiency. To tackle this challenge, I developed a robust automation system using Ansible, GitHub Actions, and Vagrant. This setup not only streamlines server configuration but also ensures that deployments are repeatable and maintainable.

A Bit of History: How It All Started

This project began out of necessity. I was maintaining a handful of Ubuntu servers — one for email, another for a website, and a few for experiments — and I quickly realized that logging into each one to make manual changes was both tedious and error-prone. My first step toward automation was a collection of shell scripts. They worked, but as the infrastructure grew, they became hard to manage and lacked the modularity I needed.

That is when I discovered Ansible. I created the ansible-servers repository in early 2024 as a way to centralize and standardize my infrastructure automation. Initially, it only contained a basic playbook for setting up users and updating packages. But over time, it evolved to include multiple roles, structured inventories, and eventually CI/CD integration through GitHub Actions.

Every addition was born out of a real-world need. When I got tired of testing changes manually, I added Vagrant to simulate my environments locally. When I wanted to be sure my configurations stayed consistent after every push, I integrated GitHub Actions to automate deployments. When I noticed the repo growing, I introduced linting and security checks to maintain quality.

The repository has grown steadily and organically, each commit reflecting a small lesson learned or a new challenge overcome.

The Foundation: Ansible Playbooks

At the core of my automation strategy are Ansible playbooks, which define the desired state of my servers. These playbooks handle tasks such as installing necessary packages, configuring services, and setting up user accounts. By codifying these configurations, I can apply them consistently across different environments.

To manage these playbooks, I maintain a structured repository that includes:

  • Inventory Files: Located in the inventory directory, these YAML files specify the hosts and groups for deployment targets.
  • Roles: Under the roles directory, I define reusable components that encapsulate specific functionalities, such as setting up a web server or configuring a database.
  • Configuration File: The ansible.cfg file sets important defaults, like enabling fact caching and specifying the inventory path, to optimize Ansible’s behavior.

Seamless Deployments with GitHub Actions

To automate the deployment process, I leverage GitHub Actions. This integration allows me to trigger Ansible playbooks automatically upon code changes, ensuring that my servers are always up-to-date with the latest configurations.

One of the key workflows is Deploy to Production, which executes the main playbook against the production inventory. This workflow is defined in the ansible-deploy.yml file and is triggered on specific events, such as pushes to the main branch.

Additionally, I have set up other workflows to maintain code quality and security:

  • Super-Linter: Automatically checks the codebase for syntax errors and adherence to best practices.
  • Codacy Security Scan: Analyzes the code for potential security vulnerabilities.
  • Dependabot Updates: Keeps dependencies up-to-date by automatically creating pull requests for new versions.

Local Testing with Vagrant

Before deploying changes to production, it is crucial to test them in a controlled environment. For this purpose, I use Vagrant to spin up virtual machines that mirror my production servers.

The deploy_to_staging.sh script automates this process by:

  1. Starting the Vagrant environment and provisioning it.
  2. Installing required Ansible roles specified in requirements.yml.
  3. Running the Ansible playbook against the staging inventory.

This approach allows me to validate changes in a safe environment before applying them to live servers.

Embracing Open Source and Continuous Improvement

Transparency and collaboration are vital in the open-source community. By hosting my automation setup on GitHub, I invite others to review, suggest improvements, and adapt the configurations for their own use cases.

The repository is licensed under the MIT License, encouraging reuse and modification. Moreover, I actively monitor issues and welcome contributions to enhance the system further.


In summary, by combining Ansible, GitHub Actions, and Vagrant, I have created a powerful and flexible automation framework for managing my servers. This setup not only reduces manual effort but also increases reliability and scalability. I encourage others to explore this approach and adapt it to their own infrastructure needs. What began as a few basic scripts has now evolved into a reliable automation pipeline I rely on every day.

If you are managing servers and find yourself repeating the same configuration steps, I invite you to check out the ansible-servers repository on GitHub. Clone it, explore the structure, try it in your own environment — and if you have ideas or improvements, feel free to open a pull request or start a discussion. Automation has made a huge difference for me, and I hope it can do the same for you.


April 17, 2025

One of the most surprising moments at Drupal Dev Days Leuven? Getting a phone call from Drupal. Yes, really.

Marcus Johansson gave me a spontaneous demo of a Twilio-powered AI agent built for Drupal, which triggered a phone call right from within the Drupal interface. It was unexpected, fun, and a perfect example of the kind of creative energy in the room.

That moment reminded me why I love Drupal. People were building, sharing, and exploring what Drupal can do next. The energy was contagious.

From MCP (Model Context Protocol) modules to AI-powered search, I saw Drupal doing things I wouldn't have imagined two years ago. AI is no longer just an idea. It's already finding its way into Drupal in practical, thoughtful ways.

Dries Buytaert speaking during a Q&A session at Drupal Dev Days Leuven, facing a large, engaged audience seated in a university lecture hall. Doing a Q&A at Drupal Dev Days in Leuven. I loved the energy and great questions from the Drupal community. © Paul Johnson

Outside of doing a Q&A session, I spent much of my time at Drupal Dev Days working on the next phase of Drupal's AI strategy. We have an early lead in AI, but we need to build on it. We will be sharing more on that in the coming month.

In the meantime, huge thanks to the organizers of Drupal Dev Days for making this event happen, and to Paul Johnson for the fantastic photo. I love that it shows so many happy faces.

Dédicace à Trolls & Vélo et magie cycliste

Je serai ce samedi 19 avril à Mons au festival Trolls & Légende en dédicace au stand PVH.

La star de la table sera sans conteste Sara Schneider, autrice fantasy de la saga des enfants d’Aliel et qui est toute auréolée du Prix SFFF Suisse 2024 pour son superbe roman « Place d’âmes » (dont je vous ai déjà parlé).

C’est la première fois que je dédicacerai à côté d’une autrice ayant reçu un prix majeur. Je suis pas sûr qu’elle acceptera encore que je la tutoie.

Sara Schneider avec son roman et son prix SFFF Suisse 2024 Sara Schneider avec son roman et son prix SFFF Suisse 2024

Bref, si Sara vient pour faire la légende, le nom du festival implique qu’il faille compléter avec des trolls. D’où la présence également à la table PVH de Tirodem, Allius et moi-même. Ça, les trolls, on sait faire !

Les belles mécaniques de l’imaginaire

S’il y a des trolls et des légendes, il y a aussi tout un côté Steampunk. Et quoi de plus Steampunk qu’un vélo ?

Ce qui fait la beauté de la bicyclette, c’est sa sincérité. Elle ne cache rien, ses mouvements sont apparents, l’effort chez elle se voit et se comprend; elle proclame son but, elle dit qu’elle veut aller vite, silencieusement et légèrement. Pourquoi la voiture automobile est-elle si vilaine et nous inspire-t-elle un sentiment de malaise ? Parce qu’elle dissimule ses organes comme une honte. On ne sait pas ce qu’elle veut. Elle semble inachevée.
– Voici des ailes, Maurice Leblanc

Le vélo, c’est l’aboutissement d’un transhumanisme humaniste rêvé par la science-fiction.

La bicyclette a résolu le problème, qui remédie à notre lenteur et supprime la fatigue. L’homme maintenant est pourvu de tous ses moyens. La vapeur, l’électricité n’étaient que des progrès servant à son bien-être; la bicyclette est un perfectionnement de son corps même, un achèvement pourrait-on dire. C’est une paire de jambes plus rapides qu’on lui offre. Lui et sa machine ne font qu’un, ce ne sont pas deux êtres différents comme l’homme et le cheval, deux instincts en opposition; non, c’est un seul être, un automate d’un seul morceau. Il n’y a pas un homme et une machine, il y a un homme plus vite.
– Voici des ailes, Maurice Leblanc

Un aboutissement technologique qui, paradoxalement, connecte avec la nature. Le vélo est une technologie respectueuse et utilisable par les korrigans, les fées, les elfes et toutes les peuplades qui souffrent de notre croissance technologique. Le vélo étend notre cerveau pour nous connecter à la nature, induit une transe chamanique dès que les pédales se mettent à tourner.

Nos rapports avec la nature sont bouleversés ! Imaginez deux hommes sur un grand chemin : l’un marche, l’autre roule; leur situation à l’égard de la nature sera-t-elle la même ? Oh ! non. L’un recevra d’elle de menues sensations de détails, l’autre une vaste impression d’ensemble. À pied, vous respirez le parfum de cette plante, vous admirez la nuance de cette fleur, vous entendez le chant de cet oiseau; à bicyclette, vous respirez, vous admirez, vous entendez la nature elle-même. C’est que le mouvement produit tend nos nerfs jusqu’à leur maximum d’intensité et nous dote d’une sensibilité inconnue jusqu’alors.
– Voici des ailes, Maurice Leblanc

Oui, le vélo a amplement sa place à Trolls & Légendes, comme le démontrent ses extraits de « Voici des ailes » de Maurice Leblanc, roman écrit… en 1898, quelques années avant la création d’Arsène Lupin !

Célébrer l’univers Bikepunk

Moi aussi, j’aime me faire lyrique pour célébrer le vélo, comme le prouvent les extraits que sélectionnent les critiques de mon roman Bikepunk.

Chierie chimique de bordel nucléaire de saloperie vomissoire de permamerde !
— Bikepunk, Ploum

Ouais bon, d’accord… C’est un style légèrement différent. J’essaie juste de toucher un public un poil plus moderne quoi. Et puis on avait dit « pas cet extrait-là ! ».

Allez, comme on dit chez les cyclisteurs : on enchaîne, on enchaîne…

Donc, pour célébrer le vélo et l’imaginaire cycliste, je me propose d’offrir une petite surprise à toute personne qui se présentera sur le stand PVH avec un déguisement dans le thème Bikepunk ce samedi (et si vous me prévenez à l’avance, c’est encore mieux).

Parce qu’on va leur montrer à ces elfes, ces barbares et ces mages ce que c’est la véritable magie, la véritable puissance : des pédales, deux roues et un guidon !

À samedi les cyclotrolls !

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

April 16, 2025

Introduction

In my previous post, I shared the story of why I needed a new USB stick and how I used ChatGPT to write a benchmark script that could measure read performance across various methods. In this follow-up, I will dive into the technical details of how the script evolved—from a basic prototype into a robust and feature-rich tool—thanks to incremental refinements and some AI-assisted development.


Starting Simple: The First Version

The initial idea was simple: read a file using dd and measure the speed.

dd if=/media/amedee/Ventoy/ISO/ubuntu-24.10-desktop-amd64.iso \
   of=/dev/null bs=8k

That worked, but I quickly ran into limitations:

  • No progress indicator
  • Hardcoded file paths
  • No USB auto-detection
  • No cache flushing, leading to inflated results when repeating the measurement

With ChatGPT’s help, I started addressing each of these issues one by one.


Tools check

On a default Ubuntu installation, some tools are available by default, while others (especially benchmarking tools) usually need to be installed separately.

Tools used in the script:

ToolInstalled by default?Needs require?
hdparm❌ Not installed✅ Yes
dd✅ Yes❌ No
pv❌ Not installed✅ Yes
cat✅ Yes❌ No
ioping❌ Not installed✅ Yes
fio❌ Not installed✅ Yes
lsblk✅ Yes (in util-linux)❌ No
awk✅ Yes (in gawk)❌ No
grep✅ Yes❌ No
basename✅ Yes (in coreutils)❌ No
find✅ Yes❌ No
sort✅ Yes❌ No
stat✅ Yes❌ No

This function ensures the system has all tools needed for benchmarking. It exits early if any tool is missing.

This was the initial version:

check_required_tools() {
  local required_tools=(dd pv hdparm fio ioping awk grep sed tr bc stat lsblk find sort)
  for tool in "${required_tools[@]}"; do
    if ! command -v "$tool" &>/dev/null; then
      echo "❌ Required tool '$tool' is not installed."
      exit 1
    fi
  done
}

That’s already nice, but maybe I just want to run the script anyway if some of the tools are missing.

This is a more advanced version:

ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()

require() {
  if ! command -v "$1" >/dev/null; then
    return 1
  fi
  return 0
}

check_required_tools() {
  echo "🔍 Checking required tools..."
  for tool in "${ALL_TOOLS[@]}"; do
    if ! require "$tool"; then
      MISSING_TOOLS+=("$tool")
    fi
  done

  if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
    echo "⚠️  The following tools are missing: ${MISSING_TOOLS[*]}"
    echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
    if [[ -z "$FORCE_YES" ]]; then
      read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
      case $yn in
        [Yy]*)
          echo "Continuing with limited tests..."
          ;;
        *)
          echo "Aborting. Please install the required tools."
          exit 1
          ;;
      esac
    else
      echo "Continuing with limited tests (auto-confirmed)..."
    fi
  else
    echo "✅ All required tools are available."
  fi
}

Device Auto-Detection

One early challenge was identifying which device was the USB stick. I wanted the script to automatically detect a mounted USB device. My first version was clunky and error-prone.

detect_usb() {
  USB_DEVICE=$(lsblk -o NAME,TRAN,MOUNTPOINT -J | jq -r '.blockdevices[] | select(.tran=="usb") | .name' | head -n1)
  if [[ -z "$USB_DEVICE" ]]; then
    echo "❌ No USB device detected."
    exit 1
  fi
  USB_PATH="/dev/$USB_DEVICE"
  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_PATH" | head -n1)
  if [[ -z "$MOUNT_PATH" ]]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi
  echo "✅ Using USB device: $USB_PATH"
  echo "✅ Mounted at: $MOUNT_PATH"
}

After a few iterations, we (ChatGPT and I) settled on parsing lsblk with filters on tran=usb and hotplug=1, and selecting the first mounted partition.

We also added a fallback prompt in case auto-detection failed.

detect_usb() {
  if [[ -n "$USB_DEVICE" ]]; then
    echo "📎 Using provided USB device: $USB_DEVICE"
    MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
    return
  fi

  echo "🔍 Detecting USB device..."
  USB_DEVICE=""
  while read -r dev tran hotplug type _; do
    if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
      base="/dev/$dev"
      part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
      if [[ -n "$part" ]]; then
        USB_DEVICE="$part"
        break
      fi
    fi
  done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)

  if [ -z "$USB_DEVICE" ]; then
    echo "❌ No mounted USB partition found on any USB disk."
    lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
    read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
  fi

  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
  if [ -z "$MOUNT_PATH" ]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi

  echo "✅ Using USB device: $USB_DEVICE"
  echo "✅ Mounted at: $MOUNT_PATH"
}

Finding the Test File

To avoid hardcoding filenames, we implemented logic to search for the latest Ubuntu ISO on the USB stick.

find_ubuntu_iso() {
  # Function to find an Ubuntu ISO on the USB device
  find "$MOUNT_PATH" -type f -regextype posix-extended \
    -regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}

Later, we enhanced it to accept a user-provided file, and even verify that the file was located on the USB stick. If it was not, the script would gracefully fall back to the Ubuntu ISO search.

find_test_file() {
  if [[ -n "$TEST_FILE" ]]; then
    echo "📎 Using provided test file: $(basename "$TEST_FILE")"
    
    # Check if the provided test file is on the USB device
    TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
    if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
      echo "❌ The provided test file is not located on the USB device."
      # Look for an Ubuntu ISO if it's not on the USB
      TEST_FILE=$(find_ubuntu_iso)
    fi
  else
    TEST_FILE=$(find_ubuntu_iso)
  fi

  if [ -z "$TEST_FILE" ]; then
    echo "❌ No valid test file found."
    exit 1
  fi

  if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
    UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
    echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
  else
    echo "📎 Selected test file: $(basename "$TEST_FILE")"
  fi
}

Read Methods and Speed Extraction

To get a comprehensive view, we added multiple methods:

  • hdparm (direct disk access)
  • dd (simple block read)
  • dd + pv (with progress bar)
  • cat + pv (alternative stream reader)
  • ioping (random access)
  • fio (customizable benchmark tool)
    if require hdparm; then
      drop_caches
      speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    drop_caches
    speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
    mb=$(speed_to_mb "$speed")
    echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
    TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
    echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    ((idx++))

    if require pv; then
      drop_caches
      FILESIZE=$(stat -c%s "$TEST_FILE")
      speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require pv; then
      drop_caches
      speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require ioping; then
      drop_caches
      speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require fio; then
      drop_caches
      speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
            --size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
            --minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi

Parsing their outputs proved tricky. For example, pv outputs speed with or without spaces, and with different units. We created a robust extract_speed function with regex, and a speed_to_mb function that could handle both MB/s and MiB/s, with or without a space between value and unit.

extract_speed() {
  grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}

speed_to_mb() {
  if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
    value="${BASH_REMATCH[1]}"
    unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
  else
    echo "0"
    return
  fi

  case "$unit" in
    kb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
    mb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
    kib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
    mib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
    *) echo "0" ;;
  esac
}

Dropping Caches for Accurate Results

To prevent cached reads from skewing the results, each test run begins by dropping system caches using:

sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

What it does:

CommandPurpose
syncFlushes all dirty (pending write) pages to disk
echo 3 > /proc/sys/vm/drop_cachesClears page cache, dentries, and inodes from RAM

We wrapped this in a helper function and used it consistently.


Multiple Runs and Averaging

We made the script repeat each test N times (default: 3), collect results, compute averages, and display a summary at the end.

  echo "📊 Read-only USB benchmark started ($RUNS run(s))"
  echo "==================================="

  declare -A TEST_NAMES=(
    [1]="hdparm"
    [2]="dd"
    [3]="dd + pv"
    [4]="cat + pv"
    [5]="ioping"
    [6]="fio"
  )

  declare -A TOTAL_MB
  for i in {1..6}; do TOTAL_MB[$i]=0; done
  CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
  echo "Test,Run,Speed (MB/s)" > "$CSVFILE"

  for ((run=1; run<=RUNS; run++)); do
    echo "▶ Run $run"
    idx=1

  ### tests run here

  echo "📄 Summary of average results for $UBUNTU_VERSION:"
  echo "==================================="
  SUMMARY_TABLE=""
  for i in {1..6}; do
    if [[ ${TOTAL_MB[$i]} != 0 ]]; then
      avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
      echo "${TEST_NAMES[$i]} average: $avg MB/s"
      RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
      SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
    fi
  done

Output Formats

To make the results user-friendly, we added:

  • A clean table view
  • CSV export for spreadsheets
  • Log file for later reference
  if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
    echo -e "📋 Table view:"
    echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
  fi

  if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
    if require gnuplot; then
      echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
      gnuplot -p -e "
        set terminal dumb;
        set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
        set xlabel 'Test Method';
        set ylabel 'MB/s';
        plot '-' using 2:xtic(1) with boxes notitle
      "
    fi
  fi

  LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
  {
    echo "Benchmark for USB device: $USB_DEVICE"
    echo "Mounted at: $MOUNT_PATH"
    echo "Ubuntu version: $UBUNTU_VERSION"
    echo "Test file: $TEST_FILE"
    echo "Timestamp: $(date)"
    echo "Number of runs: $RUNS"
    echo ""
    echo "Read speed averages:"
    for line in "${RESULTS[@]}"; do
      echo "$line"
    done
  } > "$LOGFILE"

  echo "📝 Results saved to: $LOGFILE"
  echo "📈 CSV exported to: $CSVFILE"
  echo "==================================="

The Full Script

Here is the complete version of the script used to benchmark the read performance of a USB drive:

#!/bin/bash

# ==========================
# CONFIGURATION
# ==========================
RESULTS=()
USB_DEVICE=""
TEST_FILE=""
RUNS=1
VISUAL="none"
SUMMARY=0

# (Consider grouping related configuration into a config file or associative array if script expands)

# ==========================
# ARGUMENT PARSING
# ==========================
while [[ $# -gt 0 ]]; do
  case $1 in
    --device)
      USB_DEVICE="$2"
      shift 2
      ;;
    --file)
      TEST_FILE="$2"
      shift 2
      ;;
    --runs)
      RUNS="$2"
      shift 2
      ;;
    --visual)
      VISUAL="$2"
      shift 2
      ;;
    --summary)
      SUMMARY=1
      shift
      ;;
    --yes|--force)
      FORCE_YES=1
      shift
      ;;
    *)
      echo "Unknown option: $1"
      exit 1
      ;;
  esac
done

# ==========================
# TOOL CHECK
# ==========================
ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()

require() {
  if ! command -v "$1" >/dev/null; then
    return 1
  fi
  return 0
}

check_required_tools() {
  echo "🔍 Checking required tools..."
  for tool in "${ALL_TOOLS[@]}"; do
    if ! require "$tool"; then
      MISSING_TOOLS+=("$tool")
    fi
  done

  if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
    echo "⚠️  The following tools are missing: ${MISSING_TOOLS[*]}"
    echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
    if [[ -z "$FORCE_YES" ]]; then
      read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
      case $yn in
        [Yy]*)
          echo "Continuing with limited tests..."
          ;;
        *)
          echo "Aborting. Please install the required tools."
          exit 1
          ;;
      esac
    else
      echo "Continuing with limited tests (auto-confirmed)..."
    fi
  else
    echo "✅ All required tools are available."
  fi
}

# ==========================
# AUTO-DETECT USB DEVICE
# ==========================
detect_usb() {
  if [[ -n "$USB_DEVICE" ]]; then
    echo "📎 Using provided USB device: $USB_DEVICE"
    MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
    return
  fi

  echo "🔍 Detecting USB device..."
  USB_DEVICE=""
  while read -r dev tran hotplug type _; do
    if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
      base="/dev/$dev"
      part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
      if [[ -n "$part" ]]; then
        USB_DEVICE="$part"
        break
      fi
    fi
  done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)

  if [ -z "$USB_DEVICE" ]; then
    echo "❌ No mounted USB partition found on any USB disk."
    lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
    read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
  fi

  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
  if [ -z "$MOUNT_PATH" ]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi

  echo "✅ Using USB device: $USB_DEVICE"
  echo "✅ Mounted at: $MOUNT_PATH"
}

# ==========================
# FIND TEST FILE
# ==========================
find_ubuntu_iso() {
  # Function to find an Ubuntu ISO on the USB device
  find "$MOUNT_PATH" -type f -regextype posix-extended \
    -regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}

find_test_file() {
  if [[ -n "$TEST_FILE" ]]; then
    echo "📎 Using provided test file: $(basename "$TEST_FILE")"
    
    # Check if the provided test file is on the USB device
    TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
    if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
      echo "❌ The provided test file is not located on the USB device."
      # Look for an Ubuntu ISO if it's not on the USB
      TEST_FILE=$(find_ubuntu_iso)
    fi
  else
    TEST_FILE=$(find_ubuntu_iso)
  fi

  if [ -z "$TEST_FILE" ]; then
    echo "❌ No valid test file found."
    exit 1
  fi

  if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
    UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
    echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
  else
    echo "📎 Selected test file: $(basename "$TEST_FILE")"
  fi
}



# ==========================
# SPEED EXTRACTION
# ==========================
extract_speed() {
  grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}

speed_to_mb() {
  if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
    value="${BASH_REMATCH[1]}"
    unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
  else
    echo "0"
    return
  fi

  case "$unit" in
    kb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
    mb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
    kib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
    mib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
    *) echo "0" ;;
  esac
}

drop_caches() {
  echo "🧹 Dropping system caches..."
  if [[ $EUID -ne 0 ]]; then
    echo "  (requires sudo)"
  fi
  sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
}

# ==========================
# RUN BENCHMARKS
# ==========================
run_benchmarks() {
  echo "📊 Read-only USB benchmark started ($RUNS run(s))"
  echo "==================================="

  declare -A TEST_NAMES=(
    [1]="hdparm"
    [2]="dd"
    [3]="dd + pv"
    [4]="cat + pv"
    [5]="ioping"
    [6]="fio"
  )

  declare -A TOTAL_MB
  for i in {1..6}; do TOTAL_MB[$i]=0; done
  CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
  echo "Test,Run,Speed (MB/s)" > "$CSVFILE"

  for ((run=1; run<=RUNS; run++)); do
    echo "▶ Run $run"
    idx=1

    if require hdparm; then
      drop_caches
      speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    drop_caches
    speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
    mb=$(speed_to_mb "$speed")
    echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
    TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
    echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    ((idx++))

    if require pv; then
      drop_caches
      FILESIZE=$(stat -c%s "$TEST_FILE")
      speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require pv; then
      drop_caches
      speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require ioping; then
      drop_caches
      speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require fio; then
      drop_caches
      speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
            --size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
            --minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
  done

  echo "📄 Summary of average results for $UBUNTU_VERSION:"
  echo "==================================="
  SUMMARY_TABLE=""
  for i in {1..6}; do
    if [[ ${TOTAL_MB[$i]} != 0 ]]; then
      avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
      echo "${TEST_NAMES[$i]} average: $avg MB/s"
      RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
      SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
    fi
  done

  if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
    echo -e "📋 Table view:"
    echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
  fi

  if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
    if require gnuplot; then
      echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
      gnuplot -p -e "
        set terminal dumb;
        set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
        set xlabel 'Test Method';
        set ylabel 'MB/s';
        plot '-' using 2:xtic(1) with boxes notitle
      "
    fi
  fi

  LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
  {
    echo "Benchmark for USB device: $USB_DEVICE"
    echo "Mounted at: $MOUNT_PATH"
    echo "Ubuntu version: $UBUNTU_VERSION"
    echo "Test file: $TEST_FILE"
    echo "Timestamp: $(date)"
    echo "Number of runs: $RUNS"
    echo ""
    echo "Read speed averages:"
    for line in "${RESULTS[@]}"; do
      echo "$line"
    done
  } > "$LOGFILE"

  echo "📝 Results saved to: $LOGFILE"
  echo "📈 CSV exported to: $CSVFILE"
  echo "==================================="
}

# ==========================
# MAIN
# ==========================
check_required_tools
detect_usb
find_test_file
run_benchmarks

You van also find the latest revision of this script as a GitHub Gist.


Lessons Learned

This script has grown from a simple one-liner into a reliable tool to test USB read performance. Working with ChatGPT sped up development significantly, especially for bash edge cases and regex. But more importantly, it helped guide the evolution of the script in a structured way, with clean modular functions and consistent formatting.


Conclusion

This has been a fun and educational project. Whether you are benchmarking your own USB drives or just want to learn more about shell scripting, I hope this walkthrough is helpful.

Next up? Maybe a graphical version, or write benchmarking on a RAM disk to avoid damaging flash storage.

Stay tuned—and let me know if you use this script or improve it!

April 15, 2025

If you are testing MySQL with sysbench, here is a RPM version for Fedora 31 and OL 8 & 9 linked with the last libmysql (libmysqlclient.so.24) from MySQL 9.3. This version of sysbench is from the latest master branch in GitHub. I used version 1.1, but this is to make a differentiation with the code […]

April 14, 2025

À la recherche de l’attention perdue

La messagerie instantanée et la politique

Vous l’avez certainement vu passer : Un journaliste américain s’est fait inviter par erreur sur un chat Signal où des personnes très haut placées de l’administration américaine (y compris le vice-président) discutent de l’organisation top secrète d’une frappe militaire au Yémen le 15 mars.

La raison de cette erreur est que le porte-parole de Trump, Brian Hughes, avait, durant la campagne électorale, reçu un email du journaliste en question pour demander des précisions sur un autre sujet. Brian Hughes avait alors copié/collé la totalité de l’email, incluant la signature contenant le numéro de téléphone du journaliste, dans un message instantané Apple iMessage à destination de Mike Waltz, qui allait devenir le conseiller à la sécurité de Trump. Recevant ce numéro par message de la part de Brian Hughes, Mike Waltz aurait ensuite sauvegardé ce numéro sous le nom de Brian Hughes. En voulant inviter plus tard Brian Hughes dans le chat Signal, Mike Waltz a par erreur invité le journaliste américain.

Cette anecdote nous apprend plusieurs choses:

Premièrement, Signal est devenu une réelle infrastructure critique de sécurité, y compris dans les cercles les plus hauts placés.

Deuxièmement, les discussions de guerre ultra-stratégique ont désormais lieu… par chat. Pas difficile d’imaginer que chaque participant répond machinalement, poste un émoji entre deux réunions, lors d’une pause pipi. Et là se décident la vie et la mort du reste du monde : dans les toilettes et les réunions qui n’ont rien à voir !

L’erreur initiale provient du fait que Mike Waltz ne lit vraisemblablement pas ses emails (sinon, on lui aurait fait suivre l’email au lieu de l’envoyer par message) et que Brian Hughues est incapable de résumer efficacement un long texte (sinon il n’aurait pas collé l’intégralité du message).

Non seulement Mike Waltz ne lit pas ses emails, mais on peut soupçonner qu’il ne lit pas les messages trop longs : il a quand même ajouté un numéro de téléphone qui se trouvait à la fin d’un message sans prendre le temps de lire et de comprendre ledit message. À sa décharge, il semblerait qu’il soit possible que ce soit "l’intelligence artificielle" de l’iPhone qui ait ajouté ce numéro automatiquement au contact.

Je ne sais pas si cette fonctionnalité existe, mais le fait d’utiliser un téléphone qui peut décider automatiquement de changer le numéro de ses contacts est quand même assez effrayant. Et bien dans le genre d’Apple dont j’interprète les slogans marketing comme « achetez avec nos produits l’intelligence qui vous fait défaut, bande de crétins ! ».

Crise politique attentionnelle et surveillance généralisée

La crise attentionnelle est réelle : nous sommes de moins en moins capables de nous concentrer et nous votons pour des gens qui le sont encore moins ! Un ami ayant été embauché pour participer à une campagne électorale en Belgique m’a raconté avoir été abasourdi par l’addiction des politiciens les plus en vue aux réseaux sociaux. Ils sont en permanence rivés à leurs écrans à comptabiliser les likes et les partages de leurs publications et, quand ils reçoivent un dossier de plus de dix lignes, demandent un résumé ultra-succinct à leurs conseillers.

Vos politiques ne comprennent rien à rien. Ils font semblant. Et désormais, ils demandent à ChatGPT qui a l’avantage de ne pas dormir, contrairement aux conseillers humains. Les fameuses intelligences artificielles qui, justement, sont peut-être coupables d’avoir ajouté le numéro à ce contact et d’avoir rédigé la politique fiscale de Trump.

Mais pourquoi utiliser Signal et pas une solution officielle qui empêcherait ce genre de fuite ? Officiellement, il n’y aurait pas d’alternative aussi facile. Mais je vois une raison non officielle très simple : les personnes haut placées ont désormais peur de leur propre infrastructure, car ils savent que tout est sauvegardé et peut-être utilisé contre eux lors d’une éventuelle enquête ou d’un procès, même des années plus tard.

Trump a été élu la première fois en faisant campagne sur le fait qu’Hillary Clinton avait utilisé un serveur email personnel, ce qui lui permettait, selon Trump lui-même, d’échapper à la justice en ayant ses mails soustraits aux services de surveillance internes américains.

Même ceux qui mettent en place le système de surveillance généralisé en ont peur.

L’éducation à la compréhension

La dernière leçon que je tire de cette anecdote c’est, encore une fois, celle de l’éducation : vous pouvez avoir l’infrastructure cryptographique la plus sécurisée, si vous êtes incompétent au point d’inviter n’importe qui dans votre chat, on ne peut rien faire pour vous.

La plus grosse faille de sécurité est toujours entre la chaise et le clavier, la seule manière de sécuriser un système est de faire en sorte que l’utilisateur soit éduqué.

Le meilleur exemple reste celui des voitures autonomes : nous sommes en train de mettre des générations entières dans des Tesla qui se conduisent toutes seules 99% du temps. Et lorsqu’un accident arrive, dans le 1% restant, nous demandons au conducteur : « Mais pourquoi tu n’as pas réagi comme un bon conducteur ? »

Et la réponse est très simple : « Parce que je n’ai jamais conduit de ma vie, je ne sais pas ce que c’est conduire, je n’ai jamais appris à réagir quand le système ne fonctionne pas correctement ».

Vous pensez que j’exagère ? Attendez…

Se faire engager grâce à l’IA

Eric Lu a reçu le CV d’un candidat très prometteur pour bosser dans sa startup. CV qui semblait fort optimisé en mots clés, mais qui était particulièrement pointu dans les technologies utilisées par Eric. Il a donc proposé au candidat une interview par vidéo.

Au début, tout s’est très bien passé jusqu’à ce que le candidat commence à s’emmêler dans ses réponses. « Vous dites que le service d’envoi de SMS sur lequel vous avez bossé était saturé, mais vous décrivez le service comme étant utilisé par une classe de 30 personnes. Comment 30 SMS peuvent-ils saturer le service ? » … euh… « Pouvez-vous me dire quelle interface utilisateur vous avez mise en place avec ce que vous dites avoir implémenté ? » … euh, je ne me souviens plus…

Eric comprend alors que le candidat baratine. Le CV a été généré par ChatGPT. Le candidat s’est préparé en simulant un entretien d’embauche avec ChatGPT et en étudiant par cœur ce qu’il devait répondre. Il panique dès qu’on sort de son script.

Ce qui est particulièrement dommage, c’est que le candidat avait un profil vraiment adapté. S’il avait été honnête et franc au regard de son manque d’expérience, il aurait pu se faire engager comme junior et acquérir l’expérience souhaitée. S’il avait consacré son temps à lire des explications techniques sur les technologies concernées plutôt que d’utiliser ChatGPT, il aurait pu convaincre l’employeur de sa motivation, de sa curiosité. « Je ne connais pas encore grand-chose, mais je suis désireux d’apprendre ».

Mais le plus triste dans tout cela, c’est qu’il a sincèrement pensé que ça pouvait fonctionner. Il a détruit sa réputation parce que ça ne lui a même pas traversé l’esprit que, quand bien même il aurait été engagé, il n’aurait pas tenu deux jours dans son boulot avant de passer pour un crétin. Il a été malhonnête parce qu’il était persuadé que c’était la bonne manière de fonctionner.

Bref, il était un vrai Julius.

Il a « appris à conduire une Tesla » en s’asseyant sur le siège et regardant celle-ci faire 100 fois le tour du quartier. Confiant, il est parti dans une autre ville et s’est pris le premier platane.

Sauver une génération

Les smartphones, l’IA, les monopoles publicitaires, les réseaux sociaux sont toutes les facettes d’un même problème : la volonté de rendre la technologie incompréhensible afin de nous asservir commercialement et de nous occuper l’esprit.

J’ai écrit comment je pensais que nous devions agir pour éduquer la prochaine génération d’adultes :

Mais c’est un point de vue de parent. C’est pour cela que je trouve très pertinente l’analyse de Thual qui, lui, est un jeune adulte à peine sorti de l’adolescence. Il peut parler de tout cela à la première personne.

La grande leçon que j’en tire est que la génération qui nous suit est loin d’être perdue. Comme toutes les générations, elle est désireuse d’apprendre, de se battre. Nous devons avoir l’humilité de réaliser que ma génération s’est complètement plantée. Que nous détruisons tout, que nous sommes des fascistes addicts à Facebook et Candy Crush qui roulons en SUV.

Nous n’avons pas de leçons à leur donner. Nous avons le devoir de les aider, de nous mettre à leur service en désactivant le pilote automatique et en brûlant les slides PowerPoint dont nous sommes si fiers.

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

April 13, 2025

Seven years ago, I wrote a post about a tiny experiment: publishing my phone's battery status to my website. The updates have quietly continued ever since, appearing at https://dri.es/status.

Every 20 minutes or so, my phone sends its battery level and charging state to a REST endpoint on my Drupal site. The exact timing depends on iOS background scheduling, which has a mind of its own.

For years, this lived quietly at https://dri.es/status. I never linked to it outside the original blog post, so it felt like a forgotten corner of my site. Still working, but mostly invisible.

Despite its low profile, people still mention it occasionally after all this time. This prompted me to bring it into the light.

I have now added a battery icon to my site's header. It's a dynamically generated SVG that displays my phone's current battery level and charging state.

It's a bit goofy, but that is what makes personal websites special. You get to experiment with it and make it yours.

April 10, 2025

De l’utilisation des smartphones et des tablettes chez les adolescents

Chers parents, chers enseignants, chers éducateurs,

Nous le savons toutes et tous, le smartphone est devenu un objet incontournable de notre quotidien, nous connectant en permanence au réseau Internet qui, avant cela, restait cantonné aux ordinateurs sur nos bureaux. En voyant grandir nos enfants, la question se pose : quand, comment et pourquoi les faire entrer dans le monde de cette hyperconnexion permanente.

L’adolescence est une phase critique de la vie durant laquelle le cerveau est particulièrement réceptif et forme des réflexes qui resteront ancrés toute une vie. C’est également une période durant laquelle la pression du groupe et le désir de conformité sociale sont les plus importants. Ce n’est pas un hasard si les producteurs de cigarettes et d’alcool ciblent explicitement les adolescents dans le marketing de leur produit.

Le smartphone étant une invention incroyablement récente, nous manquons totalement de recul sur l’impact qu’il peut avoir durant la croissance. Est-il totalement inoffensif ou sera-t-il considéré, d’ici quelques années, comme le tabac l’est aujourd’hui ? Personne ne le sait avec certitude. Nos enfants sont les cobayes de cette technologie.

Il me parait important de souligner certains points importants, qui ne sont que quelques éléments parmi les nombreuses problématiques étudiées dans le domaine

L’attention et la concentration

Il est désormais démontré que le smartphone perturbe grandement l’attention et la concentration, y compris chez les adultes. Ce n’est pas un hasard : il est conçu pour cela. Les entreprises comme Google et Meta (Facebook, Whatsapp, Instagram) sont payées proportionnellement au temps que nous passons devant l’écran. Tout est optimisé en ce sens. Le simple fait d’avoir un téléphone près de soi, même éteint, perturbe le raisonnement et fait baisser sensiblement les résultats de tests de QI.

Le cerveau acquiert le réflexe d’attendre des notifications de nouveaux messages de cet appareil, sa seule présence est donc un handicap majeur dans toutes les tâches qui requièrent de l’attention : lecture, apprentissage, réflexion, calculs. Il ne suffit pas de l’éteindre : il faut le mettre à distance, si possible dans une pièce différente !

Il est démontré que l’utilisation des réseaux sociaux comme Tik-Tok perturbe complètement la notion du temps et la formation de la mémoire. Nous en avons tous fait l’expérience : nous jurons avoir passé 10 minutes sur notre smartphone alors qu’il s’est en réalité écoulé près d’une heure.

Pour mémoriser et apprendre, le cerveau a besoin de temps de repos, de vide, d’ennui et de réflexion. Ces nécessaires temps « morts » dans les trajets, dans les files d’attente, dans la solitude d’une chambre d’adolescent voire même durant un cours rébarbatif ont été supplantés par une hyperconnexion.

L’angoisse sociale et la perturbation du sommeil

Même lorsque nous ne l’utilisons pas, nous savons que les conversations continuent. Que des messages importants sont peut-être échangés en notre absence. Cette sensation bien connue appelée « FOMO » (Fear Of Missing Out, peur de manquer quelque chose) nous pousse à consulter notre téléphone jusque tard dans la nuit et dès le réveil. Une proportion inquiétante de jeunes reconnaissent se réveiller durant la nuit pour consulter leur smartphone. Or la qualité du sommeil est fondamentale dans le processus d’apprentissage et de formation du cerveau.

La santé mentale

De récentes avancées démontrent une corrélation forte entre le degré d’utilisation des réseaux sociaux et les symptômes de dépression. Le monde occidental semble atteint d’une épidémie de dépression adolescente, épidémie dont la temporalité correspond exactement avec l’apparition du smartphone. Les filles en dessous de 16 ans sont la population la plus touchée.

Le harcèlement et la prédation

Sur les réseaux sociaux, il est trivial de créer un compte anonyme ou usurpant l’identité d’une autre personne (contrairement à ce qu’il est parfois affirmé dans les médias, il n’est pas nécessaire d’être un génie de l’informatique pour mettre un faux nom dans un formulaire). À l’abri sous cet anonymat, il est parfois très tentant de faire des blagues de mauvais goût, de tenir des propos injurieux, de révéler aux grands jours les secrets dont les adolescents sont friands voire de calomnier pour régler des différends de cours de récré. Ces comportements ont toujours fait partie de l’adolescence et font partie d’une exploration naturelle normale des relations sociales. Cependant, le fonctionnement des réseaux sociaux aggrave fortement l’impact de ces actions tout en favorisant l’impunité du responsable. Cela peut conduire à des conséquences graves allant au-delà de ce qu’imaginent initialement les participants.

Ce pseudonymat est également une bénédiction pour les personnes mal intentionnées qui se font passer pour des enfants et, après des semaines de discussion, proposent à l’enfant de se retrouver en vrai, mais sans rien dire aux adultes.

Au lieu d’en tirer des leçons sociales éducatives, nous appelons les adolescents faisant des blagues de mauvais goût des « pirates informatiques », stigmatisant l’utilisation de la technologie plutôt que le comportement. Le thème des prédateurs sexuels est mis en exergue pour réclamer à cor et à cri des solutions de contrôle technologiques. Solutions que les géants de l’informatique se font un plaisir de nous vendre, jouant sur la peur et stigmatisant la technologie ainsi que celles et ceux qui ont le malheur d’en avoir une compréhension intuitive.

La peur et l’incompréhension deviennent les moteurs centraux pour mettre en avant une seule valeur éducative : obéir aveuglément à ce qui est incompréhensible et ce qu’il ne faut surtout pas essayer de comprendre.

La fausse idée de l’apprentissage de l’informatique

Car il faut à tout prix déconstruire le mythe de la « génération numérique ».

Contrairement à ce qui est parfois exprimé, l’utilisation d’un smartphone ou d’une tablette ne prépare en rien à l’apprentissage de l’informatique. Les smartphones sont, au contraire, conçus pour cacher la manière dont ils fonctionnent et sont majoritairement utilisés pour discuter et suivre des publications sponsorisées. Ils préparent à l’informatique autant que lire un magazine people à l’arrière d’un taxi prépare à devenir mécanicien. Ce n’est pas parce que vous êtes assis dans une voiture que vous apprenez son fonctionnement.

Une dame de 87 ans se sert d’une tablette sans avoir été formée, mais il faudrait former les enfants à l’école ? Une dame de 87 ans se sert d’une tablette sans avoir été formée, mais il faudrait former les enfants à l’école ?

Former à utiliser Word ou PowerPoint ? Les enfants doivent apprendre à découvrir les généralités des logiciels, à tester, à « chipoter », pas à reproduire à l’aveugle un comportement propre à un logiciel propriétaire donné afin de les préparer à devenir des clients captifs. Et que dire d’un PowerPoint qui force à casser la textualité, la capacité d’écriture pour réduire des idées complexes sous forme de bullet points ? Former à PowerPoint revient à inviter ses élèves dans un fast-food sous prétexte de leur apprendre à cuisiner.

L’aspect propriétaire et fermé de ces logiciels est incroyablement pervers. Introduire Microsoft Windows, Google Android ou Apple iOS dans les classes, c’est forcer les étudiants à fumer à l’intérieur sans ouvrir les fenêtres pour en faire de bons apnéistes qui savent retenir leur souffle. C’est à la fois dangereusement stupide et contre-productif.

De manière étonnante, c’est d’ailleurs dans les milieux de l’informatique professionnelle que l’on trouve le plus de personnes retournant aux « dumbphones », téléphones simples. Car, comme dit le proverbe « Quand on sait comment se prépare la saucisse, on perd l’envie d’en manger… »

Que faire ?

Le smartphone est omniprésent. Chaque génération transmet à ses enfants ses propres peurs. S’il y a tant de discussions, de craintes, de volonté « d’éducation », c’est avant tout parce que la génération des parents d’aujourd’hui est celle qui est le plus addict à son smartphone, qui est la plus espionnée par les monopoles publicitaires. Nous avons peur de l’impact du smartphone sur nos enfants parce que nous nous rendons confusément compte de ce qu’il nous inflige.

Mais les adolescents ne sont pas forcés d’être aussi naïfs que nous face à la technologie.

Commencer le plus tard possible

Les pédiatres et les psychiatres recommandent de ne pas avoir une utilisation régulière du smartphone avant 15 ou 16 ans, le système nerveux et visuel étant encore trop sensible avant cela. Les adolescents eux-mêmes, lorsqu’on les interroge, considèrent qu’ils ne devraient pas avoir de téléphone avant 12 ou 13 ans.

Si une limite d’âge n’est pas réaliste pour tout le monde, il semble important de retarder au maximum l’utilisation quotidienne et régulière du smartphone. Lorsque votre enfant devient autonome, privilégiez un « dumbphone », un simple téléphone lui permettant de vous appeler et de vous envoyer des SMS. Votre enfant arguera, bien entendu, qu’il est le seul de sa bande à ne pas avoir de smartphone. Nous avons tous été adolescents et utilisé cet argument pour nous habiller avec le dernier jeans à la mode.

Comme le signale Jonathan Haidt dans son livre « The Anxious Generation », il y a un besoin urgent de prendre des actions collectives. Nous offrons des téléphones de plus en plus tôt à nos enfants, car ils nous disent « Tout le monde en a sauf moi ». Nous cédons, sans le savoir, nous forçons d’autres parents à céder. Des expériences pilotes d’écoles « sans téléphone » montrent des résultats immédiats en termes de bien-être et de santé mentale des enfants..

Parlez-en avec les autres parents. Développez des stratégies ensemble qui permettent de garder une utilisation raisonnable du smartphone tout en évitant l’exclusion du groupe, ce qui est la plus grande hantise de l’adolescent.

Discutez en amont avec votre enfant

Expliquez à votre enfant les problématiques liées au smartphone. Plutôt que de prendre des décisions arbitraires, consultez-le et discutez avec lui de la meilleure manière pour lui d’entrer dans le monde connecté. Établissez un lien de confiance en lui expliquant de ne jamais faire confiance à ce qu’il pourra lire sur le téléphone.

Dans le doute, il doit avoir le réflexe d’en discuter avec vous.

Introduisez l’outil progressivement

Ne laissez pas votre enfant se débrouiller directement avec un smartphone une fois votre limite d’âge atteinte.

Bien avant cela, montrez-lui comment vous utilisez votre propre smartphone, votre ordinateur. Montrez-lui la même page Wikipédia sur les deux outils en expliquant qu’il ne s’agit que d’une manière de visualiser un contenu qui se trouve sur un autre ordinateur.

Lorsque votre enfant reçoit son propre appareil, introduisez-le progressivement en ne lui autorisant l’utilisation que pour des cas particuliers. Vous pouvez par exemple garder le téléphone, en ne le donnant à l’enfant que lorsqu’il en fait la demande pour une durée limitée et pour un usage précis. Ne créez pas immédiatement des comptes sur toutes les plateformes à la mode. Observez avec lui les réflexes qu’il acquiert, discutez sur l’inondation permanente que sont les groupes Whatsapp.

Parlez de vie privée

Rappelez à votre enfant que l’objectif des plateformes monopolistiques est de vous espionner en permanence afin de revendre votre vie privée et de vous bombarder de publicités. Que tout ce qui est dit et posté sur les réseaux sociaux, y compris les photos, doit être considéré comme public, le secret n’est qu’une illusion. Une règle d’or : on ne poste pas ce qu’on ne serait pas confortable de voir afficher en grand sur les murs de l’école.

Au Danemark, les écoles ne peuvent désormais plus utiliser de Chromebook pour ne pas enfreindre la vie privée des enfants. Mais ne croyez pas qu’Android, Windows ou iOS soient mieux en termes de vie privée.

Pas dans la chambre

Ne laissez jamais votre enfant dormir avec son téléphone. Le soir, le téléphone devrait être rangé dans un endroit neutre et hors de portée. De même, ne laissez pas le téléphone à portée de main lorsque l’enfant fait ses devoirs. Il en va de même pour les tablettes et autres laptops qui ont exactement les mêmes fonctions. Idéalement, les écrans sont à éviter avant d’aller à l’école pour éviter de commencer la journée en étant déjà en état de fatigue attentionnelle. N’oubliez pas que le smartphone peut être le vecteur de messages et d’images dérangeantes, voire choquantes, mais étrangement hypnotiques. L’effet de la lumière des écrans sur la qualité du sommeil est également une problématique encore mal comprise.

Continuez la discussion

Il existe des logiciels dits de « Contrôle parental ». Mais aucun logiciel ne remplacera jamais la présence des parents. Pire : les enfants les plus débrouillards trouveront très vite des astuces pour contourner ces limitations voire seront tentés de contourner ces limitations uniquement parce qu’elles sont arbitraires. Plutôt que d’imposer un contrôle électronique, prenez le temps de demander à vos enfants ce qu’ils font sur leur téléphone, avec qui ils parlent, ce qui se dit, quels sont les logiciels qu’ils utilisent.

L’utilisation d’Internet peut être également très bénéfique en permettant à l’enfant d’apprendre sur des sujets hors programmes ou de découvrir des communautés partageant des centres d’intérêt différents de ceux de l’école.

De la même manière que vous laissez votre enfant fréquenter un club de sport ou de scoutisme tout en l’empêchant de trainer avec une bande de voyous dans la rue, vous devez contrôler les fréquentations de vos enfants en ligne. Loin des groupes Whatsapp scolaires, votre enfant peut trouver des communautés en ligne partageant ses centres d’intérêt, communautés dans lesquelles il pourra apprendre, découvrir et s’épanouir s’il est bien aiguillé.

Donnez l’exemple, soyez l’exemple !

Nos enfants ne font pas ce qu’on leur dit de faire, ils font ce qu’ils nous voient faire. Les enfants ayant vu leurs parents fumer ont le plus grand risque de devenir fumeurs à leur tour. Il en est de même pour les smartphones. Si notre enfant nous voit en permanence sur notre téléphone, il n’a pas d’autre choix que de vouloir nous imiter. L’un des plus beaux cadeaux que vous pouvez faire est donc de ne pas utiliser compulsivement votre téléphone en présence de votre enfant.

Oui, vous devez traiter et prendre conscience de votre propre addiction !

Prévoyez des périodes où vous le mettez-le en silencieux ou en mode avion et où il est rangé à l’écart. Lorsque vous prenez votre téléphone, expliquez à votre enfant l’usage que vous en faites.

Devant lui, mettez-vous à lire un livre papier. Et, non, la lecture sur l’iPad n’est pas « pareille ».

D’ailleurs, si vous manquez d’idée, je ne peux que vous recommander mon dernier roman : une aventure palpitante écrite à la machine à écrire qui traite de vélo, d’adolescence, de fin du monde et de smartphones éteints pour toujours. Oui, la publicité s’est même glissée dans ce texte, quel scandale !

Donnez le goût de l’informatique, pas celui d’être contrôlé

Il ne faut pas tirer sur le messager : le responsable n’est pas « l’écran », mais l’utilisation que nous en faisons. Les monopoles informatiques tentent de rendre les utilisateurs addicts, prisonniers pour les bombarder de publicités, pour les faire consommer. Là sont les responsables.

Apprendre la programmation (ce qui se fait au départ très bien sans écran), jouer à des jeux vidéos profonds avec des histoires complexes ou simplement drôles pour passer un moment amusant, discuter en ligne avec des passionnés, dévorer Wikipédia… L’informatique moderne nous ouvre de magnifiques portes dont il serait dommage de priver nos enfants.

Au lieu de céder à nos propres peurs, angoisses et incompréhensions, nous devons donner à nos enfants le goût de reprendre le contrôle de l’informatique et de nos vies, contrôle que nous avons un peu trop facilement cédé aux monopoles publicitaires en échange d’un rectangle de verre affichant des icônes de couleur.

Une enfant s’étonne de ne plus retrouver un livre sur sa tablette, la maitresse lui explique que des entreprises ont décidé que ce livre n’était pas bon pour elle. Une enfant s’étonne de ne plus retrouver un livre sur sa tablette, la maitresse lui explique que des entreprises ont décidé que ce livre n’était pas bon pour elle.

Accepter l’imperfection

« J’avais des principes, aujourd’hui j’ai des enfants » dit le proverbe. Impossible d’être parfait. Quoi que nous fassions, nos enfants seront confrontés à des conversations toxiques, des dessins animés débiles et c’est bien normal. En tant que parents, nous faisons ce que nous pouvons, avec nos réalités.

Personne n’est parfait. Surtout pas un parent.

L’important n’est pas d’empêcher à tout prix nos enfants d’être sur un écran, mais de prendre conscience qu’un smartphone n’est absolument pas un outil éducatif, qu’il ne prépare à rien d’autre que de faire de nous de bons consommateurs passifs.

Le seul apprentissage réellement nécessaire est celui d’un esprit critique dans l’utilisation d’un outil informatique.

Et dans cet apprentissage, les enfants ont souvent beaucoup à apprendre aux adultes !

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

April 09, 2025

Introduction

When I upgraded from an old 8GB USB stick to a shiny new 256GB one, I expected faster speeds and more convenience—especially for carrying around multiple bootable ISO files using Ventoy. With modern Linux distributions often exceeding 4GB per ISO, my old drive could barely hold a single image. But I quickly realized that storage space was only half the story—performance matters too.

Curious about how much of an upgrade I had actually made, I decided to benchmark the read speed of both USB sticks. Instead of hunting down benchmarking tools or manually comparing outputs, I turned to ChatGPT to help me craft a reliable, repeatable shell script that could automate the entire process. In this post, I’ll share how ChatGPT helped me go from an idea to a functional USB benchmark script, and what I learned along the way.


The Goal

I wanted to answer a few simple but important questions:

  • How much faster is my new USB stick compared to the old one?
  • Do different USB ports affect read speeds?
  • How can I automate these tests and compare the results?

But I also wanted a reusable script that would:

  • Detect the USB device automatically
  • Find or use a test file on the USB stick
  • Run several types of read benchmarks
  • Present the results clearly, with support for summary and CSV export

Getting Help from ChatGPT

I asked ChatGPT to help me write a shell script with these requirements. It guided me through:

  • Choosing benchmarking tools: hdparm, dd, pv, ioping, fio
  • Auto-detecting the mounted USB device
  • Handling different cases for user-provided test files or Ubuntu ISOs
  • Parsing and converting human-readable speed outputs
  • Displaying results in human-friendly tables and optional CSV export

We iterated over the script, addressing edge cases like:

  • USB devices not mounted
  • Multiple USB partitions
  • pv not showing output unless stderr was correctly handled
  • Formatting output consistently across tools

ChatGPT even helped optimize the code for readability, reduce duplication, and handle both space-separated and non-space-separated speed values like “18.6 MB/s” and “18.6MB/s”.


Benchmark Results

With the script ready, I ran tests on three configurations:

1. Old 8GB USB Stick

hdparm       16.40 MB/s
dd 18.66 MB/s
dd + pv 17.80 MB/s
cat + pv 18.10 MB/s
ioping 4.44 MB/s
fio 93.99 MB/s

2. New 256GB USB Stick (Fast USB Port)

hdparm      372.01 MB/s
dd 327.33 MB/s
dd + pv 310.00 MB/s
cat + pv 347.00 MB/s
ioping 8.58 MB/s
fio 992.78 MB/s

3. New 256GB USB Stick (Slow USB Port)

hdparm       37.60 MB/s
dd 39.86 MB/s
dd + pv 38.13 MB/s
cat + pv 40.30 MB/s
ioping 6.88 MB/s
fio 73.52 MB/s

Observations

  • The old USB stick is not only limited in capacity but also very slow. It barely breaks 20 MB/s in most tests.
  • The new USB stick, when plugged into a fast USB 3.0 port, is significantly faster—over 10x the speed in most benchmarks.
  • Plugging the same new stick into a slower port dramatically reduces its performance—a good reminder to check where you plug it in.
  • Tools like hdparm, dd, and cat + pv give relatively consistent results. However, ioping and fio behave differently due to the way they access data—random access or block size differences can impact results.

Also worth noting: the metal casing of the new USB stick gets warm after a few test runs, unlike the old plastic one.


Conclusion

Using ChatGPT to develop this benchmark script was like pair-programming with an always-available assistant. It accelerated development, helped troubleshoot weird edge cases, and made the script more polished than if I had done it alone.

If you want to test your own USB drives—or ensure you’re using the best port for speed—this benchmark script is a great tool to have in your kit. And if you’re looking to learn shell scripting, pairing with ChatGPT is an excellent way to level up.


Want the script?
I’ll share the full version of the script and instructions on how to use it in a follow-up post. Stay tuned!

April 08, 2025

La fin d’un monde ?

La fin de nos souvenirs

Nous sommes envahis d’IA. Bien plus que vous ne le pensez.

Chaque fois que votre téléphone prend une photo, ce n’est pas la réalité qui s’affiche, mais une reconstruction « probable » de ce que vous avez envie de voir. C’est la raison pour laquelle les photos paraissent désormais si belles, si vivantes, si précises : parce qu’elles ne sont pas le reflet de la réalité, mais le reflet de ce que nous avons envie de voir, de ce que nous sommes le plus susceptibles de trouver « beau ». C’est aussi la raison pour laquelle les systèmes dégooglisés prennent de moins belles photos: ils ne bénéficient pas des algorithmes Google pour améliorer la photo en temps réel.

Les hallucinations sont rares à nos yeux naïfs, car crédibles. Nous ne les voyons pas. Mais elles sont là. Comme cette future mariée essayant sa robe devant des miroirs et qui découvre que chaque reflet est différent.

J’ai moi-même réussi à perturber les algorithmes. À gauche, la photo telle que je l’ai prise et telle qu’elle apparait dans n’importe quel visualisateur de photos. À droite, la même photo affichée dans Google Photos. Pour une raison difficilement compréhensible, l’algorithme tente de reconstruire la photo et se plante lourdement.

Une photo de ma main à gauche et la même photo complètement déformée à droite Une photo de ma main à gauche et la même photo complètement déformée à droite

Or ces images, reconstruites par IA, sont ce que notre cerveau va retenir. Nos souvenirs sont littéralement altérés par les IA.

La fin de la vérité

Tout ce que vous croyez lire sur LinkedIn a probablement été généré par un robot. Pour vous dire, le 2 avril il y avait déjà des robots qui se vantaient sur ce réseau de migrer de Offpunk vers XKCDpunk.

Capture d’écran de LinkedIn montrant le billet d’un certain Arthur Howell se vantant d’un blog post racontant la migration de Offpunk ver XKCDpunk. Capture d’écran de LinkedIn montrant le billet d’un certain Arthur Howell se vantant d’un blog post racontant la migration de Offpunk ver XKCDpunk.

La transition Offpunk vers XKCDpunk était un poisson d’avril hyper spécifique et compréhensible uniquement par une poignée d’initiés. Il n’a pas fallu 24h pour que le sujet soit repris sur LinkedIn.

Non, franchement, vous pouvez éteindre LinkedIn. Même les posts de vos contacts sont probablement en grande partie générés par IA suite à un encouragement algorithmique à poster.

Il y a 3 ans, je mettais en garde sur le fait que les chatbots généraient du contenu qui remplissait le web et servait de base d’apprentissage à la prochaine génération de chatbots.

Je parlais d’une guerre silencieuse. Mais qui n’est plus tellement silencieuse. La Russie utilise notamment ce principe pour inonder le web d’articles, générés automatiquement, reprenant sa propagande.

Le principe est simple : vu que les chatbots font des statistiques, si vous publiez un million d’articles décrivant les expériences d’armes biologiques que les Américains font en Ukraine (ce qui est faux), le chatbot va considérer ce morceau de texte comme statistiquement fréquent et avoir une grande probabilité de vous le ressortir.

Et même si vous n’utilisez pas ChatGPT, vos politiciens et les journalistes, eux, les utilisent. Ils en sont même fiers.

Ils ont entendu ChatGPT braire dans un pré et en font un discours qui sera lui-même repris par ChatGPT. Ils empoisonnent la réalité et, ce faisant, la modifient. Ils savent très bien qu’ils mentent. C’est le but.

Je pensais qu’utiliser ces outils était une perte de temps un peu stupide. En fait, c’est dangereux aussi pour les autres. Vous vous demandez certainement c’est quoi le bazar autour des taxes frontalières que Trump vient d’annoncer ? Les économistes se grattent la tête. Les geeks ont compris : tout le plan politique lié aux taxes et son explication semblent avoir été littéralement générés par un chatbot devant répondre à la question « comment imposer des taxes douanières pour réduire le déficit ? ».

Le monde n’est pas dirigé par Trump, il est dirigé par ChatGPT. Mais où est la Sara Conor qui le débranchera ?

Extrait de Tintin, l’étoile mystérieuse Extrait de Tintin, l’étoile mystérieuse

La fin de l’apprentissage

Slack vole notre attention, mais vole également notre apprentissage en permettant à n’importe qui de déranger, par message privé, le développeur senior qui connait les réponses, car il a bâti le système.

La capacité d’apprendre, c’est bel et bien ce que les téléphones et l’IA sont en train de nous dérober. Comme le souligne Hilarius Bookbinder, professeur de philosophie dans une université américaine, la différence générationnelle majeure qu’il observe est que les étudiants d’aujourd’hui n’ont aucune honte à simplement envoyer un email au professeur pour lui demander de résumer ce qu’il faut savoir.

Dans son journal de Mars, Thierry Crouzet fait une observation similaire. Alors qu’il annonce quitter Facebook, tout ce qu’il a pour réponse c’est « Mais pourquoi ? ». Alors même qu’il balance des liens sur le sujet depuis des lustres.

Les chatbots ne sont, eux-mêmes, pas des systèmes qu’il est possible d’apprendre. Ils sont statistiques, sans cesse changeants. À les utiliser, la seule capacité que l’on acquiert, c’est l’impression qu’il n’est pas possible d’apprendre. Ces systèmes nous volent littéralement le réflexe de réfléchir et d’apprendre.

En conséquence, sans même vouloir chercher, une partie de la population veut désormais une réponse personnelle, immédiate, courte, résumée. Et si possible en vidéo.

La fin de la confiance

Apprendre nécessite d’avoir confiance en soi. Il est impossible d’apprendre si on n’a pas la certitude qu’on est capable d’apprendre. À l’opposé, si on acquiert cette certitude, à peu près tout peut s’apprendre.

Une étude menée par des chercheurs de Microsoft montre que plus on a confiance en soi, moins on fait confiance aux réponses des chatbots. Mais, au contraire, si on a le moindre doute, on a soudainement confiance envers les résultats qui nous sont envoyés.

Parce que les chatbots parlent comme des CEOs, des marketeux ou des arnaqueurs : ils simulent la confiance envers leurs propres réponses. Les personnes, même les plus expertes, qui n’ont pas le réflexe d’aller au conflit, de remettre l’autorité en question finissent par transformer leur confiance en eux-mêmes en confiance envers un outil.

Un outil de génération aléatoire qui appartient à des multinationales.

Les entreprises sont en train de nous voler notre confiance en nous-mêmes. Elles sont en train de nous voler notre compétence. Elles sont en train de nous voler nos scientifiques les plus brillants.

Et c’est déjà en train de faire des dégâts dans le domaine de « l’intelligence stratégique » (à savoir les services secrets).

Ainsi que dans le domaine de la santé : les médecins ont tendance à faire exagérément confiance aux diagnostics posés automatiquement, notamment pour les cancers. Les médecins les plus expérimentés se défendent mieux, mais restent néanmoins sensibles : ils font des erreurs qu’ils n’auraient jamais commises normalement si cette erreur est encouragée par un assistant artificiel.

La fin de la connaissance

Avec les chatbots, une idée vieille comme l’informatique refait surface : « Et si on pouvait dire à la machine ce qu’on veut sans avoir besoin de la programmer ? ».

C’est le rềve de toute cette catégorie de managers qui ne voient les programmeurs que comme des pousse-bouton qu’il faut bien payer, mais dont on aimerait se passer.

Rêve qui, faut-il le préciser, est complètement stupide.

Parce que l’humain ne sait pas ce qu’il veut. Parce que la parole a pour essence d’être imprécise. Parce que lorsqu’on parle, on échange des sensations, des intuitions, mais on ne peut pas être précis, rigoureux, bref, scientifique.

L’humanité est sortie du moyen-âge lorsque des Newton, Leibniz, Descartes ont commencé à inventer un langage de logique rationnelle : les mathématiques. Tout comme on avait inventé, à peine plus tôt, un langage précis pour décrire la musique.

Se satisfaire de faire tourner un programme qu’on a décrit à un chatbot, c’est retourner intellectuellement au moyen-âge.

Mais bon, encore faut-il maitriser une langue. Lorsqu’on passe sa scolarité à demander à un chatbot de résumer les livres à lire, ce n’est même pas sûr que nous arriverons à décrire ce que nous voulons précisément.

En fait, ce n’est même pas sûr que nous arriverons encore à penser ce que nous voulons. Ni même à vouloir. La capacité de penser, de réfléchir est fortement corrélée avec la capacité de traduire en mot.

Ce qui se conçoit bien s’énonce clairement et les mots pour le dire viennent aisément. (Boileau)

Ce n’est plus un retour au moyen-âge, c’est un retour à l’âge de la pierre.

Ou dans le futur décrit dans mon (excellent) roman Printeurs : des injonctions publicitaires qui se sont substituées à la volonté. (si si, achetez-le ! Il est à la fois palpitant et vous fera réfléchir)

Extrait de Tintin, l’étoile mystérieuse Extrait de Tintin, l’étoile mystérieuse

La fin des différentes voix.

Je critique le besoin d’avoir une réponse en vidéo, car la notion de lecture est importante. Je me rends compte qu’une proportion incroyable, y compris d’universitaires, ne sait pas « lire ». Ils savent certes déchiffrer, mais pas réellement lire. Et il y a un test tout simple pour savoir si vous savez lire : si vous trouvez plus facile d’écouter une vidéo YouTube d’une personne qui parle plutôt que de lire le texte vous-même, c’est sans doute que vous déchiffrez. C’est que vous lisez à haute voix dans votre cerveau pour vous écouter parler.

Il y a bien sûr bien des contextes où la vidéo ou la voix ont des avantages, mais lorsqu’il s’agit, par exemple, d’apprendre une série de commandes et leurs paramètres, la vidéo est insupportablement inappropriée. Pourtant, je ne compte plus les étudiants qui me recommandent des vidéos sur le sujet.

Car la lecture, ce n’est pas simplement transformer les lettres en son. C’est en percevoir directement le sens, permettant des allers-retours incessants, des pauses, des passages rapides afin de comprendre le texte. Entre un écrivain et un lecteur, il existe une communication, une communion télépathique qui font paraître l’échange oral lent, inefficace, balourd, voire grossier.

Cet échange n’est pas toujours idéal. Un écrivain possède sa « voix » personnelle qui ne convient pas à tout le monde. Il m’arrive régulièrement de tomber sur des blogs dont le sujet m’intéresse, mais je n’arrive pas à m’abonner, car la « voix » du blogueur ne me convient pas du tout.

C’est normal et même souhaitable. C’est une des raisons pour laquelle nous avons besoin de multitudes de voix. Nous avons besoin de gens qui lisent puis qui écrivent, qui mélangent les idées et les transforment pour les transmettre avec leur propre voix.

La fin de la relation humaine

Dans la file d’un magasin, j’entendais la personne en face de moi se vanter de raconter sa vie amoureuse à ChatGPT et de lui demander en permanence conseil sur la manière de la gérer.

Comme si la situation nécessitait une réponse d’un ordinateur plutôt qu’une discussion avec un autre être humain qui comprend voir qui a vécu le même problème.

Après nous avoir volé le moindre instant de solitude avec les notifications incessantes de nos téléphones et les messages sur les réseaux sociaux, l’IA va désormais voler notre sociabilité.

Nous ne serons plus connectés qu’avec le fournisseur, l’Entreprise.

Sur Gopher, szczezuja parle des autres personnes postant sur Gopher comme étant ses amis.

Tout le monde ne sait pas que ce sont mes amis, mais comment appeler autrement quelqu’un que vous lisez régulièrement et dont vous connaissez un peu de sa vie intime

La fin de la fin…

La fin d’une ère est toujours le début d’une autre. Annoncer la fin, c’est préparer une renaissance. En apprenant de nos erreurs pour reconstruire en améliorant le tout.

C’est peut-être ce que j’apprécie tant sur Gemini : l’impression de découvrir, de suivre des « voix » uniques, humaines. J’ai l’impression d’être témoin d’une microfaction d’humanité qui se désolidarise du reste, qui reconstruit autre chose. Qui lit ce que d’autres humains ont écrit juste parce qu’un autre humain a eu besoin de l’écrire sans espérer aucune contrepartie.

Vous vous souvenez des « planet » ? Ce sont des agrégateurs de blogs regroupant les participants d’un projet en un seul flux. L’idée a été historiquement lancée par GNOME avec planet.gnome.org (qui existe toujours) avant de se généraliser.

Et bien bacardi55 lance Planet Gemini FR, un agrégateur des capsules Gemini francophone.

C’est génial et parfait pour ceux qui ont envie de découvrir du contenu sur Gemini.

C’est génial pour ceux qui ont envie de lire d’autres humains qui n’ont rien à vous vendre. Bref, pour découvrir le fin du fin…

Toutes les images sont illégalement issues du chef-d’œuvre d’Hergé : « L’étoile mystérieuse ». Y’a pas de raison que les chatbots soient les seuls à pomper.

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

April 06, 2025

cloud-init

I prepared a few update releases of some ansible roles related to provision virtual machines with libvirt over the last weeks.

Mainly clean up releases and makes sure that everything works on different GNU/Linux distributions out of the box.

One “big” change is the removal of the dependency on the cloud-localds utility to provision virtual machines with cloud-init. This enables the usage of the roles on Linux distributions that don’t provide this utility.


Ansible-k3s-on-vms v1.2.0

An Ansible playbook to deploy virtual machines and deploy K3s.

https://github.com/stafwag/ansible-k3s-on-vms

ChangeLog

Added community.libvirt to requirements.yml

  • Added community.libvirt to requirements.yml
  • Added required Suse packages installation
  • Documentation update
  • This release removes the dependency on the cloud-locals utility. On the distributes that don’t provide the cloud-localds utility GNU xorriso is used.

stafwag.delegated_vm_install v2.0.3

An Ansible role to install a virtual machine with virt-install and cloud-init (Delegated).

https://github.com/stafwag/ansible-role-delegated_vm_install

ChangeLog

Gather facts on kvm hosts only once

  • Gather facts on kvm hosts only once
  • Corrected ansible-lint errors
  • Remove the cloud-localds requirement in README

stafwag.virt_install_vm v1.1.1

An Ansible role to install a libvirt virtual machine with virt-install and cloud-init. It “designed” to be flexible.

https://github.com/stafwag/ansible-role-virt_install_vm

ChangeLog

CleanUp Release

  • Corrected ansible-lint errors
  • Updated documentation
  • Avoid ansible error during check vm status check

stafwag.libvirt v2.0.0

An ansible role to install libvirt/KVM packages and enable the libvirtd service.

https://github.com/stafwag/ansible-role-libvirt

ChangeLog

Use general vars instead of tasks

  • Reorganized the role to use vars and package install install the packages
  • This version works from the Ansible version that is included in Ubuntu 24.x
  • Corrected ansible-lint errors

stafwag.cloud_localds v3.0.2

An ansible role to create cloud-init config disk images.

https://github.com/stafwag/ansible-role-cloud_localds

ChangeLog

Execute installation tasks only once

  • Added run_one: true to only execute the installation tasks only once.
  • Move “name: Set OS related variables” to main, to have the provider settings when the install phase isn’t executed e.g. With –skip-tags install.
  • Added apply tags install to support –tags install

stafwag.qemu_img v2.3.2

An ansible role create qemu images.

https://github.com/stafwag/ansible-role-qemu_img

ChangeLog

Enable run_once on installation tasks

  • Execute installation tasks only once to allow parallel execution of roles on the same host e.g. With delegate_to

Have fun!

April 03, 2025

Can AI actually help with real Drupal development? I wanted to find out.

This morning, I fired up Claude Code and pointed it at my personal Drupal site. In a 30-minute session, I asked it to help me build new features and modernize some of my code. I expected mixed results but was surprised by how useful it proved to be.

I didn't touch my IDE once or write a single line of code myself. Claude handled everything from creating a custom Drush command to refactoring constructors and converting Drupal annotations to PHP attributes.

If you're curious what AI-assisted Drupal development actually feels like, this video captures the experience.

April 02, 2025

A graphic titled &quot;Evaluating a Drupal Site Template Marketplace&quot; showing a grid of website templates labeled by industry, including Higher Ed, News, Non-profi, and more.

This is an unusual post for my blog, but I'm sharing it to start a broader conversation about an idea we're exploring: a marketplace for Drupal Site Templates. Both the Drupal CMS Leadership Team and the Drupal Association have discussed this concept, but no decision has been made. I'm posting to share our current thinking and invite feedback as we shape this together.

This post will also be cross-posted to Drupal.org, where comments are open. You're also welcome to join the conversation in the #drupal-cms-marketplace channel on Drupal Slack.

In my DrupalCon Atlanta keynote, I introduced the concept of Site Templates for Drupal. If you haven't seen my keynote yet, I recommend watching it first. It provides helpful context for the rest of this post.

Site Templates provide pre-configured website starting points that combine Drupal recipes, themes, and default content. While Site Templates will help users launch websites faster, I also posed a bigger question: should we create a marketplace where users can discover and download or install these templates? And if so, should that marketplace offer only open source Site Templates, or should we also allow commercial templates?

What are Site Templates?

Site Templates combine Drupal recipes, a theme, design elements, and default content to give users a fully functional website right from the start. They solve one of Drupal's biggest challenges: the time it takes to build a site from scratch.

Unlike a bare Drupal installation, a Site Template provides all the components needed for a specific use case. A restaurant template might include menu management, reservation systems, and food photography. A nonprofit template could feature donation processing, event management, and impact reporting tools.

A diagram of a site template composed of multiple recipes, an Experience Builder theme, and relevant demo content.

Why consider a marketplace?

A Drupal marketplace for Site Templates would:

  1. Help new users launch a professional-looking site instantly
  2. Showcase Drupal's full potential through high-quality starting points
  3. Generate new revenue opportunities for Drupal agencies and developers
  4. Support Drupal's long-term sustainability through a revenue-sharing model with the Drupal Association

Should we support both open source and commercial Site Templates?

Fully open source Site Templates align naturally with Drupal's values. They could function much like community-contributed modules and themes, and we hope that many contributors will take this approach.

A marketplace requires ongoing investment. The Drupal Association would need to maintain the platform, review submissions, provide support, and ensure templates meet high standards. Without dedicated resources, quality and sustainability would suffer.

This is why supporting both open source and commercial templates makes sense. Paid templates can create a sustainable revenue stream to fund infrastructure, quality control, and support.

Commercial incentives also give creators a reason to invest in polished, well-documented, and well-supported templates.

How can a template be commercial while respecting Drupal's open source values?

First, rest assured: Drupal modules will always be open source.

Drupal is licensed under the GNU General Public License, or GPL. We've always taken a conservative approach to interpreting the GPL. In practice, this means we treat any code that builds on or interacts closely with Drupal as subject to the GPL. This includes PHP, Twig templates, etc. If it relies on Drupal's APIs or is executed by Drupal, it must be GPL-licensed.

Some parts of a site template fall into a gray area. JavaScript is an example. If JavaScript code is integrated with Drupal, we treat it as GPL-covered. If JavaScript code is standalone, such as a self-contained React component, it may not be considered a derivative work. The same may apply to CSS or configuration files not tightly coupled with Drupal APIs. These cases aren't always clear, but our stance has been to treat all code that ships with and interacts with Drupal as GPL. This keeps things simple.

Other parts of a Site Template are likely not subject to the GPL. Assets like images, fonts and icons are not code and are not derived from Drupal. The same applies to demo content, such as placeholder text or sample articles. These elements are not integrated with Drupal in a technical sense and can use other licenses, including commercial ones.

So when we talk about a commercial Site Template, we mean one that combines open source code with separately licensed assets or is sold alongside value-added services like documentation, support, or updates.

What would people actually be paying for in a commercial template?

While the legal distinction clarifies which parts of a Site Template can be licensed commercially, it's only part of the picture. The real question is the value proposition: what are users actually paying for when they choose a commercial template?

When purchasing a commercial template, users wouldn't just be paying for code. They're potentially paying for:

  • Professional design assets and media
  • Time saved in configuration and setup
  • Documentation and support
  • Ongoing updates and maintenance

This approach aligns with the Free Software Foundation's stance (the organization that created the GPL), which has always supported commercial distribution of free software. Creating a commercial template means balancing open source code with separately licensed assets. However, the real commercial value often extends beyond just the files you can license differently.

A sustainable commercial strategy combines proper licensing with controlled distribution channels and value-added services, like support. This approach ensures the value of a site template isn't limited to easily copied assets, but includes expertise that can't be simply downloaded. This is how a template can be commercial while staying true to Drupal's open source values.

How would we maintain quality in the marketplace?

A marketplace filled with low-quality or abandoned templates would damage Drupal's reputation. To ensure quality we probably need:

  • Technical reviews of templates for security and performance
  • Standards for documentation and support
  • Processes to handle outdated or abandoned templates
  • Community ratings and reviews
  • Processes for resolving disputes

These quality assurance measures require ongoing time, effort, and funding. This is why including a commercial component makes sense. A revenue-sharing model with template creators could help fund platform maintenance, reviews, support, and other efforts needed to keep the marketplace high quality and trustworthy.

What pricing models might be available?

We don't know yet, but we've heard many good suggestions from the community.

Ideas include one-time purchases for unlimited use, annual subscriptions with ongoing updates, and a marketplace membership model for template creators.

The marketplace could support multiple approaches, giving creators flexibility to choose what works best for their offerings.

Is it fair for template creators to profit while module contributors aren't paid?

When a site template is sold commercially, it raises an important question. What about the maintainers of the modules included in the template? The template builder receives payment. The Drupal Association may collect a revenue share. But the individual contributors who created the modules or core functionality do not receive direct compensation, even though their work is essential to the Site Template.

This may feel frustrating or unfair. Contributors often donate their time to improve Drupal for everyone. Seeing others earn money by building on that work without recognition can be disheartening, and could even discourage future contributions. It's an important concern, and one we plan to take seriously as we evaluate the marketplace model.

At the same time, this dynamic is not new. Agencies and developers already build paid Drupal sites using contributed modules without directly compensating the people who made the underlying code possible. This is both legal, expected, and common in open source.

A marketplace would not create this reality, but it would make it more visible. That visibility gives us a chance to confront a long-standing tension in open source: the gap between those who contribute foundational work and those who profit from it. As I wrote in Makers and Takers, sustaining open source requires a better balance between contribution and benefit. A marketplace could give us a way to explore new approaches to recognize, support, and sustain the people who make Drupal possible. Transparency alone won't solve the issue, but it opens the door to progress and experimentation.

When commercial activity happens off Drupal.org, there is no way to recognize the contributors who made it possible. When it happens on Drupal.org, we have an opportunity to do better. We can explore models for financial support, community recognition, and long-term sustainability.

Others could build marketplaces for Drupal templates, but these would likely focus on profit rather than community support. An official Drupal Association marketplace allows us to reinvest in the project and the people behind it. It keeps value within our ecosystem, and gives us a platform to explore more equitable ways to sustain open source contribution.

Would this hurt digital agencies?

Many organizations pay thousands of dollars to digital agencies as part of a custom Drupal build. If Site Templates are available at a much lower cost, will that undercut agencies?

I don't believe it will.

Organizations investing in a Drupal website are not paying for a theme alone. Agencies provide strategy, consulting, design, customization, user testing, performance optimization, and long-term support. A template offers a starting point, but doesn't replace tailored work or a trusted partnership.

Could templates help agencies grow?

A template marketplace could expand the Drupal ecosystem by lowering the barrier to entry, making Drupal accessible to smaller organizations. Some of these will grow and require custom solutions, creating more opportunities for agencies in the long run.

Templates can also serve as powerful demonstrations of an agency's capabilities, attracting clients who want similar but customized solutions. For agencies, templates become both a product and a marketing tool.

What revenue opportunities would digital agencies have?

A template marketplace offers two revenue streams for Drupal agencies and freelancers.

First, agencies would earn direct income from template sales through revenue-sharing with the Drupal Association. While this income might be modest, it could provide recurring revenue as the marketplace grows.

Second, templates could serve as a foundation for more comprehensive service packages, including hosting, maintenance, and customization services.

How would templates connect agencies with new clients?

A marketplace could connect end users directly with service providers. With proper consent, contact details from template purchases could be shared with creators, opening the door to professional service opportunities. Template listings could also include a built-in contact form, making it easy for users to request customization or support.

This lead generation benefits both sides. Users access trusted professionals who understand their implementation, while agencies connect with qualified prospects who have already invested in their work. A marketplace becomes a matchmaking platform connecting those who need Drupal expertise with those who provide it.

Why is now the right time for this initiative?

With Drupal CMS, we're focused on growth. To succeed, we need to address two long-standing challenges: the lack of ready-made themes and a steep learning curve. Some of our new tools (Recipes, Experience Builder, and Site Templates) allow us to address these longstanding issues.

The result? We can take Drupal's flexibility and make it more available across different markets and use cases.

What was the initial reaction at DrupalCon?

The day after my keynote, we organized a Birds of a Feather (BoF) session to discuss the marketplace concept. Approximately 75 people attended, representing a cross-section of our community.

The discussion was lively and constructive. Participants raised thoughtful concerns about quality control, licensing, and impact on module contributors. They also offered suggestions for implementation, pricing, and sustainability models.

At the session's conclusion, we informally polled the audience. We asked people to raise their hand showing 1 finger if they thought a marketplace was a terrible idea, and 5 if they considered it a very impactful idea. Most responses were 4, with some 5s. Very few people indicated less than 3.

This initial reaction is encouraging, though we recognize that much work remains to address the questions raised during the session.

We also opened the #drupal-cms-marketplace channel in Drupal Slack to continue the conversation with the wider community.

What are the next steps?

The Drupal CMS Leadership Team and the Drupal Association Innovation Working Group have been exploring this idea the past month.

We believe it could be one of our strongest opportunities to grow Drupal adoption, support our Maker community, and strengthen the Drupal Association. (As a disclaimer: I serve on both the Drupal CMS Leadership Team and the Drupal Association Board of Directors.)

To be clear, no decision has been made. We recognize this initiative would have a substantial impact on our community and ecosystem. Before moving forward, we need to assess:

  • Feasibility: Can we build and operate a marketplace efficiently?
  • Sustainability: How will we support ongoing operations?
  • Ecosystem impact: How would this affect contributors, agencies, and users?
  • Funding: How do we bootstrap this initiative when we don't have spare resources?
  • Values alignment: Does this approach honor Drupal's open source principles?
  • Governance: Who makes decisions about the marketplace and how?

We cannot and should not make these assessments in isolation. We need the Drupal community's involvement through:

  • Research into similar marketplaces and their impact
  • User experience design for the marketplace interface
  • Technical prototyping of the marketplace infrastructure
  • Financial analysis of various revenue models
  • Legal research on open source licensing considerations
  • Community input on governance structures

Our goal is to make a decision by DrupalCon Vienna, 6 months from now, or sooner if clarity emerges. We want that decision to reflect input from the CMS Leadership Team, the Drupal Association Board, Certified Drupal Partners, and the wider Drupal community.

We're chartering a Marketplace Working Group with stakeholders from across the Drupal ecosystem. I'm pleased to announce that Tiffany Farriss (Drupal Association Board Member) has agreed to lead this effort. Please join the #drupal-cms-marketplace channel on Drupal Slack to share your thoughts and follow the conversation.

Drupal's greatest strength has always been its community and adaptability. I believe that by thoughtfully exploring new ideas together, we can make Drupal more accessible and widely adopted while staying true to our core values.

Thank you to everyone on the Drupal Association Innovation Working Group and the Drupal CMS Leadership Team who took the time to review this post and share thoughtful feedback. I really appreciate your input.

April 01, 2025

Goodbye Offpunk, Welcome XKCDpunk!

For the last three years, I’ve been working on Offpunk, a command-line gemini and web browser.

While my initial goal was to browse the Geminisphere offline, the mission has slowly morphed into cleaning and unenshitiffying the modern web, offering users a minimalistic way of browsing any website with interesting content.

Focusing on essentials

From the start, it was clear that Offpunk would focus on essentials. If a website needs JavaScript to be read, it is considered as non-essential.

It worked surprisingly well. In fact, in multiple occurrence, I’ve discovered that some websites work better in Offpunk than in Firefox. I can comfortably read their content in the former, not in the latter.

By default, Offpunk blocks domains deemed as nonessentials or too enshitified like twitter, X, facebook, linkedin, tiktok. (those are configurable, of course. Defaults are in offblocklist.py).

Cleaning websites, blocking worst offenders. That’s good. But it is only a start.

It’s time to go further, to really cut out all the crap from the web.

And, honestly, besides XKCD comics, everything is crap on the modern web.

As an online technical discussion grows longer, the probability of a comparison with an existing XKCD comic approaches 1.
– XKCD’s law

If we know that we will end our discussion with an XKCD’s comic, why not cut all the fluff? Why don’t we go straight to the conclusion in a true minimalistic fashion?

Introducing XKCDpunk

That’s why I’m proud to announce that, starting with today’s release, Offpunk 2.7 will now be known as XKCDpunk 1.0.

XKCDpunk includes a new essential command "xkcd" which, as you guessed, takes an integer as a parameter and display the relevant XKCD comic in your terminal, while caching it to be able to browse it offline.

Screenshot of XKCDpunk showing comic 626 Screenshot of XKCDpunk showing comic 626

Of course, this is only an early release. I need to clean a lot of code to remove everything not related to accessing xkcd.com. Every non-xkcd related domain will be added to offblocklist.py.

I also need to clean every occurrence of "Offpunk" to change the name. All offpunk.net needs to be migrated to xkcd.net. Roma was not built in one day.

Don’t hesitate to install an "offpunk" package, as it will still be called in most distributions.

And report bugs on the xkcdpunk’s mailinglist.

Goodbye Offpunk, welcome XKCDpunk!

I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

I write science-fiction novels in French. For Bikepunk, my new post-apocalyptic-cyclist book, my publisher is looking for contacts in other countries to distribute it in languages other than French. If you can help, contact me!

March 31, 2025

Three months ago, we launched Drupal CMS 1.0, our biggest step forward in years. Our goal is ambitious: to reimagine Drupal as both radically easier to use and a platform for faster innovation.

In my DrupalCon Atlanta keynote last week, I reflected on the journey so far, but mostly talked about the work ahead. If you missed the keynote, you can watch the video below, or download my slides (56 MB).

If you want to try Drupal CMS, you can explore the trial experience, use the new desktop launcher, or install it with DDEV. If you're curious about what we're working on next, keep reading.

1. Experience Builder

Some of the most common requests from Drupal users and digital agencies is a better page-building experience, simpler theming, and high-quality themes out of the box.

At DrupalCon Atlanta, I shared our progress on Experience Builder. The keynote recording includes two demos: one highlighting new site building features, and another showing how to create and design components directly in the browser.

I also demonstrated how Drupal's AI agents can generate Experience Builder components. While this was an early design experiment, it offered a glimpse into how AI could make site building faster and more intuitive. You can watch that demo in the keynote video as well.

We still have work to do, but we're aiming to release Experience Builder 1.0, the first stable version, by DrupalCon Vienna. In the meantime, try our demo release.

2. Drupal Site Templates

A diagram of a site template composed of multiple recipes, an Experience Builder theme, and relevant demo content.

One of the biggest opportunities for Drupal CMS is making it faster and easier to launch a complete website. The introduction of Recipes was a big step forward. I covered Recipes in detail in my DrupalCon Barcelona 2024 keynote. But there is still more we can do.

Imagine quickly creating a campaign or fundraising site for a nonprofit, a departmental website for a university, a portfolio site for a creative agency, or even a travel-focused ecommerce site selling tours, like the one Sarah needed in the DrupalCon Barcelona demo.

This is why we are introducing Site Templates: ready-made starting points for common use cases. They help users go from a fresh install to a fully functional site with minimal setup or configuration.

Site Templates are made possible by Recipes and Experience Builder. Recipes provide higher-level building blocks, while Experience Builder introduces a new way to design and create themes. Site Templates will bring everything together into more complete, ready-to-use solutions.

If successful, Site Templates could replace Drupal distributions, a concept that has been part of Drupal for nearly 20 years. The key advantage is that Site Templates are much easier to build and maintain.

3. A marketplace discussion

Visual metaphor showing Drupal&#039;s evolution from modules to recipes, site templates, and a marketplace, illustrated using LEGO bricks, kits, and a LEGO store.

The first Site Templates may be included directly in Drupal CMS 2.0 itself. Over time, we hope to offer hundreds of site templates through a marketplace on Drupal.org.

At DrupalCon Atlanta, I announced that we'll be exploring a marketplace for Site Templates, including the option for Commercial Site Templates. We believe it's an idea worth evaluating because it could bring several benefits to the Drupal project:

  1. Help new users launch a professional-looking site instantly
  2. Showcase Drupal's full potential through high-quality examples
  3. Generate new revenue opportunities for Drupal agencies and developers
  4. Support Drupal's sustainability through a revenue-sharing model with the Drupal Association

You can watch the keynote recording to learn more. I've also published a detailed blog post that dives deeper into the marketplace idea.

Looking ahead

Subscribe to my blog

Join 5,000+ subscribers and get new posts by email.

Drupal CMS has brought a lot of fresh momentum to the Drupal project, but we're not done yet! The rest of this year, we'll keep building on this foundation with a clear set of priorities:

  • Launching Experience Builder 1.0
  • Releasing our first Site Templates
  • Expanding our marketing efforts
  • Exploring the launch of a Site Template marketplace
  • Building out our AI framework and AI agents

If you have time and interest, please consider getting involved. Every contribution makes a difference. Not sure where to begin? Join us on Drupal Slack. We're always happy to welcome new faces. Key channels include #drupal-cms-development, #ai, #experience-builder, #drupal-cms-templates, and #drupal-cms-marketplace.

As I said in the keynote: We have all the pieces, now we just need to bring them together!

March 28, 2025

The candid naivety of geeks

I mean, come on!

Amazon recently announced that, from now on, everything you say to Alexa will be sent to their server.

What surprised me the most with this announcement is how it was met with surprise and harsh reactions. People felt betrayed.

I mean, come on!

Did you really think that Amazon was not listening to you before that? Did you really buy an Alexa trusting Amazon to "protect your privacy"?

Recently, I came across a comment on Hacker News where the poster defended Apple as protecting privacy of its users because "They market their product as protecting our privacy".

I mean, once again, come on!

Did you really think that "marketing" is telling the truth? Are you a freshly debarked Thermian? (In case you missed it, this is a Galaxy Quest reference.)

The whole point of marketing is to lie, lie and lie again.

What is the purpose of that gadget?

The whole point of the whole Amazon Alexa tech stack is to send information to Amazon. That’s the main goal of the thing. The fact that it is sometimes useful to you is a direct consequence of the thing sending information to Amazon. Just like Facebook linking you with friends is a consequence of you giving your information to Meta. Usefulness is only a byproduct of privacy invasion.

Having a fine-grained setting enabling "do not send all information to Amazon please" is, at best, wishful thinking. We had the same in the browser ("do-not-track"). It didn’t work.

I’ve always been convinced that the tech geeks who bought an Amazon Alexa perfectly knew what they were doing. One of my friends has a Google Echo and justify it with "Google already knows everything about our family through our phones, so I’m trading only a bit more of our privacy for convenience". I don’t agree with him but, at the very least, it’s a logical opinion.

We all know that what can be done with a tool will be done eventually. And you should prepare for it. On a side note, I also postulate that the reason Amazon removed that setting is because they were already gathering too much data to justify its existence in case there’s a complaint or an investigation in the future."How did you manage to get those data while your product says it will not send data?".

But, once again, any tech person knows that pushing a button in an interface is not a proof of anything in the underlying software.

Please stop being naive about Apple

That’s also the point with Apple: Apple is such a big company that the right hand has no idea about what the left hand is doing. Some privacy people are working at Apple and doing good job. But their work is continuously diluted through the interests of quick and cheap production, marketing, release, new features, gathering data for advertising purpose. Apple is not a privacy company and has never been: it is an opportunistic company which advertise privacy when it feels it could help sell more iPhones. But deeply inside, they absolutely don’t care and they will absolutely trade the (very little) privacy they have if it means selling more.

Sometimes, geek naivety is embarrassingly stupid. Like "brand loyalty". Marketing lies to you. As a rule of thumb, the bigger the company, the bigger the lie. In tech, there’s no way for a big company to not lie because marketers have no real understanding of they are selling. Do you really think that people who chose to advertise "privacy" at Apple have any strong knowledge about "privacy"? That they could simply give you a definition of "privacy"?

I know that intelligent people go to great intellectual contortions to justify buying the latest overpriced spying shiny coloured screen with an apple logo. It looks like most humans actively look to see their freedom restricted. Seirdy calls it "the domestication of users".

And that’s why I see Apple as a cult: most tech people cannot be reasoned about it.

You can’t find a technical solution to a lie

Bill Cole, contributor to Spamassassin, recently posted on Mastodon that the whole DNS stack to protect spammers was not working.

spammers are more consistent at making SPF, DKIM, and DMARC correct than are legitimate senders.

It is, once again, a naive approach to spam. The whole stack was designed with the mindset "bad spammers will try to hide themselves". But was is happening in your inbox, really?

Most spam is not "black hat spam". It is what I call "white-collar spam": perfectly legitimate company, sending you emails from legitimate address. You slept in a hotel during a business trip? Now you will receive weekly emails about our hotel for the rest of your life. And it is the same for any shop, any outlet, anything you have done. Your inbox is filled with "white-collar" junk. And they know this perfectly well.

In Europe, we have a rule, the RGPD, which forbid businesses to keep your data without your express consent. I did the experiment for several months to send a legal threat to every single white-collar spam I received. Guess what: they always replied that it was a mistake, that I was now removed, that it should not have happened, that I checked the box (which was false but how could I prove it?) or even, on one occasion, that they restored a backup containing my email before I unsubscribed (I unsubscribed from that one 10 years before, which makes it very unlikely).

In short, they lied. All of them. All of them are spammers and they lie pretending that "they thought you were interested".

In one notable case, they told me that they had erased all my data while, still having the cookie on my laptop, I could see and use my account. Thirty days later, I was still connected and I figured that they simply managed to change my user id from "ploum" to "deleted_ploum" in the database. While answering me straight in the face that they had no information about me in their database.

Corporations are lying. You must treat every corporate word as a straight lie until proved otherwise.

But Ploum, if all marketing is a lie, why trusting Signal?

If you can’t trust marketing, why do I use Signal and Protonmail?

First of all, Signal is open source. And, yes, I’ve read some of the source code for some feature I was interested in. I’ve also read through some very deep audit of Signal source code.

I’m also trusting the people behind Signal. I’m trusting people who recommend Signal. I’m trusting the way Signal is built.

But most importantly, Signal sole existence is to protect privacy of its users. It’s not even a corporation and, yes, this is important.

Yes, they could lie in their marketing. Like Telegram did (and still does AFAIK). But this would undermine their sole reason to exist.

I don’t say that Signal is perfect: I say I trust them to believe themselves what they announce. For now.

What about Protonmail?

For the same reasons, Protonmail can, to some extent, be trusted. Technically, they can access most of the emails of their customers (because those emails arrive unencrypted to PM’s servers). But I trust Protonmail not to sell any data because if there’s any doubt that they do it, the whole business will crumble. They have a strong commercial incentive to do everything they can to protect my data. I pay them for that. It’s not a "checkbox" they could remove, it’s their whole raison d’être.

This is also why I pay for Kagi as my search engine: their business incentive is to provide me the best search results with less slop, less advertising. As soon as they start doing some kind of advertising, I will stop paying them and they know it. Or if Kagi starts becoming to AI centric for my taste, like they did for Lori:

I don’t blindly trust companies. Paying them is not a commitment to obey them, au contraire. Every relation with a commercial entity is, by essence, temporary. I pay for a service with strings attached. If the service degrade, if my conditions are not respected, I stop paying. If I’m not convinced they can be trusted, I stop paying them. I know I can pay and still be the product. If I have any doubt, I don’t pay. I try to find an alternative and migrate to it. Email being critical to me, I always have two accounts on two different trustable providers with an easy migrating path (which boils down to changing my DNS config).

Fighting the Androidification

Cory Doctorow speaks a lot about enshitification. Where users are more and more exploited. But one key component of a good enshitification is what I call "Androidification".

Androidification is not about degrading the user experience. It’s about closing doors, removing special use cases, being less and less transparent. It’s about taking open source software and frog boiling it to a full closed proprietary state while killing all the competition in the process.

Android was, at first, an Open Source project. With each release, it became more closed, more proprietary. As I explain in my "20 years of Linux on the Desktop" essay, I believe it has always been part of the plan. Besides the Linux kernel, Google was always wary not to include any GPL or LGPL licensed library in Android.

It took them 15 years but they finally achieved killing the Android Open Source Project:

This is why I’m deeply concerned by the motivation of Canonical to switch Ubuntu’s coreutils to an MIT licensed version.

This is why I’m deeply concerned that Protonmail quietly removed the issue tracker from its Protonmail Bridge Github page (making the development completely opaque for what is an essential tool for technical Protonmail users).

I mean, commons!

This whole naivety is also why I’m deeply concerned by very intelligent and smart tech people not understanding what "copyleft" is, why it is different from "open source" and why they should care.

Corporations are not your friend. They never were. They lie. The only possible relationship with them is an opportunistic one. And if you want to build commons that they cannot steal, you need strong copyleft.

But firstly, my fellow geeks, you need to lose your candid naivety.

I mean, come on, let’s build the commons!

I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

I write science-fiction novels in French. For Bikepunk, my new post-apocalyptic-cyclist book, my publisher is looking for contacts in other countries to distribute it in languages other than French. If you can help, contact me!

March 24, 2025

MySQL HeatWave integrates GenAI capabilities into MySQL on OCI. We have demonstrated how HeatWave GenAI can leverage RAG’s capability to utilize ingested documents (unstructured data) in LakeHouse and generate responses to specific questions or chats. See: The common theme here is the use of data stored in Object Storage (LakeHouse). I previously discussed how to […]

March 23, 2025

Modern SSAO in a modern run-time

Cover Image - SSAO with Image Based Lighting

Use.GPU 0.14 is out, so here's an update on my declarative/reactive rendering efforts.

The highlights in this release are:

  • dramatic inspector viewing upgrades
  • a modern ambient-occlusion (SSAO/GTAO) implementation
  • newly revised render pass infrastructure
  • expanded shader generation for bind groups
  • more use of generated WGSL struct types
SSAO with image based lighting

SSAO with Image-Based Lighting

The main effect is that out-of-the-box, without any textures, Use.GPU no longer looks like early 2000s OpenGL. This is a problem every home-grown 3D effort runs into: how to make things look good without premium, high-quality models and pre-baking all the lights.

Use.GPU's reactive run-time continues to purr along well. Its main role is to enable doing at run-time what normally only happens at build time: dealing with shader permutations, assigning bindings, and so on. I'm quite proud of the line up of demos Use.GPU has now, for the sheer diversity of rendering techniques on display, including an example path tracer. The new inspector is the cherry on top.

Example mosaic

A lot of the effort continues to revolve around mitigating flaws in GPU API design, and offering something simpler. As such, the challenge here wasn't just implementing SSAO: the basic effect is pretty easy. Rather, it brings with it a few new requirements, such as temporal accumulation and reprojection, that put new demands on the rendering pipeline, which I still want to expose in a modular and flexible way. This refines the efforts I detailed previously for 0.8.

Good SSAO also requires deep integration in the lighting pipeline. Here there is tension between modularizing and ease-of-use. If there is only one way to assemble a particular set of components, then it should probably be provided as a prefab. As such, occlusion has to remain a first class concept, tho it can be provided in several ways. It's a good case study of pragmatism over purity.

In case you're wondering: WebGPU is still not readily available on every device, so Use.GPU remains niche, tho it already excels at in-house use for adventurous clients. At this point you can imagine me and the browser GPU teams eyeing each other awkwardly from across the room: I certainly do.

Inspector Gadget

The first thing to mention is the upgraded the Use.GPU inspector. It already had a lot of quality-of-life features like highlighting, but the main issue was finding your way around the giant trees that Use.GPU now expands into.

Inspector without filtering

Old

Inspector with filtering

New

Inspector filter
Inspector with highlights

Highlights show data dependencies

The fix was filtering by type. This is very simple as a component already advertises its inspectability in a few pragmatic ways. Additionally, it uses the data dependency graph between components to identify relevant parents. This shows a surprisingly tidy overview with no additional manual tagging. For each demo, it really does show you the major parts first now.

If you've checked it out before, give it another try. The layered structure is now clearly visible, and often fits in one screen. The main split is how Live is used to reconcile different levels of representation: from data, to geometry, to renders, to dispatches. These points appear as different reconciler nodes, and can be toggled as a filter.

It's still the best way to see Live and Use.GPU in action. It can be tricky to grok that each line in the tree is really a plain function, calling other functions, as it's an execution trace you can inspect. It will now point you more in the right way, and auto-select the most useful tabs by default.

The inspector is unfortunately far heavier than the GPU rendering itself, as it all relies on HTML and React to do its thing. At some point it's probably worth to remake it into a Live-native version, maybe as a 2D canvas with some virtualization. But in the mean time it's a dev tool, so the important thing is that it still works when nothing else does.

Most of the images of buffers in this post can be viewed live in the inspector, if you have a WebGPU capable browser.

SSAO

Screen-space AO is common now: using the rendered depth buffer, you estimate occlusion in a hemisphere around every point. I opted for Ground Truth AO (GTAO) as it estimates the correct visibility integral, as opposed to a more empirical 'crease darkening' technique. It also allows me to estimate bent normals along the way, i.e. the average unoccluded direction, for better environment lighting.

Hemisphere sampling

This image shows the debug viz in the demo. Each frame will sample one green ring around a hemisphere, spinning rapidly, and you can hold ALT to capture the sampling process for the pixel you're pointing at. It was invaluable to find sampling issues, and also makes it trivial to verify alignment in 3D. The shader calls printPoint(…) and printLine(…) in WGSL, which are provided by a print helper, and linked in the same way it links any other shader functions.

SSAO normal and occlusion samples

Bent normal and occlusion samples

SSAO is expensive, and typically done at half-res, with heavy blurring to hide the sampling noise. Mine is no different, though I did take care to handle odd-sized framebuffers correctly, with no unexpected sample misalignments.

It also has accumulation over time, as the shadows change slowly from frame to frame. This is done with temporal reprojection and motion vectors, at the cost of a little bit of ghosting. Moving the camera doesn't reset the ambient occlusion, as long as it's moving smoothly.

SSAO motion vectors

Motion vectors example

SSAO normal and occlusion accumulation

Accumulated samples

As Use.GPU doesn't render continuously, you can now use <Loop converge={N}> to decide how many extra frames you want to render after every visual change.

Reprojection requires access to the last frame's depth, normal and samples, and this is trivial to provide. Use.GPU has built-in transparent history for render targets and buffers. This allows for a classic front/back buffer flipping arrangement with zero effort (also, n > 2).

Depth history

Depth history

You bind this as virtual sources, each accessing a fixed slot history[i], which will transparently cycle whenever you render to its target. Any reimagined GPU API should seriously consider buffer history as a first-class concept. All the modern techniques require it.

Interleaved Gradient Noise

IGN

Rather than use e.g. blue noise and hope the statistics work out, I chose a very precise sampling and blurring scheme. This uses interleaved gradient noise (IGN), and pre-filters samples in alternating 2x2 quads to help diffuse the speckles as quickly as possible. IGN is designed for 3x3 filters, so a more specifically tuned noise generator may work even better, but it's a decent V1.

Reprojection often doubles as a cheap blur filter, creating free anti-aliasing under motion or jitter. I avoided this however, as the data being sampled includes the bent normals, and this would cause all edges to become rounded. Instead I use a precise bilateral filter based on depth and normal, aided by 3D motion vectors. This means it knows exactly what depth to expect in the last frame, and the reprojected samples remain fully aliased, which is a good thing here. The choice of 3D motion vectors is mainly a fun experiment, it may be an unnecessary luxury.

SSAO aliased accumulation

Detail of accumulated samples

The motion vectors are based only on the camera motion for now, though there is already the option of implementing custom motion shaders similar to e.g. Unity. For live data viz and procedural geometry, motion vectors may not even be well-defined. Luckily it doesn't matter much: it converges fast enough that artifacts are hard to spot.

The final resolve can then do a bilateral upsample of these accumulated samples, using the original high-res normal and depth buffer:

SSAO upscaled and resolved samples

Upscaled and resolved samples, with overscan trimmed off

Because it's screen-space, the shadows disappear at the screen edges. To remedy this, I implemented a very precise form of overscan. It expands the framebuffer by a constant amount of pixels, and expands the projectionMatrix to match. This border is then trimmed off when doing the final resolve. In principle this is pixel-exact, barring GPU quirks. These extra pixels don't go to waste either: they can get reprojected into the frame under motion, reducing visible noise significantly.

In theory this is very simple, as it's a direct scaling of [-1..1] XY clip space. In practice you have to make sure absolutely nothing visual depends on the exact X/Y range of your projectionMatrix, either its aspect ratio or in screen-space units. This required some cleanup on the inside, as Use.GPU has some pretty subtle scaling shaders for 2.5D and 3D points and lines. I imagine this is also why I haven't seen more people do this. But it's definitely worth it.

Overall I'm very satisfied with this. Improvements and tweaks can be made aplenty, some performance tuning needs to happen, but it looks great already. It also works in both forward and deferred mode. The shader source is here.

Render Buffers & Passes

The rendering API for passes reflects the way a user wants to think about it, as 1 logical step in producing a final image. Sub-passes such as shadows or SSAO aren't really separate here, as the correct render cannot be finished without it.

The main entry point here is the <Pass> component, representing such a logical render pass. It sits inside a view, like an <OrbitCamera>, and has some kind of pre-existing render context, like the visible canvas.

<Pass
  lights
  ssao={{ radius: 3, indirect: 0.5 }}
  overscan={0.05}
>
  ...
</Pass>

You can sequence multiple logical passes to add overlays with overlay: true, or even merge two scenes in 3D using the same Z-buffer.

Inside it's a declarative recipe that turns a few flags and options into the necessary arrangement of buffers and passes required. This uses the alt-Live syntax use(…) but you can pretend that's JSX:

const resources = [
  use(ViewBuffer, options),
  lights ? use(LightBuffer, options) : null,
  shadows ? use(ShadowBuffer, options) : null,
  picking ? use(PickingBuffer, options) : null,
  overscan ? use(OverscanBuffer, options) : null,
  ...(ssao ? [
    use(NormalBuffer, options),
    use(MotionBuffer, options),
  ] : []),
  ssao ? use(SSAOBuffer, options) : null,
];
const resolved = passes ?? [
  normals ? use(NormalPass, options) : null,
  motion ? use(MotionPass, options) : null,
  ssao ? use(SSAOPass, options) : null,
  shadows ? use(ShadowPass, options) : null,
  use(DEFAULT_PASS[viewType], options),
  picking ? use(PickingPass, options) : null,
  debug ? use(DebugPass, options) : null,
]

e.g. The <SSAOBuffer> will spawn all the buffers necessary to do SSAO.

Notice what is absent here: the inputs and outputs. The render passes are wired up implicitly, because if you had to do it manually, there would only be one correct way. This is the purpose of separating the resources from the passes: it allows everything to be allocated once, up front, so that then the render passes can connect them into a suitable graph with a non-trivial but generally expected topology. They find each other using 'well-known names' like normal and motion, which is how it's done in practice anyway.

Mounted render passes

Render passes in the inspector

This reflects what I am starting to run into more and more: that decomposed systems have little value if everyone has to use it the same way. It can lead to a lot of code noise, and also tie users to unimportant details of the existing implementation. Hence the simple recipe.

But, if you want to sequence your own render exactly, nothing prevents you from using the render components à la carte: the main method of composition is mounting reactive components in Live, like everything else. Your passes work exactly the same as the built-in ones.

I make use of the dynamicism of JS to e.g. not care what options are passed to the buffers and passes. The convention is that each should be namespaced so they don't collide. This provides real extensibility for custom use, while paving the cow paths that exist.

It's typical that buffers and passes come in matching pairs. However, one could swap out one variation of a <FooPass> for another, while reusing the same buffer type. Most <FooBuffer> implementations are themselves declarative recipes, with e.g. a <RenderTarget> or two, and perhaps an associated data binding. All the meat—i.e. the dispatches—is in the passes.

It's so declarative that there isn't much left inside <Renderer> itself. It maps logical calls into concrete ones by leveraging Live, and that's reflected entirely in what's there. It only gathers up some data it doesn't know details about, and helps ensure the sequence of compute before render before readback. This is a big clue that renderers really want to be reactive run-times instead.

Bind Group Soup

Use.GPU's initial design goal was "a unique shader for every draw call". This means its data binding fu has mostly been applied to local shader bindings. These apply only to one particular draw, and you bind the data to the shader at the same time as creating it.

This is the useShader hook. There is no separation where you first prepare the binding layout, and as such, you use it like a deferred function call, just like JSX.

// Prepare to call surfaceShader(matrix, ray, normal, size, ...)
const getSurface = useShader(surfaceShader, [
  matrix, ray, normal, size, insideRef, originRef,
  sdf, palette, pbr, ...sources
], defs);

Shader and pipeline reuse is handled via structural hashing behind the scenes: it's merely a happy benefit if two draw calls can reuse the same shader and pipeline, but absolutely not a problem if they don't. As batching is highly encouraged, and large data sets can be rendered as one, the number of draw calls tends to be low.

All local bindings are grouped in two bind groups, static and volatile. The latter allows for the transparent history feature, as well as just-in-time allocated atlases. Static bindings don't need to be 100% static, they just can't change during dispatch or rendering.

WebGPU only has four bind groups total. I previously used the other two for respectively the global view, and the concrete render pass, using up all the bind groups. This was wasteful but an unfortunate necessity, without an easy way to compose them at run-time.

Bind Group: #0 #1 #2 #3
Use.GPU 0.13 View Pass Static Volatile
Use.GPU 0.14 Pass Static Volatile Free

This has been fixed in 0.14, which frees up a bind group. It also means every render pass fully owns its own view. It can pick from a set of pre-provided ones (e.g. overscanned or not), or set a custom one, the same way it finds buffers and other bindings.

Having bind group 3 free also opens up the possibility of a more traditional sub-pipeline, as seen in a traditional scene graph renderer. These can handle larger amounts of individual draw calls, all sharing the same shader template, but with different textures and parameters. My goal however is to avoid monomorphizing to this degree, unless it's absolutely necessary (e.g. with the lighting).

This required upgrading the shader linker. Given e.g. a static binding snippet such as:

use '@use-gpu/wgsl/use/types'::{ Light };

@export struct LightUniforms {
  count: u32,
  lights: array<Light>,
};

@group(PASS) @binding(1) var<storage> lightUniforms: LightUniforms;

...you can import it in Typescript like any other shader module, with the @binding as an attribute to be linked. The shader linker will understand struct types like LightUniforms with array<Light> fully now, and is able to produce e.g. a correct minimum binding size for types that cross module boundaries.

The ergonomics of useShader have been replicated here, so that useBindGroupLayout takes a set of these and prepares them into a single static bind group, managing e.g. the shader stages for you. To bind data to the bind group, a render pass delegates via useApplyPassBindGroup: this allows the source of the data to be modularized, instead of requiring every pass to know about every possible binding (e.g. lighting, shadows, SSAO, etc.). That is, while there is a separation between bind group layout and data binding, it's lazy: both are still defined in the same place.

SSAO on voxels

The binding system is flexible enough end-to-end that the SSAO can e.g. be applied to the voxel raytracer from @use-gpu/voxel with zero effort required, as it also uses the shaded technique (with per fragment depth). It has a getSurface(...) shader function that raytraces and returns a surface fragment. The SSAO sampler can just attach its occlusion information to it, by decorating it in WGSL.

WGSL Types

Worth noting, this all derives from previous work on auto-generated structs for data aggregation.

It's cool tech, but it's hard to show off, because it's completely invisible on the outside, and the shader code is all ugly autogenerated glue. There's a presentation up on the site that details it at the lower level, if you're curious.

The main reason I had aggregation initially was to work around the 8 storage buffers limit in WebGPU. The Plot API needed to auto-aggregate all the different attributes of shapes, with their given spread policies, based on what the user supplied.

This allows me to offer e.g. a bulk line drawing primitive where attributes don't waste precious bandwidth on repeated data. Each ends up grouped in structs, taking up only 1 storage buffer, depending on whether it is constant or varying, per instance or per vertex:

<Line
  // Two lines
  positions={[
    [[300, 50], [350, 150], [400, 50], [450, 150]],
    [[300, 150], [350, 250], [400, 150], [450, 250]],
  ]}
  // Of the same color and width
  color={'#40c000'}
  width={5}
/>

<Line
  // Two lines
  positions={[
    [[300, 250], [350, 350], [400, 250], [450, 350]],
    [[300, 350], [350, 450], [400, 350], [450, 450]],
  ]}
  // With color per line
  color={['#ffa040', '#7f40a0']}
  // And width per vertex
  widths={[[1, 2, 2, 1], [1, 2, 2, 1]}
/>

This involves a comprehensive buffer interleaving and copying mechanism, that has to satisfy all the alignment constraints. This then leverages @use-gpu/shader's structType(…) API to generate WGSL struct types at run-time. Given a list of attributes, it returns a virtual shader module with a real symbol table. This is materialized into shader code on demand, and can be exploded into individual accessor functions as well.

Hence data sources in Use.GPU can now have a format of T or array<T> with a WGSL shader module as the type parameter. I already had most of the pieces in place for this, but hadn't quite put it all together everywhere.

Using shader modules as the representation of types is very natural, as they carry all the WGSL attributes and GPU-only concepts. It goes far beyond what I had initially scoped for the linker, as it's all source-code-level, but it was worth it. The main limitation is that type inference only happens at link time, as binding shader modules together has to remain a fast and lazy op.

Native WGSL types are somewhat poorly aligned with the WebGPU API on the CPU side. A good chunk of @use-gpu/core is lookup tables with info about formats and types, as well as alignment and size, so it can all be resolved at run-time. There's something similar for bind group creation, where it has to translate between a few different ways of saying the same thing.

The types I expose instead are simple: TextureSource, StorageSource and LambdaSource. Everything you bind to a shader is either one of these, or a constant (by reference). They carry all the necessary metadata to derive a suitable binding and accessor.

That said, I cannot shield you from the limitations underneath. Texture formats can e.g. be renderable or not, filterable or not, writeable or not, and the specific mechanisms available to you vary. If this involves native depth buffers, you may need to use a full-screen render pass to copy data, instead of just calling copyTextureToTexture. I run into this too, and can only provide a few more convenience hooks.

I did come up with a neat way to genericize these copy shaders, using the existing WGSL type inference I had, souped up a bit. This uses simple selector functions to serve the role of reassembling types. It's finally given me a concrete way to make 'root shaders' (i.e. the entry points) generic enough to support all use. I may end up using something similar to handle the ordinary vertex and fragment entry points, which still have to be provided in various permutations.

* * *

Phew. Use.GPU is always a lot to go over. But its à la carte nature remains and that's great.

For in-house use it's already useful, especially if you need a decent GPU on a desktop anyway. I have been using it for some client work, and it seems to be making people happy. If you want to go off-road from there, you can.

It delivers on combining low-level shader code with its own stock components, without making you reinvent a lot of the wheels.

Visit usegpu.live for more and to view demos in a WebGPU capable browser.

PS: I upgraded the aging build of Jekyll that was driving this blog, so if you see anything out of the ordinary, please let me know.

March 21, 2025

Du désir profond de se faire arnaquer

Pour suivre les modes et faire comme tout le monde

Stefano Marinelli, un administrateur système chevronné, installe principalement des serveurs sous FreeBSD, OpenBSD ou NetBSD pour ses clients. Le plus difficile ? Arriver à convaincre un client qui veut absolument un « cluster de kubernetes tournant sous Linux », mais ne sait pas de quoi il s’agit que ce n’est pas toujours une bonne idée. Par contre, s’il migre sans rien dire des machines virtuelles vers des jails FreeBSD, il reçoit des appels paniqués parce que « tout va désormais trop vite, ça va nous coûter combien votre mise à jour du matériel ? ».

C’est le gros problème du métier d’ingénieur : l’ingénieur est censé analyser un problème et proposer des solutions, mais un manager, pour justifier son boulot, a la plupart du temps déjà décidé de la solution qu’il veut que l’ingénieur mette en place, même si elle est inadaptée.

Heureusement, les conflits sont de plus en plus rares : toutes les écoles d’ingénieurs enseignent désormais le management et la plupart des élèves ingénieurs n’apprennent plus à être critiques dans la résolution des problèmes. Les universités créent un monde de Julius:

Ceux qui osent demander « mais pourquoi ? » sont les exceptions, les rebelles.

Stefano continue avec d’autres anecdotes : comment un projet a capoté parce que le mauvais code d’un développeur remplissait les disques des serveurs de Stefano. Plutôt que de résoudre le problème du code, il a été jugé plus diplomatique d’écouter le développeur et de « passer dans le cloud ». Les disques ne se sont pas remplis en quelques heures comme auparavant. Le projet a tourné un mois sur le « cloud » avant que n’arrive la facture. Et le compte en banque du projet s’est vidé.

Ou comment une infrastructure de soins de santé refuse de mettre à jour ses serveurs pour investir dans le design d’une infrastructure « cloud » qui, 5 ans plus tard, est toujours à l’état de design malgré le budget injecté dans le « cloud consultant ». L’infrastructure se retrouve à faire tourner… Windows XP et appelle Stefano quand tout plante.

L’arnaque du SEO

J’ai vécu une anecdote similaire lorsque j’ai mis en place, pour une petite société, un site web qui comportait une partie CMS, la gestion des commandes et la génération de factures (j’avais tout fait en utilisant Django). Un jour, je reçois un coup de téléphone de quelqu’un que je ne connais pas me demandant les accès au serveur sur lequel est hébergé ce site. Je refuse, bien évidemment, mais le ton monte. Je raccroche, persuadé d’avoir affaire à une sorte d’arnaque. Quelques minutes plus tard, ma cliente m’appelle pour savoir pourquoi je n’ai pas donné l’accès à la personne qui m’a appelé. J’ai tenté l’approche raisonnable « Vous voulez vraiment que je donne accès à toute votre infrastructure à la première personne qui m’appelle et le demande ? », sans succès. J’ai finalement accepté de donner l’accès, mais en expliquant que j’exigeais un ordre écrit de sa part et que je me dégageais ensuite de toute responsabilité. Là, la cliente a paru comprendre.

Après moult explications, il s’est avéré qu’elle avait engagé, à mon insu, un consultant SEO qui voulait rajouter un code Google Analytics dans son site. Le SEO, Search Engine Optimisation, consiste à tenter de faire remonter un site web dans les résultats Google.

J’ai expliqué à ma cliente que même avec accès au serveur, le type du SEO aurait été incapable de modifier le code Django, mais que, pas de problème, il suffisait de m’envoyer un email avec le code à rajouter (aujourd’hui encore je me demande ce qu’aurait fait le gars si je lui avais donné un « accès administrateur » sur le serveur, comme il le demandait). Quelques jours plus tard, un second email me demande de modifier le code Google Analytics ajouté. J’obtempère.

Puis, je commence à recevoir des plaintes que je ne fais pas mon travail, que le code n’est pas le bon. Je le rechange. Le même cinéma se passe deux ou trois fois et ma cliente s’énerve, me traite d’incompétent. Il me faut plusieurs jours d’investigations, plusieurs réunions téléphoniques avec les types du SEO pour réaliser que les emails proviennent de deux sociétés de SEO différentes (mais avec un nom de domaine similaire, ça m’était passé au-dessus de la tête en lisant les emails).

Ma cliente avait en fait engagé deux sociétés différentes de SEO, sans leur dire et sans me le dire. Les deux sociétés se battaient donc pour mettre leur code Google Analytics à elles, ne comprenant pas pourquoi je mettais un « mauvais » code. Le pot au rose a été découvert lors d’une réunion téléphonique houleuse où j’ai pointé un email reçu la veille et que mon correspondant prétendait n’avoir jamais envoyé (forcément, il provenait d’une autre société).

J’ai confronté ma cliente et j’ai réussi à découvrir que, à part fournir des résumés issus de Google Analytics, ces deux sociétés ne faisaient rien, mais que chacune avait été payée trois fois le prix que j’avais demandé pour la réalisation entière du site, de la gestion de commande et de facturation. C’est d’ailleurs la raison pour laquelle la cliente me prenait de haut par rapport aux entreprises de SEO : j’étais bon marché donc j’étais forcément incompétent.

Pour être honnête, l’une des sociétés avait fait son « travail » et m’avait envoyé un rapport avec des modifications mineures à faire sur le site pour améliorer le SEO, mais en notant que le site était déjà très bien, qu’il n’y avait pas grand-chose à faire (essentiellement, ils me demandaient de rajouter des keywords dans les balises meta, un truc que je savais comme étant dépassé, déjà à l’époque, mais que j’ai fait sans discuter).

Furieux, j’ai publié un billet qui a tellement choqué la communauté SEO que j’ai reçu des dizaines de mails d’insultes voire de menaces physiques (vous savez, le genre où le mec à découvert des infos personnelles et tente de vous intimider en vous montrant qu’il sait faire une recherche Google sur votre nom).

Toute une communauté s’est prise au jeu de faire en sorte que le premier résultat Google sur mon nom soit une série d’injures. Flatté par tant d’attention pour un simple billet de blog sans prétention, j’ai surtout réalisé, en lisant les forums où ils discutaient mon cas, à que j’avais affaire à des gens malhonnêtes, peu scrupuleux, bref bêtes et méchants à un niveau à la limite de la parodie.

Merdification du web avec le SEO

Certains, plus modérés, tentèrent de me convaincre que « not all SEO ». Réponse : si. C’est le principe même. Tu ne veux juste pas le voir parce que tu es quelqu’un avec une certaine éthique et que ça rentre en conflit avec ta source de revenus. Mais c’est gentil à toi de m’écrire posément sans m’insulter.

Le web est devenu un énorme tas de déchets généré par les SEO.

Solderpunk s’interroge par exemple sur une mystérieuse mesure de la couverture nuageuse, mais, devant la merdification du web et l’appropriation technologique du mot "cloud", il s’en remet à poser sa question à d’autres humains, sur le réseau Gemini. Parce que le web ne lui permet plus de trouver une réponse ou de la poser à d’autres êtres humains.

Le web devait nous connecter, la merdification et l’IA nous force à nous retirer dans des espaces alternatifs où nous pouvons discuter entre humains, même pour résoudre les problèmes pour lesquels l’IA et le web sont censés être les plus utiles : répondre à nos questions techniques et factuelles. Dénicher des informations rares et difficiles d’accès.

Fermez vos comptes sur les plateformes merdifiées

Ce retour aux petites communautés est un mouvement. Thierry Crouzet se met également à Gemini:

Mais, surtout, il ferme définitivement Facebook, X, Bluesky, Instagram et bientôt peut-être Whatsapp. Pour ceux qui hésitent à faire de même, c’est toujours intéressant d’avoir des retours d’expérience.

Thierry n’est pas le seul, Vigrey ferme également son compte Facebook et en parle… sur Gemini.

Une chose est certaine : vous n’arriverez pas à migrer tous vos contacs pour une simple raison. Beaucoup veulent se faire arnaquer. Ils le demandent. Comme mon entrepreneuse, ils ne veulent pas un discours rationnel, ils ne veulent pas une solution. À vous de ne pas les laisser décider de votre futur numérique.

Et n’espérez pas que tout le monde soit un jour sur le même réseau social.

L’impact global de l’IA sur le web

L’IA produit essentiellement de la merde et il ne faut jamais lui faire confiance. Ça, vous le savez déjà.

Mais elle a surtout un impact énorme sur ceux qui ne l’utilisent pas. Beaucoup parlent des ressources utilisées dans les datacenters, mais bien plus proches et plus directes, les IA inondent le web de requêtes pour tenter d’aspirer tout le contenu possible et imaginable.

Il existe un standard bien implanté depuis des décennies qui permet de mettre un fichier appelé "robots.txt" sur son site web. Ce fichier contient les règles que doit respecter un robot accédant à votre site. Cela permet par exemple de dire au robot de Google de ne pas visiter certaines pages ou pas trop souvent.

Sans surprise, les robots utilisés par l’IA ne respectent pas ces règles. Pire, ils se camouflent pour avoir l’air d’être de véritables utilisateurs. Ils sont donc fondamentalement malhonnêtes et savent très bien ce qu’ils font : ils viennent littéralement copier votre contenu sans votre accord pour le réutiliser. Mais ils le font des centaines, des milliers de fois par secondes. Ce qui met à mal toute l’infrastructure du web.

Drew De Vault parle de son expérience avec l’infrastructure Sourcehut, sur laquelle est hébergé ce blog.

Tous ces datacenters construits en urgence pour faire de « l’IA » ? Ils sont utilisés pour mener des attaques DOS (Denial of Service) sur toute l’infrastructure du web. Dans le but de « pirater » les contenus sans respecter les licences et le copyright.

Ce n’est pas que je suis un fan du copyright, bien au contraire. C’est juste que ça fait 30 ans qu’on nous martèle que « la copie c’est le vol » et qu’Aaron Swartz s’est suicidé, car il risquait 30 de prison pour avoir automatisé le téléchargement de quelques milliers d’articles scientifiques qu’il estimait, avec justesse, appartenir au domaine public.

L’IA consomme des ressources, détruit nos réseaux, met à genoux les systèmes administrateurs bénévoles des sites communautaires, s’approprie nos contenus. Et tout cela pour quoi faire ? Pour générer du contenu SEO qui va remplir encore plus le web. Oui, ça tourne en boucle. Non, ça ne peut pas bien se terminer.

La mode de l’incompétence

Le SEO, le cloud et maintenant l’IA sont en cela très similaires : la mode. Les clients le veulent à tout prix et demandent pour se faire littéralement arnaquer tout en se vantant de leur incompétence.

Dans un sens, c’est bien fait pour eux : ils le veulent le truc à la mode sans même savoir pourquoi ils le veulent. Ma cliente voulait du SEO alors qu’il s’agissait d’un business essentiellement local qui ciblait une clientèle de niche avec laquelle elle avait des contacts. Les clients veulent « du cloud » pour ne pas payer un administrateur système comme Stefano, mais payent dix fois le prix pour un consultant et se retrouvent à appeler Stefano quand tout va mal. De même, ils veulent désormais de l’IA sans même savoir pourquoi ils le veulent.

L’IA, c’est en fait la junk food de la pensée : un aspect appétissant, mais aucune valeur nutritive et, à terme, une perte totale de la culture du goût, de la saveur.

Même si j’ai donné tous les codes, tous les accès, même si je l’ai mise en contact avec d’autres développeurs Django, la société dont je parle dans ce billet n’a pas survécu longtemps après mon départ. Son capital initial et, surtout, les aides de l’état à la création d’entreprise qu’elle percevait ont essentiellement fini dans les poches de deux entreprises de SEO qui n’ont rien fait d’autre que de créer un compte Google Analytics. Aujourd’hui, c’est pareil avec le cloud et l’IA : il s’agit d’exploiter au maximum la crédulité des petits entrepreneurs qui ont la capacité d’obtenir des subsides de l’état afin de vider leurs poches. Ainsi que celles de l’état, dans lesquelles les politiciens piochent avec un enthousiasme démesuré dès qu’on utilise un buzzword à la mode.

Je pensais, naïvement, offrir un service éthique, je pensais discuter avec les clients pour répondre à leurs véritables besoins.

Je n’imaginais pas que les clients voulaient à tout prix se faire arnaquer.

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

March 19, 2025

In the previous post, we saw how to deploy MySQL HeatWave on Amazon. Multicloud refers to the coordinated use of cloud services from multiple providers. In addition to our previous post, where we deployed MySQL HeatWave on Amazon, we will explore how to connect with another cloud service. Oracle has partnered with Microsoft to offer […]

March 15, 2025

When I searched for a new LoRaWAN indoor gateway, my primary criterion was that it should be capable of running open-source firmware. The ChirpStack Gateway OS firmware caught my attention. It's based on OpenWrt and has regular releases. Its recent 4.7.0 release added support for the Seeed SenseCAP M2 Multi-Platform Gateway, which seemed like an interesting and affordable option for a LoRaWAN gateway.

Unfortunately, this device wasn't available through my usual suppliers. However, TinyTronics did stock the SenseCAP M2 Data Only, which looked to me like exactly the same hardware but with different firmware to support the Helium LongFi Network. Ten minutes before their closing time on a Friday evening, I called their office to confirm whether I could use it as a LoRaWAN gateway on an arbitrary network. I was helped by a guy who was surprisingly friendly for the time of my call, and after a quick search he confirmed that it was indeed the same hardware. After this, I ordered this Helium variant of the gateway.

Upon its arrival, the first thing I did after connecting the antenna and powering it on was to search for the Backup/Flash Firmware entry in Luci's System menu, as explained in Seeed Studio's wiki page about flashing open-source firmware to the M2 Gateway. Unfortunately, the M2 Data Only seemed to have a locked-down version of OpenWrt's Luci interface, without the ability to flash other firmware. There was no SSH access either. I tried to flash the firmware via TFTP, but to no avail..

After these disappointing attempts, I submitted a support ticket to Seeed Studio, explaining my intention to install alternative firmware on the device, as I wasn't interested in the Helium functionality. I received a helpful response by a field application engineer with the high-level steps to do this, although I had to fill in some details myself. After getting stuck on a missing step, my follow-up query was promptly answered with the missing information and an apology for the incomplete instructions, and I finally succeeded in installing the Chirpstack Gateway OS on the SenseCAP M2 Data Only. Here are the detailed steps I followed.

Initial serial connection

Connect the gateway via USB and start a serial connection with a baud rate of 57600. I used GNU Screen for this purpose:

$ screen /dev/ttyUSB0 57600

When the U-Boot boot loader shows its options, press 0 for Load system code then write to Flash via Serial:

/images/sensecap-m2-uboot-menu.png

You'll then be prompted to switch the baud rate to 230400 and press ENTER. I terminated the screen session with Ctrl+a k and reconnected with the new baud rate:

$ screen /dev/ttyUSB0 230400

Sending the firmware with Kermit

Upon pressing ENTER, you'll see the message Ready for binary (kermit) download to 0x80100000 at 230400 bps.... I never used the Kermit protocol before, but I installed ckermit and found the procedure in a StackOverflow response to the question How to send boot files over uart. After some experimenting, I found that I needed to use the following commands:

 koan@nov:~/Downloads$ kermit
C-Kermit 10.0 pre-Beta.11, 06 Feb 2024, for Linux+SSL (64-bit)
 Copyright (C) 1985, 2024,
  Trustees of Columbia University in the City of New York.
  Open Source 3-clause BSD license since 2011.
Type ? or HELP for help.
(~/Downloads/) C-Kermit>set port /dev/ttyUSB0
(~/Downloads/) C-Kermit>set speed 230400
/dev/ttyUSB0, 230400 bps
(~/Downloads/) C-Kermit>set carrier-watch off
(~/Downloads/) C-Kermit>set flow-control none
(~/Downloads/) C-Kermit>set prefixing all
(~/Downloads/) C-Kermit>send openwrt.bin

The openwrt.bin file was the firmware image from Seeed's own LoRa_Gateway_OpenWRT firmware. I decided to install this instead of the ChirpStack Gateway OS because it was a smaller image and hence flashed more quickly (although still almost 8 minutes).

/images/sensecap-m2-kermit-send.png

After the file was sent successfully, I didn't see any output when reestablishing a serial connection. After responding this to Seeed's field application engineer, he replied that the gateway should display a prompt requesting to switch the baud rate again to 57600.

Kermit can also function as a serial terminal, so I just stayed within the Kermit command line and entered the following commands:

(~/Downloads/) C-Kermit>set speed 57600
/dev/ttyUSB0, 57600 bps
(~/Downloads/) C-Kermit>connect
Connecting to /dev/ttyUSB0, speed 57600
 Escapr character: Ctrl-\ (ASCII 28, FS): enabled
Type the escape character followed by C to get back,
or followed by ? to see other options.
----------------------------------------------------
## Total Size      = 0x00840325 = 8651557 Bytes
## Start Addr      = 0x80100000
## Switch baudrate to 57600 bps and press ESC ...

And indeed, there was the prompt. After pressing ESC, the transferred image was flashed.

Reboot into the new firmware

Upon rebooting, the device was now running Seeed's open-source LoRaWAN gateway operating system. Luci's menu now included a Backup/Flash Firmware entry in the System menu, enabling me to upload the ChirpStack Gateway OS image:

/images/sensecap-m2-openwrt-new-firmware.png

Before flashing the firmware image, I deselected the Keep settings and retain the current configuration option, as outlined in ChirpStack's documentation for installation on the SenseCAP M2:

/images/sensecap-m2-openwrt-flash.png

Thus, I now have open-source firmware running on my new LoRaWAN gateway, with regular updates in place.

Imagine waking up to discover that overnight, AI agents rewrote 500 product descriptions, reorganized 300 pages for SEO, and updated 9,000 alt-text descriptions on your website.

As you review the changes over coffee, you find three product descriptions featuring nonexistent features. If published, customers will order based on false expectations. Then you notice another problem: AI rewrote hundreds of alt-text descriptions, erasing the ones your team crafted for accessibility.

AI-driven content management isn't a distant scenario. Soon, Content Management Systems (CMS) may deploy hundreds of AI agents making bulk edits across thousands of pages.

The challenge? Traditional CMS workflows weren't designed for AI-powered editing at scale. What features should an AI-first CMS include? What safeguards would prevent errors? What workflows would balance efficiency with quality control? I'm outlining some rough ideas to start a conversation and inspire Drupal contributors to help build this future.

1. Smart review queues: scaling human oversight

AI-generated content needs different quality checks than human work. Current editorial workflows aren't optimized to handle its output volume.

I envision "AI review queues" with specialized tools like:

  • Spot-checking: Instead of manually reviewing everything, editors can sample AI content strategically. They focus on key areas, like top-selling products or pages flagged by anomaly detection. Reviewing just 5% of the changes could provide confidence; good samples suggest the broader set works well. If issues are found, it signals the need for deeper review.
  • Rolled-up approvals: Instead of approving AI edits one by one, CMS platforms could summarize large-scale AI changes into a single reviewable batch.

2. Git-like content versioning: selective control over AI changes

Say an AI translated your site into Spanish with mixed results. Meanwhile, editors updated the English content. Without sophisticated versioning, you face a tough choice: keep poor translations or roll everything back, losing days of human work.

CMS platforms need Git-like branch-based versioning for content. AI contributions should exist in separate branches that teams can merge, modify, or reject independently.

3. Configuration versioning: keeping AI from breaking your CMS

AI isn't just generating content. It is also modifying site configurations, permissions, content models and more. Many CMS platforms don't handle "configuration versioning" well. Changes to settings and site structures are often harder to track and undo.

CMS platforms also need Git-like versioning for configuration changes, allowing humans to track, review, and roll back AI-driven modifications just as easily as content edits. This ensures AI can assist with complex site management tasks without introducing silent, irreversible changes.

4. Enhanced audit trails: understanding AI decisions

Standard CMS audit logs track who made changes and when, but AI operations demand deeper insights. When multiple AI agents modify your site, we need to know which agent made each change, why it acted, and what data influenced its decision. Without these explanations, tracking down and fixing AI errors becomes nearly impossible.

AI audit trails should record confidence scores showing how certain an agent was about its changes (60% vs 95% certainty makes a difference). They need to document reasoning paths explaining how each agent reached its conclusion, track which model versions and parameters were used, and preserve the prompt contexts that guided the AI's decisions. This comprehensive tracking creates accountability in multi-agent environments where dozens of specialized AIs might collaborate on content.

This transparency also supports compliance requirements, ensuring organizations can demonstrate responsible AI oversight.

5. AI guardrails: enforcing governance and quality control

AI needs a governance layer to ensure reliability and compliance. Imagine a healthcare system where AI-generated medical claims must reference approved clinical studies, or a financial institution where AI cannot make investment recommendations without regulatory review.

Without these guardrails, AI could generate misleading or non-compliant content, leading to legal risks, financial penalties, or loss of trust.

Instead of just blocking AI from certain tasks, AI-generated content should be checked for missing citations, regulatory violations, and factual inconsistencies before publication.

Implementing these safeguards likely requires a "rules engine" that intercepts and reviews AI outputs. This could involve pattern matching to detect incorrect content, as well as fact verification against approved databases and trusted sources. For example, a healthcare CMS could automatically verify AI-generated medical claims against clinical research databases. A financial platform might flag investment advice containing unapproved claims for compliance review.

Strategic priorities for modern CMS platforms

I can't predict exactly how these ideas will take shape, but I believe their core principles address real needs in AI-integrated content management. As AI takes on a bigger role in how we manage content, building the right foundation now will pay off regardless of specific implementations. Two key investment areas stand out:

  1. Improved version control – AI and human editors will increasingly work in parallel, requiring more sophisticated versioning for both content and configuration. Traditional CMS platforms must evolve to support Git-like branching, precise rollback controls, and configuration tracking, ensuring both content stability and site integrity.
  2. AI oversight infrastructure – As AI generates and modifies content at scale, CMS platforms will need structured oversight systems. This includes specialized review queues, audit logs, and governance frameworks.

March 11, 2025

N’attendez pas, changez vos paradigmes !

Il faut se passer de voiture pendant un certain temps pour réellement comprendre au plus profond de soi que la solution à beaucoup de nos problèmes sociétaux n’est pas une voiture électrique, mais une ville cyclable.

Nous ne devons pas chercher des « alternatives équivalentes » à ce que nous offre le marché, nous devons changer les paradigmes, les fondements. Si on ne change pas le problème, si on ne revoit pas en profondeur nos attentes et nos besoins, on obtiendra toujours la même solution.

Migrer ses contacts vers Signal

Je reçois beaucoup de messages qui me demandent comment j’ai fait pour migrer vers Mastodon et vers Signal. Et comment j’ai migré mes contacts vers Signal.

Il n’y a pas de secret. Une seule stratégie est vraiment efficace pour que vos contacts s’intéressent aux alternatives éthiques : ne plus être sur les réseaux propriétaires.

Je sais que c’est difficile, qu’on a l’impression de se couper du monde. Mais il n’y a pas d’autre solution. Le premier qui part s’exclut, c’est vrai. Mais le second qui, inspiré, ose suivre le premier entraine un mouvement inexorable. Car si une personne qui s’exclut est une « originale » ou une « marginale », deux personnes forment un groupe. Soudainement, les suiveurs ont peur de rater le coche.

Il faut donc s’armer de courage, communiquer son retrait et être ferme. Les gens ont besoin de vous comme vous avez besoin d’eux. Ils finiront par vouloir vous contacter. Oui, vous allez rater des informations le temps que les gens comprennent que vous n’êtes plus là. Oui, certaines personnes qui sont sur les deux réseaux vont devoir faire la passerelle durant un certain temps.

Vous devez également accepter de faire face au dur constat que certains de vos contacts ne le sont que par facilité, non par envie profonde. Très peu de gens tiennent véritablement à vous. C’est le lot de l’humanité. Même une star qui quitte un réseau social n’entraine avec elle qu’une fraction de ses followers. Et encore, pas de manière durable. Personne n’est indispensable.

Ne pas vouloir quitter un réseau tant que « tout le monde » n’est pas sur l’alternative implique le constat effrayant que le plus réactionnaire, le plus conservateur du groupe dicte ses choix. Son refus de bouger lui donne un pouvoir hors norme sur vous et sur tous les autres. Il représente « la majorité » simplement parce que vous, qui souhaitez bouger, tolérez son côté réactionnaire. Mais si vous dîtes vouloir bouger, mais que vous ne le faites pas, n’êtes-vous pas vous-même conservateur ?

Vous voulez vraiment vous passer de Whatsapp et de Messenger ? N’attendez pas, faites-le ! Supprimez votre compte pendant un mois pour voir l’impact sur votre vie. Laissez-vous la latitude de recréer le compte s’il s’avère que cette suppression n’est pas possible pour vous sur le long terme. Mais, au moins, vous aurez testé le nouveau paradigme, vous aurez pris conscience de vos besoins réels.

Adopter le Fediverse

Joan Westenberg le dit très bien à propos du Fediverse : le Fediverse n’est pas le futur, c’est le présent. Son problème n’est pas que c’est compliqué ou qu’il n’y a personne : c’est simplement que le marketing de Google/Facebook/Apple nous a formaté le cerveau pour nous faire croire que les alternatives ne sont pas viables. Le Fediverse regorge d’humains et de créativité, mais il n’y a pas plus aveugle que celui qui ne veut pas voir.

Après avoir rechigné pendant des années à s’y consacrer pleinement, Thierry Crouzet arrive à la même conclusion : d’un point de vue réseau social, le Fediverse est la seule solution viable. Utiliser un réseau propriétaire est une compromission et une collaboration avec l’idéologie de ce réseau. Il encourage les acteurs du livre francophone à rejoindre le Fediverse.

Je maintiens moi-même une liste d’écrivain·e·s de l’imaginaire en activité sur le Fediverse. Il y en a encore trop peu.

Votre influenceur préféré n’est pas sur le Fediverse ? Mais est-il indispensable de suivre votre influenceur préféré sur un réseau social ? Vous n’êtes pas sur X parce que vous voulez suivre cet influenceur. Vous suivez cet influenceur parce que X vous fait croire que c’est indispensable pour être un véritable fan ! L’outil ne répond pas à un besoin, il le crée de toutes pièces.

Le paradoxe de la tolérance

Vous tolérez de rester sur Facebook/Messenger/Whatsapp par « respect pour ceux qui n’y sont pas » ? Vous tolérez en fermant votre gueule que votre tonton Albert raciste et homophobe balance des horreurs au repas de famille pour « ne pas envenimer la situation » ? D’ailleurs, votre Tata vous a dit que « ça n’en valait pas la peine, que vous valiez mieux que ça ». Vous tolérez sans rien dire que les fumeurs vous empestent sur les quais de gare et les terrasses par « respect pour leur liberté » ?

À un moment, il faut choisir : soit on préfère ne pas faire de vagues, soit on veut du progrès. Mais les deux sont souvent incompatibles.

Vous voulez vous passer de Facebook/Instagram/X ? Encore une fois, faites-le ! La plupart de ces réseaux permettent de restaurer un compte supprimé dans les 15 jours qui suivent sa suppression. Alors, testez ! Deux semaines sans comptes pour voir si vous avez vraiment envie de le restaurer. C’est à vous de changer votre paradigme !

LinkedIn, le réseau bullshit par excellence

On parle beaucoup de X parce que la plateforme devient un acteur majeur de promotion du fascisme. Mais chaque plateforme porte des valeurs qu’il est important de cerner pour savoir si elles nous conviennent ou pas. LinkedIn, par exemple. Qui est indistinguable de la parodie qu’en fait Babeleur (qui vient justement de quitter ce réseau).

J’ai éclaté de rire plusieurs fois tellement c’est bon. Je me demande si certains auront la lucidité de s’y reconnaître.

Encore une fois, si LinkedIn vous ennuie, si vous détestez ce réseau. Mais qu’il vous semble indispensable pour ne pas « rater » certaines opportunités professionnelles. Et bien, testez ! Supprimez-le pendant deux semaines. Restaurez-le puis resupprimez-le. Juste pour voir ce que ça fait de ne plus être sur ce réseau. Ce que ça fait de rater ce gros tas de merde malodorant que vous vous forcez à fouiller journalièrement pour le cas où il contiendrait une pépite d’or. Peut-être que ce réseau vous est indispensable, mais la seule manière de le savoir est de tenter de vous en passer pour de bon.

Peut-être que vous raterez certaines opportunités. Mais je suis certain : en n’étant pas sur ce réseau, vous en découvrirez d’autres.

De la poésie, de la fiction…

La résistance n’est pas que technique. Elle doit être également poétique ! Et pour que la poésie opère, il est nécessaire que la technologie s’efface, se fasse minimaliste et utile au lieu d’être le centre de l’attention.

On ne peut pas changer le monde. On ne peut que changer ses comportements. Le monde est façonné par ceux qui changent leurs comportements. Alors, essayez de changer. Essayez de changer de paradigme. Pendant une semaine, un mois, une année.

Après, je ne vous cache pas qu’il y a un risque : c’est souvent difficile de revenir en arrière.

Une fois qu’on a lâché la voiture pour le vélo, impossible de ne pas rêver. On se met à imaginer des mondes où la voiture aurait totalement disparu pour laisser la place au vélo…

Dédicaces

D’ailleurs, je dédicacerai Bikepunk (et mes autres livres) à la Foire du livre de Bruxelles ce samedi 15 mars à partir de 16h30 sur le stand de la province du Brabant-Wallon.

On se retrouve là-bas pour discuter vélo et changement de paradigme ?

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

March 08, 2025

20 years of Linux on the Desktop (part 3)

Previously in "20 years of Linux on the Deskop": After contributing to the launch of Ubuntu as the "perfect Linux desktop", Ploum realises that Ubuntu is drifting away from both Debian and GNOME. But something else is about to shake the world…

The new mobile paradigm

While I was focused on Ubuntu as a desktop solution, another GNOME+Debian product had appeared and was shaking the small free software world: Maemo.

It will come as a shock for the youngest but this was a time without smartphones (yes, we had electricity and, no, dinosaurs were already extinct, please keep playing Pokémon instead of interrupting me). Mobile phones were still quite new and doing exactly two things: calls and SMSes. In fact, they were sold as calling machines and the SMS frenzy, which was just a technical hack around the GSM protocol, took everybody by surprise, including operators. Were people really using awkward cramped keyboard to send themselves flood of small messages?

Small pocket computers with tiny keyboard started to appear. There were using proprietary operating systems like WinCE or Symbian and browsing a mobile version of the web, called "WAP", that required specific WAP sites and that nobody used. The Blackberry was so proprietary that it had its own proprietary network. It was particularly popular amongst business people that wanted to look serious. Obama was famously addicted to his Blackberry to the point that the firm had to create a secure proprietary network only for him once he took office in the White House. But like others, Blackberries were very limited, with very limited software. Nothing like a laptop computer.

N770, the precursor

In 2005, Nokia very quietly launched the N770 as an experiment. Unlike its competitors, it has no keyboard but a wide screen that could be used with a stylus. Inside was running a Debian system with an interface based on GNOME: Maemo.

The N770, browsing Wikipedia The N770, browsing Wikipedia

Instead of doing all the development in-house, Nokia was toying with free software. Most of the software work was done by small European companies created by free software hackers between 2004 and 2005. Those companies, often created specifically to work with Nokia, were only a handful of people each and had very narrow expertise. Fluendo was working on the media framework GStreamer. Immendio was working on the GTK user interface layer. Collabora was focusing on messaging software. Etc.

Far from the hegemony of American giant monopolists, the N770 was a mostly European attempt at innovating through a collaborative network of smaller and creative actors, everything led by the giant Nokia.

During FOSDEM 2005, GNOME developer Vincent Untz lent me a N770 prototype for two days. The first night was a dream come true: I was laying in bed, chatting on IRC and reading forums. Once the N770 was publicly released, I immediately bought my own. While standing in line in the bakery one Sunday morning, I discovered that there was an unprotected wifi. I used it to post a message on the Linuxfr website telling my fellow geeks that I was waiting for my croissants and could still chat with them thanks to free software.

Those days, chatting while waiting in a queue has been normalised to the point you remark someone not doing it. But, in 2005, this was brand new.

So new that it started a running meme about "Ploum’s baker" on Linuxfr. Twenty years later, some people that I meet for the first time still greet me with "say hello to your baker" when they learn who I am. For the record, the baker, an already-old woman at the time of the original post, retired a couple years later and the whole building was demolished to give place to a motorbike shop.

This anecdote highlights a huge flaw of the N770: without wifi, it was a dead weight. When I showed it to people, they didn’t understand what it was, they asked why I would carry it if I could not make calls with it. Not being able to use the Internet without a wifi was a huge miss but, to be fair, 3G didn’t exist yet. Another flaw was that installing new software was far from being user-friendly. Being based on Debian, Maemo was offering a Synaptic-like interface where you had to select your software in a very long list of .deb packages, including the technical libraries.

Also, it was slow and prone to crash but that could be solved.

Having played with the N770 in my bed and having seen the reactions of people around me when I used it, I knew that the N770 could become a worldwide hit. It was literally the future. There were only two things that Nokia needed to solve: make it a phone and make it easy to install new software. Also, if it could crash less, that would be perfect.

The Nokia (un)management guide to failure

But development seemed to stall. It would take more than two years for Nokia to successively release two successors to the N770: the N800 and the N810. But, besides some better performance, none of the core issues were addressed. None of those were phones. None of those offered easy installation of software. None were widely released. In fact, it was so confidential that you could only buy them through the Nokia website of some specific countries. The items were not in traditional shops nor catalogues. When I asked my employer to get a N810, the purchasing department was unable to find a reference: it didn’t exist for them. Tired by multiple days of discussion with the purchasing administration, my boss gave me his own credit card, asked me to purchase it on the Nokia website and made a "diverse material expense" to be reimbursed.

The thing was simply not available to businesses. It was like Nokia wanted Maemo to fail at all cost.

While the N800 and N810 were released, a new device appeared on the market: the Apple iPhone.

I said that the problem with the N770 is that you had to carry a phone with it. Steve Jobs had come to the same conclusion with the iPod. People had to carry an iPod and a phone. So he added the phone to the iPod. It should be highlighted that the success of the iPhone took everyone by surprise, including Steve Jobs himself. The original iPhone was envisioned as an iPod and nothing else. There was no app, no app store, no customisation (Steve Jobs was against it). It was nevertheless a hit because you could make calls, listen to music and Apple spent a fortune in marketing to advertise it worldwide. The marketing frenzy was crazy. Multiple people that knew I was "good with computers" asked me if I could unlock the iPhone they bought in the USA and which was not working in Europe (I could not). They spent a fortune on a device that was not working. Those having one were showing it to everyone.

With the iPhone, you had music listening and a phone on one single device. In theory, you could also browse the web. Of course, there was no 3G so browsing the web was mostly done through wifi, like the N770. But, at the time, websites were done with wide screens in mind and Flash was all the rage. The iPhone was not supporting Flash and the screen was vertical, which made web browsing a lot worse than on the N770. And, unlike the N770, you could not install any application.

The iPhone 1 was far from the revolution Apple want us to believe. It was just very good marketing. In retrospective, the N770 could have been a huge success had Nokia done some marketing at all. They did none.

Another Linux on your mobile

In 2008, Google launched its first phone which still had a physical keyboard. Instead of developing the software from scratch, Google used a Linux system initially developed as an embedded solution for cameras: Android. At the same time, Apple came to the realisation I had in 2005 that installing software was a key feature. The App Store was born.

Phone, web browsing and custom applications, all on one device. Since 2005, people who had tried the N770 knew this was the answer. They simply did not expect it from Apple nor Google.

When Android was first released, I thought it was what Maemo should have been. Because of the Linux kernel, I was thinking it would be a "free" operating system. I made a deep comparison with Maemo, diving into some part of the source code, and was surprised by some choices. Why Java? And why would Android avoid GStreamer in its multimedia stack? Technical explanations around that choice were not convincing. Years later, I would understand that this was not a technical choice: besides the Linux kernel itself, Google would explicitly avoid every GPL and LGPL licensed code. Android was only "free software" by accident. Gradually, the Android Open Source Project (AOSP) would be reduced to a mere skeleton while Android itself became more and more restricted and proprietary.

In reaction to the iPhone and to Android, Nokia launched the N900 at the end of 2009. Eventually, the N900 was a phone. It even included an app store called, for unknown marketing reasons, "OVI store". The phone was good. The software was good, with the exception of the infamous OVI store (which was bad, had a bad name, a non-existent software offering and, worse of all, was conflicting with deb packages).

The N900 would probably have taken the world by storm if released 3 years earlier. It would have been a success and a huge competitor to the iPhone if released 18 months before. Is it too late? The world seems to settle with an Apple/Google duopoly. A duopoly that could have been slightly shacked by the N900 if Nokia had done at least some marketing. It should be noted that the N900 had a physical keyboard. But, at that point, nobody really cared.

When failing is not enough, dig deeper

At least, there was the Maemo platform. Four years of work. Something could be done with that. That’s why, in 2010, Nokia decided to… launch Meego, a new Linux platform which replaced the Debian infrastructure by RPMs and the GNOME infrastructure by Qt.

No, really.

Even if it was theoretically, the continuation of Maemo (Maemo 6, codenamed Harmattan, was released as Meego 1), it felt like starting everything from scratch with a Fedora+KDE system. Instead of a strong leadership, Meego was a medley of Linux Foundation, Intel, AMD and Nokia. Design by committee with red tape everywhere. From the outside, it looked like Nokia outsourced its own management incompetence and administrative hubris. The N9 phone would be released in 2011 without keyboard but with Meego.

History would repeat itself two years later when people working on Meego (without Nokia) would replace it with Tizen. Yet another committee.

From being three years ahead of the competition in 2005 thanks to Free Software, Nokia managed to become two years too late in 2010 thanks to incredibly bad management and choosing to hide its products instead of advertising them.

I’ve no inside knowledge of what Nokia was at this time but my experience in the industry allows me to perfectly imagine the hundreds of meetings that probably happened at that time.

When business decisions look like very bad management from the outside, it is often because they are. In the whole Europe at the time, technical expertise was seen as the realm of those who were not gifted enough to become managers. As a young engineer, I thought that managers from higher levels were pretentious and incompetent idiots. After climbing the ladder and becoming a manager myself, years later, I got the confirmation that I was even underestimating the sheer stupidity of management. It is not that most managers were idiots, they were also proud of their incompetence and, as this story would demonstrate, they sometimes need to become deeply dishonest to succeed.

It looks like Nokia never really trusted its own Maemo initiative because no manager really understood what it was. To add insult to injury the company bought Symbian OS in 2008, an operating system which was already historical and highly limited at that time. Nodoby could figure out why they spent cash on that and why Symbian was suddenly an internal competitor to Maemo (Symbian was running on way cheaper devices).

The emotional roller coster

In 2006, I was certain that free software would take over the world. It was just a matter of time. Debian and GNOME would soon be on most desktop thanks to Ubuntu and on most mobile devices thanks to Maemo. There was no way for Microsoft to compete against such power. My wildest dreams were coming true.

Five years later, the outlook was way darker. Apple was taking the lead by being even more proprietary and closed than Microsoft. Google seemed like good guys but could we trust them? Even Ubuntu was drifting away from its own Debian and GNOME roots. The communities I loved so much were now fragmented.

Where would I go next?

(to be continued)

Subscribe by email or by rss to get the next episodes of "20 years of Linux on the Desktop".

I’m currently turning this story into a book. I’m looking for an agent or a publisher interested to work with me on this book and on an English translation of "Bikepunk", my new post-apocalyptic-cyclist typewritten novel which sold out in three weeks in France and Belgium.

I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

I write science-fiction novels in French. For Bikepunk, my new post-apocalyptic-cyclist book, my publisher is looking for contacts in other countries to distribute it in languages other than French. If you can help, contact me!

March 06, 2025

Multicloud is a cloud adoption strategy that utilizes services from multiple cloud providers rather than relying on just one. This approach enables organizations to take advantage of the best services for specific tasks, enhances resilience, and helps reduce costs. Additionally, a multicloud strategy offers the flexibility necessary to meet regulatory requirements and increases options for […]

March 05, 2025

A while back, I built a solar-powered, self-hosted website. Running a website entirely on renewable energy felt like a win, until my Raspberry Pi Zero 2 W started ghosting me.

A solar panel on a rooftop during sunset with a city skyline in the background. My solar panel and Raspberry Pi Zero 2 are set up on our rooftop deck for testing.

Every few weeks, it just disappears from the network: no ping, no SSH. Completely unreachable, yet the power LED stays on. The SD card has plenty of free space.

The solar panel and battery aren't the problem either. In fact, my solar dashboard shows they've been running for 215 days straight. Not a glitch.

Every time my Raspberry Pi goes offline, I have to go through the same frustrating ritual: get on the roof, open the waterproof enclosure, disconnect the Pi, pull the SD card, go to my office, reformat it, reinstall the OS and reconfigure everything. Then climb back up and put everything back together.

A Raspberry Pi 4 with an RS485 CAN HAT in a waterproof enclosure, surrounded by cables, screws and components. A Raspberry Pi 4 with an attached RS485 CAN HAT module is being installed in a waterproof enclosure.

A month ago, I was back on the roof deck, battling Boston winter. My fingers were numb, struggling with tiny screws and connectors. This had to stop.

The Raspberry Pi Zero 2 W is a great device for IoT projects, but only if it can run unattended for years.

Watchdogs: a safety net for when things go wrong

Enter watchdogs: tools that detect failures and trigger automatic reboots. There are two types:

  1. Hardware watchdog – Recovers from system-wide freezes like kernel panics or hardware lockups, by forcing a low-level reset.
  2. Software watchdog – Detects and fixes service-level failures, such as lost network connectivity, high CPU load or excessive RAM usage.

Running both ensures the Raspberry Pi can recover from minor issues (like a dropped connection) and system crashes (where everything becomes unresponsive).

Hardware watchdog

The hardware watchdog is a timer built into the Raspberry Pi's Broadcom chip. The operating system must reset or pet the timer regularly. If it fails to do so within a set interval, the watchdog assumes the system has frozen and forces a reboot.

Since support for the hardware watchdog is built into the Raspberry Pi's Linux kernel, it simply needs to be enabled.

Edit /etc/systemd/system.conf and add:

 RuntimeWatchdogSec=10s
ShutdownWatchdogSec=10min
  • RuntimeWatchdogSec – Defines how often the watchdog must be reset. On the Raspberry Pi, this must be less than 15–20 seconds due to hardware constraints.
  • ShutdownWatchdogSec – Keeps the watchdog active during shutdown to detect hangs.

Restart systemd to activate the watchdog:

 $ sudo systemctl daemon-reexec

Once restarted, systemd starts petting the hardware watchdog timer. If it ever fails, the Raspberry Pi will reboot.

To ensure full recovery, set all critical services to restart automatically. For example, my web server starts by itself, bringing my solar-powered website back online without any manual work.

Software watchdog

The hardware watchdog catches complete system freezes, while the software watchdog monitors network connectivity, CPU load and other metrics.

To install the software watchdog:

$ sudo apt update
$ sudo apt install watchdog

Next, edit /etc/watchdog.conf and add the following settings:

# Network monitoring
ping = 8.8.8.8
ping = 1.1.1.1
ping-count = 5

# Interface monitoring
interface = wlan0

# Basic settings
watchdog-device = none
retry-timeout = 180
realtime = yes
interval = 20

What this does:

  • ping = 8.8.8.8 / ping = 1.1.1.1 – Checks that the Pi can reach Google (8.8.8.8) and Cloudflare (1.1.1.1).
  • interface = wlan0 – Ensures the Wi-Fi interface is active.
  • retry-timeout = 180 – Reboots the Pi if these checks fail for 180 seconds.
  • interval = 20 – Performs checks every 20 seconds.
  • watchdog-device = none – Instead of using the hardware watchdog, the daemon monitors failures and triggers a software reboot through the operating system.

While I'm just monitoring the network, you can also configure the watchdog to check CPU usage, RAM or other system health metrics.

Once configured, enable and start the watchdog service:

$ sudo systemctl enable watchdog
$ sudo systemctl start watchdog

Enabling the watchdog makes sure it launches automatically on every boot, while starting it activates it immediately without requiring a restart.

Debugging watchdog reboots

When a watchdog triggers a reboot, system logs can help uncover what went wrong. To view all recent system boots, run:

$ journalctl –list-boots

This will display a list of boot sessions, each with an index (e.g. -1 for the previous boot, -2 for the one before that).

To see all shutdown events and their reason, run:

$ journalctl –no-pager | grep "shutting down the system"

If you want more details, you can check the logs leading up to a specific reboot. The following command displays the last 50 log entries immediately before the last system shutdown:

$ journalctl -b -1 -n 50 –no-pager
  • -b -1 – Retrieves logs from the previous boot.
  • -n 50 – Displays the last 50 log entries before that reboot.
  • –no-pager – Prevents logs from being paginated.

Progress, but the mystery remains

Since installing these watchdogs, my Raspberry Pi has remained accessible. It has not gone offline indefinitely. Fingers crossed it stays that way.

My logs show the software watchdog reboots the system regularly. It always reboots due to lost network connectivity.

While the watchdog is working as intended, the real mystery remains: why does the network keep failing and leave the Raspberry Pi in an unrecoverable state?

Still, this is real progress. I no longer have to climb onto the roof in freezing weather. The system recovers on its own, even when I'm away from home.

March 04, 2025

At the beginning of the year, we released MySQL 9.2, the latest Innovation Release. Sorry for the delay, but I was busy with the preFOSDEM MySQL Belgian Days and FOSDEM MySQL Belgium Days. Of course, we released bug fixes for 8.0 and 8.4 LTS, but in this post, I focus on the newest release. Within […]

February 27, 2025

The Engagement Rehab

I’ve written extensively, in French, about my quest to break my "connection addiction" by doing what I called "disconnections". At first, it was only doing three months without major news media and social networks. Then I tried to do one full year where I would only connect once a day.

This proved to be too ambitious and failed around May when the amount of stuff that required me to be online (banking, travel booking, online meetings, …) became too high.

But I’m not giving up. I started 2025 by buying a new office chair and pledging to never be connected in that chair. I disabled Wifi in the Bios of my laptop. To be online, I now need to use my laptop on my standing desk which has a RJ-45 cable.

This means I can be connected whenever I want but I’m physically feeling the connection as standing up. There’s now a clear physical difference between "being online" and "being in my offline bubble".

This doesn’t mean that I’m as super productive as I was dreaming. Instead of working on my current book project, I do lots of work on Offpunk, I draft blog posts like this one. Not great but, at least, I feel I’ve accomplished something at the end of the day.

Hush is addicted to YouTube and reflects on spending 28 days without it. Like myself, they found themselves not that much productive but, at the very least, not feeling like shit at the end of the day.

I’ve read that post because being truly disconnected forces me to read more of what is in my Offpunk. My RSS feeds, my toread list and many gemlogs. This is basically how I start every day:

I’ve discovered that between 20 and 25% of what I read from online sources is from Gemini. It appears that I like "content" on Gemini. Historically, people were complaining that there was no content on Gemini, that most posts were about the protocol itself.

Then there was a frenzy of posts about why social media were bad. And those are subtly replaced by some kind of self-reflection about our own habits, our owns addictions. Like this one about addiction to analytics:

That’s when it struck me: we are all addicted to engagement. On both sides. We like being engaged. We like seeing engagement on our own content. Gemini is an engagement rehab!

While reading Gemini posts, I feel that I’m not alone being addicted to engagement, suffering from it and trying to find a solution.

And when people in the real world starts, out of the blue, asking my opinion about Elon Musk’s latest declaration, it reminds me that the engagement addiction is not an individual problem but a societal one.

Anyway, welcome to Gemini, welcome to rehab! I’m Ploum and I’m addicted to engagement.

I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

I write science-fiction novels in French. For Bikepunk, my new post-apocalyptic-cyclist book, my publisher is looking for contacts in other countries to distribute it in languages other than French. If you can help, contact me!

February 24, 2025

I did it. I just finished generating alt-text for 9,000 images on my website.

What began as a simple task evolved into a four-part series where I compared different LLMs, evaluated local versus cloud processing, and built an automated workflow.

But this final step was different. It wasn't about technology. It was about trust and letting things go.

My AI tool in action

In my last blog post, I shared scripts to automate alt-text generation for a single image. The final step? Running my scripts on the 9,000 images missing alt-text. This covers over 20 years of images in photo albums and blog posts.

Here is my tool in action:

A terminal displays AI generating image descriptions, showing suggested title and alt-text for each photo that scrolls by.

And yes, AI generated the alt-text for this GIF. AI describing AI, a recursion that should have ripped open the space-time continuum. Sadly, no portals appeared. At best, it might have triggered a stack overflow in a distant dimension. Meanwhile, I just did the evening dishes.

ChatGPT-4o processed all 9,000 images at half a cent each, for less than $50 in total. And despite hammering their service for a couple days, I never hit a rate limit or error. Very impressive.

AI is better than me

Trusting a script to label 9,000 images made me nervous. What if mistakes in auto-generated descriptions made my website less accessible? What if future AI models trained on any mistakes?

I started cautiously, stopping after each album to check every alt-text. After reviewing 250 images, I noticed something: I wasn't fixing errors, I was just tweaking words.

Then came the real surprise. I tested my script on albums I had manually described five years ago. The result was humbling. AI wrote better alt-text: spotting details I missed, describing scenes more clearly, and capturing nuances I overlooked. Turns out, past me wasn't so great at writing alt-text.

Not just that. The LLM understood Japanese restaurant menus, decoded Hungarian text, interpreted German Drupal books, and read Dutch street signs. It recognized conference badges and correctly labeled events. It understood cultural contexts across countries. It picked up details about my photos that I had forgotten or didn't even know existed.

I was starting to understand this wasn't about AI's ability to describe images; it was about me accepting that AI often described them better than I could.

Conclusion

AI isn't perfect, but it can be very useful. People worry about hallucinations and inaccuracy, and I did too. But after generating alt-text for 9,000 images, I saw something different: real, practical value.

It didn't just make my site more accessible; it challenged me. It showed me that sometimes, the best way to improve is to step aside and let a tool do the job better.

February 20, 2025

Billions of images on the web lack proper alt-text, making them inaccessible to millions of users who rely on screen readers.

My own website is no exception, so a few weeks ago, I set out to add missing alt-text to about 9,000 images on this website.

What seemed like a simple fix became a multi-step challenge. I needed to evaluate different AI models and decide between local or cloud processing.

To make the web better, a lot of websites need to add alt-text to their images. So I decided to document my progress here on my blog so others can learn from it – or offer suggestions. This third post dives into the technical details of how I built an automated pipeline to generate alt-text at scale.

Subscribe to my blog

Join 5,000+ subscribers and get new posts by email.

High-level architecture overview

My automation process follows three steps for each image:

  1. Check if alt-text exists for a given image
  2. Generate new alt-text using AI when missing
  3. Update the database record for the image with the new alt-text

The rest of this post goes into more detail on each of these steps. If you're interested in the implementation, you can find most of the source code on GitHub.

Retrieving image metadata

To systematically process 9,000 images, I needed a structured way to identify which ones were missing alt-text.

Since my site runs on Drupal, I built two REST API endpoints to interact with the image metadata:

  • GET /album/{album-name}/{image-name}/get – Retrieves metadata for an image, including title, alt-text, and caption.
  • PATCH /album/{album-name}/{image-name}/patch – Updates specific fields, such as adding or modifying alt-text.

I've built similar APIs before, including one for my basement's temperature and humidity monitor. That post provides a more detailed breakdown of how I build endpoints like this.

This API uses separate URL paths (/get and /patch) for different operations, rather than using a single resource URL. I'd prefer to follow RESTful principles, but this approach avoids caching problems, including content negotiation issues in CDNs.

Anyway, with the new endpoints in place, fetching metadata for an image is simple:

curl -H "Authorization: test-token" \
  "https://dri.es/album/isle-of-skye-2024/journey-to-skye/get"

Every request requires an authorization token. And no, test-token isn't the real one. Without it, anyone could edit my images. While crowdsourced alt-text might be an interesting experiment, it's not one I'm looking to run today.

This request returns a JSON object with image metadata:

{
  "title": "Journey to Skye",
  "alt": "",
  "caption": "Each year, Klaas and I pick a new destination for our outdoor adventure. In 2024, we set off for the Isle of Skye in Scotland. This stop was near Glencoe, about halfway between Glasgow and Skye."
}

Because the alt-field is empty, the next step is to generate a description using AI.

Generating and refining alt-text with AI

A person stands by a small lake surrounded by grassy hills and mountains under a cloudy sky in the Scottish Highlands.

In my first post on AI-generated alt-text, I wrote a Python script to compare 10 different local Large Language Models (LLMs). The script uses PyTorch, a widely used machine learning framework for AI research and deep learning. This implementation was a great learning experience.

The original script takes an image as input and generates alt-text using multiple LLMs:

./caption.py journey-to-skye.jpg
{
  "image": "journey-to-skye.jpg",
  "captions": {
    "vit-gpt2": "A man standing on top of a lush green field next to a body of water with a bird perched on top of it.",
    "git": "A man stands in a field next to a body of water with mountains in the background and a mountain in the background.",
    "blip": "This is an image of a person standing in the middle of a field next to a body of water with a mountain in the background.",
    "blip2-opt": "A man standing in the middle of a field with mountains in the background.",
    "blip2-flan": "A man is standing in the middle of a field with a river and mountains behind him on a cloudy day.",
    "minicpm-v": "A person standing alone amidst nature, with mountains and cloudy skies as backdrop.",
    "llava-13b": "A person standing alone in a misty, overgrown field with heather and trees, possibly during autumn or early spring due to the presence of red berries on the trees and the foggy atmosphere.",
    "llava-34b": "A person standing alone on a grassy hillside with a body of water and mountains in the background, under a cloudy sky.",
    "llama32-vision-11b": "A person standing in a field with mountains and water in the background, surrounded by overgrown grass and trees."
  }
}

My original plan was to run everything locally for full control, no subscription costs, and optimal privacy. But after testing 10 local LLMs, I changed my mind.

I knew cloud-based models would be better, but wanted to see if local models were good enough for alt-texts. Turns out, they're not quite there. You can read the full comparison, but I gave the best local models a B, while cloud models earned an A.

While local processing aligned with my principles, it compromised the primary goal: creating the best possible descriptions for screen reader users. So I abandoned my local-only approach and decided to use cloud-based LLMs.

To automate alt-text generation for 9,000 images, I needed programmatic access to cloud models rather than relying on their browser-based interfaces — though browser-based AI can be tons of fun.

Instead of expanding my script with cloud LLM support, I switched to Simon Willison's llm tool: https://llm.datasette.io/. llm is a command-line tool and Python library that supports both local and cloud-based models. It takes care of installation, dependencies, API key management, and uploading images. Basically, all the things I didn't want to spend time maintaining myself.

Despite enjoying my PyTorch explorations with vision language models and multimodal encoders, I needed to focus on results. My weekly progress goal meant prioritizing working alt-text over building homegrown inference pipelines.

I also considered you, my readers. If this project inspires you to make your own website more accessible, you're better off with a script built on a well-maintained tool like llm rather than trying to adapt my custom implementation.

Scrapping my PyTorch implementation stung at first, but building on a more mature and active open-source project was far better for me and for you. So I rewrote my script, now in the v2 branch, with the original PyTorch version preserved in v1.

The new version of my script keeps the same simple interface but now supports cloud models like ChatGPT and Claude:

./caption.py journey-to-skye.jpg --model chatgpt-4o-latest claude-3-sonnet --context "Location: Glencoe, Scotland"
{
  "image": "journey-to-skye.jpg",
  "captions": {
    "chatgpt-4o-latest": "A person in a red jacket stands near a small body of water, looking at distant mountains in Glencoe, Scotland.",
    "claude-3-sonnet": "A person stands by a small lake surrounded by grassy hills and mountains under a cloudy sky in the Scottish Highlands."
  }
}

The --context parameter improves alt-text quality by adding details the LLM can't determine from the image alone. This might include GPS coordinates, album titles, or even a blog post about the trip.

In this example, I added "Location: Glencoe, Scotland". Notice how ChatGPT-4o mentions Glencoe directly while Claude-3 Sonnet references the Scottish Highlands. This contextual information makes descriptions more accurate and valuable for users. For maximum accuracy, use all available information!

Updating image metadata

With alt-text generated, the final step is updating each image. The PATCH endpoint accepts only the fields that need changing, preserving other metadata:

curl -X PATCH \
  -H "Authorization: test-token" \
  "https://dri.es/album/isle-of-skye-2024/journey-to-skye/patch" \
  -d '{
    "alt": "A person stands by a small lake surrounded by grassy hills and mountains under a cloudy sky in the Scottish Highlands.",
  }'

That's it. This completes the automation loop for one image. It checks if alt-text is needed, creates a description using a cloud-based LLM, and updates the image if necessary. Now, I just need to do this about 9,000 times.

Tracking AI-generated alt-text

Before running the script on all 9,000 images, I added a label to the database that marks each alt-text as either human-written or AI-generated. This makes it easy to:

  • Re-run AI-generated descriptions without overwriting human-written ones
  • Upgrade AI-generated alt-text as better models become available

With this approach I can update the AI-generated alt-text when ChatGPT 5 is released. And eventually, it might allow me to return to my original principles: to use a high-quality local LLM trained on public domain data. In the mean time, it helps me make the web more accessible today while building toward a better long-term solution tomorrow.

Next steps

Now that the process is automated for a single image, the last step is to run the script on all 9,000. And honestly, it makes me nervous. The perfectionist in me wants to review every single AI-generated alt-text, but that is just not feasible. So, I have to trust AI. I'll probably write one more post to share the results and what I learned from this final step.

Stay tuned.

February 11, 2025

Last week, I wrote about my plan to use AI to generate 9,000 alt-texts for images on my website. I tested 12 LLMs – 10 running locally and 2 cloud-based – to assess their accuracy in generating alt-text for images. I ended that post with two key questions:

  1. Should I use AI-generated alt-texts, even if it they are not perfect?
  2. Should I generate these alt-texts with local LLMs or in the cloud?

Since then, I've received dozens of emails and LinkedIn comments. The responses were all over the place. Some swore by local models because they align with open-source values. Others championed cloud-based LLMs for better accuracy. A couple of people even ran tests using different models to help me out.

I appreciate every response. It's a great reminder of why building in the open is so valuable: it brings in diverse perspectives.

But one comment stood out. A visually impaired reader put it simply: Imperfect alt-text is better than no alt-text.

That comment made the first decision easy: AI-generated alt-text, even if not perfect, is better than nothing.

The harder question was which AI models to use. As a long-term open-source evangelist, I really want to run my own LLMs. Local AI aligns with my values: no privacy concerns, no API quotas, more transparency, and more control. They also align with my wallet: no subscription fees. And, let's be honest: running your own LLMs earns you some bragging rights at family parties.

But here is the problem: local models aren't as good as cloud models.

Most laptops and consumer desktops have 16–32GB of RAM, which limits them to small, lower-accuracy models. Even maxing out an Apple Mac Studio with 192GB of RAM doesn't change that. Gaming GPUs? Also a dead end, at least for me. Even high-end cards with 24GB of VRAM struggle with the larger models unless you stack multiple cards together.

The gap between local and cloud hardware is big. It's like racing a bicycle against a jet engine.

I could wait. Apple will likely release a new Mac Studio this year, and I'm hoping it supports more than 192GB of RAM. NVIDIA's Digits project could make consumer-grade LLM hardware even more viable.

Local models are also improving fast. Just in the past few weeks:

  • Alibaba released Qwen 2.5 VL, which performs well in benchmarks.
  • DeepSeek launched DeepSeek-VL2, a strong new open model.
  • Mark Zuckerberg shared that Meta's Llama 4 is in testing and might be released in the next few months.

Consumer hardware and local models will continue to improve. But even when they do, cloud models will still be ahead. So, I am left with this choice:

  1. Prioritize accessibility: use the best AI models available today, even if they're cloud-based.
  2. Stick to Open Source ideals: run everything locally, but accept worse accuracy.

A reader, Kris, put it well: Prioritize users while investing in your values. That stuck with me.

I'd love to run everything locally, but making my content accessible and ensuring its accuracy matters more. So, for now, I'm moving forward with cloud-based models, even if it means compromising on my open-source ideals.

It's not the perfect answer, but it's the practical one. Prioritizing accessibility and end-user needs over my own principles feels like the right choice.

That doesn't mean I'm giving up on local LLMs. I'll keep testing models, tracking improvements, and looking for the right hardware upgrades. The moment local AI is good enough for generating alt-text, I'll switch. In my next post, I'll share my technical approach to making this work.

January 31, 2025

Treasure hunters, we have an update! Unfortunately, some of our signs have been removed or stolen, but don’t worry—the hunt is still on! To ensure everyone can continue, we will be posting all signs online so you can still access the riddles and keep progressing. However, there is one exception: the 4th riddle must still be heard in person at Building H, as it includes an important radio message. Keep your eyes on our updates, stay determined, and don’t let a few missing signs stop you from cracking the code! Good luck, and see you at Infodesk K with舰

January 29, 2025

Are you ready for a challenge? We’re hosting a treasure hunt at FOSDEM, where participants must solve six sequential riddles to uncover the final answer. Teamwork is allowed and encouraged, so gather your friends and put your problem-solving skills to the test! The six riddles are set up across different locations on campus. Your task is to find the correct locations, solve the riddles, and progress to the next step. No additional instructions will be given after this announcement, it’s up to you to navigate and decipher the clues! To keep things fair, no hints or tips will be given舰

January 27, 2025

Core to the Digital Operational Resilience Act is the notion of a critical or important function. When a function is deemed critical or important, DORA expects the company or group to take precautions and measures to ensure the resilience of the company and the markets in which it is active.

But what exactly is a function? When do we consider it critical or important? Is there a differentiation between critical and important? Can an IT function be a critical or important function?

Defining functions

Let's start with the definition of a function. Surely that is defined in the documents, right? Right?

Eh... no. The DORA regulation does not seem to provide a definition for a function. It does however refer to the definition of critical function in the Bank Recovery and Resolution Directive (BRRD), aka Directive 2014/59/EU. That's one of the regulations that focuses on the resolution in case of severe disruptions, bankrupcy or other failures of banks at a national or European level. A Delegated regulation EU/ 2016/778 further defines several definitions that inspired the DORA regulation as well.

In the latter document, we do find the definition of a function:

‘function’ means a structured set of activities, services or operations that are delivered by the institution or group to third parties irrespective from the internal organisation of the institution;

Article 2, (2), of Delegated regulation 2016/778

So if you want to be blunt, you could state that an IT function which is only supporting the own group (as in, you're not insourcing IT of other companies) is not a function, and thus cannot be a "critical or important function" in DORA's viewpoint.

That is, unless you find that the definition of previous regulations do not necessarily imply the same interpretation within DORA. After all, DORA does not amend the EU 2016/778 regulation. It amends EC 1060/2009, EU 2012/648, EU 2014/600 aka MiFIR, EU 2014/909 aka CSDR and EU 2016/1011 aka Benchmark Regulation. But none of these have a definition for 'function' at first sight.

So let's humor ourselves and move on. What is a critical function? Is that defined in DORA? Not really, sort-of. DORA has a definition for critical or important function, but let's first look at more distinct definitions.

In the BRRD regulation, this is defined as follows:

‘critical functions’ means activities, services or operations the discontinuance of which is likely in one or more Member States, to lead to the disruption of services that are essential to the real economy or to disrupt financial stability due to the size, market share, external and internal interconnectedness, complexity or cross-border activities of an institution or group, with particular regard to the substitutability of those activities, services or operations;

Article 2, (35), of BRRD 2014/59

This extends on the use of function, and adds in the evaluation if it is crucial for the economy, especially when it would be suddenly discontinued. The extension on the definition of function is also confirmed by guidance that the European Single Resolution Board published, namely that "the function is provided by an institution to third parties not affiliated to the institution or group".

The preamble of the Delegated regulation also mentions that its focus is at the safeguarding of the financial stability and the real economy. It gives examples of potential critical functions such as deposit taking, lending and loan services, payment, clearing, custody and settlement services, wholesale funding markets activities, and capital markets and investments activities.

Of course, your IT is supporting your company, and in case of financial institutions, IT is a very big part of the company. Is IT then not involved in all of this?

It sure is...

Defining services

The Delegated regulation EU 2016/778 in its preamble already indicates that functions are supported by services:

Critical services should be the underlying operations, activities and services performed for one (dedicated services) or more business units or legal entities (shared services) within the group which are needed to provide one or more critical functions. Critical services can be performed by one or more entities (such as a separate legal entity or an internal unit) within the group (internal service) or be outsourced to an external provider (external service). A service should be considered critical where its disruption can present a serious impediment to, or completely prevent, the performance of critical functions as they are intrinsically linked to the critical functions that an institution performs for third parties. Their identification follows the identification of a critical function.

Preamble, (8), Delegated regulation 2016/778

IT within an organization is certainly offering services to one or more of the business units within that financial institution. Once the company has defined its critical functions (or for DORA, "critical or important functions"), then the company will need to create a mapping of all assets and services that are needed to realize that function.

Out of that mapping, it is very well possible that several IT services will be considered critical services. I'm myself involved in the infrastructure side of things, which are often shared services. The delegated regulation already points to it, and a somewhat older guideline from the Financial Stability Board has the following to say about critical shared services:

a critical shared service has the following elements: (i) an activity, function or service is performed by either an internal unit, a separate legal entity within the group or an external provider; (ii) that activity, function or service is performed for one or more business units or legal entities of the group; (iii) the sudden and disorderly failure or malfunction would lead to the collapse of or present a serious impediment to the performance of, critical functions.

FSB guidance on identification of critical functions and critical shared services

For IT organizations, it is thus most important to focus on the services they offer.

Definition of critical or important function

Within DORA, the definition of critical or important function is as follows:

(22) ‘critical or important function’ means a function, the disruption of which would materially impair the financial performance of a financial entity, or the soundness or continuity of its services and activities, or the discontinued, defective or failed performance of that function would materially impair the continuing compliance of a financial entity with the conditions and obligations of its authorisation, or with its other obligations under applicable financial services law;

Article 3, (22), DORA

If we compare this definition with the previous ones about critical functions, we notice that it is extended with an evaluation of the impact towards the company - rather than the market. I think it is safe to say that this is the or important part of the critical or important function: whereas a function is critical if its discontinuance has market impact, a function is important if its discontinuance causes material impairment towards the company itself.

Hence, we can consider a critical or important function as being either market impact (critical) or company impact (important), but retaining externally offered (function).

This more broad definition does mean that DORA's regulation puts more expectations forward than previous regulation, which is one of the reasons that DORA is that impactful to financial institutions.

Implications towards IT

From the above, I'd wager that IT itself is not a "critical or important function", but IT offers services which could be supporting critical or important functions. Hence, it is necessary that the company has a good mapping of the functions and their underlying services, operations and systems. From that mapping, we can then see if those underlying services are crucial for the function or not. If they are, then we should consider those as critical or important systems.

This mapping is mandated by DORA as well:

Financial entities shall identify all information assets and ICT assets, including those on remote sites, network resources and hardware equipment, and shall map those considered critical. They shall map the configuration of the information assets and ICT assets and the links and interdependencies between the different information assets and ICT assets.

Article 8, (4), DORA

as well as:

As part of the overall business continuity policy, financial entities shall conduct a business impact analysis (BIA) of their exposures to severe business disruptions. Under the BIA, financial entities shall assess the potential impact of severe business disruptions by means of quantitative and qualitative criteria, using internal and external data and scenario analysis, as appropriate. The BIA shall consider the criticality of identified and mapped business functions, support processes, third-party dependencies and information assets, and their interdependencies. Financial entities shall ensure that ICT assets and ICT services are designed and used in full alignment with the BIA, in particular with regard to adequately ensuring the redundancy of all critical components.

Article 11, paragraph 2, DORA

In more complex landscapes, it is very well possible that the mapping is a multi-layered view with different types of systems or services in between, which could make the effort to identify services as being critical or important quite challenging.

For instance, it could be that the IT organization has a service catalog, but that this service catalog is too broadly defined to use the indication of critical or important. Making a more fine-grained service catalog will be necessary to properly evaluate the dependencies, but that also implies that your business (who has defined their critical or important functions) will need to indicate which fine-grained service they are depending on, rather than the high-level services.

In later posts, I'll probably dive deeper into this layered view.

Feedback? Comments? Don't hesitate to get in touch on Mastodon.

January 26, 2025

The regular FOSDEM lightning talk track isn't chaotic enough, so this year we're introducing Lightning Lightning Talks (now with added lightning!). Update: we've had a lot of proposals, so submissions are now closed! Thought of a last minute topic you want to share? Got your interesting talk rejected? Has something exciting happened in the last few weeks you want to talk about? Get that talk submitted to Lightning Lightning Talks! This is an experimental session taking place on Sunday afternoon (13:00 in k1105), containing non-stop lightning fast 5 minute talks. Submitted talks will be automatically presented by our Lightning舰

January 17, 2025

As in previous years, some small rooms will be available for Birds of a Feather sessions. The concept is simple: Any project or community can reserve a timeslot (30 minutes or 1 hour) during which they have the room just to themselves. These rooms are intended for ad-hoc discussions, meet-ups or brainstorming sessions. They are not a replacement for a developer room and they are certainly not intended for talks. Schedules: BOF Track A, BOF Track B, BOF Track C. To apply for a BOF session, enter your proposal at https://fosdem.org/submit. Select any of the BOF tracks and mention in舰

January 16, 2025

With FOSDEM just a few days away, it is time for us to enlist your help. Every year, an enthusiastic band of volunteers make FOSDEM happen and make it a fun and safe place for all our attendees. We could not do this without you. This year we again need as many hands as possible, especially for heralding during the conference, during the buildup (starting Friday at noon) and teardown (Sunday evening). No need to worry about missing lunch at the weekend, food will be provided. Would you like to be part of the team that makes FOSDEM tick?舰
If your non-geek partner and/or kids are joining you to FOSDEM, they may be interested in spending some time exploring Brussels while you attend the conference. Like previous years, FOSDEM is organising sightseeing tours. UPDATE: The tour is now fully booked.

January 15, 2025

We were made aware of planned protests during the upcoming FOSDEM 2025 in response to a scheduled talk which is causing controversy. The talk in question is claimed to be on the schedule for sponsorship reasons; additionally, some of the speakers scheduled to speak during this talk are controversial to some of our attendees. To be clear, in our 25 year history, we have always had the hard rule that sponsorship does not give you preferential treatment for talk selection; this policy has always applied, it applied in this particular case, and it will continue to apply in the future.舰

January 12, 2025

One of the topics that most financial institutions are (still) currently working on, is their compliance with a European legislation called DORA. This abbreviation, which stands for "Digital Operational Resilience Act", is a European regulation. European regulations apply automatically and uniformly across all EU countries. This is unlike another recent legislation called NIS2, the "Network and Information Security" directive. As a EU directive, NIS2 requires the EU countries to formulate the directive into local law. As a result, different EU countries can have a slightly different implementation.

The DORA regulation applies to the EU financial sector, and has some strict requirements in it that companies' IT stakeholders are affected by. It doesn't often sugar-coat things like some frameworks do. This has the advantage that its "interpretation flexibility" is quite reduced - but not zero of course. Yet, that advantage is also a disadvantage: financial entities might have had different strategies covering their resiliency, and now need to adjust their strategy.

January 09, 2025

The preFOSDEM MySQL Belgian Days 2025 will occur at the usual place (ICAB Incubator, Belgium, 1040 Bruxelles) on Thursday, January 30th, and Friday, January 31st, just before FOSDEM. Again this year, we will have the chance to have incredible sessions from our Community and the opportunity to meet some MySQL Engineers from Oracle. DimK will […]

January 07, 2025

FOSDEM Junior is a collaboration between FOSDEM, Code Club, CoderDojo, developers, and volunteers to organize workshops and activities for children during the FOSDEM weekend. These activities are for children to learn and get inspired about technology. This year’s activities include microcontrollers, embroidery, game development, music, and mobile application development. Last year we organized the first edition of FOSDEM Junior. We are pleased to announce that we will be back this year. Registration for individual workshops is required. Links can be found on the page of each activity. The full schedule can be viewed at the junior track schedule page. You舰

January 02, 2025

January 01, 2025

2025 = (20 + 25)²

2025 = 45²

2025 = 1³+2³+3³+4³+5³+6³+7³+8³+9³

2025 = (1+2+3+4+5+6+7+8+9)²

2025 = 1+3+5+7+9+11+...+89

2025 = 9² x 5²

2025 = 40² + 20² + 5²

December 27, 2024

At work, I've been maintaining a perl script that needs to run a number of steps as part of a release workflow.

Initially, that script was very simple, but over time it has grown to do a number of things. And then some of those things did not need to be run all the time. And then we wanted to do this one exceptional thing for this one case. And so on; eventually the script became a big mess of configuration options and unreadable flow, and so I decided that I wanted it to be more configurable. I sat down and spent some time on this, and eventually came up with what I now realize is a domain-specific language (DSL) in JSON, implemented by creating objects in Moose, extensible by writing more object classes.

Let me explain how it works.

In order to explain, however, I need to explain some perl and Moose basics first. If you already know all that, you can safely skip ahead past the "Preliminaries" section that's next.

Preliminaries

Moose object creation, references.

In Moose, creating a class is done something like this:

package Foo;

use v5.40;
use Moose;

has 'attribute' => (
    is  => 'ro',
    isa => 'Str',
    required => 1
);

sub say_something {
    my $self = shift;
    say "Hello there, our attribute is " . $self->attribute;
}

The above is a class that has a single attribute called attribute. To create an object, you use the Moose constructor on the class, and pass it the attributes you want:

use v5.40;
use Foo;

my $foo = Foo->new(attribute => "foo");

$foo->say_something;

(output: Hello there, our attribute is foo)

This creates a new object with the attribute attribute set to bar. The attribute accessor is a method generated by Moose, which functions both as a getter and a setter (though in this particular case we made the attribute "ro", meaning read-only, so while it can be set at object creation time it cannot be changed by the setter anymore). So yay, an object.

And it has methods, things that we set ourselves. Basic OO, all that.

One of the peculiarities of perl is its concept of "lists". Not to be confused with the lists of python -- a concept that is called "arrays" in perl and is somewhat different -- in perl, lists are enumerations of values. They can be used as initializers for arrays or hashes, and they are used as arguments to subroutines. Lists cannot be nested; whenever a hash or array is passed in a list, the list is "flattened", that is, it becomes one big list.

This means that the below script is functionally equivalent to the above script that uses our "Foo" object:

use v5.40;
use Foo;

my %args;

$args{attribute} = "foo";

my $foo = Foo->new(%args);

$foo->say_something;

(output: Hello there, our attribute is foo)

This creates a hash %args wherein we set the attributes that we want to pass to our constructor. We set one attribute in %args, the one called attribute, and then use %args and rely on list flattening to create the object with the same attribute set (list flattening turns a hash into a list of key-value pairs).

Perl also has a concept of "references". These are scalar values that point to other values; the other value can be a hash, a list, or another scalar. There is syntax to create a non-scalar value at assignment time, called anonymous references, which is useful when one wants to remember non-scoped values. By default, references are not flattened, and this is what allows you to create multidimensional values in perl; however, it is possible to request list flattening by dereferencing the reference. The below example, again functionally equivalent to the previous two examples, demonstrates this:

use v5.40;
use Foo;

my $args = {};

$args->{attribute} = "foo";

my $foo = Foo->new(%$args);

$foo->say_something;

(output: Hello there, our attribute is foo)

This creates a scalar $args, which is a reference to an anonymous hash. Then, we set the key attribute of that anonymous hash to bar (note the use arrow operator here, which is used to indicate that we want to dereference a reference to a hash), and create the object using that reference, requesting hash dereferencing and flattening by using a double sigil, %$.

As a side note, objects in perl are references too, hence the fact that we have to use the dereferencing arrow to access the attributes and methods of Moose objects.

Moose attributes don't have to be strings or even simple scalars. They can also be references to hashes or arrays, or even other objects:

package Bar;

use v5.40;
use Moose;

extends 'Foo';

has 'hash_attribute' => (
    is => 'ro',
    isa => 'HashRef[Str]',
    predicate => 'has_hash_attribute',
);

has 'object_attribute' => (
    is => 'ro',
    isa => 'Foo',
    predicate => 'has_object_attribute',
);

sub say_something {
    my $self = shift;

    if($self->has_object_attribute) {
        $self->object_attribute->say_something;
    }

    $self->SUPER::say_something unless $self->has_hash_attribute;

    say "We have a hash attribute!"
}

This creates a subclass of Foo called Bar that has a hash attribute called hash_attribute, and an object attribute called object_attribute. Both of them are references; one to a hash, the other to an object. The hash ref is further limited in that it requires that each value in the hash must be a string (this is optional but can occasionally be useful), and the object ref in that it must refer to an object of the class Foo, or any of its subclasses.

The predicates used here are extra subroutines that Moose provides if you ask for them, and which allow you to see if an object's attribute has a value or not.

The example script would use an object like this:

use v5.40;
use Bar;

my $foo = Foo->new(attribute => "foo");

my $bar = Bar->new(object_attribute => $foo, attribute => "bar");

$bar->say_something;

(output: Hello there, our attribute is foo)

This example also shows object inheritance, and methods implemented in child classes.

Okay, that's it for perl and Moose basics. On to...

Moose Coercion

Moose has a concept of "value coercion". Value coercion allows you to tell Moose that if it sees one thing but expects another, it should convert is using a passed subroutine before assigning the value.

That sounds a bit dense without example, so let me show you how it works. Reimaginging the Bar package, we could use coercion to eliminate one object creation step from the creation of a Bar object:

package "Bar";

use v5.40;

use Moose;
use Moose::Util::TypeConstraints;

extends "Foo";

coerce "Foo",
    from "HashRef",
    via { Foo->new(%$_) };

has 'hash_attribute' => (
    is => 'ro',
    isa => 'HashRef',
    predicate => 'has_hash_attribute',
);

has 'object_attribute' => (
    is => 'ro',
    isa => 'Foo',
    coerce => 1,
    predicate => 'has_object_attribute',
);

sub say_something {
    my $self = shift;

    if($self->has_object_attribute) {
        $self->object_attribute->say_something;
    }

    $self->SUPER::say_something unless $self->has_hash_attribute;

    say "We have a hash attribute!"
}

Okay, let's unpack that a bit.

First, we add the Moose::Util::TypeConstraints module to our package. This is required to declare coercions.

Then, we declare a coercion to tell Moose how to convert a HashRef to a Foo object: by using the Foo constructor on a flattened list created from the hashref that it is given.

Then, we update the definition of the object_attribute to say that it should use coercions. This is not the default, because going through the list of coercions to find the right one has a performance penalty, so if the coercion is not requested then we do not do it.

This allows us to simplify declarations. With the updated Bar class, we can simplify our example script to this:

use v5.40;

use Bar;

my $bar = Bar->new(attribute => "bar", object_attribute => { attribute => "foo" });

$bar->say_something

(output: Hello there, our attribute is foo)

Here, the coercion kicks in because the value object_attribute, which is supposed to be an object of class Foo, is instead a hash ref. Without the coercion, this would produce an error message saying that the type of the object_attribute attribute is not a Foo object. With the coercion, however, the value that we pass to object_attribute is passed to a Foo constructor using list flattening, and then the resulting Foo object is assigned to the object_attribute attribute.

Coercion works for more complicated things, too; for instance, you can use coercion to coerce an array of hashes into an array of objects, by creating a subtype first:

package MyCoercions;
use v5.40;

use Moose;
use Moose::Util::TypeConstraints;

use Foo;

subtype "ArrayOfFoo", as "ArrayRef[Foo]";
subtype "ArrayOfHashes", as "ArrayRef[HashRef]";

coerce "ArrayOfFoo", from "ArrayOfHashes", via { [ map { Foo->create(%$_) } @{$_} ] };

Ick. That's a bit more complex.

What happens here is that we use the map function to iterate over a list of values.

The given list of values is @{$_}, which is perl for "dereference the default value as an array reference, and flatten the list of values in that array reference".

So the ArrayRef of HashRefs is dereferenced and flattened, and each HashRef in the ArrayRef is passed to the map function.

The map function then takes each hash ref in turn and passes it to the block of code that it is also given. In this case, that block is { Foo->create(%$_) }. In other words, we invoke the create factory method with the flattened hashref as an argument. This returns an object of the correct implementation (assuming our hash ref has a type attribute set), and with all attributes of their object set to the correct value. That value is then returned from the block (this could be made more explicit with a return call, but that is optional, perl defaults a return value to the rvalue of the last expression in a block).

The map function then returns a list of all the created objects, which we capture in an anonymous array ref (the [] square brackets), i.e., an ArrayRef of Foo object, passing the Moose requirement of ArrayRef[Foo].

Usually, I tend to put my coercions in a special-purpose package. Although it is not strictly required by Moose, I find that it is useful to do this, because Moose does not allow a coercion to be defined if a coercion for the same type had already been done in a different package. And while it is theoretically possible to make sure you only ever declare a coercion once in your entire codebase, I find that doing so is easier to remember if you put all your coercions in a specific package.

Okay, now you understand Moose object coercion! On to...

Dynamic module loading

Perl allows loading modules at runtime. In the most simple case, you just use require inside a stringy eval:

my $module = "Foo";
eval "require $module";

This loads "Foo" at runtime. Obviously, the $module string could be a computed value, it does not have to be hardcoded.

There are some obvious downsides to doing things this way, mostly in the fact that a computed value can basically be anything and so without proper checks this can quickly become an arbitrary code vulnerability. As such, there are a number of distributions on CPAN to help you with the low-level stuff of figuring out what the possible modules are, and how to load them.

For the purposes of my script, I used Module::Pluggable. Its API is fairly simple and straightforward:

package Foo;

use v5.40;
use Moose;

use Module::Pluggable require => 1;

has 'attribute' => (
    is => 'ro',
    isa => 'Str',
);

has 'type' => (
    is => 'ro',
    isa => 'Str',
    required => 1,
);

sub handles_type {
    return 0;
}

sub create {
    my $class = shift;
    my %data = @_;

    foreach my $impl($class->plugins) {
        if($impl->can("handles_type") && $impl->handles_type($data{type})) {
            return $impl->new(%data);
        }
    }
    die "could not find a plugin for type " . $data{type};
}

sub say_something {
    my $self = shift;
    say "Hello there, I am a " . $self->type;
}

The new concept here is the plugins class method, which is added by Module::Pluggable, and which searches perl's library paths for all modules that are in our namespace. The namespace is configurable, but by default it is the name of our module; so in the above example, if there were a package "Foo::Bar" which

  • has a subroutine handles_type
  • that returns a truthy value when passed the value of the type key in a hash that is passed to the create subroutine,
  • then the create subroutine creates a new object with the passed key/value pairs used as attribute initializers.

Let's implement a Foo::Bar package:

package Foo::Bar;

use v5.40;
use Moose;

extends 'Foo';

has 'type' => (
    is => 'ro',
    isa => 'Str',
    required => 1,
);

has 'serves_drinks' => (
    is => 'ro',
    isa => 'Bool',
    default => 0,
);

sub handles_type {
    my $class = shift;
    my $type = shift;

    return $type eq "bar";
}

sub say_something {
    my $self = shift;
    $self->SUPER::say_something;
    say "I serve drinks!" if $self->serves_drinks;
}

We can now indirectly use the Foo::Bar package in our script:

use v5.40;
use Foo;

my $obj = Foo->create(type => bar, serves_drinks => 1);

$obj->say_something;

output:

Hello there, I am a bar.
I serve drinks!

Okay, now you understand all the bits and pieces that are needed to understand how I created the DSL engine. On to...

Putting it all together

We're actually quite close already. The create factory method in the last version of our Foo package allows us to decide at run time which module to instantiate an object of, and to load that module at run time. We can use coercion and list flattening to turn a reference to a hash into an object of the correct type.

We haven't looked yet at how to turn a JSON data structure into a hash, but that bit is actually ridiculously trivial:

use JSON::MaybeXS;

my $data = decode_json($json_string);

Tada, now $data is a reference to a deserialized version of the JSON string: if the JSON string contained an object, $data is a hashref; if the JSON string contained an array, $data is an arrayref, etc.

So, in other words, to create an extensible JSON-based DSL that is implemented by Moose objects, all we need to do is create a system that

  • takes hash refs to set arguments
  • has factory methods to create objects, which

    • uses Module::Pluggable to find the available object classes, and
    • uses the type attribute to figure out which object class to use to create the object
  • uses coercion to convert hash refs into objects using these factory methods

In practice, we could have a JSON file with the following structure:

{
    "description": "do stuff",
    "actions": [
        {
            "type": "bar",
            "serves_drinks": true,
        },
        {
            "type": "bar",
            "serves_drinks": false,
        }
    ]
}

... and then we could have a Moose object definition like this:

package MyDSL;

use v5.40;
use Moose;

use MyCoercions;

has "description" => (
    is => 'ro',
    isa => 'Str',
);

has 'actions' => (
    is => 'ro',
    isa => 'ArrayOfFoo'
    coerce => 1,
    required => 1,
);

sub say_something {
    say "Hello there, I am described as " . $self->description . " and I am performing my actions: ";

    foreach my $action(@{$self->actions}) {
        $action->say_something;
    }
}

Now, we can write a script that loads this JSON file and create a new object using the flattened arguments:

use v5.40;
use MyDSL;
use JSON::MaybeXS;

my $input_file_name = shift;

my $args = do {
    local $/ = undef;

    open my $input_fh, "<", $input_file_name or die "could not open file";
    <$input_fh>;
};

$args = decode_json($args);

my $dsl = MyDSL->new(%$args);

$dsl->say_something

Output:

Hello there, I am described as do stuff and I am performing my actions:
Hello there, I am a bar
I am serving drinks!
Hello there, I am a bar

In some more detail, this will:

  • Read the JSON file and deserialize it;
  • Pass the object keys in the JSON file as arguments to a constructor of the MyDSL class;
  • The MyDSL class then uses those arguments to set its attributes, using Moose coercion to convert the "actions" array of hashes into an array of Foo::Bar objects.
  • Perform the say_something method on the MyDSL object

Once this is written, extending the scheme to also support a "quux" type simply requires writing a Foo::Quux class, making sure it has a method handles_type that returns a truthy value when called with quux as the argument, and installing it into the perl library path. This is rather easy to do.

It can even be extended deeper, too; if the quux type requires a list of arguments rather than just a single argument, it could itself also have an array attribute with relevant coercions. These coercions could then be used to convert the list of arguments into an array of objects of the correct type, using the same schema as above.

The actual DSL is of course somewhat more complex, and also actually does something useful, in contrast to the DSL that we define here which just says things.

Creating an object that actually performs some action when required is left as an exercise to the reader.

December 20, 2024

Let’s stay a bit longer with MySQL 3.2x to advance the MySQL Retrospective in anticipation of the 30th Anniversary. The idea of this article was suggested to me by Daniël van Eeden. Did you know that in the early days, and therefore still in MySQL 3.20, MySQL used the ISAM storage format? IBM introduced the […]

December 18, 2024

To further advance the MySQL Retrospective in anticipation of the 30th Anniversary, today, let’s discuss the very first version of MySQL that became availble to a wide audient though the popular InfoMagic distribution: MySQL 3.20! In 1997, InfoMagic incorporated MySQL 3.20 as part of the RedHat Contrib CD-ROM (MySQL 3.20.25). Additionally, version 3.20.13-beta was also […]

December 08, 2024

It’s been 2 years since AOPro was launched and a lot has happened in that time; bugs were squashed, improvements were made and some great features were added. Taking that into account on one hand and increasing costs from suppliers on the other: prices will see a smallish increase as from 2025 (exact amounts still to be determined) But rest assured; if you already signed up, you will continue to…

Source

November 26, 2024

The deadline for talk submissions is rapidly approaching! If you are interested in talking at FOSDEM this year (yes, I'm talking to you!), it's time to polish off and submit those proposals in the next few days before the 1st: Devrooms: follow the instructions in each cfp listed here Main tracks: for topics which are more general or don't fit in a devroom, select 'Main' as the track here Lightning talks: for short talks (15 minutes) on a wide range of topics, select 'Lightning Talks' as the track here For more details, refer to the previous post.

November 25, 2024

Last month we released MySQL 9.1, the latest Innovation Release. Of course, we released bug fixes for 8.0 and 8.4 LTS but in this post, I focus on the newest release. Within these releases, we included patches and code received by our amazing Community. Here is the list of contributions we processed and included in […]

November 15, 2024

With great pleasure we can announce that the following projects will have a stand at FOSDEM 2025 (1st & 2nd February). This is the list of stands (in alphabetic order): 0 A.D. Empires Ascendant AlekSIS and Teckids AlmaLinux OS CalyxOS Ceph Chamilo CISO Assistant Cloud Native Computing Foundation (CNCF) Codeberg and Forgejo coreboot / flashprog / EDKII / OpenBMC Debian DeepComputing's DC-ROMA RISC-V Mainboard with Framework Laptop 13 DevPod Digital Public Goods Dolibarr ERP CRM Drupal Eclipse Foundation Fedora Project FerretDB Firefly Zero FOSSASIA Free Software Foundation Europe FreeBSD Project FreeCAD and KiCAD Furi Labs Gentoo Linux & Flatcar舰

November 01, 2024

Dear WordPress friends in the USA: I hope you vote and when you do, I hope you vote for respect. The world worriedly awaits your collective verdict, as do I. Peace! Watch this video on YouTube.

Source

October 29, 2024

As announced yesterday, the MySQL Devroom is back at FOSDEM! For people preparing for their travel to Belgium, we want to announce that the MySQL Belgian Days fringe event will be held on the Thursday and Friday before FOSDEM. This event will take place on January 30th and 31st, 2025, in Brussels at the usual […]

October 05, 2024

Cover Ember Knights

Proton is a compatibility layer for Windows games to run on Linux. Running a Windows games is mostly just hitting the Play button within Steam. It’s that good that many games now run faster on Linux than on native Windows. That’s what makes the Steam Deck the best gaming handheld of the moment.

But a compatibility layer is still a layer, so you may encounter … incompatibilities. Ember Knights is a lovely game with fun co-op multiplayer support. It runs perfectly on the (Linux-based) Steam Deck, but on my Ubuntu laptop I encountered long loading times (startup was 5 minutes and loading between worlds was slow). But once the game was loaded it ran fine.

Debugging the game reveled that there were lost of EAGAIN errors while the game was trying to access the system clock. Changing the numer of allowed open files fixed the problem for me.

Add this to end end of the following files:

  • in /etc/security/limits.conf:
* hard nofile 1048576
  • in /etc/systemd/system.conf and /etc/systemd/user.conf:
DefaultLimitNOFILE=1048576 

Reboot.

Cover In Game

“The Witcher 3: Wild Hunt” is considered to be one of the greatest video games of all time. I certainly agree with that sentiment.

At its core, The Witcher 3 is a action-role playing game with a third-person perspective in a huge open world. You develop your character while the story advances. At the same time you can freely roam and explore as much as you like. The main story is captivating and the world is filled with with side quests and lots of interesting people. Fun for at least 200 hours, if you’re the exploring kind. If you’re not, the base game (without DLCs) will still take you 50 hours to finish.

While similar to other great games like Nintendo’s Zelda Breath of the Wild and Sony’s Horizon Zero Dawn, the strength of the game is a deep lore originating from the Witcher series novels written by the “Polish Tolkien” Andrzej Sapkowski. It’s not a game, but a universe (nowadays it even includes a Netflix tv-series).

A must play.

Played on the Steam Deck without any issues (“Steam Deck Verified”)

September 10, 2024

In previous blog posts, we discussed setting up a GPG smartcard on GNU/Linux and FreeBSD.

In this blog post, we will configure Thunderbird to work with an external smartcard reader and our GPG-compatible smartcard.

beastie gnu tux

Before Thunderbird 78, if you wanted to use OpenPGP email encryption, you had to use a third-party add-on such as https://enigmail.net/.

Thunderbird’s recent versions natively support OpenPGP. The Enigmail addon for Thunderbird has been discontinued. See: https://enigmail.net/index.php/en/home/news.

I didn’t find good documentation on how to set up Thunderbird with a GnuPG smartcard when I moved to a new coreboot laptop, so this was the reason I created this blog post series.

GnuPG configuration

We’ll not go into too much detail on how to set up GnuPG. This was already explained in the previous blog posts.

If you want to use a HSM with GnuPG you can use the gnupg-pkcs11-scd agent https://github.com/alonbl/gnupg-pkcs11-scd that translates the pkcs11 interface to GnuPG. A previous blog post describes how this can be configured with SmartCard-HSM.

We’ll go over some steps to make sure that the GnuPG is set up correctly before we continue with the Thunderbird configuration. The pinentry command must be configured with graphical support to type our pin code in the Graphical user environment.

Import Public Key

Make sure that your public key - or the public key of the reciever(s) - is/are imported.

[staf@snuffel ~]$ gpg --list-keys
[staf@snuffel ~]$ 
[staf@snuffel ~]$ gpg --import <snip>.asc
gpg: key XXXXXXXXXXXXXXXX: public key "XXXX XXXXXXXXXX <XXX@XXXXXX>" imported
gpg: Total number processed: 1
gpg:               imported: 1
[staf@snuffel ~]$ 
[staf@snuffel ~]$  gpg --list-keys
/home/staf/.gnupg/pubring.kbx
-----------------------------
pub   xxxxxxx YYYYY-MM-DD [SC]
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
uid           [ xxxxxxx] xxxx xxxxxxxxxx <xxxx@xxxxxxxxxx.xx>
sub   xxxxxxx xxxx-xx-xx [A]
sub   xxxxxxx xxxx-xx-xx [E]

[staf@snuffel ~]$ 

Pinentry

Thunderbird will not ask for your smartcard’s pin code.

This must be done on your smartcard reader if it has a pin pad or an external pinentry program.

The pinentry is configured in the gpg-agent.conf configuration file. As we’re using Thunderbird is a graphical environment we’ll configure it to use a graphical version.

Installation

I’m testing KDE plasma 6 on FreeBSD, so I installed the Qt version of pinentry.

On GNU/Linux you can check the documentation of your favourite Linux distribution to install a graphical pinentry. If you use a Graphical user environment there is probably already a graphical-enabled pinentry installed.

[staf@snuffel ~]$ sudo pkg install -y pinentry-qt6
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        pinentry-qt6: 1.3.0

Number of packages to be installed: 1

76 KiB to be downloaded.
[1/1] Fetching pinentry-qt6-1.3.0.pkg: 100%   76 KiB  78.0kB/s    00:01    
Checking integrity... done (0 conflicting)
[1/1] Installing pinentry-qt6-1.3.0...
[1/1] Extracting pinentry-qt6-1.3.0: 100%
==> Running trigger: desktop-file-utils.ucl
Building cache database of MIME types
[staf@snuffel ~]$ 

Configuration

The gpg-agent is responsible for starting the pinentry program. Let’s reconfigure it to start the pinentry that we like to use.

[staf@snuffel ~]$ cd .gnupg/
[staf@snuffel ~/.gnupg]$ 
[staf@snuffel ~/.gnupg]$ vi gpg-agent.conf

The pinentry is configured in the pinentry-program directive. You’ll find the complete gpg-agent.conf that I’m using below.

debug-level expert
verbose
verbose
log-file /home/staf/logs/gpg-agent.log
pinentry-program /usr/local/bin/pinentry-qt

Reload the sdaemon and gpg-agent configuration.

staf@freebsd-gpg3:~/.gnupg $ gpgconf --reload scdaemon
staf@freebsd-gpg3:~/.gnupg $ gpgconf --reload gpg-agent
staf@freebsd-gpg3:~/.gnupg $ 

Test

To verify that gpg works correctly and that the pinentry program works in our graphical environment we sign a file.

Create a new file.

$ cd /tmp
[staf@snuffel /tmp]$ 
[staf@snuffel /tmp]$ echo "foobar" > foobar
[staf@snuffel /tmp]$ 

Try to sign it.

[staf@snuffel /tmp]$ gpg --sign foobar
[staf@snuffel /tmp]$ 

If everything works fine, the pinentry program will ask for the pincode to sign it.

image info

Thunderbird

In this section we’ll (finally) configure Thunderbird to use GPG with a smartcard reader.

Allow external smartcard reader

open settings

Open the global settings, click on the "Hamburger" icon and select settings.

Or press [F10] to bring-up the "Menu bar" in Thunderbird and select [Edit] and Settings.

open settings

In the settings window click on [Config Editor].

This will open the Advanced Preferences window.

allow external gpg

In the Advanced Preferences window search for "external_gnupg" settings and set mail.indenity.allow_external_gnupg to true.


 

Setup End-To-End Encryption

The next step is to configure the GPG keypair that we’ll use for our user account.

open settings

Open the account setting by pressing on the "Hamburger" icon and select Account Settings or press [F10] to open the menu bar and select Edit, Account Settings.

Select End-to-End Encryption at OpenPG section select [ Add Key ].

open settings

Select the ( * ) Use your external key though GnuPG (e.g. from a smartcard)

And click on [Continue]

The next window will ask you for the Secret Key ID.

open settings

Execute gpg --list-keys to get your secret key id.

Copy/paste your key id and click on [ Save key ID ].

I found that it is sometimes required to restart Thunderbird to reload the configuration when a new key id is added. So restart Thunderbird or restart it fails to find your key id in the keyring.

Test

open settings

As a test we send an email to our own email address.

Open a new message window and enter your email address into the To: field.

Click on [OpenPGP] and Encrypt.

open settings

Thunderbird will show a warning message that it doesn't know the public key to set up the encryption.

Click on [Resolve].

discover keys In the next window Thunderbird will ask to Discover Public Keys online or to import the Public Keys From File, we'll import our public key from a file.
open key file In the Import OpenPGP key File window select your public key file, and click on [ Open ].
open settings

Thunderbird will show a window with the key fingerprint. Select ( * ) Accepted.

Click on [ Import ] to import the public key.

open settings

With our public key imported, the warning about the End-to-end encryption requires resolving key issue should be resolved.

Click on the [ Send ] button to send the email.

open settings

To encrypt the message, Thunderbird will start a gpg session that invokes the pinentry command type in your pincode. gpg will encrypt the message file and if everything works fine the email is sent.

 

Have fun!

Links

September 09, 2024

The NBD protocol has grown a number of new features over the years. Unfortunately, some of those features are not (yet?) supported by the Linux kernel.

I suggested a few times over the years that the maintainer of the NBD driver in the kernel, Josef Bacik, take a look at these features, but he hasn't done so; presumably he has other priorities. As with anything in the open source world, if you want it done you must do it yourself.

I'd been off and on considering to work on the kernel driver so that I could implement these new features, but I never really got anywhere.

A few months ago, however, Christoph Hellwig posted a patch set that reworked a number of block device drivers in the Linux kernel to a new type of API. Since the NBD mailinglist is listed in the kernel's MAINTAINERS file, this patch series were crossposted to the NBD mailinglist, too, and when I noticed that it explicitly disabled the "rotational" flag on the NBD device, I suggested to Christoph that perhaps "we" (meaning, "he") might want to vary the decision on whether a device is rotational depending on whether the NBD server signals, through the flag that exists for that very purpose, whether the device is rotational.

To which he replied "Can you send a patch".

That got me down the rabbit hole, and now, for the first time in the 20+ years of being a C programmer who uses Linux exclusively, I got a patch merged into the Linux kernel... twice.

So, what do these things do?

The first patch adds support for the ROTATIONAL flag. If the NBD server mentions that the device is rotational, it will be treated as such, and the elevator algorithm will be used to optimize accesses to the device. For the reference implementation, you can do this by adding a line "rotational = true" to the relevant section (relating to the export where you want it to be used) of the config file.

It's unlikely that this will be of much benefit in most cases (most nbd-server installations will be exporting a file on a filesystem and have the elevator algorithm implemented server side and then it doesn't matter whether the device has the rotational flag set), but it's there in case you wish to use it.

The second set of patches adds support for the WRITE_ZEROES command. Most devices these days allow you to tell them "please write a N zeroes starting at this offset", which is a lot more efficient than sending over a buffer of N zeroes and asking the device to do DMA to copy buffers etc etc for just zeroes.

The NBD protocol has supported its own WRITE_ZEROES command for a while now, and hooking it up was reasonably simple in the end. The only problem is that it expects length values in bytes, whereas the kernel uses it in blocks. It took me a few tries to get that right -- and then I also fixed up handling of discard messages, which required the same conversion.

September 05, 2024

IT architects generally use architecture-specific languages or modeling techniques to document their thoughts and designs. ArchiMate, the framework I have the most experience with, is a specialized enterprise architecture modeling language. It is maintained by The Open Group, an organization known for its broad architecture framework titled TOGAF.

My stance, however, is that architects should not use the diagrams from their architecture modeling framework to convey their message to every stakeholder out there...

What is the definition of “Open Source”?

There’s been no shortage of contention on what “Open Source software” means. Two instances that stand out to me personally are ElasticSearch’s “Doubling down on Open” and Scott Chacon’s “public on GitHub”.

I’ve been active in Open Source for 20 years and could use a refresher on its origins and officialisms. The plan was simple: write a blog post about why the OSI (Open Source Initiative) and its OSD (Open Source Definition) are authoritative, collect evidence in its support (confirmation that they invented the term, of widespread acceptance with little dissent, and of the OSD being a practical, well functioning tool). That’s what I keep hearing, I just wanted to back it up. Since contention always seems to be around commercial re-distribution restrictions (which are forbidden by the OSD), I wanted to particularly confirm that there hasn’t been all that many commercial vendors who’ve used, or wanted, to use the term “open source” to mean “you can view/modify/use the source, but you are limited in your ability to re-sell, or need to buy additional licenses for use in a business”

However, the further I looked, the more I found evidence of the opposite of all of the above. I’ve spent a few weeks now digging and some of my long standing beliefs are shattered. I can’t believe some of the things I found out. Clearly I was too emotionally invested, but after a few weeks of thinking, I think I can put things in perspective. So this will become not one, but multiple posts.

The goal for the series is look at the tensions in the community/industry (in particular those directed towards the OSD), and figure out how to resolve, or at least reduce them.

Without further ado, let’s get into the beginnings of Open Source.

The “official” OSI story.

Let’s first get the official story out the way, the one you see repeated over and over on websites, on Wikipedia and probably in most computing history books.

Back in 1998, there was a small group of folks who felt that the verbiage at the time (Free Software) had become too politicized. (note: the Free Software Foundation was founded 13 years prior, in 1985, and informal use of “free software” had around since the 1970’s). They felt they needed a new word “to market the free software concept to people who wore ties”. (source) (somewhat ironic since today many of us like to say “Open Source is not a business model”)

Bruce Perens - an early Debian project leader and hacker on free software projects such as busybox - had authored the first Debian Free Software Guidelines in 1997 which was turned into the first Open Source Definition when he founded the OSI (Open Source Initiative) with Eric Raymond in 1998. As you continue reading, keep in mind that from the get-go, OSI’s mission was supporting the industry. Not the community of hobbyists.

Eric Raymond is of course known for his seminal 1999 essay on development models “The cathedral and the bazaar”, but he also worked on fetchmail among others.

According to Bruce Perens, there was some criticism at the time, but only to the term “Open” in general and to “Open Source” only in a completely different industry.

At the time of its conception there was much criticism for the Open Source campaign, even among the Linux contingent who had already bought-in to the free software concept. Many pointed to the existing use of the term “Open Source” in the political intelligence industry. Others felt the term “Open” was already overused. Many simply preferred the established name Free Software. I contended that the overuse of “Open” could never be as bad as the dual meaning of “Free” in the English language–either liberty or price, with price being the most oft-used meaning in the commercial world of computers and software

From Open Sources: Voices from the Open Source Revolution: The Open Source Definition

Furthermore, from Bruce Perens’ own account:

I wrote an announcement of Open Source which was published on February 9 [1998], and that’s when the world first heard about Open Source.

source: On Usage of The Phrase “Open Source”

Occasionally it comes up that it may have been Christine Peterson who coined the term earlier that week in February but didn’t give it a precise meaning. That was a task for Eric and Bruce in followup meetings over the next few days.

Even when you’re the first to use or define a term, you can’t legally control how others use it, until you obtain a Trademark. Luckily for OSI, US trademark law recognizes the first user when you file an application, so they filed for a trademark right away. But what happened? It was rejected! The OSI’s official explanation reads:

We have discovered that there is virtually no chance that the U.S. Patent and Trademark Office would register the mark “open source”; the mark is too descriptive. Ironically, we were partly a victim of our own success in bringing the “open source” concept into the mainstream

This is our first 🚩 red flag and it lies at the basis of some of the conflicts which we will explore in this, and future posts. (tip: I found this handy Trademark search website in the process)

Regardless, since 1998, the OSI has vastly grown its scope of influence (more on that in future posts), with the Open Source Definition mostly unaltered for 25 years, and having been widely used in the industry.

Prior uses of the term “Open Source”

Many publications simply repeat the idea that OSI came up with the term, has the authority (if not legal, at least in practice) and call it a day. I, however, had nothing better to do, so I decided to spend a few days (which turned into a few weeks 😬) and see if I could dig up any references to “Open Source” predating OSI’s definition in 1998, especially ones with different meanings or definitions.

Of course, it’s totally possible that multiple people come up with the same term independently and I don’t actually care so much about “who was first”, I’m more interested in figuring out what different meanings have been assigned to the term and how widespread those are.

In particular, because most contention is around commercial limitations (non-competes) where receivers of the code are forbidden to resell it, this clause of the OSD stands out:

Free Redistribution: The license shall not restrict any party from selling (…)

Turns out, the “Open Source” was already in use for more than a decade, prior to the OSI founding.

OpenSource.com

In 1998, a business in Texas called “OpenSource, Inc” launched their website. They were a “Systems Consulting and Integration Services company providing high quality, value-added IT professional services”. Sometime during the year 2000, the website became a RedHat property. Enter the domain name on Icann and it reveals the domain name was registered Jan 8, 1998. A month before the term was “invented” by Christine/Richard/Bruce. What a coincidence. We are just warming up…

image

Caldera announces Open Source OpenDOS

In 1996, a company called Caldera had “open sourced” a DOS operating system called OpenDos. Their announcement (accessible on google groups and a mailing list archive) reads:

Caldera Announces Open Source for DOS.
(…)
Caldera plans to openly distribute the source code for all of the DOS technologies it acquired from Novell., Inc
(…)
Caldera believes an open source code model benefits the industry in many ways.
(…)
Individuals can use OpenDOS source for personal use at no cost.
Individuals and organizations desiring to commercially redistribute
Caldera OpenDOS must acquire a license with an associated small fee.

Today we would refer to it as dual-licensing, using Source Available due to the non-compete clause. But in 1996, actual practitioners referred to it as “Open Source” and OSI couldn’t contest it because it didn’t exist!

You can download the OpenDos package from ArchiveOS and have a look at the license file, which includes even more restrictions such as “single computer”. (like I said, I had nothing better to do).

Investigations by Martin Espinoza re: Caldera

On his blog, Martin has an article making a similar observation about Caldera’s prior use of “open source”, following up with another article which includes a response from Lyle Ball, who headed the PR department of Caldera

Quoting Martin:

As a member of the OSI, he [Bruce] frequently championed that organization’s prerogative to define what “Open Source” means, on the basis that they invented the term. But I [Martin] knew from personal experience that they did not. I was personally using the term with people I knew before then, and it had a meaning — you can get the source code. It didn’t imply anything at all about redistribution.

The response from Caldera includes such gems as:

I joined Caldera in November of 1995, and we certainly used “open source” broadly at that time. We were building software. I can’t imagine a world where we did not use the specific phrase “open source software”. And we were not alone. The term “Open Source” was used broadly by Linus Torvalds (who at the time was a student (…), John “Mad Dog” Hall who was a major voice in the community (he worked at COMPAQ at the time), and many, many others.

Our mission was first to promote “open source”, Linus Torvalds, Linux, and the open source community at large. (…) we flew around the world to promote open source, Linus and the Linux community….we specifically taught the analysts houses (i.e. Gartner, Forrester) and media outlets (in all major markets and languages in North America, Europe and Asia.) (…) My team and I also created the first unified gatherings of vendors attempting to monetize open source

So according to Caldera, “open source” was a phenomenon in the industry already and Linus himself had used the term. He mentions plenty of avenues for further research, I pursued one of them below.

Linux Kernel discussions

Mr. Ball’s mentions of Linus and Linux piqued my interest, so I started digging.

I couldn’t find a mention of “open source” in the Linux Kernel Mailing List archives prior to the OSD day (Feb 1998), though the archives only start as of March 1996. I asked ChatGPT where people used to discuss Linux kernel development prior to that, and it suggested 5 Usenet groups, which google still lets you search through:

What were the hits? Glad you asked!

comp.os.linux: a 1993 discussion about supporting binary-only software on Linux

This conversation predates the OSI by five whole years and leaves very little to the imagination:

The GPL and the open source code have made Linux the success that it is. Cygnus and other commercial interests are quite comfortable with this open paradigm, and in fact prosper. One need only pull the source code to GCC and read the list of many commercial contributors to realize this.

comp.os.linux.announce: 1996 announcement of Caldera’s open-source environment

In November 1996 Caldera shows up again, this time with a Linux based “open-source” environment:

Channel Partners can utilize Caldera’s Linux-based, open-source environment to remotely manage Windows 3.1 applications at home, in the office or on the road. By using Caldera’s OpenLinux (COL) and Wabi solution, resellers can increase sales and service revenues by leveraging the rapidly expanding telecommuter/home office market. Channel Partners who create customized turn-key solutions based on environments like SCO OpenServer 5 or Windows NT,

comp.os.linux.announce: 1996 announcement of a trade show

On 17 Oct 1996 we find this announcement

There will be a Open Systems World/FedUnix conference/trade show in Washington DC on November 4-8. It is a traditional event devoted to open computing (read: Unix), attended mostly by government and commercial Information Systems types.

In particular, this talk stands out to me:

** Schedule of Linux talks, OSW/FedUnix'96, Thursday, November 7, 1996 ***
(…)
11:45 Alexander O. Yuriev, “Security in an open source system: Linux study

The context here seems to be open standards, and maybe also the open source development model.

1990: Tony Patti on “software developed from open source material”

in 1990, a magazine editor by name of Tony Patti not only refers to Open Source software but mentions that NSA in 1987 referred to “software was developed from open source material”

1995: open-source changes emails on OpenBSD-misc email list

I could find one mention of “Open-source” on an OpenBSD email list, seems there was a directory “open-source-changes” which had incoming patches, distributed over email. (source). Though perhaps the way to interpret is, to say it concerns “source-changes” to OpenBSD, paraphrased to “open”, so let’s not count this one.

(I did not look at other BSD’s)

Bryan Lunduke’s research

Bryan Lunduke has done similar research and found several more USENET posts about “open source”, clearly in the context of of source software, predating OSI by many years. He breaks it down on his substack. Some interesting examples he found:

19 August, 1993 post to comp.os.ms-windows

Anyone else into “Source Code for NT”? The tools and stuff I’m writing for NT will be released with source. If there are “proprietary” tricks that MS wants to hide, the only way to subvert their hoarding is to post source that illuminates (and I don’t mean disclosing stuff obtained by a non-disclosure agreement).

(source)

Then he writes:

Open Source is best for everyone in the long run.

Written as a matter-of-fact generalization to the whole community, implying the term is well understood.

December 4, 1990

BSD’s open source policy meant that user developed software could be ported among platforms, which meant their customers saw a much more cost effective, leading edge capability combined hardware and software platform.

source

1985: The “the computer chronicles documentary” about UNIX.

The Computer Chronicles was a TV documentary series talking about computer technology, it started as a local broadcast, but in 1983 became a national series. On February 1985, they broadcasted an episode about UNIX. You can watch the entire 28 min episode on archive.org, and it’s an interesting snapshot in time, when UNIX was coming out of its shell and competing with MS-DOS with its multi-user and concurrent multi-tasking features. It contains a segment in which Bill Joy, co-founder of Sun Microsystems is being interviewed about Berkley Unix 4.2. Sun had more than 1000 staff members. And now its CTO was on national TV in the United States. This was a big deal, with a big audience. At 13:50 min, the interviewer quotes Bill:

“He [Bill Joy] says its open source code, versatility and ability to work on a variety of machines means it will be popular with scientists and engineers for some time”

“Open Source” on national TV. 13 years before the founding of OSI.

image

Uses of the word “open”

We’re specifically talking about “open source” in this article. But we should probably also consider how the term “open” was used in software, as they are related, and that may have played a role in the rejection of the trademark.

Well, the Open Software Foundation launched in 1988. (10 years before the OSI). Their goal was to make an open standard for UNIX. The word “open” is also used in software, e.g. Common Open Software Environment in 1993 (standardized software for UNIX), OpenVMS in 1992 (renaming of VAX/VMS as an indication of its support of open systems industry standards such as POSIX and Unix compatibility), OpenStep in 1994 and of course in 1996, the OpenBSD project started. They have this to say about their name: (while OpenBSD started in 1996, this quote is from 2006):

The word “open” in the name OpenBSD refers to the availability of the operating system source code on the Internet, although the word “open” in the name OpenSSH means “OpenBSD”. It also refers to the wide range of hardware platforms the system supports.

Does it run DOOM?

The proof of any hardware platform is always whether it can run Doom. Since the DOOM source code was published in December 1997, I thought it would be fun if ID Software would happen to use the term “Open Source” at that time. There are some FTP mirrors where you can still see the files with the original December 1997 timestamps (e.g. this one). However, after sifting through the README and other documentation files, I only found references to the “Doom source code”. No mention of Open Source.

The origins of the famous “Open Source” trademark application: SPI, not OSI

This is not directly relevant, but may provide useful context: In June 1997 the SPI (“Software In the Public Interest”) organization was born to support the Debian project, funded by its community, although it grew in scope to help many more free software / open source projects. It looks like Bruce, as as representative of SPI, started the “Open Source” trademark proceedings. (and may have paid for it himself). But then something happened, 3/4 of the SPI board (including Bruce) left and founded the OSI, which Bruce announced along with a note that the trademark would move from SPI to OSI as well. Ian Jackson - Debian Project Leader and SPI president - expressed his “grave doubts” and lack of trust. SPI later confirmed they owned the trademark (application) and would not let any OSI members take it. The perspective of Debian developer Ean Schuessler provides more context.

A few years later, it seems wounds were healing, with Bruce re-applying to SPI, Ean making amends, and Bruce taking the blame.

All the bickering over the Trademark was ultimately pointless, since it didn’t go through.

Searching for SPI on the OSI website reveals no acknowledgment of SPI’s role in the story. You only find mentions in board meeting notes (ironically, they’re all requests to SPI to hand over domains or to share some software).

By the way, in November 1998, this is what SPI’s open source web page had to say:

Open Source software is software whose source code is freely available

A Trademark that was never meant to be.

Lawyer Kyle E. Mitchell knows how to write engaging blog posts. Here is one where he digs further into the topic of trademarking and why “open source” is one of the worst possible terms to try to trademark (in comparison to, say, Apple computers).

He writes:

At the bottom of the hierarchy, we have “descriptive” marks. These amount to little more than commonly understood statements about goods or services. As a general rule, trademark law does not enable private interests to seize bits of the English language, weaponize them as exclusive property, and sue others who quite naturally use the same words in the same way to describe their own products and services.
(…)
Christine Peterson, who suggested “open source” (…) ran the idea past a friend in marketing, who warned her that “open” was already vague, overused, and cliche.
(…)
The phrase “open source” is woefully descriptive for software whose source is open, for common meanings of “open” and “source”, blurry as common meanings may be and often are.
(…)
no person and no organization owns the phrase “open source” as we know it. No such legal shadow hangs over its use. It remains a meme, and maybe a movement, or many movements. Our right to speak the term freely, and to argue for our own meanings, understandings, and aspirations, isn’t impinged by anyone’s private property.

So, we have here a great example of the Trademark system working exactly as intended, doing the right thing in the service of the people: not giving away unique rights to common words, rights that were demonstrably never OSI’s to have.

I can’t decide which is more wild: OSI’s audacious outcries for the whole world to forget about the trademark failure and trust their “pinky promise” right to authority over a common term, or the fact that so much of the global community actually fell for it and repeated a misguided narrative without much further thought. (myself included)

I think many of us, through our desire to be part of a movement with a positive, fulfilling mission, were too easily swept away by OSI’s origin tale.

Co-opting a term

OSI was never relevant as an organization and hijacked a movement that was well underway without them.

(source: a harsh but astute Slashdot comment)

We have plentiful evidence that “Open Source” was used for at least a decade prior to OSI existing, in the industry, in the community, and possibly in government. You saw it at trade shows, in various newsgroups around Linux and Windows programming, and on national TV in the United States. The word was often uttered without any further explanation, implying it was a known term. For a movement that happened largely offline in the eighties and nineties, it seems likely there were many more examples that we can’t access today.

“Who was first?” is interesting, but more relevant is “what did it mean?”. Many of these uses were fairly informal and/or didn’t consider re-distribution. We saw these meanings:

  • a collaborative development model
  • portability across hardware platforms, open standards
  • disclosing (making available) of source code, sometimes with commercial limitations (e.g. per-seat licensing) or restrictions (e.g. non-compete)
  • possibly a buzz-word in the TV documentary

Then came the OSD which gave the term a very different, and much more strict meaning, than what was already in use for 15 years. However, the OSD was refined, “legal-aware” and the starting point for an attempt at global consensus and wider industry adoption, so we are far from finished with our analysis.

(ironically, it never quite matched with free software either - see this e-mail or this article)

Legend has it…

Repeat a lie often enough and it becomes the truth

Yet, the OSI still promotes their story around being first to use the term “Open Source”. RedHat’s article still claims the same. I could not find evidence of resolution. I hope I just missed it (please let me know!). What I did find, is one request for clarification remaining unaddressed and another handled in a questionable way, to put it lightly. Expand all the comments in the thread and see for yourself For an organization all about “open”, this seems especially strange. Seems we have veered far away from the “We will not hide problems” motto in the Debian Social Contract.

Real achievements are much more relevant than “who was first”. Here are some suggestions for actually relevant ways the OSI could introduce itself and its mission:

  • “We were successful open source practitioners and industry thought leaders”
  • “In our desire to assist the burgeoning open source movement, we aimed to give it direction and create alignment around useful terminology”.
  • “We launched a campaign to positively transform the industry by defining the term - which had thus far only been used loosely - precisely and popularizing it”

I think any of these would land well in the community. Instead, they are strangely obsessed with “we coined the term, therefore we decide its meaning. and anything else is “flagrant abuse”.

Is this still relevant? What comes next?

Trust takes years to build, seconds to break, and forever to repair

I’m quite an agreeable person, and until recently happily defended the Open Source Definition. Now, my trust has been tainted, but at the same time, there is beauty in knowing that healthy debate has existed since the day OSI was announced. It’s just a matter of making sense of it all, and finding healthy ways forward.

Most of the events covered here are from 25 years ago, so let’s not linger too much on it. There is still a lot to be said about adoption of Open Source in the industry (and the community), tension (and agreements!) over the definition, OSI’s campaigns around awareness and standardization and its track record of license approvals and disapprovals, challenges that have arisen (e.g. ethics, hyper clouds, and many more), some of which have resulted in alternative efforts and terms. I have some ideas for productive ways forward.

Stay tuned for more, sign up for the RSS feed and let me know what you think!
Comment below, on X or on HackerNews

August 29, 2024

In his latest Lex Fridman appearance, Elon Musk makes some excellent points about the importance of simplification.

Follow these steps:

  1. Simplify the requirements
  2. For each step, try to delete it altogether
  3. Implement well

1. Simplify the Requirements

Even the smartest people come up with requirements that are, in part, dumb. Start by asking yourself how they can be simplified.

There is no point in finding the perfect answer to the wrong question. Try to make the question as least wrong as possible.

I think this is so important that it is included in my first item of advice for junior developers.

There is nothing so useless as doing efficiently that which should not be done at all.

2. Delete the Step

For each step, consider if you need it at all, and if not, delete it. Certainty is not required. Indeed, if you only delete what you are 100% certain about, you will leave in junk. If you never put things back in, it is a sign you are being too conservative with deletions.

The best part is no part.

Some further commentary by me:

This applies both to the product and technical implementation levels. It’s related to YAGNI, Agile, and Lean, also mentioned in the first section of advice for junior developers.

It’s crucial to consider probabilities and compare the expected cost/value of different approaches. Don’t spend 10 EUR each day to avoid a 1% chance of needing to pay 100 EUR. Consistent Bayesian reasoning will reduce making such mistakes, though Elon’s “if you do not put anything back in, you are not removing enough” heuristic is easier to understand and implement.

3. Implement Well

Here, Elon talks about optimization and automation, which are specific to his problem domain of building a supercomputer. More generally, this can be summarized as good implementation, which I advocate for in my second section of advice for junior developers.

 

The relevant segment begins at 43:48.

The post Simplify and Delete appeared first on Entropy Wins.