Planet Grep

Planet'ing Belgian FLOSS people

Planet Grep is maintained by Wouter Verhelst. All times are in UTC.

March 21, 2025

Du désir profond de se faire arnaquer

Pour suivre les modes et faire comme tout le monde

Stefano Marinelli, un administrateur système chevronné, installe principalement des serveurs sous FreeBSD, OpenBSD ou NetBSD pour ses clients. Le plus difficile ? Arriver à convaincre un client qui veut absolument un « cluster de kubernetes tournant sous Linux », mais ne sait pas de quoi il s’agit que ce n’est pas toujours une bonne idée. Par contre, s’il migre sans rien dire des machines virtuelles vers des jails FreeBSD, il reçoit des appels paniqués parce que « tout va désormais trop vite, ça va nous coûter combien votre mise à jour du matériel ? ».

C’est le gros problème du métier d’ingénieur : l’ingénieur est censé analyser un problème et proposer des solutions, mais un manager, pour justifier son boulot, a la plupart du temps déjà décidé de la solution qu’il veut que l’ingénieur mette en place, même si elle est inadaptée.

Heureusement, les conflits sont de plus en plus rares : toutes les écoles d’ingénieurs enseignent désormais le management et la plupart des élèves ingénieurs n’apprennent plus à être critiques dans la résolution des problèmes. Les universités créent un monde de Julius:

Ceux qui osent demander « mais pourquoi ? » sont les exceptions, les rebelles.

Stefano continue avec d’autres anecdotes : comment un projet a capoté parce que le mauvais code d’un développeur remplissait les disques des serveurs de Stefano. Plutôt que de résoudre le problème du code, il a été jugé plus diplomatique d’écouter le développeur et de « passer dans le cloud ». Les disques ne se sont pas remplis en quelques heures comme auparavant. Le projet a tourné un mois sur le « cloud » avant que n’arrive la facture. Et le compte en banque du projet s’est vidé.

Ou comment une infrastructure de soins de santé refuse de mettre à jour ses serveurs pour investir dans le design d’une infrastructure « cloud » qui, 5 ans plus tard, est toujours à l’état de design malgré le budget injecté dans le « cloud consultant ». L’infrastructure se retrouve à faire tourner… Windows XP et appelle Stefano quand tout plante.

L’arnaque du SEO

J’ai vécu une anecdote similaire lorsque j’ai mis en place, pour une petite société, un site web qui comportait une partie CMS, la gestion des commandes et la génération de factures (j’avais tout fait en utilisant Django). Un jour, je reçois un coup de téléphone de quelqu’un que je ne connais pas me demandant les accès au serveur sur lequel est hébergé ce site. Je refuse, bien évidemment, mais le ton monte. Je raccroche, persuadé d’avoir affaire à une sorte d’arnaque. Quelques minutes plus tard, ma cliente m’appelle pour savoir pourquoi je n’ai pas donné l’accès à la personne qui m’a appelé. J’ai tenté l’approche raisonnable « Vous voulez vraiment que je donne accès à toute votre infrastructure à la première personne qui m’appelle et le demande ? », sans succès. J’ai finalement accepté de donner l’accès, mais en expliquant que j’exigeais un ordre écrit de sa part et que je me dégageais ensuite de toute responsabilité. Là, la cliente a paru comprendre.

Après moult explications, il s’est avéré qu’elle avait engagé, à mon insu, un consultant SEO qui voulait rajouter un code Google Analytics dans son site. Le SEO, Search Engine Optimisation, consiste à tenter de faire remonter un site web dans les résultats Google.

J’ai expliqué à ma cliente que même avec accès au serveur, le type du SEO aurait été incapable de modifier le code Django, mais que, pas de problème, il suffisait de m’envoyer un email avec le code à rajouter (aujourd’hui encore je me demande ce qu’aurait fait le gars si je lui avais donné un « accès administrateur » sur le serveur, comme il le demandait). Quelques jours plus tard, un second email me demande de modifier le code Google Analytics ajouté. J’obtempère.

Puis, je commence à recevoir des plaintes que je ne fais pas mon travail, que le code n’est pas le bon. Je le rechange. Le même cinéma se passe deux ou trois fois et ma cliente s’énerve, me traite d’incompétent. Il me faut plusieurs jours d’investigations, plusieurs réunions téléphoniques avec les types du SEO pour réaliser que les emails proviennent de deux sociétés de SEO différentes (mais avec un nom de domaine similaire, ça m’était passé au-dessus de la tête en lisant les emails).

Ma cliente avait en fait engagé deux sociétés différentes de SEO, sans leur dire et sans me le dire. Les deux sociétés se battaient donc pour mettre leur code Google Analytics à elles, ne comprenant pas pourquoi je mettais un « mauvais » code. Le pot au rose a été découvert lors d’une réunion téléphonique houleuse où j’ai pointé un email reçu la veille et que mon correspondant prétendait n’avoir jamais envoyé (forcément, il provenait d’une autre société).

J’ai confronté ma cliente et j’ai réussi à découvrir que, à part fournir des résumés issus de Google Analytics, ces deux sociétés ne faisaient rien, mais que chacune avait été payée trois fois le prix que j’avais demandé pour la réalisation entière du site, de la gestion de commande et de facturation. C’est d’ailleurs la raison pour laquelle la cliente me prenait de haut par rapport aux entreprises de SEO : j’étais bon marché donc j’étais forcément incompétent.

Pour être honnête, l’une des sociétés avait fait son « travail » et m’avait envoyé un rapport avec des modifications mineures à faire sur le site pour améliorer le SEO, mais en notant que le site était déjà très bien, qu’il n’y avait pas grand-chose à faire (essentiellement, ils me demandaient de rajouter des keywords dans les balises meta, un truc que je savais comme étant dépassé, déjà à l’époque, mais que j’ai fait sans discuter).

Furieux, j’ai publié un billet qui a tellement choqué la communauté SEO que j’ai reçu des dizaines de mails d’insultes voire de menaces physiques (vous savez, le genre où le mec à découvert des infos personnelles et tente de vous intimider en vous montrant qu’il sait faire une recherche Google sur votre nom).

Toute une communauté s’est prise au jeu de faire en sorte que le premier résultat Google sur mon nom soit une série d’injures. Flatté par tant d’attention pour un simple billet de blog sans prétention, j’ai surtout réalisé, en lisant les forums où ils discutaient mon cas, à que j’avais affaire à des gens malhonnêtes, peu scrupuleux, bref bêtes et méchants à un niveau à la limite de la parodie.

Merdification du web avec le SEO

Certains, plus modérés, tentèrent de me convaincre que « not all SEO ». Réponse : si. C’est le principe même. Tu ne veux juste pas le voir parce que tu es quelqu’un avec une certaine éthique et que ça rentre en conflit avec ta source de revenus. Mais c’est gentil à toi de m’écrire posément sans m’insulter.

Le web est devenu un énorme tas de déchets généré par les SEO.

Solderpunk s’interroge par exemple sur une mystérieuse mesure de la couverture nuageuse, mais, devant la merdification du web et l’appropriation technologique du mot "cloud", il s’en remet à poser sa question à d’autres humains, sur le réseau Gemini. Parce que le web ne lui permet plus de trouver une réponse ou de la poser à d’autres êtres humains.

Le web devait nous connecter, la merdification et l’IA nous force à nous retirer dans des espaces alternatifs où nous pouvons discuter entre humains, même pour résoudre les problèmes pour lesquels l’IA et le web sont censés être les plus utiles : répondre à nos questions techniques et factuelles. Dénicher des informations rares et difficiles d’accès.

Fermez vos comptes sur les plateformes merdifiées

Ce retour aux petites communautés est un mouvement. Thierry Crouzet se met également à Gemini:

Mais, surtout, il ferme définitivement Facebook, X, Bluesky, Instagram et bientôt peut-être Whatsapp. Pour ceux qui hésitent à faire de même, c’est toujours intéressant d’avoir des retours d’expérience.

Thierry n’est pas le seul, Vigrey ferme également son compte Facebook et en parle… sur Gemini.

Une chose est certaine : vous n’arriverez pas à migrer tous vos contacs pour une simple raison. Beaucoup veulent se faire arnaquer. Ils le demandent. Comme mon entrepreneuse, ils ne veulent pas un discours rationnel, ils ne veulent pas une solution. À vous de ne pas les laisser décider de votre futur numérique.

Et n’espérez pas que tout le monde soit un jour sur le même réseau social.

L’impact global de l’IA sur le web

L’IA produit essentiellement de la merde et il ne faut jamais lui faire confiance. Ça, vous le savez déjà.

Mais elle a surtout un impact énorme sur ceux qui ne l’utilisent pas. Beaucoup parlent des ressources utilisées dans les datacenters, mais bien plus proches et plus directes, les IA inondent le web de requêtes pour tenter d’aspirer tout le contenu possible et imaginable.

Il existe un standard bien implanté depuis des décennies qui permet de mettre un fichier appelé "robots.txt" sur son site web. Ce fichier contient les règles que doit respecter un robot accédant à votre site. Cela permet par exemple de dire au robot de Google de ne pas visiter certaines pages ou pas trop souvent.

Sans surprise, les robots utilisés par l’IA ne respectent pas ces règles. Pire, ils se camouflent pour avoir l’air d’être de véritables utilisateurs. Ils sont donc fondamentalement malhonnêtes et savent très bien ce qu’ils font : ils viennent littéralement copier votre contenu sans votre accord pour le réutiliser. Mais ils le font des centaines, des milliers de fois par secondes. Ce qui met à mal toute l’infrastructure du web.

Drew De Vault parle de son expérience avec l’infrastructure Sourcehut, sur laquelle est hébergé ce blog.

Tous ces datacenters construits en urgence pour faire de « l’IA » ? Ils sont utilisés pour mener des attaques DOS (Denial of Service) sur toute l’infrastructure du web. Dans le but de « pirater » les contenus sans respecter les licences et le copyright.

Ce n’est pas que je suis un fan du copyright, bien au contraire. C’est juste que ça fait 30 ans qu’on nous martèle que « la copie c’est le vol » et qu’Aaron Swartz s’est suicidé, car il risquait 30 de prison pour avoir automatisé le téléchargement de quelques milliers d’articles scientifiques qu’il estimait, avec justesse, appartenir au domaine public.

L’IA consomme des ressources, détruit nos réseaux, met à genoux les systèmes administrateurs bénévoles des sites communautaires, s’approprie nos contenus. Et tout cela pour quoi faire ? Pour générer du contenu SEO qui va remplir encore plus le web. Oui, ça tourne en boucle. Non, ça ne peut pas bien se terminer.

La mode de l’incompétence

Le SEO, le cloud et maintenant l’IA sont en cela très similaires : la mode. Les clients le veulent à tout prix et demandent pour se faire littéralement arnaquer tout en se vantant de leur incompétence.

Dans un sens, c’est bien fait pour eux : ils le veulent le truc à la mode sans même savoir pourquoi ils le veulent. Ma cliente voulait du SEO alors qu’il s’agissait d’un business essentiellement local qui ciblait une clientèle de niche avec laquelle elle avait des contacts. Les clients veulent « du cloud » pour ne pas payer un administrateur système comme Stefano, mais payent dix fois le prix pour un consultant et se retrouvent à appeler Stefano quand tout va mal. De même, ils veulent désormais de l’IA sans même savoir pourquoi ils le veulent.

L’IA, c’est en fait la junk food de la pensée : un aspect appétissant, mais aucune valeur nutritive et, à terme, une perte totale de la culture du goût, de la saveur.

Même si j’ai donné tous les codes, tous les accès, même si je l’ai mise en contact avec d’autres développeurs Django, la société dont je parle dans ce billet n’a pas survécu longtemps après mon départ. Son capital initial et, surtout, les aides de l’état à la création d’entreprise qu’elle percevait ont essentiellement fini dans les poches de deux entreprises de SEO qui n’ont rien fait d’autre que de créer un compte Google Analytics. Aujourd’hui, c’est pareil avec le cloud et l’IA : il s’agit d’exploiter au maximum la crédulité des petits entrepreneurs qui ont la capacité d’obtenir des subsides de l’état afin de vider leurs poches. Ainsi que celles de l’état, dans lesquelles les politiciens piochent avec un enthousiasme démesuré dès qu’on utilise un buzzword à la mode.

Je pensais, naïvement, offrir un service éthique, je pensais discuter avec les clients pour répondre à leurs véritables besoins.

Je n’imaginais pas que les clients voulaient à tout prix se faire arnaquer.

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

March 19, 2025

In the previous post, we saw how to deploy MySQL HeatWave on Amazon. Multicloud refers to the coordinated use of cloud services from multiple providers. In addition to our previous post, where we deployed MySQL HeatWave on Amazon, we will explore how to connect with another cloud service. Oracle has partnered with Microsoft to offer […]

March 15, 2025

When I searched for a new LoRaWAN indoor gateway, my primary criterion was that it should be capable of running open-source firmware. The ChirpStack Gateway OS firmware caught my attention. It's based on OpenWrt and has regular releases. Its recent 4.7.0 release added support for the Seeed SenseCAP M2 Multi-Platform Gateway, which seemed like an interesting and affordable option for a LoRaWAN gateway.

Unfortunately, this device wasn't available through my usual suppliers. However, TinyTronics did stock the SenseCAP M2 Data Only, which looked to me like exactly the same hardware but with different firmware to support the Helium LongFi Network. Ten minutes before their closing time on a Friday evening, I called their office to confirm whether I could use it as a LoRaWAN gateway on an arbitrary network. I was helped by a guy who was surprisingly friendly for the time of my call, and after a quick search he confirmed that it was indeed the same hardware. After this, I ordered this Helium variant of the gateway.

Upon its arrival, the first thing I did after connecting the antenna and powering it on was to search for the Backup/Flash Firmware entry in Luci's System menu, as explained in Seeed Studio's wiki page about flashing open-source firmware to the M2 Gateway. Unfortunately, the M2 Data Only seemed to have a locked-down version of OpenWrt's Luci interface, without the ability to flash other firmware. There was no SSH access either. I tried to flash the firmware via TFTP, but to no avail..

After these disappointing attempts, I submitted a support ticket to Seeed Studio, explaining my intention to install alternative firmware on the device, as I wasn't interested in the Helium functionality. I received a helpful response by a field application engineer with the high-level steps to do this, although I had to fill in some details myself. After getting stuck on a missing step, my follow-up query was promptly answered with the missing information and an apology for the incomplete instructions, and I finally succeeded in installing the Chirpstack Gateway OS on the SenseCAP M2 Data Only. Here are the detailed steps I followed.

Initial serial connection

Connect the gateway via USB and start a serial connection with a baud rate of 57600. I used GNU Screen for this purpose:

$ screen /dev/ttyUSB0 57600

When the U-Boot boot loader shows its options, press 0 for Load system code then write to Flash via Serial:

/images/sensecap-m2-uboot-menu.png

You'll then be prompted to switch the baud rate to 230400 and press ENTER. I terminated the screen session with Ctrl+a k and reconnected with the new baud rate:

$ screen /dev/ttyUSB0 230400

Sending the firmware with Kermit

Upon pressing ENTER, you'll see the message Ready for binary (kermit) download to 0x80100000 at 230400 bps.... I never used the Kermit protocol before, but I installed ckermit and found the procedure in a StackOverflow response to the question How to send boot files over uart. After some experimenting, I found that I needed to use the following commands:

 koan@nov:~/Downloads$ kermit
C-Kermit 10.0 pre-Beta.11, 06 Feb 2024, for Linux+SSL (64-bit)
 Copyright (C) 1985, 2024,
  Trustees of Columbia University in the City of New York.
  Open Source 3-clause BSD license since 2011.
Type ? or HELP for help.
(~/Downloads/) C-Kermit>set port /dev/ttyUSB0
(~/Downloads/) C-Kermit>set speed 230400
/dev/ttyUSB0, 230400 bps
(~/Downloads/) C-Kermit>set carrier-watch off
(~/Downloads/) C-Kermit>set flow-control none
(~/Downloads/) C-Kermit>set prefixing all
(~/Downloads/) C-Kermit>send openwrt.bin

The openwrt.bin file was the firmware image from Seeed's own LoRa_Gateway_OpenWRT firmware. I decided to install this instead of the ChirpStack Gateway OS because it was a smaller image and hence flashed more quickly (although still almost 8 minutes).

/images/sensecap-m2-kermit-send.png

After the file was sent successfully, I didn't see any output when reestablishing a serial connection. After responding this to Seeed's field application engineer, he replied that the gateway should display a prompt requesting to switch the baud rate again to 57600.

Kermit can also function as a serial terminal, so I just stayed within the Kermit command line and entered the following commands:

(~/Downloads/) C-Kermit>set speed 57600
/dev/ttyUSB0, 57600 bps
(~/Downloads/) C-Kermit>connect
Connecting to /dev/ttyUSB0, speed 57600
 Escapr character: Ctrl-\ (ASCII 28, FS): enabled
Type the escape character followed by C to get back,
or followed by ? to see other options.
----------------------------------------------------
## Total Size      = 0x00840325 = 8651557 Bytes
## Start Addr      = 0x80100000
## Switch baudrate to 57600 bps and press ESC ...

And indeed, there was the prompt. After pressing ESC, the transferred image was flashed.

Reboot into the new firmware

Upon rebooting, the device was now running Seeed's open-source LoRaWAN gateway operating system. Luci's menu now included a Backup/Flash Firmware entry in the System menu, enabling me to upload the ChirpStack Gateway OS image:

/images/sensecap-m2-openwrt-new-firmware.png

Before flashing the firmware image, I deselected the Keep settings and retain the current configuration option, as outlined in ChirpStack's documentation for installation on the SenseCAP M2:

/images/sensecap-m2-openwrt-flash.png

Thus, I now have open-source firmware running on my new LoRaWAN gateway, with regular updates in place.

Imagine waking up to discover that overnight, AI agents rewrote 500 product descriptions, reorganized 300 pages for SEO, and updated 9,000 alt-text descriptions on your website.

As you review the changes over coffee, you find three product descriptions featuring nonexistent features. If published, customers will order based on false expectations. Then you notice another problem: AI rewrote hundreds of alt-text descriptions, erasing the ones your team crafted for accessibility.

AI-driven content management isn't a distant scenario. Soon, Content Management Systems (CMS) may deploy hundreds of AI agents making bulk edits across thousands of pages.

The challenge? Traditional CMS workflows weren't designed for AI-powered editing at scale. What features should an AI-first CMS include? What safeguards would prevent errors? What workflows would balance efficiency with quality control? I'm outlining some rough ideas to start a conversation and inspire Drupal contributors to help build this future.

1. Smart review queues: scaling human oversight

AI-generated content needs different quality checks than human work. Current editorial workflows aren't optimized to handle its output volume.

I envision "AI review queues" with specialized tools like:

  • Spot-checking: Instead of manually reviewing everything, editors can sample AI content strategically. They focus on key areas, like top-selling products or pages flagged by anomaly detection. Reviewing just 5% of the changes could provide confidence; good samples suggest the broader set works well. If issues are found, it signals the need for deeper review.
  • Rolled-up approvals: Instead of approving AI edits one by one, CMS platforms could summarize large-scale AI changes into a single reviewable batch.

2. Git-like content versioning: selective control over AI changes

Say an AI translated your site into Spanish with mixed results. Meanwhile, editors updated the English content. Without sophisticated versioning, you face a tough choice: keep poor translations or roll everything back, losing days of human work.

CMS platforms need Git-like branch-based versioning for content. AI contributions should exist in separate branches that teams can merge, modify, or reject independently.

3. Configuration versioning: keeping AI from breaking your CMS

AI isn't just generating content. It is also modifying site configurations, permissions, content models and more. Many CMS platforms don't handle "configuration versioning" well. Changes to settings and site structures are often harder to track and undo.

CMS platforms also need Git-like versioning for configuration changes, allowing humans to track, review, and roll back AI-driven modifications just as easily as content edits. This ensures AI can assist with complex site management tasks without introducing silent, irreversible changes.

4. Enhanced audit trails: understanding AI decisions

Standard CMS audit logs track who made changes and when, but AI operations demand deeper insights. When multiple AI agents modify your site, we need to know which agent made each change, why it acted, and what data influenced its decision. Without these explanations, tracking down and fixing AI errors becomes nearly impossible.

AI audit trails should record confidence scores showing how certain an agent was about its changes (60% vs 95% certainty makes a difference). They need to document reasoning paths explaining how each agent reached its conclusion, track which model versions and parameters were used, and preserve the prompt contexts that guided the AI's decisions. This comprehensive tracking creates accountability in multi-agent environments where dozens of specialized AIs might collaborate on content.

This transparency also supports compliance requirements, ensuring organizations can demonstrate responsible AI oversight.

5. AI guardrails: enforcing governance and quality control

AI needs a governance layer to ensure reliability and compliance. Imagine a healthcare system where AI-generated medical claims must reference approved clinical studies, or a financial institution where AI cannot make investment recommendations without regulatory review.

Without these guardrails, AI could generate misleading or non-compliant content, leading to legal risks, financial penalties, or loss of trust.

Instead of just blocking AI from certain tasks, AI-generated content should be checked for missing citations, regulatory violations, and factual inconsistencies before publication.

Implementing these safeguards likely requires a "rules engine" that intercepts and reviews AI outputs. This could involve pattern matching to detect incorrect content, as well as fact verification against approved databases and trusted sources. For example, a healthcare CMS could automatically verify AI-generated medical claims against clinical research databases. A financial platform might flag investment advice containing unapproved claims for compliance review.

Strategic priorities for modern CMS platforms

I can't predict exactly how these ideas will take shape, but I believe their core principles address real needs in AI-integrated content management. As AI takes on a bigger role in how we manage content, building the right foundation now will pay off regardless of specific implementations. Two key investment areas stand out:

  1. Improved version control – AI and human editors will increasingly work in parallel, requiring more sophisticated versioning for both content and configuration. Traditional CMS platforms must evolve to support Git-like branching, precise rollback controls, and configuration tracking, ensuring both content stability and site integrity.
  2. AI oversight infrastructure – As AI generates and modifies content at scale, CMS platforms will need structured oversight systems. This includes specialized review queues, audit logs, and governance frameworks.

March 11, 2025

N’attendez pas, changez vos paradigmes !

Il faut se passer de voiture pendant un certain temps pour réellement comprendre au plus profond de soi que la solution à beaucoup de nos problèmes sociétaux n’est pas une voiture électrique, mais une ville cyclable.

Nous ne devons pas chercher des « alternatives équivalentes » à ce que nous offre le marché, nous devons changer les paradigmes, les fondements. Si on ne change pas le problème, si on ne revoit pas en profondeur nos attentes et nos besoins, on obtiendra toujours la même solution.

Migrer ses contacts vers Signal

Je reçois beaucoup de messages qui me demandent comment j’ai fait pour migrer vers Mastodon et vers Signal. Et comment j’ai migré mes contacts vers Signal.

Il n’y a pas de secret. Une seule stratégie est vraiment efficace pour que vos contacts s’intéressent aux alternatives éthiques : ne plus être sur les réseaux propriétaires.

Je sais que c’est difficile, qu’on a l’impression de se couper du monde. Mais il n’y a pas d’autre solution. Le premier qui part s’exclut, c’est vrai. Mais le second qui, inspiré, ose suivre le premier entraine un mouvement inexorable. Car si une personne qui s’exclut est une « originale » ou une « marginale », deux personnes forment un groupe. Soudainement, les suiveurs ont peur de rater le coche.

Il faut donc s’armer de courage, communiquer son retrait et être ferme. Les gens ont besoin de vous comme vous avez besoin d’eux. Ils finiront par vouloir vous contacter. Oui, vous allez rater des informations le temps que les gens comprennent que vous n’êtes plus là. Oui, certaines personnes qui sont sur les deux réseaux vont devoir faire la passerelle durant un certain temps.

Vous devez également accepter de faire face au dur constat que certains de vos contacts ne le sont que par facilité, non par envie profonde. Très peu de gens tiennent véritablement à vous. C’est le lot de l’humanité. Même une star qui quitte un réseau social n’entraine avec elle qu’une fraction de ses followers. Et encore, pas de manière durable. Personne n’est indispensable.

Ne pas vouloir quitter un réseau tant que « tout le monde » n’est pas sur l’alternative implique le constat effrayant que le plus réactionnaire, le plus conservateur du groupe dicte ses choix. Son refus de bouger lui donne un pouvoir hors norme sur vous et sur tous les autres. Il représente « la majorité » simplement parce que vous, qui souhaitez bouger, tolérez son côté réactionnaire. Mais si vous dîtes vouloir bouger, mais que vous ne le faites pas, n’êtes-vous pas vous-même conservateur ?

Vous voulez vraiment vous passer de Whatsapp et de Messenger ? N’attendez pas, faites-le ! Supprimez votre compte pendant un mois pour voir l’impact sur votre vie. Laissez-vous la latitude de recréer le compte s’il s’avère que cette suppression n’est pas possible pour vous sur le long terme. Mais, au moins, vous aurez testé le nouveau paradigme, vous aurez pris conscience de vos besoins réels.

Adopter le Fediverse

Joan Westenberg le dit très bien à propos du Fediverse : le Fediverse n’est pas le futur, c’est le présent. Son problème n’est pas que c’est compliqué ou qu’il n’y a personne : c’est simplement que le marketing de Google/Facebook/Apple nous a formaté le cerveau pour nous faire croire que les alternatives ne sont pas viables. Le Fediverse regorge d’humains et de créativité, mais il n’y a pas plus aveugle que celui qui ne veut pas voir.

Après avoir rechigné pendant des années à s’y consacrer pleinement, Thierry Crouzet arrive à la même conclusion : d’un point de vue réseau social, le Fediverse est la seule solution viable. Utiliser un réseau propriétaire est une compromission et une collaboration avec l’idéologie de ce réseau. Il encourage les acteurs du livre francophone à rejoindre le Fediverse.

Je maintiens moi-même une liste d’écrivain·e·s de l’imaginaire en activité sur le Fediverse. Il y en a encore trop peu.

Votre influenceur préféré n’est pas sur le Fediverse ? Mais est-il indispensable de suivre votre influenceur préféré sur un réseau social ? Vous n’êtes pas sur X parce que vous voulez suivre cet influenceur. Vous suivez cet influenceur parce que X vous fait croire que c’est indispensable pour être un véritable fan ! L’outil ne répond pas à un besoin, il le crée de toutes pièces.

Le paradoxe de la tolérance

Vous tolérez de rester sur Facebook/Messenger/Whatsapp par « respect pour ceux qui n’y sont pas » ? Vous tolérez en fermant votre gueule que votre tonton Albert raciste et homophobe balance des horreurs au repas de famille pour « ne pas envenimer la situation » ? D’ailleurs, votre Tata vous a dit que « ça n’en valait pas la peine, que vous valiez mieux que ça ». Vous tolérez sans rien dire que les fumeurs vous empestent sur les quais de gare et les terrasses par « respect pour leur liberté » ?

À un moment, il faut choisir : soit on préfère ne pas faire de vagues, soit on veut du progrès. Mais les deux sont souvent incompatibles.

Vous voulez vous passer de Facebook/Instagram/X ? Encore une fois, faites-le ! La plupart de ces réseaux permettent de restaurer un compte supprimé dans les 15 jours qui suivent sa suppression. Alors, testez ! Deux semaines sans comptes pour voir si vous avez vraiment envie de le restaurer. C’est à vous de changer votre paradigme !

LinkedIn, le réseau bullshit par excellence

On parle beaucoup de X parce que la plateforme devient un acteur majeur de promotion du fascisme. Mais chaque plateforme porte des valeurs qu’il est important de cerner pour savoir si elles nous conviennent ou pas. LinkedIn, par exemple. Qui est indistinguable de la parodie qu’en fait Babeleur (qui vient justement de quitter ce réseau).

J’ai éclaté de rire plusieurs fois tellement c’est bon. Je me demande si certains auront la lucidité de s’y reconnaître.

Encore une fois, si LinkedIn vous ennuie, si vous détestez ce réseau. Mais qu’il vous semble indispensable pour ne pas « rater » certaines opportunités professionnelles. Et bien, testez ! Supprimez-le pendant deux semaines. Restaurez-le puis resupprimez-le. Juste pour voir ce que ça fait de ne plus être sur ce réseau. Ce que ça fait de rater ce gros tas de merde malodorant que vous vous forcez à fouiller journalièrement pour le cas où il contiendrait une pépite d’or. Peut-être que ce réseau vous est indispensable, mais la seule manière de le savoir est de tenter de vous en passer pour de bon.

Peut-être que vous raterez certaines opportunités. Mais je suis certain : en n’étant pas sur ce réseau, vous en découvrirez d’autres.

De la poésie, de la fiction…

La résistance n’est pas que technique. Elle doit être également poétique ! Et pour que la poésie opère, il est nécessaire que la technologie s’efface, se fasse minimaliste et utile au lieu d’être le centre de l’attention.

On ne peut pas changer le monde. On ne peut que changer ses comportements. Le monde est façonné par ceux qui changent leurs comportements. Alors, essayez de changer. Essayez de changer de paradigme. Pendant une semaine, un mois, une année.

Après, je ne vous cache pas qu’il y a un risque : c’est souvent difficile de revenir en arrière.

Une fois qu’on a lâché la voiture pour le vélo, impossible de ne pas rêver. On se met à imaginer des mondes où la voiture aurait totalement disparu pour laisser la place au vélo…

Dédicaces

D’ailleurs, je dédicacerai Bikepunk (et mes autres livres) à la Foire du livre de Bruxelles ce samedi 15 mars à partir de 16h30 sur le stand de la province du Brabant-Wallon.

On se retrouve là-bas pour discuter vélo et changement de paradigme ?

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

March 08, 2025

20 years of Linux on the Desktop (part 3)

Previously in "20 years of Linux on the Deskop": After contributing to the launch of Ubuntu as the "perfect Linux desktop", Ploum realises that Ubuntu is drifting away from both Debian and GNOME. But something else is about to shake the world…

The new mobile paradigm

While I was focused on Ubuntu as a desktop solution, another GNOME+Debian product had appeared and was shaking the small free software world: Maemo.

It will come as a shock for the youngest but this was a time without smartphones (yes, we had electricity and, no, dinosaurs were already extinct, please keep playing Pokémon instead of interrupting me). Mobile phones were still quite new and doing exactly two things: calls and SMSes. In fact, they were sold as calling machines and the SMS frenzy, which was just a technical hack around the GSM protocol, took everybody by surprise, including operators. Were people really using awkward cramped keyboard to send themselves flood of small messages?

Small pocket computers with tiny keyboard started to appear. There were using proprietary operating systems like WinCE or Symbian and browsing a mobile version of the web, called "WAP", that required specific WAP sites and that nobody used. The Blackberry was so proprietary that it had its own proprietary network. It was particularly popular amongst business people that wanted to look serious. Obama was famously addicted to his Blackberry to the point that the firm had to create a secure proprietary network only for him once he took office in the White House. But like others, Blackberries were very limited, with very limited software. Nothing like a laptop computer.

N770, the precursor

In 2005, Nokia very quietly launched the N770 as an experiment. Unlike its competitors, it has no keyboard but a wide screen that could be used with a stylus. Inside was running a Debian system with an interface based on GNOME: Maemo.

The N770, browsing Wikipedia The N770, browsing Wikipedia

Instead of doing all the development in-house, Nokia was toying with free software. Most of the software work was done by small European companies created by free software hackers between 2004 and 2005. Those companies, often created specifically to work with Nokia, were only a handful of people each and had very narrow expertise. Fluendo was working on the media framework GStreamer. Immendio was working on the GTK user interface layer. Collabora was focusing on messaging software. Etc.

Far from the hegemony of American giant monopolists, the N770 was a mostly European attempt at innovating through a collaborative network of smaller and creative actors, everything led by the giant Nokia.

During FOSDEM 2005, GNOME developer Vincent Untz lent me a N770 prototype for two days. The first night was a dream come true: I was laying in bed, chatting on IRC and reading forums. Once the N770 was publicly released, I immediately bought my own. While standing in line in the bakery one Sunday morning, I discovered that there was an unprotected wifi. I used it to post a message on the Linuxfr website telling my fellow geeks that I was waiting for my croissants and could still chat with them thanks to free software.

Those days, chatting while waiting in a queue has been normalised to the point you remark someone not doing it. But, in 2005, this was brand new.

So new that it started a running meme about "Ploum’s baker" on Linuxfr. Twenty years later, some people that I meet for the first time still greet me with "say hello to your baker" when they learn who I am. For the record, the baker, an already-old woman at the time of the original post, retired a couple years later and the whole building was demolished to give place to a motorbike shop.

This anecdote highlights a huge flaw of the N770: without wifi, it was a dead weight. When I showed it to people, they didn’t understand what it was, they asked why I would carry it if I could not make calls with it. Not being able to use the Internet without a wifi was a huge miss but, to be fair, 3G didn’t exist yet. Another flaw was that installing new software was far from being user-friendly. Being based on Debian, Maemo was offering a Synaptic-like interface where you had to select your software in a very long list of .deb packages, including the technical libraries.

Also, it was slow and prone to crash but that could be solved.

Having played with the N770 in my bed and having seen the reactions of people around me when I used it, I knew that the N770 could become a worldwide hit. It was literally the future. There were only two things that Nokia needed to solve: make it a phone and make it easy to install new software. Also, if it could crash less, that would be perfect.

The Nokia (un)management guide to failure

But development seemed to stall. It would take more than two years for Nokia to successively release two successors to the N770: the N800 and the N810. But, besides some better performance, none of the core issues were addressed. None of those were phones. None of those offered easy installation of software. None were widely released. In fact, it was so confidential that you could only buy them through the Nokia website of some specific countries. The items were not in traditional shops nor catalogues. When I asked my employer to get a N810, the purchasing department was unable to find a reference: it didn’t exist for them. Tired by multiple days of discussion with the purchasing administration, my boss gave me his own credit card, asked me to purchase it on the Nokia website and made a "diverse material expense" to be reimbursed.

The thing was simply not available to businesses. It was like Nokia wanted Maemo to fail at all cost.

While the N800 and N810 were released, a new device appeared on the market: the Apple iPhone.

I said that the problem with the N770 is that you had to carry a phone with it. Steve Jobs had come to the same conclusion with the iPod. People had to carry an iPod and a phone. So he added the phone to the iPod. It should be highlighted that the success of the iPhone took everyone by surprise, including Steve Jobs himself. The original iPhone was envisioned as an iPod and nothing else. There was no app, no app store, no customisation (Steve Jobs was against it). It was nevertheless a hit because you could make calls, listen to music and Apple spent a fortune in marketing to advertise it worldwide. The marketing frenzy was crazy. Multiple people that knew I was "good with computers" asked me if I could unlock the iPhone they bought in the USA and which was not working in Europe (I could not). They spent a fortune on a device that was not working. Those having one were showing it to everyone.

With the iPhone, you had music listening and a phone on one single device. In theory, you could also browse the web. Of course, there was no 3G so browsing the web was mostly done through wifi, like the N770. But, at the time, websites were done with wide screens in mind and Flash was all the rage. The iPhone was not supporting Flash and the screen was vertical, which made web browsing a lot worse than on the N770. And, unlike the N770, you could not install any application.

The iPhone 1 was far from the revolution Apple want us to believe. It was just very good marketing. In retrospective, the N770 could have been a huge success had Nokia done some marketing at all. They did none.

Another Linux on your mobile

In 2008, Google launched its first phone which still had a physical keyboard. Instead of developing the software from scratch, Google used a Linux system initially developed as an embedded solution for cameras: Android. At the same time, Apple came to the realisation I had in 2005 that installing software was a key feature. The App Store was born.

Phone, web browsing and custom applications, all on one device. Since 2005, people who had tried the N770 knew this was the answer. They simply did not expect it from Apple nor Google.

When Android was first released, I thought it was what Maemo should have been. Because of the Linux kernel, I was thinking it would be a "free" operating system. I made a deep comparison with Maemo, diving into some part of the source code, and was surprised by some choices. Why Java? And why would Android avoid GStreamer in its multimedia stack? Technical explanations around that choice were not convincing. Years later, I would understand that this was not a technical choice: besides the Linux kernel itself, Google would explicitly avoid every GPL and LGPL licensed code. Android was only "free software" by accident. Gradually, the Android Open Source Project (AOSP) would be reduced to a mere skeleton while Android itself became more and more restricted and proprietary.

In reaction to the iPhone and to Android, Nokia launched the N900 at the end of 2009. Eventually, the N900 was a phone. It even included an app store called, for unknown marketing reasons, "OVI store". The phone was good. The software was good, with the exception of the infamous OVI store (which was bad, had a bad name, a non-existent software offering and, worse of all, was conflicting with deb packages).

The N900 would probably have taken the world by storm if released 3 years earlier. It would have been a success and a huge competitor to the iPhone if released 18 months before. Is it too late? The world seems to settle with an Apple/Google duopoly. A duopoly that could have been slightly shacked by the N900 if Nokia had done at least some marketing. It should be noted that the N900 had a physical keyboard. But, at that point, nobody really cared.

When failing is not enough, dig deeper

At least, there was the Maemo platform. Four years of work. Something could be done with that. That’s why, in 2010, Nokia decided to… launch Meego, a new Linux platform which replaced the Debian infrastructure by RPMs and the GNOME infrastructure by Qt.

No, really.

Even if it was theoretically, the continuation of Maemo (Maemo 6, codenamed Harmattan, was released as Meego 1), it felt like starting everything from scratch with a Fedora+KDE system. Instead of a strong leadership, Meego was a medley of Linux Foundation, Intel, AMD and Nokia. Design by committee with red tape everywhere. From the outside, it looked like Nokia outsourced its own management incompetence and administrative hubris. The N9 phone would be released in 2011 without keyboard but with Meego.

History would repeat itself two years later when people working on Meego (without Nokia) would replace it with Tizen. Yet another committee.

From being three years ahead of the competition in 2005 thanks to Free Software, Nokia managed to become two years too late in 2010 thanks to incredibly bad management and choosing to hide its products instead of advertising them.

I’ve no inside knowledge of what Nokia was at this time but my experience in the industry allows me to perfectly imagine the hundreds of meetings that probably happened at that time.

When business decisions look like very bad management from the outside, it is often because they are. In the whole Europe at the time, technical expertise was seen as the realm of those who were not gifted enough to become managers. As a young engineer, I thought that managers from higher levels were pretentious and incompetent idiots. After climbing the ladder and becoming a manager myself, years later, I got the confirmation that I was even underestimating the sheer stupidity of management. It is not that most managers were idiots, they were also proud of their incompetence and, as this story would demonstrate, they sometimes need to become deeply dishonest to succeed.

It looks like Nokia never really trusted its own Maemo initiative because no manager really understood what it was. To add insult to injury the company bought Symbian OS in 2008, an operating system which was already historical and highly limited at that time. Nodoby could figure out why they spent cash on that and why Symbian was suddenly an internal competitor to Maemo (Symbian was running on way cheaper devices).

The emotional roller coster

In 2006, I was certain that free software would take over the world. It was just a matter of time. Debian and GNOME would soon be on most desktop thanks to Ubuntu and on most mobile devices thanks to Maemo. There was no way for Microsoft to compete against such power. My wildest dreams were coming true.

Five years later, the outlook was way darker. Apple was taking the lead by being even more proprietary and closed than Microsoft. Google seemed like good guys but could we trust them? Even Ubuntu was drifting away from its own Debian and GNOME roots. The communities I loved so much were now fragmented.

Where would I go next?

(to be continued)

Subscribe by email or by rss to get the next episodes of "20 years of Linux on the Desktop".

I’m currently turning this story into a book. I’m looking for an agent or a publisher interested to work with me on this book and on an English translation of "Bikepunk", my new post-apocalyptic-cyclist typewritten novel which sold out in three weeks in France and Belgium.

I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

I write science-fiction novels in French. For Bikepunk, my new post-apocalyptic-cyclist book, my publisher is looking for contacts in other countries to distribute it in languages other than French. If you can help, contact me!

March 06, 2025

Multicloud is a cloud adoption strategy that utilizes services from multiple cloud providers rather than relying on just one. This approach enables organizations to take advantage of the best services for specific tasks, enhances resilience, and helps reduce costs. Additionally, a multicloud strategy offers the flexibility necessary to meet regulatory requirements and increases options for […]

March 05, 2025

A while back, I built a solar-powered, self-hosted website. Running a website entirely on renewable energy felt like a win – until my Raspberry Pi Zero 2 W started ghosting me.

A solar panel on a rooftop during sunset with a city skyline in the background. My solar panel and Raspberry Pi Zero 2 are set up on our rooftop deck for testing.

The solar panel and battery aren't the problem. They've performed better than expected, even through cloudy and freezing winter days. In fact, my solar dashboard shows they've been running for 215 days straight. Not a glitch.

The real headache? Keeping my Raspberry Pi Zero 2 W online. Every few weeks, it just disappears from the network: no ping, no SSH. Completely unreachable, yet the power LED stays on. The SD card has plenty of free space.

And every time, I have to go through the same frustrating ritual: get on the roof, open the waterproof enclosure, disconnect the Pi, pull the SD card, go to my office, reformat it, reinstall the OS and reconfigure everything. Then climb back up and put everything back together.

Annoying!

A Raspberry Pi 4 with an RS485 CAN HAT in a waterproof enclosure, surrounded by cables, screws and components. A Raspberry Pi 4 with an attached RS485 CAN HAT module is being installed in a waterproof enclosure.

A month ago, I was back on the roof deck, battling Boston winter. My fingers were numb, struggling with tiny screws and connectors. This had to stop.

The Raspberry Pi Zero 2 W is a great device for IoT projects, but only if it can run unattended for years.

Watchdogs: a safety net for when things go wrong

Enter watchdogs: tools that detect failures and trigger automatic reboots. There are two types:

  1. Hardware watchdog – Recovers from system-wide freezes like kernel panics or hardware lockups, by forcing a low-level reset.
  2. Software watchdog – Detects and fixes service-level failures, such as lost network connectivity, high CPU load or excessive RAM usage.

Running both ensures the Raspberry Pi can recover from minor issues (like a dropped connection) and system crashes (where everything becomes unresponsive).

To maximize reliability, I set up both. This article documents how.

Hardware watchdog

The hardware watchdog is a timer built into the Raspberry Pi's Broadcom chip. The operating system must reset or pet the timer regularly. If it fails to do so within a set interval, the watchdog assumes the system has frozen and forces a reboot.

Since support for the hardware watchdog is built into the Raspberry Pi's Linux kernel, it simply needs to be enabled.

Edit /etc/systemd/system.conf and add:

 RuntimeWatchdogSec=10s
ShutdownWatchdogSec=10min
  • RuntimeWatchdogSec – Defines how often the watchdog must be reset. On the Raspberry Pi, this must be less than 15–20 seconds due to hardware constraints.
  • ShutdownWatchdogSec – Keeps the watchdog active during shutdown to detect hangs.

Restart systemd to activate the watchdog:

 $ sudo systemctl daemon-reexec

Once restarted, systemd starts petting the hardware watchdog timer. If it ever fails, the Raspberry Pi will reboot.

To ensure full recovery, set all critical services to restart automatically. For example, my web server starts by itself, bringing my solar-powered website back online without any manual work.

Software watchdog

The hardware watchdog catches complete system freezes, while the software watchdog monitors network connectivity, CPU load and other metrics.

To install the software watchdog:

$ sudo apt update
$ sudo apt install watchdog

Once installed, enable and start the watchdog service:

$ sudo systemctl enable watchdog
$ sudo systemctl start watchdog

Enabling the watchdog makes sure it launches automatically on every boot, while starting it activates it immediately without requiring a restart.

Next, edit /etc/watchdog.conf and add the following settings:

# Network monitoring
ping = 8.8.8.8
ping = 1.1.1.1
ping-count = 5

# Interface monitoring
interface = wlan0

# Basic settings
watchdog-device = none
retry-timeout = 180
realtime = yes
interval = 20

What this does:

  • ping = 8.8.8.8 / ping = 1.1.1.1 – Checks that the Pi can reach Google (8.8.8.8) and Cloudflare (1.1.1.1).
  • interface = wlan0 – Ensures the Wi-Fi interface is active.
  • retry-timeout = 180 – Reboots the Pi if these checks fail for 180 seconds.
  • interval = 20 – Performs checks every 20 seconds.
  • watchdog-device = none – Instead of using the hardware watchdog, the daemon monitors failures and triggers a software reboot through the operating system.

While I'm just monitoring the network, you can also configure the watchdog to check CPU usage, RAM or other system health metrics.

Debugging watchdog reboots

When a watchdog triggers a reboot, system logs can help uncover what went wrong. To view all recent system boots, run:

$ journalctl –list-boots

This will display a list of boot sessions, each with an index (e.g. -1 for the previous boot, -2 for the one before that).

To see all shutdown events and their reason, run:

$ journalctl –no-pager | grep "shutting down the system"

If you want more details, you can check the logs leading up to a specific reboot. The following command displays the last 50 log entries immediately before the last system shutdown:

$ journalctl -b -1 -n 50 –no-pager
  • -b -1 – Retrieves logs from the previous boot.
  • -n 50 – Displays the last 50 log entries before that reboot.
  • –no-pager – Prevents logs from being paginated.

Progress, but the mystery remains

Since installing these watchdogs, my Raspberry Pi has remained accessible. It has not gone offline indefinitely. Fingers crossed it stays that way.

My logs show the software watchdog reboots the system regularly — always due to lost network connectivity.

On one hand, the watchdog is working as intended: detecting a failure, rebooting the system and getting the Pi back online.

But the real mystery remains: why does the network keep failing in the first place? And why does my Raspberry Pi start ghosting me?

Still, this is real progress. I no longer have to climb onto the roof in freezing weather. The system recovers on its own, even when I'm away from home.

The quest continues ...

March 04, 2025

At the beginning of the year, we released MySQL 9.2, the latest Innovation Release. Sorry for the delay, but I was busy with the preFOSDEM MySQL Belgian Days and FOSDEM MySQL Belgium Days. Of course, we released bug fixes for 8.0 and 8.4 LTS, but in this post, I focus on the newest release. Within […]

February 27, 2025

The Engagement Rehab

I’ve written extensively, in French, about my quest to break my "connection addiction" by doing what I called "disconnections". At first, it was only doing three months without major news media and social networks. Then I tried to do one full year where I would only connect once a day.

This proved to be too ambitious and failed around May when the amount of stuff that required me to be online (banking, travel booking, online meetings, …) became too high.

But I’m not giving up. I started 2025 by buying a new office chair and pledging to never be connected in that chair. I disabled Wifi in the Bios of my laptop. To be online, I now need to use my laptop on my standing desk which has a RJ-45 cable.

This means I can be connected whenever I want but I’m physically feeling the connection as standing up. There’s now a clear physical difference between "being online" and "being in my offline bubble".

This doesn’t mean that I’m as super productive as I was dreaming. Instead of working on my current book project, I do lots of work on Offpunk, I draft blog posts like this one. Not great but, at least, I feel I’ve accomplished something at the end of the day.

Hush is addicted to YouTube and reflects on spending 28 days without it. Like myself, they found themselves not that much productive but, at the very least, not feeling like shit at the end of the day.

I’ve read that post because being truly disconnected forces me to read more of what is in my Offpunk. My RSS feeds, my toread list and many gemlogs. This is basically how I start every day:

I’ve discovered that between 20 and 25% of what I read from online sources is from Gemini. It appears that I like "content" on Gemini. Historically, people were complaining that there was no content on Gemini, that most posts were about the protocol itself.

Then there was a frenzy of posts about why social media were bad. And those are subtly replaced by some kind of self-reflection about our own habits, our owns addictions. Like this one about addiction to analytics:

That’s when it struck me: we are all addicted to engagement. On both sides. We like being engaged. We like seeing engagement on our own content. Gemini is an engagement rehab!

While reading Gemini posts, I feel that I’m not alone being addicted to engagement, suffering from it and trying to find a solution.

And when people in the real world starts, out of the blue, asking my opinion about Elon Musk’s latest declaration, it reminds me that the engagement addiction is not an individual problem but a societal one.

Anyway, welcome to Gemini, welcome to rehab! I’m Ploum and I’m addicted to engagement.

I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

I write science-fiction novels in French. For Bikepunk, my new post-apocalyptic-cyclist book, my publisher is looking for contacts in other countries to distribute it in languages other than French. If you can help, contact me!

February 24, 2025

I did it. I just finished generating alt-text for 9,000 images on my website.

What began as a simple task evolved into a four-part series where I compared different LLMs, evaluated local versus cloud processing, and built an automated workflow.

But this final step was different. It wasn't about technology – it was about trust and letting things go.

My AI tool in action

In my last blog post, I shared scripts to automate alt-text generation for a single image. The final step? Running my scripts on the 9,000 images missing alt-text. This covers over 20 years of images in photo albums and blog posts.

Here is my tool in action:

A terminal displays AI generating image descriptions, showing suggested title and alt-text for each photo that scrolls by.

And yes, AI generated the alt-text for this GIF. AI describing AI, a recursion that should have ripped open the space-time continuum. Sadly, no portals appeared. At best, it might have triggered a stack overflow in a distant dimension. Meanwhile, I just did the evening dishes.

ChatGPT-4o processed all 9,000 images at half a cent each, for less than $50 in total. And despite hammering their service for a couple days, I never hit a rate limit or error. Very impressive.

AI is better than me

Trusting a script to label 9,000 images made me nervous. What if mistakes in auto-generated descriptions made my website less accessible? What if future AI models trained on any mistakes?

I started cautiously, stopping after each album to check every alt-text. After reviewing 250 images, I noticed something: I wasn't fixing errors, I was just tweaking words.

Then came the real surprise. I tested my script on albums I had manually described five years ago. The result was humbling. AI wrote better alt-text: spotting details I missed, describing scenes more clearly, and capturing nuances I overlooked. Turns out, past me wasn't so great at writing alt-text.

Not just that. The LLM understood Japanese restaurant menus, decoded Hungarian text, interpreted German Drupal books, and read Dutch street signs. It recognized conference badges and correctly labeled events. It understood cultural contexts across countries. It picked up details about my photos that I had forgotten or didn't even know existed.

I was starting to understand this wasn't about AI's ability to describe images; it was about me accepting that AI often described them better than I could.

Conclusion

AI isn't perfect, but it can be very useful. People worry about hallucinations and inaccuracy, and I did too. But after generating alt-text for 9,000 images, I saw something different: real, practical value.

It didn't just make my site more accessible; it challenged me. It showed me that sometimes, the best way to improve is to step aside and let a tool do the job better.

February 20, 2025

Billions of images on the web lack proper alt-text, making them inaccessible to millions of users who rely on screen readers.

My own website is no exception, so a few weeks ago, I set out to add missing alt-text to about 9,000 images on this website.

What seemed like a simple fix became a multi-step challenge. I needed to evaluate different AI models and decide between local or cloud processing.

To make the web better, a lot of websites need to add alt-text to their images. So I decided to document my progress here on my blog so others can learn from it – or offer suggestions. This third post dives into the technical details of how I built an automated pipeline to generate alt-text at scale.

Subscribe to my blog

Join 5,000+ subscribers and get new posts by email.

High-level architecture overview

My automation process follows three steps for each image:

  1. Check if alt-text exists for a given image
  2. Generate new alt-text using AI when missing
  3. Update the database record for the image with the new alt-text

The rest of this post goes into more detail on each of these steps. If you're interested in the implementation, you can find most of the source code on GitHub.

Retrieving image metadata

To systematically process 9,000 images, I needed a structured way to identify which ones were missing alt-text.

Since my site runs on Drupal, I built two REST API endpoints to interact with the image metadata:

  • GET /album/{album-name}/{image-name}/get – Retrieves metadata for an image, including title, alt-text, and caption.
  • PATCH /album/{album-name}/{image-name}/patch – Updates specific fields, such as adding or modifying alt-text.

I've built similar APIs before, including one for my basement's temperature and humidity monitor. That post provides a more detailed breakdown of how I build endpoints like this.

This API uses separate URL paths (/get and /patch) for different operations, rather than using a single resource URL. I'd prefer to follow RESTful principles, but this approach avoids caching problems, including content negotiation issues in CDNs.

Anyway, with the new endpoints in place, fetching metadata for an image is simple:

curl -H "Authorization: test-token" \
  "https://dri.es/album/isle-of-skye-2024/journey-to-skye/get"

Every request requires an authorization token. And no, test-token isn't the real one. Without it, anyone could edit my images. While crowdsourced alt-text might be an interesting experiment, it's not one I'm looking to run today.

This request returns a JSON object with image metadata:

{
  "title": "Journey to Skye",
  "alt": "",
  "caption": "Each year, Klaas and I pick a new destination for our outdoor adventure. In 2024, we set off for the Isle of Skye in Scotland. This stop was near Glencoe, about halfway between Glasgow and Skye."
}

Because the alt-field is empty, the next step is to generate a description using AI.

Generating and refining alt-text with AI

A person stands by a small lake surrounded by grassy hills and mountains under a cloudy sky in the Scottish Highlands.

In my first post on AI-generated alt-text, I wrote a Python script to compare 10 different local Large Language Models (LLMs). The script uses PyTorch, a widely used machine learning framework for AI research and deep learning. This implementation was a great learning experience.

The original script takes an image as input and generates alt-text using multiple LLMs:

./caption.py journey-to-skye.jpg
{
  "image": "journey-to-skye.jpg",
  "captions": {
    "vit-gpt2": "A man standing on top of a lush green field next to a body of water with a bird perched on top of it.",
    "git": "A man stands in a field next to a body of water with mountains in the background and a mountain in the background.",
    "blip": "This is an image of a person standing in the middle of a field next to a body of water with a mountain in the background.",
    "blip2-opt": "A man standing in the middle of a field with mountains in the background.",
    "blip2-flan": "A man is standing in the middle of a field with a river and mountains behind him on a cloudy day.",
    "minicpm-v": "A person standing alone amidst nature, with mountains and cloudy skies as backdrop.",
    "llava-13b": "A person standing alone in a misty, overgrown field with heather and trees, possibly during autumn or early spring due to the presence of red berries on the trees and the foggy atmosphere.",
    "llava-34b": "A person standing alone on a grassy hillside with a body of water and mountains in the background, under a cloudy sky.",
    "llama32-vision-11b": "A person standing in a field with mountains and water in the background, surrounded by overgrown grass and trees."
  }
}

My original plan was to run everything locally for full control, no subscription costs, and optimal privacy. But after testing 10 local LLMs, I changed my mind.

I knew cloud-based models would be better, but wanted to see if local models were good enough for alt-texts. Turns out, they're not quite there. You can read the full comparison, but I gave the best local models a B, while cloud models earned an A.

While local processing aligned with my principles, it compromised the primary goal: creating the best possible descriptions for screen reader users. So I abandoned my local-only approach and decided to use cloud-based LLMs.

To automate alt-text generation for 9,000 images, I needed programmatic access to cloud models rather than relying on their browser-based interfaces — though browser-based AI can be tons of fun.

Instead of expanding my script with cloud LLM support, I switched to Simon Willison's llm tool: https://llm.datasette.io/. llm is a command-line tool and Python library that supports both local and cloud-based models. It takes care of installation, dependencies, API key management, and uploading images. Basically, all the things I didn't want to spend time maintaining myself.

Despite enjoying my PyTorch explorations with vision language models and multimodal encoders, I needed to focus on results. My weekly progress goal meant prioritizing working alt-text over building homegrown inference pipelines.

I also considered you, my readers. If this project inspires you to make your own website more accessible, you're better off with a script built on a well-maintained tool like llm rather than trying to adapt my custom implementation.

Scrapping my PyTorch implementation stung at first, but building on a more mature and active open-source project was far better for me and for you. So I rewrote my script, now in the v2 branch, with the original PyTorch version preserved in v1.

The new version of my script keeps the same simple interface but now supports cloud models like ChatGPT and Claude:

./caption.py journey-to-skye.jpg --model chatgpt-4o-latest claude-3-sonnet --context "Location: Glencoe, Scotland"
{
  "image": "journey-to-skye.jpg",
  "captions": {
    "chatgpt-4o-latest": "A person in a red jacket stands near a small body of water, looking at distant mountains in Glencoe, Scotland.",
    "claude-3-sonnet": "A person stands by a small lake surrounded by grassy hills and mountains under a cloudy sky in the Scottish Highlands."
  }
}

The --context parameter improves alt-text quality by adding details the LLM can't determine from the image alone. This might include GPS coordinates, album titles, or even a blog post about the trip.

In this example, I added "Location: Glencoe, Scotland". Notice how ChatGPT-4o mentions Glencoe directly while Claude-3 Sonnet references the Scottish Highlands. This contextual information makes descriptions more accurate and valuable for users. For maximum accuracy, use all available information!

Updating image metadata

With alt-text generated, the final step is updating each image. The PATCH endpoint accepts only the fields that need changing, preserving other metadata:

curl -X PATCH \
  -H "Authorization: test-token" \
  "https://dri.es/album/isle-of-skye-2024/journey-to-skye/patch" \
  -d '{
    "alt": "A person stands by a small lake surrounded by grassy hills and mountains under a cloudy sky in the Scottish Highlands.",
  }'

That's it. This completes the automation loop for one image. It checks if alt-text is needed, creates a description using a cloud-based LLM, and updates the image if necessary. Now, I just need to do this about 9,000 times.

Tracking AI-generated alt-text

Before running the script on all 9,000 images, I added a label to the database that marks each alt-text as either human-written or AI-generated. This makes it easy to:

  • Re-run AI-generated descriptions without overwriting human-written ones
  • Upgrade AI-generated alt-text as better models become available

With this approach I can update the AI-generated alt-text when ChatGPT 5 is released. And eventually, it might allow me to return to my original principles: to use a high-quality local LLM trained on public domain data. In the mean time, it helps me make the web more accessible today while building toward a better long-term solution tomorrow.

Next steps

Now that the process is automated for a single image, the last step is to run the script on all 9,000. And honestly, it makes me nervous. The perfectionist in me wants to review every single AI-generated alt-text, but that is just not feasible. So, I have to trust AI. I'll probably write one more post to share the results and what I learned from this final step.

Stay tuned.

February 19, 2025

De la soumission au technofascisme religieux

Les générateurs de code stupide

Sur Mastodon, David Chisnall fait le point sur une année d’utilisation de GitHub Copilot pour coder. Et le résultat est clair : si, au début, il a l’impression de gagner du temps en devant moins taper sur son ordinateur, ce temps est très largement perdu par les heures voire les jours nécessaires à déboguer des bugs subtils qui ne seraient jamais arrivés s’il avait écrit le code lui-même en premier lieu ou, au pire, qu’il aurait pu détecter beaucoup plus vite.

Il réalise alors que la difficulté et le temps passé sur le code n’est pas d’écrire le code, c’est de savoir quoi et comment l’écrire. S’il faut relire le code généré par l’IA pour le comprendre, c’est plus compliqué pour le programmeur que de tout écrire soi-même.

« Oui, mais pour générer le code pas très intelligent »

Là, je rejoins David à 100% : si votre projet nécessite d’écrire du code bête qui a déjà été écrit mille fois ailleurs, c’est que vous avez un problème. Et le résoudre en le faisant écrire par une IA est à peu près la pire des choses à faire.

Comme je le dis en conférence : ChatGPT apparait utile pour ceux qui ne savent pas taper sur un clavier. Vous voulez être productif ? Apprenez la dactylographie !

Là où ChatGPT est très fort, par contre, c’est de faire semblant d’écrire du code. En proposant des tableaux d’avancement de son travail, en prétendant que tout est bientôt prêt et sera sur WeTransfer. C’est évidemment bidon : ChatGPT a appris à arnaquer !

Bref, ChatGPT est devenu le parfait Julius.

Ed Zitron enfonce encore plus le clou à ce sujet : les ChatGPTs et consorts sont des « succès » parce que toute la presse ne fait qu’en parler en termes élogieux, que ce soit par bêtise ou par corruption. Mais, en réalité, le nombre d’utilisateurs payants est incroyablement faible et, comme Trump, Sam Altman s’adresse à nous en considérant que nous sommes des débiles qui avalons les plus gros mensonges sans broncher. Et les médias et les CEOs applaudissent…

Débiles, nous le sommes peut-être complètement. Plusieurs dizaines d’articles scientifiques mentionnent désormais la « miscroscopie électronique végétative ». Ce terme ne veut rien dire. Quelle est son origine ?

Il vient tout simplement d’un article de 1959 publié sur deux colonnes, mais qui est entré dans le corpus comme une seule colonne !

Ce que cette anecdote nous apprend c’est que, premièrement, les générateurs de conneries sont encore plus mauvais qu’on ne l’imagine, mais, surtout, que notre monde est déjà rempli de cette merde ! Les LLMs ne font qu’appliquer au contenu en ligne ce que l’industrie a fait pour le reste : les outils, les vêtements, la bouffe. Produire le plus possible en baissant la qualité autant que possible. Puis en l’abaissant encore plus.

La suppression des filtres

L’imprimerie fait passer la communication de "One to one" à "One to many", ce qui rend obsolète l’Église catholique, l’outil utilisé en occident pour que les puissants imposent leur discours à la population. La première conséquence de l’imprimerie sera d’ailleurs le protestantisme qui revendique explicitement la capacité pour chacun d’interpréter la parole de Dieu et donc de créer son propre discours à diffuser, le "One to many".

Comme le souligne Victor Hugo dans Notre-Dame de Paris, « la presse tuera l’église ».

Conséquences directes de l’imprimerie : la Renaissance puis les Lumières. Toute personne qui réfléchit peut diffuser ses idées et s’inspirer de celles qui sont diffusées. Chaque humain ne doit plus réinventer la roue, il peut se baser sur l’existant. L’éducation prend le pas sur l’obéissance.

Après quelques siècles de « One to many » apparait l’étape suivante : Internet. Du « One to many » on passe au « Many to many ». Il n’y a plus aucune limite pour diffuser ses idées : tout le monde peut le faire envers tout le monde.

Une conséquence logique qui m’avait échappé à l’époque du billet précédent, c’est que si tout le monde veut parler, plus personne n’écoute. Comme beaucoup, j’ai cru que le « many to many » serait incroyablement positif. La triste réalité est que l’immense majorité d’entre nous n’avons pas grand-chose à dire, mais que nous voulons quand même nous faire entendre. Alors nous crions. Nous générons du bruit. Nous étouffons ce qui est malgré tout intéressant.

L’investissement nécessaire pour imprimer un livre ainsi que le faible retour direct constitue un filtre. Ne vont publier un livre que ceux qui veulent vraiment le faire.

La pérennité de l’objet livre et la relative lenteur de sa transmission implique également un second filtre : les livres les moins intéressants seront vite oubliés. C’est d’ailleurs pourquoi nous idéalisons parfois le passé, tant en termes de littérature que de cinématographie ou de musique : parce que ne nous sont parvenus que les meilleurs, parce que nous avons oublié les sombres merdes qui firent un flop ou eurent un succès éphémère.

Bien que très imparfait et filtrant probablement de très bonnes choses que nous avons malheureusement perdues, la barrière à l’entrée et la dilution temporelle nous permettaient de ne pas sombrer dans la cacophonie.

L’échec de la démocratisation de la parole

Internet, en permettant le « many to many » sans aucune limite a rendu ces deux filtres inopérants. Tout le monde peut poster pour un coût nul. Pire : les mécanismes d’addiction des plateformes ont rendu plus facile de poster que de ne pas poster. Le support numérique rend également floue la frontière temporelle : un contenu est soit parfaitement conservé, soit disparait totalement. Cela entraine que de vieux contenus réapparaissent comme s’ils étaient neufs et personne ne s’en rend compte. Le filtre temporel a totalement disparu.

De possible, le « many to many » s’est transformé en obligation. Pour exister, nous devons être vus, entendus. Nous devons avoir une audience. Prendre des selfies et les partager. Recevoir des likes qui nous sont vendus bien cher.

Le « many to many » s’est donc révélé une catastrophe, peut-être pas dans son principe, mais dans sa mise en œuvre. Au lieu d’une seconde renaissance, nous entrons en décadence, dans un second moyen-âge. La frustration de pouvoir s’exprimer, mais de ne pas être entendu est grande.

Olivier Ertzscheid va même plus loin : pour lui, ChatGPT permet justement d’avoir l’impression d’être écouté alors que personne ne nous écoute plus. Du « many to many », nous sommes passés au « many to nobody ».

Utiliser ChatGPT pour obtenir des infos se transforme en utiliser ChatGPT pour obtenir confirmation à ses propres croyances, comme le relève le journaliste politique Nils Wilcke.

J’en ai marre de le répéter, mais ChatGPT et consorts sont des générateurs de conneries explicitement conçus pour vous dire ce que vous avez envie d’entendre. Que « ChatGPT a dit que » puisse être un argument politique sur un plateau télévisé sans que personne ne bronche est l’illustration d’un crétinisme total généralisé.

Le Techno-Fascisme religieux

La « Many to nobody » est en soi un retour à l’ordre ancien. Plus personne n’écoute la populace. Seuls les grands seigneurs disposent de l’outil pour imposer leur vue. L’Église catholique a été remplacée par la presse et les médias, eux-mêmes remplacés par les réseaux sociaux et ChatGPT. ChatGPT qui n’est finalement qu’une instance automatisée d’un prêtre qui vous écoute en confession avant de vous dire ce qui est bien et ce qui est mal, basé sur les ordres qu’il reçoit d’en haut.

Dans un très bon billet sur le réseau Gemini, small patata réalise que l’incohérence du fascisme n’est pas un bug, c’est son mode de fonctionnement, son essence. Une incohérence aléatoire et permanente qui permet aux esprits faibles de voir ce qu’ils ont envie de voir par paréidolie et qui brise les esprits les plus forts. En brisant toute logique et cohérence, le fascisme permet aux abrutis de s’affranchir de l’intelligence et de prendre le contrôle sur les esprits rationnels. Le légendaire pigeon qui chie sur l’échiquier et renverse les pièces avant de déclarer victoire.

L’incohérence de ChatGPT n’est pas un bug qui sera résolu ! C’est au contraire ce qui lui permet d’avoir du succès avec les esprits faibles qui, en suivant des formations de « prompt engineering », ont l’impression de reprendre un peu de contrôle sur leur vie et d’acquérir un peu de pouvoir sur la réalité. C’est l’essence de toutes les arnaques : prétendre aux personnes en situation de faiblesse intellectuelle qu’ils vont miraculeusement retrouver du pouvoir.

Small patata fait le lien avec les surréalistes qui tentèrent de lutter artistiquement contre le fascisme et voit dans le surréalisme une manière beaucoup plus efficace de lutter contre les générateurs de conneries.

Il faut dire que face à un générateur mondial de conneries, fasciste, centralisé, ultra capitaliste et bénéficiant d’une adulation religieuse, je ne vois pas d’autre échappatoire que le surréalisme.

Brandissons ce qui nous reste d’humanité ! Aux âmes citoyens !

Image reprise du gemlog de small patatas: Le triomphe du surréalisme, Max Ernst (1937)

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

February 18, 2025

Le succès existe-t-il ?

La notion de succès d’un blog

Un blogueur que j’aime beaucoup, Gee, revient sur ses 10 ans de blogging. Cela me fascine de voir l’envers du décor des autres créateurs. Gee pense avoir fait l’erreur de ne pas profiter de la vague d’enthousiasme qu’à connu son Geektionnerd et de ne pas en avoir profité pour faire plus de promo.

Je ne suis pas d’accord avec Gee : il a très bien fait de continuer sa vie sans se préoccuper du succès. Les vagues d’enthousiasme vont et viennent, elles sont très brèves. Le public passe très vite à autre chose. Partir en quête du buzz permanent est la recette absolue pour se perdre. C’est un métier à part entière : le marketing. Trop d’artistes et de créateurs se sont détournés vers le marketing, espérant obtenir une fraction du succès obtenu par des gens sans talents autre que le marketing.

Mais vous oubliez que la perception du succès elle-même fait partie du plan marketing. Vous pensez qu’un tel a du succès ? Vous n’en savez rien. Vous ne savez même pas définir « succès ». C’est une intuition confuse. Faire croire qu’on a du succès fait partie du mensonge !

Pour beaucoup de gens de mon entourage éloigné, je suis soudainement devenu un écrivain à succès parce que… je suis passé à la télé à une heure de grande écoute. Pour ces gens-là qui me connaissent, je suis passé de « type qui écrit de vagues livres dont personne n’a entendu parler » à « véritable écrivain connu qui passe à la télé ». Pour ceux, et ils sont nombreux, qui ont délégué à la télévision le pouvoir d’ordonner les individus au rang de « célébrité », j’ai du succès. Pour eux, je ne peux rien rêver de plus si ce n’est, peut-être, passer régulièrement à la télé et devenir une « vedette ».

Dans ma vie quotidienne et aux yeux de toutes les (trop rares) personnes qui n’idolâtre pas inconsciemment la télévision, ces passages à la télé n’ont strictement rien changé. J’ai certainement vendu quelques centaines de livres en plus. Mais ai-je du « succès » pour autant ?

Il y a quelques mois, j’étais invité comme expert pour le tournage d’une émission télé sur l’importance de protéger ses données personnelles en ligne. Lors d’une pause, j’ai demandé au présentateur ce qu’il faisait d’autre dans la vie. Il m’a regardé, étonné, et m’a répondu : « Je présente le JT ». Ça ne devait plus lui arriver très souvent de ne pas être reconnu. La moitié de la Belgique doit savoir qui il est. Nous avons rigolé et j’ai expliqué que je n’avais pas la télévision.

Question : cette personne a-t-elle du « succès » ?

Le succès est éphémère

À 12 ans, en vacances avec mes parents, je trouve un livre abandonné sur une table de la réception de l’hôtel. « Tantzor » de Paul-Loup Sulitzer. Je le dévore et je ne suis visiblement pas le seul. Paul-Loup Sulitzer est l’écrivain à la mode du moment. Selon Wikipédia, il a vendu près de 40 millions de livres dans 40 langues, dont son roman le plus connu : « Money ». Il vit alors une vie de milliardaire flamboyant.

Trente ans plus tard, ruiné, il publie la suite de Money: « Money 2 ». Il s’en écoulera moins de 1.300 exemplaires. Adoré, adulé, moqué, parodié des centaines de fois, Sulitzer est tout simplement tombé dans l’oubli le plus total.

Si le « succès » reste une notion floue et abstraite, une chose est certaine : il doit s’entretenir en permanence. Il n’est jamais véritablement acquis. Si on peut encore comprendre la notion de « faire fortune » comme « avoir plus d’argent que l’on ne peut en dépenser » (et donc ne plus avoir besoin d’en gagner), le succès lui ne se mesure pas. Il ne se gère pas de manière rationnelle.

Quels indicateurs ?

Dans son billet, Gee s’étonne également d’avoir reçu beaucoup moins de propositions pour le concours des 5 ans du blog que pour celui du premier anniversaire. Malgré une audience supposée supérieure.

De nouveau, le succès est une affaire de perception. Quel succès voulons-nous ? Des interactions intéressantes ? Des interactions nombreuses (ce qui est contradictoire avec la précédente) ? Des ventes ? Du chiffre d’affaires ? Des chiffres sur un compteur de visite comme les sites web du siècle précédent ?

Il n’y a pas une définition de succès. En fait, je ne connais personne, moi le premier, qui soit satisfait de son succès. Nous sommes, par essence humaine, éternellement insatisfaits. Nous sommes jaloux de ce que nous croyons voir chez d’autres (« Il passe à la télé ! ») et déçus de nos propres réussites (« Je suis passé à la télé, mais en fait, ça n’a rien changé à ma vie »).

Écrire dans le vide

C’est peut-être pour cela que j’aime tant le réseau Gemini. C’est le réseau anti-succès par essence. En publiant sur Gemini, on a réellement l’impression que personne ne va nous lire, ce qui est donne une réelle liberté.

Certains de mes posts de blog font le buzz sur le web. Je n’ai pas de statistiques, mais je vois qu’ils tournent sur Mastodon, qu’ils font la première page sur Hacker News. Mais si je n’allais pas sur Hacker News ni sur Mastodon, je ne le saurais pas. J’aurais tout autant l’impression d’ếcrire dans le vide que sur Gemini.

À l’opposé, certains de mes billets ne semblent pas attirer les "likes", "partages", "votes" et autres "commentaires". Pourtant, je reçois de nombreux emails à leur sujet. De gens qui veulent creuser le sujet, réfléchir avec moi. Ou me remercier pour cette réflexion. C’est particulièrement le cas avec le réseau Gemini qui semble attirer des personnes qui sont dans l’échange direct. Moi-même il m’arrive souvent de dégainer mon client mail pour répondre spontanément à un billet personnel lu sur Gemini. La réaction la plus fréquente à ces messages est : « Wow, je ne pensais pas que quelqu’un me lisait ! ».

Je vous pose la question : quel type de billet a, selon vous, le plus de « succès » ?

Est-ce que la notion de succès a réellement un sens ? Peut-on avoir assez de succès ?

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

February 11, 2025

Last week, I wrote about my plan to use AI to generate 9,000 alt-texts for images on my website. I tested 12 LLMs — 10 running locally and 2 cloud-based — to assess their accuracy in generating alt-text for images. I ended that post with two key questions:

  1. Should I use AI-generated alt-texts, even if it they are not perfect?
  2. Should I generate these alt-texts with local LLMs or in the cloud?

Since then, I've received dozens of emails and LinkedIn comments. The responses were all over the place. Some swore by local models because they align with open-source values. Others championed cloud-based LLMs for better accuracy. A couple of people even ran tests using different models to help me out.

I appreciate every response. It's a great reminder of why building in the open is so valuable – it brings in diverse perspectives.

But one comment stood out. A visually impaired reader put it simply: Imperfect alt-text is better than no alt-text.

That comment made the first decision easy: AI-generated alt-text, even if not perfect, is better than nothing.

The harder question was which AI models to use. As a long-term open-source evangelist, I really want to run my own LLMs. Local AI aligns with my values: no privacy concerns, no API quotas, more transparency, and more control. They also align with my wallet: no subscription fees. And, let's be honest – running your own LLMs earns you some bragging rights at family parties.

But here is the problem: local models aren't as good as cloud models.

Most laptops and consumer desktops have 16–32GB of RAM, which limits them to small, lower-accuracy models. Even maxing out an Apple Mac Studio with 192GB of RAM doesn't change that. Gaming GPUs? Also a dead end, at least for me. Even high-end cards with 24GB of VRAM struggle with the larger models unless you stack multiple cards together.

The gap between local and cloud hardware is big. It's like racing a bicycle against a jet engine.

I could wait. Apple will likely release a new Mac Studio this year, and I'm hoping it supports more than 192GB of RAM. NVIDIA's Digits project could make consumer-grade LLM hardware even more viable.

Local models are also improving fast. Just in the past few weeks:

  • Alibaba released Qwen 2.5 VL, which performs well in benchmarks.
  • DeepSeek launched DeepSeek-VL2, a strong new open model.
  • Mark Zuckerberg shared that Meta's Llama 4 is in testing and might be released in the next few months.

Consumer hardware and local models will continue to improve. But even when they do, cloud models will still be ahead. So, I am left with this choice:

  1. Prioritize accessibility: use the best AI models available today, even if they're cloud-based.
  2. Stick to Open Source ideals: run everything locally, but accept worse accuracy.

A reader, Kris, put it well: Prioritize users while investing in your values. That stuck with me.

I'd love to run everything locally, but making my content accessible and ensuring its accuracy matters more. So, for now, I'm moving forward with cloud-based models, even if it means compromising on my open-source ideals.

It's not the perfect answer, but it's the practical one. Prioritizing accessibility and end-user needs over my own principles feels like the right choice.

That doesn't mean I'm giving up on local LLMs. I'll keep testing models, tracking improvements, and looking for the right hardware upgrades. The moment local AI is good enough for generating alt-text, I'll switch – no hesitation. In my next post, I'll share my technical approach to making this work.

À la recherche de la déconnexion parfaite

Une rétrospective de ma quête de concentration

Une première déconnexion

À la fin de l’année 2018, épuisé par la promotion de la compagne Ulule de mon livre « Les aventures d’Aristide, le lapin cosmonaute » et prenant conscience de mon addiction aux réseaux sociaux, je décide de me « déconnecter ».

Un bien grand mot pour m’interdire pendant 3 mois l’utilisation des réseaux sociaux et des sites d’actualité.

Le premier effet va se faire sentir très vite avec la désinstallation de l’app que j’utilise le plus à l’époque : Pocket.

L’expérience est avant tout une prise de conscience. Je découvre que, dès que je m’ennuie, j’ouvre machinalement un navigateur web sans même y réfléchir. C’est littéralement un réflexe.

Je commence à percevoir la différence entre l’information et le « bruit ». L’hyperconnexion est, comme le tabac, une assuétude et une pollution. Une notion qui deviendra essentielle dans ma réflexion.

Si je tente de subir moins de bruit, mon épouse me fait remarquer que je tente toujours d’en générer en postant sur des réseaux que je ne lis plus. Je suis incohérent.

Comme souvent dans ce genre d’expérience, on en sort sans aucune envie de se « reconnecter ». Mais je vais, bien entendu, très vite reprendre mes anciennes habitudes.

Le problème de l’hyperconnexion est désormais clair dans ma tête. Je suis addict et cette addiction m’est néfaste à tous les points de vue.

La période technosolutionniste

Face à la réalisation de l’ampleur du problème, mon premier réflexe est de trouver une solution technique, technologique. Beaucoup de personnes sont dans le même cas et, si cette étape est loin de suffire, elle est indispensable : faire du tri dans les outils numériques que nous utilisons. Je me rends compte que l’univers Apple, que je fréquente à l’époque, ayant reçu un MacBook de mon employeur, est à la fois contraire à mes valeurs et complètement incompatible avec une forme de sobriété numérique, car poussant à la consommation. Cette dichotomie entre ma philosophie et mon vécu entraine une tension que je tente d’évacuer par la surconnexion. Il est temps pour moi de revenir entièrement sous Linux.

J’achète également un téléphone qui est tellement merdique et bugué que je n’ai jamais envie de l’utiliser (non, ne l’achetez pas).

Concrètement, cette première déconnexion a également été l’opportunité de terminer mon feuilleton « Printeurs » ainsi que d’écrire quelques nouvelles. Celui-ci intéresse un éditeur et je publie mon premier roman en 2020.

Une autre action concrète que j’entreprends est de supprimer au maximum de comptes en ligne. Je ne le sais pas encore, mais je vais en découvrir et en supprimer près de 500 et cela va me prendre près de trois ans. Pour la plupart, j’ai oublié qu’ils existent, mais pour certains, l’étape est significative.

En parallèle, je découvre le protocole minimaliste Gemini. Suite à l’utilisation de ce protocole, une idée commence à me trotter dans la tête : travailler complètement déconnecté. J’ai en effet découvert que bloquer certains sites n’est pas suffisant : je trouve automatiquement des alternatives sur lesquelles procrastiner, alternatives qui sont même parfois moins intéressantes. J’ai donc envie d’explorer une déconnexion totale. Je commence à rédiger mon journal personnel à la machine à écrire.

Seconde déconnexion : une tentative d’année déconnectée

Le 1er janvier 2022, trois ans après la fin de ma première déconnexion, je me lance dans une tentative d’année complètement déconnectée. L’idée est de n’utiliser mon ordinateur que déconnecté dans mon bureau, de le synchroniser une fois par jour. Le tout est rendu possible par un logiciel que j’ai développé dans les derniers mois de 2021 : Offpunk.

Évidemment, la connexion est nécessaire pour certaines actions que je me propose de chronométrer et d’enregistrer. J’écris, en direct, le compte-rendu de cette déconnexion et, contre toute attente, ces écrits semblent passionner les lecteurs.

Mieux préparée et beaucoup plus ambitieuse (trop ?), cette déconnexion est finalement un échec après moins de 6 mois.

La leçon est dure : il n’est quasiment pas possible de se déconnecter de manière structurelle dans la société actuelle. Nous sommes tout le temps sollicités pour accomplir des actions en ligne, actions qui nécessitent du temps, mais pas toujours de la concentration. Tout est désormais optimisé pour que nous soyons en ligne.

Ma déconnexion est un échec. Le livre de cette déconnexion est inachevé. Un autre manuscrit sur lequel je travaille durant cette déconnexion est dans un état inutilisable. Cependant, j’ai profité de ce temps pour écrire quelques nouvelles et finaliser mon recueil « Stagiaire au spatioport Omega 3000 et autres joyeusetés que nous réserve le futur ».

Conséquence directe de cette déconnexion, mon compte Whatsapp disparait. Mon compte Twitter suit bientôt également.

J’ai également pris conscience que mon blog Wordpress n’est plus du tout en phase avec ma philosophie. En parallèle de mon travail sur Offpunk, je réécris complètement mon blog pour en faire un outil « offline ».

Le second retour à la normalité

Début 2023, je m’isole pour commencer l’écriture de Bikepunk qui paraitra en 2024. J’alterne entre les périodes de déconnexion totale et des périodes d’hyperconnexion.

Le seul réseau social où j’ai gardé un compte, Mastodon, commence à attirer l’attention. J’y suis très présent et, philosophiquement, je ne peux que soutenir et encourager toutes les personnes cherchant à quitter X et Meta. Je retombe dans l’hyperconnexion. Une hyperconnexion éthique, mais une hyperconnexion tout de même.

Pendant deux ans, j’utilise l’extension Firefox LeechBlock qui permet de n’autoriser qu’un temps limité par jour sur certains sites web. Cela fonctionne pas trop mal pendant un temps jusqu’au moment où j’acquiers le réflexe de désactiver le plugin sans même y penser.

Comme tous les trois ans, il est temps pour moi de lancer un nouveau cycle et de m’interroger sur mes usages.

Un de mes apprentissages principaux est que toute modification de mon comportement mental doit s’accompagner chez moi par une modification physique. Mon esprit suit les réflexes de mon corps. Je tape encore parfois machinalement dans la barre d’adresse Firefox les premières lettres de sites procrastinatoires sur lesquels je n’ai plus été depuis dix ans !

Le second apprentissage est que la radicalité implique une rechute plus forte. La connexion est nécessaire tous les jours, de manière imprévisible. Je ne souhaite pas m’isoler, mais concevoir une manière de fonctionner durable. Créer de nouveaux réflexes.

Une troisième déconnexion

Pour ma « déconnexion 2025 », j’ai donc pris une grande décision : j’ai acheté un fauteuil pour remplacer ma chaise de bureau. Pendant toutes mes études et mes premières années professionnelles, je n’avais que des chaises de récupération. Au printemps 2008, disposant d’un salaire stable et d’un appartement, j’achète une chaise de bureau neuve : le premier prix de chez Ikea. Cette chaise, rafistolée avec des coussins défoncés dont mes beaux-parents ne voulaient plus, était encore celle que j’utilisais jusqu’il y a quelques jours. Ce nouveau fauteuil est donc un très grand changement pour moi.

Et je me suis promis de ne l’utiliser qu’en étant déconnecté.

Pour ce faire, je désactive le wifi dans le Bios de mon ordinateur. J’ai également organisé un « bureau debout » dans un coin de la pièce, bureau debout où arrive un câble RJ-45. Si je veux me connecter, je dois donc physiquement me lever et brancher un câble. Tout ce que je dois faire en ligne s’effectue désormais en étant debout. Lorsque je suis assis (ou vautré, pour être plus exact), je suis déconnecté.

J’ai également pris d’autres petites mesures. En premier lieu, mes todos ne sont plus stockés sur mon ordinateur, mais sur des fiches sur un tableau de liège. Un comble pour qui se rappelle que j’ai passé plusieurs années à développer le logiciel « Getting Things GNOME ».

Je revois aussi la gestion de mon email. J’adore recevoir des emails et de mes lecteurs et j’ai beaucoup de mal à ne pas y répondre. Puis à répondre à la réponse de ma réponse. Avec le succès de Bikepunk, mon courrier s’est étoffé et je me retrouve parfois à la fin de la journée en réalisant que j’ai… « répondu à mes emails ». Des discussions certes enrichissantes, mais chronophages. Dans bien des cas, je répète dans plusieurs mails ce qui pourrait être un billet de blog. Considérez que j’ai lu votre mail, mais que ma réponse alimentera mes prochains billets de blogs. Certains billets futurs traiteront de thèmes que je n’aborde pas d’habitude, mais pour lesquels je reçois énormément de questions.

Sur Mastodon, que je ne consulte plus que debout, j’ai pris la décision de mettre tous les comptes que je suis dans une liste, liste que j’ai configurée pour qu’elle ne s’affiche pas dans ma timeline. Quand je consulte Mastodon, je ne vois donc que mes posts à moi et je dois accomplir une action en plus si je veux voir ce qui se dit (ce que je ne fais plus tous les jours). Comme avant, les notifications sont régulièrement « vidées ».

Si vous voulez suivre ce blog, privilégiez le flux RSS ou bien mes deux newsletters:

À la recherche de l’ennui.

Déconnexion est un bien grand mot pour simplement dire que je ne serai plus connecté 100% du temps. Mais telle est l’époque où nous vivons. Cal Newport parle de l’incroyable productivité de l’écrivain Brandon Sanderson qui a créé une entreprise de 70 personnes uniquement dédiée à une seule activité : le laisser écrire le plus possible !

Si l’exemple est extrême, Cal s’étonne de ce qu’on ne voit pas plus de structures qui cherchent à favoriser la concentration et la créativité. Dans un âge où l’hyperdistraction permanente est la norme, il est nécessaire de se battre et de développer les outils pour se concentrer. Et s’ennuyer. Surtout s’ennuyer. Car pour réfléchir et créer, l’ennui est primordial.

D’ailleurs, si je ne m’étais pas ennuyé, je n’aurais jamais écrit ce billet ! Nous dresserons le bilan dans 3 ans pour ma quatrième déconnexion…

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

February 06, 2025

De la décadence technologique et des luddites technophiles

La valeur de texte brut

Thierry s’essaie à publier son blog sur le réseau Gemini, mais a du mal avec le format minimaliste. Qui est justement pour moi la meilleure partie du protocole Gemini.

Le format Gemini impose, comme dans un livre, du texte pur. Il est possible d’ajouter un titre, des sous-titres, des liens, des citations, mais avec une particularité importante : cela doit concerner toute la ligne, pas une simple partie de texte. Les liens doivent donc être sur leur propre ligne plutôt que de se perdre et foisonner dans le texte. Comme ils interrompent la lecture entre deux paragraphes, ils doivent être explicités et justifiés plutôt que d’être cachés au petit bonheur du clic.

Il est également impossible de mettre de l’italique ou du gras dans son texte. Ce qui est une excellente chose. Comme le rappelle Neal Stephenson dans son « In the beginning was the command line », les mélanges gras/italiques aléatoires n’ont rien à faire dans un texte. Prenez un livre et tentez de trouver du texte en gras dans le corps du texte. Il n’y en a pas et pour une bonne raison : cela ne veut rien dire, cela perturbe la lecture. Mais lorsque Microsoft Word est apparu, il a rendu plus facile de mettre en gras que de faire des titres corrects. Tout comme le clavier azerty a soudainement fait croire qu’il ne fallait pas mettre d’accent sur les majuscules, l’outil technologique a appauvri notre rapport au texte.

Car le besoin d’attirer l’attention au milieu d’un texte est un aveu d’insécurité de l’auteur. Le texte doit exister par lui-même. C’est au lecteur de choisir ce qu’il veut mettre en avant en surlignant, pas à l’auteur. Orner un texte d’artifices inutiles pour tenter de combler les vides porte un nom : la décadence.

Le gras, le word art, le Comic San MS, les powerpoints envoyés par mail, tous sont des textes décadents qui tentent de camoufler la vacuité ou l’inanité du contenu.

La décadence inexorable de la tech

Le texte n’est qu’un exemple parmi tant d’autres.

Thierry se pose également beaucoup de questions sur les notions low-tech et high-tech, notamment dans le médical. Mais le terme « low-tech » est selon moi trompeur. Je suis un luddite technophile. Contrairement à ce que la légende prétend, les luddites n’étaient pas du tout opposés à la technologie. Ils étaient opposés à la propriété technologique par la classe bourgeoise, ce qui transformait les artisans spécialisés en interchangeables esclaves des machines. Les luddites n’ont pas tenté de détruire des métiers à tisser technologiques, mais des machines que leurs patrons utilisaient pour les exploiter.

De la même manière, je ne suis pas opposé aux réseaux sociaux centralisés ni aux chatbots parce que c’est « high tech », mais parce que ce sont des technologies qui sont activement utilisées pour nous appauvrir, tant intellectuellement que financièrement. C’est même leur seul objectif avoué.

Que l’IA soit utilisée pour détecter plus précocement des cancers, je trouve l’idée formidable. Mais je sais également qu’elle est impossible dans le contexte actuel. Pas d’un point de vue technique. Mais parce que, bien utilisée, elle coûtera plus cher que pas d’IA du tout. En effet, l’IA peut aider en détectant des cancers que le médecin a ratés. Il faut donc un double diagnostic, tant du médecin que de l’IA et se poser des questions lorsque les deux sont en désaccord. Il faut payer le coût de l’IA en plus du surplus du travail du médecin, car il devra faire plus d’heures vu qu’il devra revoir les diagnostics « divergents » pour trouver son erreur ou celle de l’IA. L’IA est un outil qui peut être utile si on accepte qu’il coûte beaucoup plus cher.

Ça, c’est la théorie.

En pratique, une telle technologie est vendue sous prétexte de « faire des économies ». Elle va forcément induire un relâchement attentionnel des médecins et, pour justifier les coûts, une diminution du temps consacré à chaque diagnostic humain. Perdant de l’expérience et de l’habitude, le diagnostic des médecins va devenir de moins en moins sûr et, par effet ricochet, les nouveaux médecins vont être de moins en moins bien formés. Les cancers indétectés par l’IA ne le seront plus par les humains. L’IA étant entrainée sur les diagnostics réalisés par des humains, elle va également devenir de moins en moins compétente et s’autovalider. Au final, nul besoin d’être grand clerc pour voir que si la technologie est intéressante, son utilisation dans notre contexte socio-économique ne peut que se révéler catastrophique et n’est intéressante que pour les vendeurs d’IA.

Le mensonge high tech

Les partisans du « low tech » ont l’intuition que la « high tech » cherche à les exploiter. Ils ont raison sur le fond, pas sur la cause. Ce n’est pas la technologie le nœud du problème, mais sa décadence.

La course à la technologie est une bulle bâtie sur un mensonge. L’idée n’est pas de construire quelque chose de durable, mais de faire croire qu’on va le construire pour attirer des investisseurs. Les entreprises du NASDAQ sont devenues une énorme pyramide de Ponzi. Elles tentent de se soutenir l’une l’autre à coup de millions, mais perdent toutes énormément d’argent, ce qu’elles arrivent à cacher grâce au cours de la bourse.

D’ailleurs, des recherches sérieuses confirment mon intuition : au plus on comprend ce qu’il y a derrière « l’intelligence artificielle », au moins on en veut. L’IA est littéralement un piège à ignorants. Et les producteurs l’ont très bien compris : ils ne veulent pas que l’on comprenne ce qu’ils font.

Ed Zitron continue sur sa lancée avec l’inattendue arrivée de DeepSeek, le ChatGPT chinois qui est simplement 30 fois moins cher. À la question « Pourquoi OpenAI et les autres n’ont pas réussi à faire moins cher », il propose la réponse rétrospectivement évidente : « Parce que ces entreprises n’avaient aucun intérêt à faire moins cher. Au plus elles perdent de l’argent, au plus elles justifient que ce qu’elles font est cher, au plus elles attirent les investisseurs et effraient de potentiels compétiteurs ». En bref : parce qu’elles sont complètement décadentes !

Cory Doctorow parle souvent de merdification, je propose plutôt de parler de « décadence technologique ». Nous produisons la technologie la plus chère, la plus complexe et la moins écologique possible par simple réflexe. Comme pour les orgies romaines, la complexité et le coût ne sont plus des obstacles, mais les objectifs premiers que nous cherchons à atteindre.

Ceci explique aussi pourquoi la technologie se retourne complètement contre ses utilisateurs. Dernièrement, une dame d’un certain âge voulait me montrer sur son téléphone un post vu sur son compte Facebook. La moitié de son gigantesque écran de téléphone était littéralement une publicité fixe pour une voiture. Dans la seconde moitié de l’écran, la dame scrollait et alternait entre d’autres pubs pour des voitures et ce qui était probablement du contenu. Son téléphone était doté d’un écran gigantesque, mais seule une fraction de celui-ci était au service de l’utilisateur. Et encore, pas complètement.

La bagnole est en soi le parfait exemple de décadence : d’outil, elle est devenue un symbole qui doit être le plus gros, le plus lourd, le plus voyant possible. Ce qui entraine une complexité infernale tant en termes d’espace public que d’espace privé. Les maisons des dernières décennies sont, pour la plupart, bâties comme des pièces autour d’un garage. Les villes comme des bâtiments autour de nœuds routiers. La voiture est devenue le véritable citoyen des villes, les humains n’en sont que les servants. Le Web suit la même trajectoire avec les robots remplaçant les voitures.

La frénésie envers l’intelligence artificielle est l’archétype de cette décadence. Car si les nouveaux outils ont clairement une utilité et peuvent clairement aider dans certains contextes, nous sommes dans une situation inverse : trouver un problème auquel appliquer l’outil .

Retour au concept d’utilité

C’est également la raison pour laquelle Gemini me passionne tellement. C’est l’outil le plus direct pour transmettre le texte de mon cerveau à celui d’un lecteur. En ouvrant la porte au gras, à l’italique puis aux images et au JavaScript, le Web est devenu une jungle décadente. Les auteurs y publient puis, sans se soucier d’être lus, consultent avidement les statistiques de clics et de likes. Le texte est de plus en plus optimisé pour ces statistiques. Avant d’être automatisés par des robots, robots qui pour s’entrainer vont consulter les textes en ligne et générer automatiquement des clics.

La boucle de la décadence technologique est bouclée : les contenus sont lus et générés par les mêmes machines. Les bourgeois capitalistes propriétaires ont réussi à automatiser totalement tant leurs ouvriers (les créateurs de contenus) que leurs clients (ceux qui font du clic).

Je ne veux pas servir les propriétaires de plateforme. Je ne veux pas consommer ce fade et inhumain contenu automatisé. Je tente de comprendre les conséquences de mes usages technologiques pour en tirer le maximum d’utilité avec le moins de conséquences négatives possible.

Face à la décadence technologique, je suis devenu un luddite technophile.

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

February 03, 2025

I have 10,000 photos on my website. About 9,000 have no alt-text. I'm not proud of that, and it has bothered me for a long time.

When I started my blog nearly 20 years ago, I didn't think much about alt-texts. Over time, I realized its importance for visually impaired users who rely on screen readers.

The past 5+ years, I diligently added alt-text to every new image I uploaded. But that only covers about 1,000 images, leaving most older photos without descriptions.

Writing 9,000 alt-texts manually would take ages. Of course, AI could do this much faster, but is it good enough?

To see what AI can do, I tested 12 Large Language Models (LLMs): 10 running locally and 2 in the cloud. My goal was to test their accuracy and determine whether they can generate accurate alt-text.

The TL;DR is that, not surprisingly, cloud models (GPT-4, Claude Sonnet 3.5) set the benchmark with A-grade performance, though not 100% perfect. I prefer local models for privacy, cost, and offline use. Among local options, the Llama variants and MiniCPM-V perform best. Both earned a B grade: they work reliably but sometimes miss important details.

I know I'm not the only one. Plenty of people — entire organizations even — have massive backlogs of images without alt-text. I'm determined to fix that for my blog and share what I learn along the way. This blog post is just step one — subscribe by email or RSS to get future posts.

Subscribe to my blog

Join 5,000+ subscribers and get new posts by email.

Models evaluated

I tested alt-text generation using 12 AI models: 9 on my MacBook Pro with 32GB RAM, 1 on a higher-RAM machine (thanks to Jeremy Andrews, a friend and long-time Drupal contributor), and 2 cloud-based services.

The table below lists the models I tested, with details like links to research papers, release dates, parameter sizes (in billions), memory requirements, some architectural details and more:

Model Launch date Type Vision encoder Language encoder Model size (billions of parameters) RAM Deployment
1 VIT-GPT2 2021 Image-to-text ViT (Vision Transformer) GPT-2 0.4B ~8GB Local, Dries
2 Microsoft GIT 2022 Image-to-text Swin Transformer Transformer Decoder 1.2B ~8GB Local, Dries
3 BLIP Large 2022 Image-to-text ViT BERT 0.5B ~8GB Local, Dries
4 BLIP-2 OPT 2023 Image-to-text CLIP ViT OPT 2.7B ~8GB Local, Dries
5 BLIP-2 FLAN-T5 2023 Image-to-text CLIP ViT FLAN-T5 XL 3B ~8GB Local, Dries
6 MiniCPM-V 2024 Multi-modal SigLip-400M Qwen2-7B 8B ~16GB Local, Dries
7 LLaVA 13B 2024 Multi-modal CLIP ViT Vicuna 13B 13B ~16GB Local, Dries
8 LLaVA 34B 2024 Multi-modal CLIP ViT Vicuna 34B 34B ~32GB Local, Dries
9 Llama 3.2 Vision 11B 2024 Multi-modal Custom Vision Encoder Llama 3.2 11B ~20GB Local, Dries
10 Llama 3.2 Vision 90B 2024 Multi-modal Custom Vision Encoder Llama 3.2 90B ~128GB Local, Jeremy
11 OpenAI GPT-4o 2023 Multi-modal Custom Vision Encoder GPT-4 >150B Cloud
12 Anthropic Claude 3.5 Sonnet 2024 Multi-modal Custom Vision Encoder Claude 3.5 >150B Cloud

How image-to-text models work (in less than 30 seconds)

LLMs come in many forms, but for this project, I focused on image-to-text and multi-modal models. Both types of models can analyze images and generate text, either by describing images or answering questions about them.

Image-to-text models follow a two-step process: vision encoding and language decoding:

  1. Vision encoding: First, the model breaks an image down into patches. You can think of these as "puzzle pieces". The patches are converted into mathematical representations called embeddings, which summarize their visual details. Next, an attention mechanism filters out the most important patches (e.g. the puzzle pieces with the cat's outline or fur texture) and eliminates less relevant details (e.g. puzzle pieces with plain blue skies).
  2. Language encoding: Once the model has summarized the most important visual features, it uses a language model to translate those features into words. This step is where the actual text (image captions or Q&A answers) is generated.

In short, the vision encoder sees the image, while the language encoder describes it.

If you look at the table above, you'll see that each row pairs a vision encoder (e.g., ViT, CLIP, Swin) with a language encoder (e.g., GPT-2, BERT, T5, Llama).

For a more in-depth explanation, I recommend Sebastian Raschka's article Understanding Multi-modal LLMs, which also covers how image encoders work. It's fantastic!

Comparing different AI models

I wrote a Python script that generates alt-texts for images using nine different local models. You can find it in my GitHub repository. It takes care of installing models, running them, and generating alt-texts. It supports both Hugging Face and Ollama and is built to be easily extended as new models come out.

You can run the script as follows:

$ ./alt-text ./test-images/image-1.jpg

The first time you run the script, it will download all models, which requires significant disk space and bandwidth — expect to download over 50GB of model data.

The script outputs a JSON response, making it easy to integrate or analyze programmatically. Here is an example output:

{
  "image": "test-images/image-1.jpg",
  "alt-texts": {
    "vit-gpt2": "A city at night with skyscrapers and a traffic light on the side of the street in front of a tall building.",
    "git": "A busy city street is lit up at night, with the word qroi on the right side of the sign.",
    "blip": "This is an aerial view of a busy city street at night with lots of people walking and cars on the side of the road.",
    "blip2-opt": "An aerial view of a busy city street at night.",
    "blip2-flan": "An aerial view of a busy street in tokyo, japanese city at night with large billboards.",
    "minicpm-v": "A bustling cityscape at night with illuminated billboards and advertisements, including one for Michael Kors.",
    "llava-13b": "A bustling nighttime scene from Tokyo's famous Shibuya Crossing, characterized by its bright lights and dense crowds of people moving through the intersection.",
    "llava-34b": "A bustling city street at night, filled with illuminated buildings and numerous pedestrians.",
    "llama32-vision-11b": "A bustling city street at night, with towering skyscrapers and neon lights illuminating the scene."
  }
}

Test images

With the script ready, I decided to test it on some of my 10,000 photos. Not all of them at once. I picked five that I consider non-standard. Instead of simple portraits or landscapes, I picked photos with elements that might confuse or challenge the models.

One photo is from the Isabella Stewart Gardner Museum in Boston and features an empty gold frame. The frame once held a masterpiece stolen in the infamous 1990 heist, one of the biggest art thefts in history. I wanted to see if the models would recognize it as empty or mistake it for a framed painting.

Another photo, taken last summer in Vermont, shows a wakeboarder. Though he is the main subject, he is relatively small in the frame. I was curious to see if the models could still recognize him as the focal point.

In another photo, a backgammon game is set in a dark but cozy atmosphere. I was curious to see if the models could recognize partially visible objects and capture the mood of the scene.

To ensure a fair test, I stripped all EXIF metadata from the images. This includes any embedded captions, GPS coordinates, or other details that could inadvertently help the models.

Yes, I know that a test set of five images is small, but it's sufficient to identify the top models for further evaluation. With 12 models generating alt-texts for each photo, I had to manually evaluate 60 alt-texts, which was tedious to do fairly and accurately. For now, these five images are enough to filter out weaker models and pinpoint the best 2-3 LLMs for more detailed testing.

Next, I'll share the detailed results for each of the five photos. It's a lot to go through, so feel free to skip ahead to the summary.

Test image 1: Shibuya Crossing, Tokyo

Areal view of an intersection, capturing the evening commute with pedestrians, traffic and electronic billboards.

Benchmark LLMs:

Model Description Grade
GPT-4o A vibrant city intersection with crowds crossing, surrounded by illuminated billboards and towering buildings, likely Shibuya, Tokyo. A
Claude Hundreds of people cross Tokyo's famous Shibuya intersection as neon signs light up the surrounding skyscrapers. A

Local LLMs:

Model Description Grade
VIT-GPT2 A city at night with skyscrapers and a traffic light on the side of the street in front of a tall building. F
GIT A busy city street is lit up at night, with the word qroi on the right side of the sign. E
BLIP This is an aerial view of a busy city street at night with lots of people walking and cars on the side of the road. D
BLIP-2 with OPT An aerial view of a busy city street at night. E
BLIP-2 with FLAN An aerial view of a busy street in Tokyo, Japanese city at night with large billboards. E
LLaVA (13B) A bustling nighttime scene of Tokyo's Shibuya Crossing, with neon lights and people walking across the intersection. A
LLaVA (34B) A bustling city street at night, with illuminated buildings and numerous pedestrians. C
MiniCPM-V A bustling cityscape at night with various advertisements, including one for Michael Kors and another for Yakult. C
Llama 3.2 Vision (11B) A bustling city street at night, with numerous billboards and advertisements lining the buildings. C
Llama 3.2 Vision (90B) A bustling city street with many people and buildings. D

Test image 2: Isabella Stewart Gardner Museum, Boston

An empty picture frame in a room with ornate wallpaper, candles, old paintings, and antique furniture.

Benchmark LLMs:

Model Description Grade
GPT-4o Elegant wallpaper, an empty gold frame, candelabra, and framed painting in a classic room. B
Claude Ornate Victorian room with damask wallpaper, candelabra, and gold frames on wall. B

Local LLMs:

Model Description Grade
VIT-GPT2 A living room with a mirror, candles, and a vase of flowers on a table in front of a mirror. F
GIT A picture frame is hanging on a wall next to a vase and a vase with the word tulips on it. E
BLIP A room with a painting on the wall and two vases on the table in front of it. E
BLIP-2 with OPT A room with a painting on the wall, a picture frame, and a chandelier. C
BLIP-2 with FLAN A room with a painting on the wall and a couple of framed pictures hanging on the wall next to it. D
LLaVA (13B) An empty wall with ornate decorations, including candlesticks and framed artwork, suggesting a formal or historical setting. D
LLaVA (34B) An ornate room with framed pictures on the wall, a chandelier hanging from the ceiling, and a fireplace mantel adorned with decorative items. E
MiniCPM-V A room with ornate wallpaper, candlesticks and framed portraits of historical figures is displayed.
Llama 3.2 Vision (11B) An empty gold frame on a wall with ornate wallpaper, surrounded by other decorative items. A
Llama 3.2 Vision (90B) An ornate room with a gold frame, a painting of a man, and a vase on a table. B

Test image 3: wakeboarding in Vermont, USA

Two men in swim shorts on the back of a boat watching another person wakeboarding behind the boat.

Benchmark LLMs:

Model Description Grade
GPT-4o Two shirtless boys on a boat watch a person wakeboarding in the distance on a cloudy day. A
Claude People watching from a boat as a person is wakeboarding on a lake with a wooded shoreline. B

Local LLMs:

Model Description Grade
VIT-GPT2 A man standing on top of a boat next to another man holding a surfboard in one hand and a surfboard in the other. E
GIT Two men are in a boat, one of them is wearing an orange hat and the other is wearing an orange hat. D
BLIP There are two men riding on the back of a boat in the water, one of them is on a surfboard and the other is on a board. D
BLIP-2 with OPT Three young men sitting on the back of a boat. D
BLIP-2 with FLAN A group of people sitting on a boat watching a man ride a surfboard in the middle of the water. C
LLaVA (13B) Two shirtless men on a boat, watching another man surfing in the water. B
LLaVA (34B) Two shirtless men on a boat, watching another person water skiing. B
MiniCPM-V Two shirtless men on a boat watch as another person water skis away from them, with trees and sky visible across the lake behind him. B
Llama 3.2 Vision (11B) Two shirtless men on a boat watching people wakeboarding or surfing behind it. B
Llama 3.2 Vision (90B) Two men on a boat watching another man wakeboarding. A

Test image 4: hiking in the Dolomites, Italy

Santa maddalena church in funes

Benchmark LLMs:

Model Description Grade
GPT-4o A wooden shrine with a statue of Mary and directional signs, set against mountains and green hills. A
Claude Wooden shrine of St. Magdalena with trail signs in the Dolomites mountains of Italy. A

Local LLMs:

Model Description Grade
VIT-GPT2 A bird perched on top of a wooden post in front of a mountain range with a view of a mountain range and mountains. F
GIT A wooden sign in front of a mountain with a statue on it that says "Meise museum" on the top. D
BLIP There is a birdhouse with a statue on top of it in the middle of a field with trees and mountains in the background. D
BLIP-2 with OPT A statue of Mary on a wooden post in the middle of a field with mountains in the background. B
BLIP-2 with FLAN A statue of the Virgin Mary sits on top of a wooden post in the middle of a mountain. C
LLaVA (13B) A picturesque scene of a small wooden chapel with a statue of Jesus, set against a backdrop of majestic mountains and lush greenery. C
LLaVA (34B) A wooden shrine with religious iconography, surrounded by alpine scenery and signposts. B
MiniCPM-V A wooden shrine with a statue of Mary and several directional signs pointing to various locations such as Kirchsteig, Magdalena, St. B
Llama 3.2 Vision (11B) A wooden shrine with a statue of Mary and a signpost in front of mountains. A
Llama 3.2 Vision (90B) A statue of Mary in a wooden shrine with a signpost pointing to various locations, including Rundweg St. B

Test image 5: backgammon by candlelight

A backgammon board on a wooden table, accompanied by candles that cast a warm glow.

Benchmark LLMs:

Model Description Grade
GPT-4o A cozy, dimly lit room with a candle on a wooden table, next to a backgammon board, creating a warm, rustic ambiance. B
Claude Two candles light up a game board for backgammon on a wooden table at night. A

Local LLMs:

Model Description Grade
VIT-GPT2 A candle is lit on a wooden table in front of a fire place with candles and other items on top of it. E
GIT Two candles are lit next to each other on a table, one of them is lit up and the other is lit up. D
BLIP There is a lit candle sitting on top of a wooden table next to a game board and a glass of wine on the table. D
BLIP-2 with OPT A candle sits on top of a wooden table. C
BLIP-2 with FLAN A candle sits on a wooden table next to a backgammon board and a glass of wine. D
LLaVA (13B) A cozy indoor setting with candles, a guitar, and a warm ambiance. D
LLaVA (34B) A candlelit wooden table with musical instruments and a cozy ambiance. C
MiniCPM-V A dimly lit room with candles and backgammon pieces on a wooden table, creating an atmosphere of relaxation or leisure activity. A
Llama 3.2 Vision (11B) A dimly lit room with a wooden table, featuring a backgammon board and two candles. A
Llama 3.2 Vision (90B) A candle and backgammon board on a wooden table. B

Model accuracy

I evaluated each description using a structured but subjective scoring system. For each image, I identified the two or three most important objects the AI should recognize and include in its description. I also assessed whether the model captured the photo's mood, which can be important for visually impaired users. Finally, I deducted points for repetition, grammar errors, or hallucinations (invented details). Each alt-text received a score from 0 to 5, which I then converted to a letter grade from A to F.

Model Repetitions Hallucinations Moods Average score Grade
VIT-GPT2 Often Often Poor 0.4/5 F
GIT Often Often Poor 1.6/5 D
BLIP Often Often Poor 1.8/5 D
BLIP2 w/OPT Rarely Sometimes Fair 2.6/5 C
BLIP2 w/FLAN Rarely Sometimes Fair 2.2/5 D
LLaVA 13B Never Sometimes Good 3.2/5 C
LLaVA 34B Never Sometimes Good 3.2/5 C
MiniCPM-V Never Never Good 3.8/5 B
Llama 11B Never Rarely Good 4.4/5 B
Llama 90B Never Rarely Good 3.8/5 B
GPT-4o Never Never Good 4.8/5 A
Claude 3.5 Sonnet Never Never Good 5/5 A

The cloud-based models, GPT-4o and Claude 3.5 Sonnet, performed nearly perfectly on my small test of five images, with no major errors, hallucinations, repetitions and excellent mood detection.

Among local models, both Llama variants and MiniCPM-V show the strongest performance.

Repetition in descriptions frustrates users of screen readers. Early models like VIT-GPT2, GIT, BLIP, and BLIP2 frequently repeat content, making them unsuitable.

Hallucinations can be a serious issue in my opinion. Describing nonexistent objects or actions misleads visually impaired users and erodes trust. Among the best-performing local models, MiniCPM-V did not hallucinate, while Llama 11B and Llama 90B each made one mistake. Llama 90B misidentified a cabinet at the museum as a table, and Llama 11B described multiple people wakeboarding instead of just one. While these errors aren't dramatic, they are still frustrating.

Capturing mood is essential for giving visually impaired users a richer understanding of images. While early models struggled in this area, all recent models all performed well. This includes both LLaVA variants and MiniCPM-V.

From a practical standpoint, Llama 11B and MiniCPM-V ran smoothly on my 32GB RAM laptop, but Llama 90B needed more memory. Long story short, this means that Llama 11B and MiniCPM-V are my best candidates for additional testing.

Possible next steps

The results raise a tough question: is a "B"-level alt-text better than none at all? Many human-written alt-texts probably aren't perfect either. Should I wait for local models to hit an "A"-grade, or is an imperfect description still better than no alt-text at all?

Here are four possible next steps:

  1. Combine AI outputs – Run the same image through different models and merge their results to try and create more accurate descriptions.
  2. Wait and upgrade – Use the best local model for now, tag AI-generated alt-texts in the database, and refresh them in 6–12 months when new and better local models are available.
  3. Go cloud-based – Get the best quality with a cloud model, even if it means uploading 65GB of photos. I can't explain why, or if the feeling is even justified, but it feels like giving in.
  4. Hybrid approach – Use AI to generate alt-texts but review them manually. With 9,000 images, that is not practical. I'd need a way to flag alt-texts most likely to be wrong. Can LLMs give me a reliably confidence score?

Each option comes with trade-offs. Some options are quick but imperfect, others take work but might be worth it. Going cloud-based is the easiest but it feels like giving in. Waiting for better models is effortless but means delaying progress. Merging AI outputs or assigning a confidence score takes more effort but might be the best balance of speed and accuracy.

Maybe the solution is a combination of these options? I could go cloud-based now, tag the AI-generated alt-texts in my database, and regenerate them in 6–12 months when LLMs got even better.

It also comes down to pragmatism versus principle. Should I stick to local models because I believe in data privacy and Open Source, or should I prioritize accessibility by providing the best possible alt-text for users? The local-first approach better aligns with my values, but it might come at the cost of a worse experience for visually impaired users.

I'll be weighing these options over the next few weeks. What would you do? I'd love to hear your thoughts!

Update: My thoughts on using AI for alt-text has evolved across several blog posts. First, I chose a cloud-based LLM after all. Then, I built an automated system to generate and update descriptions for just one image. Finally, I scaled it to 9,000 images and learned to trust AI in the process.

January 31, 2025

Treasure hunters, we have an update! Unfortunately, some of our signs have been removed or stolen, but don’t worry—the hunt is still on! To ensure everyone can continue, we will be posting all signs online so you can still access the riddles and keep progressing. However, there is one exception: the 4th riddle must still be heard in person at Building H, as it includes an important radio message. Keep your eyes on our updates, stay determined, and don’t let a few missing signs stop you from cracking the code! Good luck, and see you at Infodesk K with舰

January 29, 2025

Are you ready for a challenge? We’re hosting a treasure hunt at FOSDEM, where participants must solve six sequential riddles to uncover the final answer. Teamwork is allowed and encouraged, so gather your friends and put your problem-solving skills to the test! The six riddles are set up across different locations on campus. Your task is to find the correct locations, solve the riddles, and progress to the next step. No additional instructions will be given after this announcement, it’s up to you to navigate and decipher the clues! To keep things fair, no hints or tips will be given舰

Et si on arrêtait d’être de bons petits consultants obéissants ?

Le cauchemar des examens

Régulièrement, je me réveille la nuit avec une boule dans le ventre et une bouffée de panique à l’idée que je n’ai pas étudié mon examen à l’université. Cela fait 20 ans que je n’ai plus passé d’examen et pourtant j’en suis encore traumatisé.

Du coup, j’essaye de proposer à mes étudiants un examen le moins stressant possible. Si un étudiant n’est vraiment nulle part, je profite de l’adrénaline inhérente à un examen pour tenter de lui inculquer les concepts. Parfois, je demande à un étudiant d’enseigner la matière à l’autre. J’impose toute de même certaines règles vestimentaires : la cravate est interdite, mais tout le reste est encouragé. J’ai déjà eu des étudiants en peignoir, un étudiant en costume traditionnel de son pays, et toujours insurpassé, une étudiante en costume complet de Minnie (avec les oreilles, le maquillage, les chaussures, la totale !). Cette année j’ai eu droit… à une banane !

Un étudiant passe son examen déguisé en banane. Un étudiant passe son examen déguisé en banane.

Le monopole de l’East India Company

J’encourage également les étudiants à venir avec leur propre sujet d’examen.
Un de mes étudiants m’a proposé cet article qui compare Google avec l’East India Company qui, comme tous les empires, a fini par s’écrouler sous son propre poids. J’aime l’analogie et la morale : on ne gagne le pouvoir qu’en se faisant des ennemis. C’est lorsqu’on croit avoir le plus de pouvoirs qu’on a le plus d’ennemis qui n’ont rien à perdre et qui veulent se venger. Trump, Facebook, Google. Ils sont au sommet. Mais chaque jour les rangs des rebelles sont étoffés par ceux qui ont cru que leur allégeance et leur soumission leur offriraient une fraction de pouvoir ou de richesse avant d’être déçus. Car le pouvoir absolu ne se partage pas. Il ne se partage, par définition, jamais.

Bon, l’article est fort naïf sur certains aspects. Il dit par exemple qu’AT&T n’a pas exploité sa position dominante parce qu’il suivait une certaine éthique. C’est faux. AT&T n’a pas exploité sa position dominante tout simplement parce que l’entreprise était sous la menace d’un procès pour abus de position dominante. La crainte du procès est ce qui a permis le succès d’UNIX (développé par AT&T) et d’Internet. Lorsque IBM a commencé à avoir une position dominante dans le marché informatique naissant, la crainte d’un procès est ce qui a permis la standardisation du PC que l’on connait aujourd’hui et ce qui a permis l’apparition de l’industrie logicielle où s’est engouffrée Microsoft.

Mais je vous ai déjà raconté cette histoire :

Malheureusement, tout change dans les années 1980 avec la présidence de Reagan (le Trump de l’époque). Ses conseillers instaurent l’idée que les monopoles ne sont finalement pas si nocifs, ils sont même plutôt bons pour l’économie (surtout les économies des politiciens qui ont des actions dans ces monopoles). Du coup, on va beaucoup moins les poursuivre, voire les encourager. De là les succès de Microsoft, Google et Facebook qui, malgré les procès, n’ont pas été scindés ni n’ont jamais dû adapter leurs pratiques.

Si vous lisez ceci, ça vous parait sans doute absurde : comment peut-on justifier que les monopoles ne sont pas nocifs juste pour enrichir les politiciens ? Quelle astuce utiliser ?

Le secret ? Il n’y a pas d’astuce. Pas besoin de se justifier. Il suffit de le faire. Et pour tous les aspects pratiques de n’importe quelle loi, aussi absurde et injuste soit-elle, il suffit de se passer des fonctionnaires scrupuleux et de tout faire faire par des cabinets de consultance. Enfin, surtout un : McKinsey.

McKinsey et la naïveté de la bonté

Étudiant, j’ai participé à une soirée d’embauche de McKinsey. Bon, je n’avais pas trop d’espoir, car ils annonçaient ne prendre que celleux avec les meilleurs points (ce dont j’étais loin), mais je me suis dit qu’on ne savait jamais. Je n’avais aucune idée de ce qu’était McKinsey ni de ce qu’ils faisaient, je savais juste que c’était une sorte de Graal vu qu’ils ne prenaient que les meilleurs.

Assis dans un auditoire, j’ai assisté à la présentation de « cas » réels. Une employée de McKinsey, qui a annoncé avoir fait les mêmes études que moi quelques années auparavant (mais avec de bien meilleurs résultats), a présenté son travail. Il s’agissait de réaliser la fusion de deux entités dont les noms avaient été cachés. Sur l’écran s’affichait des colonnes de « ressources » pour chaque entité puis comment la fusion permettait d’économiser les ressources.

J’étais d’abord un peu perdu dans le jargon. J’ai posé quelques questions et finis par comprendre que les « ressources » étaient des employé·e·s. Que ce que je voyais était avant tout un plan de licenciement brutal. J’ai interrompu la présentation pour demander comment étaient pris en compte les aspects éthiques. J’ai eu droit à une réponse standard comme quoi « l’éthique était primordiale chez McKinsey, qu’ils suivaient des règles strictes ». J’ai insisté, j’ai creusé. Parmi la cinquantaine d’étudiants participants, j’étais le seul à prendre la parole, j’étais le seul à m’étonner (j’en ai discuté après avec d’autres, personne ne semblait avoir vu le problème). J’ai demandé à la présentatrice de me donner un exemple d’une des fameuses règles de l’éthique McKinsey. Et j’ai obtenu cette réponse qui est restée gravée dans ma mémoire : « Un consultant McKinsey doit toujours favoriser l’intérêt de son client, quoi qu’il arrive ».

Après la présentation, je suis allé trouver la consultante en question. Autour d’un petit four, j’ai insisté une fois de plus sur l’éthique. Elle m’a ressorti le même blabla. Je lui ai alors dit que je ne parlais pas de ça. Que toutes ses colonnes de chiffres étaient des personnes qui allaient perdre leur emploi, que la fusion allait avoir un impact économique important sur des milliers de familles et que je me demandais comment cet aspect était envisagé.

Elle a ouvert la bouche. Son visage s’est décomposé. Et la brillante ingénieure qui avait réussi les études les plus difficiles avec les meilleurs points m’a répondu :

« Je n’avais jamais pensé à ça… »

Même les personnes soi-disant les plus intelligentes ne pensent pas. Elles obéissent. « Ne recruter que les meilleurs » n’était pas une technique de recrutement, mais bien une manière de créer un élitisme de façade qui empêchait les heureux élus de se poser des questions.

« Je n’avais jamais pensé à ça… »

À mon examen, j’ai eu une étudiante particulièrement brillante. Je lui ai dit que, vu sa compréhension hyper fine, j’attendais d’elle qu’elle questionne plus les choses, qu’elle réagisse surtout face à d’autres, moins brillants, mais plus sûrs d’eux. Elle est clairement plus intelligente que moi alors pourquoi n’intervient-elle pas pour me signaler lorsque je suis incohérent ? Le monde a besoin de gens intelligents qui posent des questions. Elle s’est défendue : « Mais on m’a toujours appris à faire le contraire ! ».

Dans un très bon article, Garrison Lovely revient sur la stratégie de McKinsey.

Le fait que les personnes qui avaient participé à une manifestation contre Trump soient ensuite des pièces centrales [en temps que consultants McKinsey] de sa politique de déportation est, en un sens, tout ce qu’il faut savoir.
McKinsey exécute, ne fait pas de politique

L’auteur interroge sur ce qui aurait empêché McKinsey d’optimiser la fourniture de fils barbelés des camps de concentration. La réponse tombe : "McKinsey a des valeurs". Des valeurs qui sont enseignées et répétées lors des "Values Days". 20 ans après ma propre expérience d’une soirée McKinsey, rien n’a changé.

La naïveté du bien

Le problème des gens bons et subtils, c’est qu’ils n’arrivent pas à imaginer que l’arnaque est fondamentalement malhonnête et pas subtile.

Ils cherchent à comprendre, à expliquer, à justifier.

Il n’y a rien à comprendre : le malhonnête cherche son profit de manière directe et non subtile. C’est tellement évident que même les plus subtils laissent passer en se disant que ça cache « autre chose ».

Je vois passer des messages qui disent que Trump ou Musk font des choses illégales. Qu’ils ne respectent pas les règles.

Ben justement. C’est le principe.

Que Trump ne peut décemment pas avoir triché aux élections parce que « quelqu’un » se serait opposé. Quelqu’un ? Mais qui ? Ceux qui obéissent à leurs chefs sans poser de questions parce que c’est leur boulot ? Ceux qui ont peur de perdre leur place et qui préfèrent ménager celui qui a gagné les élections ? Ceux qui, au contraire, se disent qu’ils peuvent faire une bonne affaire en brossant le vainqueur dans le sens du poil ?

Si Trump avait été condamné pour son implication dans l’insurrection du 6 janvier, tout le monde lui serait tombé dessus et se serait disputé sa dépouille. Mais même les juges impliqués savaient qu’il pouvait redevenir président. Qu’il utiliserait son pouvoir pour punir toute personne impliquée dans sa condamnation. Il était moins risqué de soutenir Trump que le contraire. La majorité des gens, même les plus puissants, surtout les plus puissants, sont des moutons terrorisés par le bâton et à l’affut de la moindre petite carotte.

Les seuls qui peuvent s’indigner sont celleux qui ont un sens moral fort, qui n’ont rien à perdre, qui n’ont rien à gagner, qui ont la force et l’énergie de s’indigner, le temps pour le faire et les réseaux pour se faire entendre. J’insiste sur le « et » logique. Il faut que toutes ces conditions soient remplies. Et force est de constater que ça ne fait pas beaucoup de monde.

Surtout quand on réalise que ce « pas beaucoup de monde » est majoritairement peuplé d’idéalistes qui ne veulent pas croire que la personne en face puisse être à ce point dénuée de sens moral et de scrupule. Alors, comme des crétins, ils tentent de se faire entendre… sur X ou sur Facebook, des plateformes qui appartiennent à ceux qu’ils cherchent à combattre.

Celleux qui se plaignent sur ces plateformes ont l’impression d’être actifs, mais ils sont algorithmiquement enfermés dans leur petite bulle où iels n’auront aucun impact sur le reste du monde.

« Bon » et « bête » ça commence par la même lettre. On a ce qu’on mérite. Le simple fait d’avoir gardé un compte sur X après le rachat par Elon Musk était un vote virtuel pour Trump. Tout le monde le savait. Vous le saviez. Vous ne pouviez pas ne pas le savoir. C’est juste que, comme un bon consultant McKinsey, vous vous disiez que « ça n’était pas si grave que ça ». Qu’ « il y a des règles, non ? ». Si vous me lisez, vous êtes, comme beaucoup, une bonne personne et donc incapable d’imaginer qu’Elon Musk puisse avoir simplement et très ouvertement manipulé son réseau social pour favoriser Trump.

Mieux vaut tard que jamais

Mais il n’est jamais trop tard pour réagir. Le 1er février est annoncé comme le « Global Switch Day ». Vous êtes invités à migrer de X vers Mastodon.

Le 1er février, migrez de X vers Mastodon Le 1er février, migrez de X vers Mastodon

Mastodon qui devient une fondation. Ça fait plaisir de voir qu’Eugen, le créateur de Mastodon qui tout un temps s’enorgueillait du titre de « CEO de Mastodon » se rend compte que cette pression est énorme, qu’il ne joue pas dans la même cour et que Mastodon est un bien commun. En se faisant appeler « CEO », Meta le flattait pour obtenir sa coopération. Eugen semble avoir compris qu’il se perdait. Excellente interview de Renaud, développeur Mastodon.

Thierry Crouzet fait la comparaison avec les résistants.

En parallèle, Dansup développe Pixelfed, qui ressemble à Instagram. Ce qui est génial c’est que vous pouvez suivre des gens sur Mastodon depuis Pixelfed et vice-versa (enfin, en théorie, faudra qu’on en reparle, car, depuis Pixelfed, vous ne verrez pour le moment que les messages Mastodon contenant une image, j’espère que ça évoluera).

Pixelfed a attiré tellement de gens dégoutés par Meta (propriétaire d’Instagram) que Dansup s’est vu assailli par les investisseurs désireux de mettre des sous dans son "entreprise".

Le 1er février, migrez de Instagram vers Pixelfed Le 1er février, migrez de Instagram vers Pixelfed

Dommage pour eux, comme Mastodon, Pixelfed est un bien public. Il est et sera financé par les dons. Dansup lance d’ailleurs une campagne Kickstarter :

Signal et la messagerie

Lors de mon examen, la plupart des étudiants ont eu des questions sur le Fediverse ou sur Signal. Ce qui m’a permis de sonder leurs utilisations des réseaux sociaux et messageries. Fait marrant : ils sont tous sur des réseaux où ils pensent que « tout le monde est ». Mais, sans communiquer entre eux, ne sont pas d’accord sur quel est le réseau où tout le monde est. J’ai eu des étudiants qui ne jurent que par Instagram et d’autres qui n’ont jamais eu de compte. J’ai eu un étudiant qui est sur Facebook Messenger et sur Signal, mais n’a jamais éprouvé le besoin d’être sur Whatsapp. À côté de lui, un autre étudiant n’avait tout simplement jamais entendu parler de Signal. Il n’y a que Discord qui semble faire l’unanimité.

Celleux qui utilisaient Signal disaient tou·te·s qu’iels regrettaient que Signal ne soit pas plus utilisé. Et bien, le 1er février, c’est l’occasion !

Le 1er février, migrez de Whatsapp vers Signal Le 1er février, migrez de Whatsapp vers Signal

Alors, c’est peut-être le moment d’arrêter de jouer au bon petit consultant McKinsey ! Surtout si vous n’êtes pas payé pour ça…

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

January 27, 2025

Core to the Digital Operational Resilience Act is the notion of a critical or important function. When a function is deemed critical or important, DORA expects the company or group to take precautions and measures to ensure the resilience of the company and the markets in which it is active.

But what exactly is a function? When do we consider it critical or important? Is there a differentiation between critical and important? Can an IT function be a critical or important function?

Defining functions

Let's start with the definition of a function. Surely that is defined in the documents, right? Right?

Eh... no. The DORA regulation does not seem to provide a definition for a function. It does however refer to the definition of critical function in the Bank Recovery and Resolution Directive (BRRD), aka Directive 2014/59/EU. That's one of the regulations that focuses on the resolution in case of severe disruptions, bankrupcy or other failures of banks at a national or European level. A Delegated regulation EU/ 2016/778 further defines several definitions that inspired the DORA regulation as well.

In the latter document, we do find the definition of a function:

‘function’ means a structured set of activities, services or operations that are delivered by the institution or group to third parties irrespective from the internal organisation of the institution;

Article 2, (2), of Delegated regulation 2016/778

So if you want to be blunt, you could state that an IT function which is only supporting the own group (as in, you're not insourcing IT of other companies) is not a function, and thus cannot be a "critical or important function" in DORA's viewpoint.

That is, unless you find that the definition of previous regulations do not necessarily imply the same interpretation within DORA. After all, DORA does not amend the EU 2016/778 regulation. It amends EC 1060/2009, EU 2012/648, EU 2014/600 aka MiFIR, EU 2014/909 aka CSDR and EU 2016/1011 aka Benchmark Regulation. But none of these have a definition for 'function' at first sight.

So let's humor ourselves and move on. What is a critical function? Is that defined in DORA? Not really, sort-of. DORA has a definition for critical or important function, but let's first look at more distinct definitions.

In the BRRD regulation, this is defined as follows:

‘critical functions’ means activities, services or operations the discontinuance of which is likely in one or more Member States, to lead to the disruption of services that are essential to the real economy or to disrupt financial stability due to the size, market share, external and internal interconnectedness, complexity or cross-border activities of an institution or group, with particular regard to the substitutability of those activities, services or operations;

Article 2, (35), of BRRD 2014/59

This extends on the use of function, and adds in the evaluation if it is crucial for the economy, especially when it would be suddenly discontinued. The extension on the definition of function is also confirmed by guidance that the European Single Resolution Board published, namely that "the function is provided by an institution to third parties not affiliated to the institution or group".

The preamble of the Delegated regulation also mentions that its focus is at the safeguarding of the financial stability and the real economy. It gives examples of potential critical functions such as deposit taking, lending and loan services, payment, clearing, custody and settlement services, wholesale funding markets activities, and capital markets and investments activities.

Of course, your IT is supporting your company, and in case of financial institutions, IT is a very big part of the company. Is IT then not involved in all of this?

It sure is...

Defining services

The Delegated regulation EU 2016/778 in its preamble already indicates that functions are supported by services:

Critical services should be the underlying operations, activities and services performed for one (dedicated services) or more business units or legal entities (shared services) within the group which are needed to provide one or more critical functions. Critical services can be performed by one or more entities (such as a separate legal entity or an internal unit) within the group (internal service) or be outsourced to an external provider (external service). A service should be considered critical where its disruption can present a serious impediment to, or completely prevent, the performance of critical functions as they are intrinsically linked to the critical functions that an institution performs for third parties. Their identification follows the identification of a critical function.

Preamble, (8), Delegated regulation 2016/778

IT within an organization is certainly offering services to one or more of the business units within that financial institution. Once the company has defined its critical functions (or for DORA, "critical or important functions"), then the company will need to create a mapping of all assets and services that are needed to realize that function.

Out of that mapping, it is very well possible that several IT services will be considered critical services. I'm myself involved in the infrastructure side of things, which are often shared services. The delegated regulation already points to it, and a somewhat older guideline from the Financial Stability Board has the following to say about critical shared services:

a critical shared service has the following elements: (i) an activity, function or service is performed by either an internal unit, a separate legal entity within the group or an external provider; (ii) that activity, function or service is performed for one or more business units or legal entities of the group; (iii) the sudden and disorderly failure or malfunction would lead to the collapse of or present a serious impediment to the performance of, critical functions.

FSB guidance on identification of critical functions and critical shared services

For IT organizations, it is thus most important to focus on the services they offer.

Definition of critical or important function

Within DORA, the definition of critical or important function is as follows:

(22) ‘critical or important function’ means a function, the disruption of which would materially impair the financial performance of a financial entity, or the soundness or continuity of its services and activities, or the discontinued, defective or failed performance of that function would materially impair the continuing compliance of a financial entity with the conditions and obligations of its authorisation, or with its other obligations under applicable financial services law;

Article 3, (22), DORA

If we compare this definition with the previous ones about critical functions, we notice that it is extended with an evaluation of the impact towards the company - rather than the market. I think it is safe to say that this is the or important part of the critical or important function: whereas a function is critical if its discontinuance has market impact, a function is important if its discontinuance causes material impairment towards the company itself.

Hence, we can consider a critical or important function as being either market impact (critical) or company impact (important), but retaining externally offered (function).

This more broad definition does mean that DORA's regulation puts more expectations forward than previous regulation, which is one of the reasons that DORA is that impactful to financial institutions.

Implications towards IT

From the above, I'd wager that IT itself is not a "critical or important function", but IT offers services which could be supporting critical or important functions. Hence, it is necessary that the company has a good mapping of the functions and their underlying services, operations and systems. From that mapping, we can then see if those underlying services are crucial for the function or not. If they are, then we should consider those as critical or important systems.

This mapping is mandated by DORA as well:

Financial entities shall identify all information assets and ICT assets, including those on remote sites, network resources and hardware equipment, and shall map those considered critical. They shall map the configuration of the information assets and ICT assets and the links and interdependencies between the different information assets and ICT assets.

Article 8, (4), DORA

as well as:

As part of the overall business continuity policy, financial entities shall conduct a business impact analysis (BIA) of their exposures to severe business disruptions. Under the BIA, financial entities shall assess the potential impact of severe business disruptions by means of quantitative and qualitative criteria, using internal and external data and scenario analysis, as appropriate. The BIA shall consider the criticality of identified and mapped business functions, support processes, third-party dependencies and information assets, and their interdependencies. Financial entities shall ensure that ICT assets and ICT services are designed and used in full alignment with the BIA, in particular with regard to adequately ensuring the redundancy of all critical components.

Article 11, paragraph 2, DORA

In more complex landscapes, it is very well possible that the mapping is a multi-layered view with different types of systems or services in between, which could make the effort to identify services as being critical or important quite challenging.

For instance, it could be that the IT organization has a service catalog, but that this service catalog is too broadly defined to use the indication of critical or important. Making a more fine-grained service catalog will be necessary to properly evaluate the dependencies, but that also implies that your business (who has defined their critical or important functions) will need to indicate which fine-grained service they are depending on, rather than the high-level services.

In later posts, I'll probably dive deeper into this layered view.

Feedback? Comments? Don't hesitate to get in touch on Mastodon.

January 26, 2025

The regular FOSDEM lightning talk track isn't chaotic enough, so this year we're introducing Lightning Lightning Talks (now with added lightning!). Update: we've had a lot of proposals, so submissions are now closed! Thought of a last minute topic you want to share? Got your interesting talk rejected? Has something exciting happened in the last few weeks you want to talk about? Get that talk submitted to Lightning Lightning Talks! This is an experimental session taking place on Sunday afternoon (13:00 in k1105), containing non-stop lightning fast 5 minute talks. Submitted talks will be automatically presented by our Lightning舰

January 21, 2025

A person works on a laptop as a hologram of an AI agent hovers beside them.

I'm often asked, Will AI agents replace digital marketers and site builders?. The answer is yes, at least for certain kinds of tasks.

To explore this idea, I prototyped two AI agents to automate marketing tasks on my personal website. They update meta descriptions to improve SEO and optimize tags to improve content discovery.

Watching the AI agents in action is incredible. In the video below, you'll see them effortlessly navigate my Drupal site — logging in, finding posts, and editing content. It's a glimpse into how AI could transform the role of digital marketers.

The experiment

I built two AI agents to help optimize my blog posts. Here is how they work together:

  • Agent 1: Content analysis: This agent finds a blog post, reviews its content, and suggests improved summaries and tags to enhance SEO and increase discoverability.
  • Agent 2: Applying updates: After manual approval, this agent logs into the site and updates the summary and tags suggested by the first agent.

All of this could be done in one step, or with a single agent, but keeping a 'human-in-the-loop' is good for quality assurance.

This was achieved with just 120 lines of Python code and a few hours of trial and error. As the video demonstrates, the code is approachable for developers with basic programming skills.

The secret ingredient is the browser_use framework, which acts as a bridge between various LLMs and Playwright, a framework for browser automation and testing.

The magic and the reality check

What makes this exciting is the agent's ability to problem-solve. It's almost human-like.

Watching the AI agents operate my site, I noticed they often face the same UX challenges as humans. It likely means that the more we simplify a CMS like Drupal for human users, the more accessible it becomes for AI agents. I find this link between human and AI usability both striking and thought-provoking.

In the first part of the video, the agent was tasked with finding my DrupalCon Lille 2023 keynote. When scrolling through the blog section failed, it adapted by using Google search instead.

In the second part of the video, it navigated Drupal's more complex UI elements, like auto-complete taxonomy fields, though it required one trial-and-error attempt.

The results are incredible, but not flawless. I ran the agents multiple times, and while they performed well most of the time, they aren't reliable enough for production use. However, this field is evolving quickly, and agents like this could become highly reliable within a year or two.

Native agents versus explorer agents

In my mind, agents can be categorized as "explorer agents" or "native agents". I haven't seen these terms used before, so here is how I define them:

  • Explorer agents: These agents operate across multiple websites. For example, an agent might use Google to search for a product, compare prices on different sites, and order the cheapest option.
  • Native agents: These agents work within a specific site, directly integrating with the CMS to leverage its APIs and built-in features.

The browser_use framework, in my view, is best suited for explorer agents. While it can be applied to a single website, as shown in my demo, it's not the most efficient approach.

Native agents that directly interact with the CMS's APIs should be more effective. Rather than imitating human behavior to "search" for content, the agent could retrieve it directly through a single API call. It could then programmatically propose changes within a CMS-supported content editing workflow, complete with role-based permissions and moderation states

I can also imagine a future where native agents and explorer agents work together (hybrid agents), combining the strengths of both approaches to unlock even greater opportunities.

Next steps

A next step for me is to build a similar solution using Drupal's AI agent capabilities. Drupal's native AI agents should make finding and updating content more efficiently.

Of course, other digital marketing use cases might benefit from explorer agents. I'd be happy to explore these possibilities as well. Let me know if you have ideas.

Subscribe to my blog

Join 5,000+ subscribers and get new posts by email.

Conclusions

Building an AI assistant to handle digital marketing tasks is no longer science fiction. It's clear that, soon, AI agents will be working alongside digital marketers and site builders.

These tools are advancing rapidly and are surprisingly easy to create, even though they're not yet perfect. Their potential disruption is both exciting and hard to fully understand.

As Drupal, we need to stay ahead by asking questions like: are we fully imagining the disruption AI could bring? The future is ours to shape, but we need to rise to the challenge.

January 20, 2025

Ne venez pas dire que vous n’étiez pas prévenus…

…c’est juste que vous pensiez ne pas être concernés

Depuis des décennies, je fais partie de ces gens qui tentent d’alerter sur les terrifiantes possibilités qu’offre l’aveuglement technologique dans lequel nous sommes plongés.

Je croyais que je devais expliquer, informer encore et encore.

Je découvre avec effroi que même ceux qui comprennent ce que je dis n’agissent pas. Voire agissent dans le sens contraire. Les électeurs de Trump, pour la plupart, savent très bien ce qui va arriver. Les artistes défendent Facebook et Spotify. Les politiciens les plus à gauche restent accrochés à X comme leur seule fenêtre sur le monde. Pourtant, ils sont prévenus !

C’est juste qu’ils croient qu’ils ne sont pas concernés. C’est juste que nous pensons naïvement que ça n’arrive qu’aux autres. Que nous sommes, d’une manière ou d’une autre, parmi ceux qui seront les privilégiés.

Je suis un homme. Blanc. Cisgenre. Avec un très bon diplôme. Une très bonne situation. Dans un des endroits les plus protégés, les plus démocratiques. Bref, je serai parmi les tout derniers à souffrir des effets combinés de la politique et de la technologie.

Et j’ai peur. Je suis terrifié.

J’ai peur de l’espionnage permanent

Nous nous soumettons volontairement et presque consciemment à un espionnage permanent. Je vous avais déjà raconté que même les distributeurs de boissons font de la reconnaissance faciale !

Comme le souligne Post Tenebras Lire, la réalité ressemble de plus en plus à mon roman Printeurs.

Mais la réalité a complètement dépassé la fiction, le cas le plus emblématique étant ce qui arrive aux Ouïghours qui sont contrôlés en permanence, dont la moindre image postée sur les réseaux sociaux (même si elle a été effacée depuis des années) peut servir d’excuse pour être enfermé.

Fuir les réseaux sociaux ? C’est trop tard pour eux, car le simple fait de ne pas avoir un smartphone est considéré comme suspect. Tout comme, en France, le fait d’utiliser Signal a déjà été considéré comme un élément à charge suffisant pour suspecter l’utilisateur d’écoterrorisme. (une seule solution pour contrer cela: migrer massivement vers Signal !)

Je le dis et je le répète : vous avez le droit de quitter les réseaux sociaux propriétaires. Vous avez le droit de désinstaller Whatsapp pour Signal. Si vous avez le moindre doute, je vous garantis que vous vous sentirez nettement mieux après.

Ne venez pas dire que vous n’étiez pas prévenus.

J’ai peur de l’uniformisation abrutie

Les algorithmes des plateformes propriétaires leur permettent de vous imposer leur choix, leur vision du monde. X/Twitter cache les démocrates et met Trump en avant pendant les élections. Facebook cache le moindre bout de nichon, mais promeut les nazis. Ce n’est donc pas une surprise d’apprendre que Spotify fait exactement la même chose en générant de la musique qui est ensuite imposée dans les playlists, surtout celle qui sont de type "musique de fond/musique d’ambiance".

De cette manière, ils ne doivent pas payer de royalties aux musiciens. Qui n’en touchent déjà pas beaucoup.

Je suis un pirate. Lorsque je découvre un groupe qui me parle, je télécharge plusieurs albums illégalement en MP3. Si j’aime bien, j’achète les MP3 légaux sans DRM (ce que permet Bandcamp par exemple). Je suis un pirate, mais en achetant l’album directement, je fais plus pour l’artiste plus que des centaines voire des milliers de streams.

Le CEO de Spotify gagne chaque année plus d’argent que Taylor Swift. Il est plus riche que Paul McCartney ou Mick Jagger après 50 ans de carrière chacun.

Les pirates sont aux artistes ce que les immigrés sont aux pauvres : un bouc émissaire bien pratique. Et, en attendant, Amazon et Spotify tuent les artistes et nous baignent dans une mélasse uniformisée.

C’est bien pour ça que l’IA semble si intéressante pour le business : c’est, par définition, une production de mélasse fade et uniforme.

IA qui sont entraînées en utilisant… les bases de données pirates ! Car, oui, Meta a entrainé ses IA sur la base de données libgen.is.

Ne venez pas dire que vous n’étiez pas prévenus.

Rappel: libgen.is est aux livres ce que thepiratebay est aux films. Une gigantesque bibliothèque pirate, un bastion de préservation de la culture. Je n’exagère pas : lors d’un dîner, mon ami Henri Lœvenbruck m’a parlé d’un livre assez vieux qui n’était plus édité et qui était devenu introuvable. Même dans les bibliothèques, les bouquineries et les catalogues en ligne, il ne parvenait pas à mettre la main sur ce livre. Livre que j’ai trouvé, devant lui, sur libgen.is en quelques secondes.

Ceux que vous accusez d’être des pirates sont les défenseurs, les protecteurs et les diffuseurs de la culture humaine. Et puisque Meta utilise libgen.is pour entrainer ses IA, ce n’est plus vraiment du piratage, vous avez moralement le droit d’utiliser cette bibliothèque partagée. Sur laquelle, soit dit en passant, je vous encourage à découvrir mes livres si vous ne voulez/pouvez pas les payer.

J’ai peur de voir disparaitre la démocratie

Tous les experts le clament depuis 25 ans : le vote électronique enterre complètement la démocratie.

La tricherie n’a même pas besoin d’être subtile ni même plausible. Le marketing et la politique ont découvert que les 5% d’intellectuels qui s’indignent ne pèsent pas lourd. Que le reste de la population ne demande qu’une chose : qu’on leur mente !

Les personnes qui mettent en place le vote électronique sont des gens malhonnêtes qui espèrent en tirer profit sans se rendre compte que leurs adversaires peuvent faire de même.

Ou alors des complets crétins.

Mais les deux ne sont pas incompatibles.

Dans tous les cas, ce sont les fossoyeurs de la démocratie.

Il est presque certain que Trump a triché pour être élu en piratant les systèmes de vote.

J’en étais convaincu depuis bien avant l’élection. Pourquoi ? Tout simplement parce que Georges W. Bush l’a fait avant lui en 2000 et que ça a très bien fonctionné. Le CEO de Diebold, la société en charge de construire les ordinateurs de vote, avait d’ailleurs déclaré à l’époque qu’il ferait tout ce qui est en son pouvoir pour que Georges W. Bush soit élu.

Georges W. Bush a d’ailleurs perdu cette élection. Sur absolument tous les critères. Mais Al Gore a préféré reconnaître une défaite mathématiquement impossible en Floride pour éviter des débordements de violence.

La tricherie et la violence ont permis à Georges W Bush de devenir président à la place d’Al Gore, ce qui a consacré cette stratégie électorale et durablement orienté le monde entier.

Grâce au vote électronique et aux connexions de Trump avec la Silicon Valley (Elon Musk, Peter Thiel, …), Trump avait la possibilité de tricher très facilement.

La question n’est donc pas de savoir s’il l’a fait, mais « Qu’est-ce qui l’aurait empêché de le faire ? »

Réponse : rien.

Ne vous demandez pas s’il est probable que Trump ait triché, mais, au contraire, s’il est probable qu’il ne l’ait pas fait.

Résumons : l’équipe de Trump avait clairement les moyens de pirater le vote électronique. Elle avait les données nécessaires (souvenez-vous d’Elon Musk offrant un million de dollars dans une tombola en échange des données personnelles des votants). Et Trump a obtenu un résultat statistiquement incroyablement improbable : gagner les sept swing states en gagnant juste les comtés les plus disputés avec juste ce qu’il faut de marge pour éviter un recompte et avec entre 5% et 7% de "bullet votes" (des bulletins juste pour Trump, mais ne participant pas aux autres élections) alors que la norme pour les "bullet votes" est entre… 0,05% et 1% dans les cas extrêmes (ce qui est le cas dans les comtés moins disputés). Le tout en ayant exactement le même nombre de voix que lors de l’élection de 2020.

EDIT: après recomptage, il semblerait que l’argument des "bullet votes" ne tiennent pas complètement la route. Le fond ne change pas mais toute triche, si avérée, ne serait pas aussi statistiquement évidente. Voir l’article suivant écrit suite au premier.

Mais vous savez quoi ?

Cela ne change rien. Parce que depuis Al Gore, on sait que les républicains trichent à outrance et que les démocrates, pour être élus, ne doivent pas juste remporter l’élection : ils doivent la gagner à un tel point que même les tricheries ne soient pas suffisantes. Cela ne veut pas dire que les démocrates ne trichent pas. Mais juste qu’ils le font moins bien ou qu’ils ont une certaine retenue quand ils le font.

La tromperie et la menace de violence gouvernent. Pendant que les politiciens vaguement plus progressistes/humanistes perdent les élections en tentant d’obtenir des followers sur des réseaux sociaux propriétaires totalement contrôlés par leur ennemi juré. Ils sont peut-être moins malhonnêtes, mais totalement crétins.

Ne venez pas dire que vous n’étiez pas prévenus. C’est juste que vous pensiez ne pas être concernés.

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

January 17, 2025

As in previous years, some small rooms will be available for Birds of a Feather sessions. The concept is simple: Any project or community can reserve a timeslot (30 minutes or 1 hour) during which they have the room just to themselves. These rooms are intended for ad-hoc discussions, meet-ups or brainstorming sessions. They are not a replacement for a developer room and they are certainly not intended for talks. Schedules: BOF Track A, BOF Track B, BOF Track C. To apply for a BOF session, enter your proposal at https://fosdem.org/submit. Select any of the BOF tracks and mention in舰

January 16, 2025

With FOSDEM just a few days away, it is time for us to enlist your help. Every year, an enthusiastic band of volunteers make FOSDEM happen and make it a fun and safe place for all our attendees. We could not do this without you. This year we again need as many hands as possible, especially for heralding during the conference, during the buildup (starting Friday at noon) and teardown (Sunday evening). No need to worry about missing lunch at the weekend, food will be provided. Would you like to be part of the team that makes FOSDEM tick?舰
If your non-geek partner and/or kids are joining you to FOSDEM, they may be interested in spending some time exploring Brussels while you attend the conference. Like previous years, FOSDEM is organising sightseeing tours. UPDATE: The tour is now fully booked.

January 15, 2025

We were made aware of planned protests during the upcoming FOSDEM 2025 in response to a scheduled talk which is causing controversy. The talk in question is claimed to be on the schedule for sponsorship reasons; additionally, some of the speakers scheduled to speak during this talk are controversial to some of our attendees. To be clear, in our 25 year history, we have always had the hard rule that sponsorship does not give you preferential treatment for talk selection; this policy has always applied, it applied in this particular case, and it will continue to apply in the future.舰
Graphic with the text "Drupal CMS 1.0 released" next to the Drupal logo in bold colors.

We did it: Drupal CMS 1.0 is here! 🎉

Eight months ago, I challenged our community to make Drupal easier for marketers, content creators, and site builders. Today, on Drupal's 24th birthday, we're making history with the launch of Drupal CMS 1.0.

With this release, you now have two ways to build with Drupal:

  • Drupal Core serves expert users and developers who want complete control over their websites. It provides a blank canvas for building websites and has been the foundation behind millions of websites since Drupal began 24 years ago.
  • Drupal CMS is a ready-to-use platform for marketing teams, content creators and site builders built on Drupal 11 core. When you install Drupal CMS, you get a set of out-of-the-box tools such as advanced media management, SEO tools, AI-driven website building, consent management, analytics, search, automatic updates and more.

To celebrate this milestone, more than 60 launch parties are happening around the world today! These celebrations highlight one of Drupal's greatest strengths: a worldwide community that builds and innovates together.

If you want to try Drupal CMS, you can start a free trial today at https://www.drupal.org/drupal-cms/trial.

Built for ambitious marketers

Drupal CMS targets organizations with ambitious digital goals, particularly in mid-market and enterprise settings. The platform provides a robust foundation that adapts and scales with evolving needs.

Organizations often hit a growth ceiling with non-Drupal CMS platforms. What starts as a simple website becomes a constraint as needs expand. Take privacy and consent management as an example: while these features are now essential due to GDPR, CCPA, and growing privacy concerns, most CMS platforms don't offer them out of the box. This forces organizations to create patchwork solutions.

Drupal CMS addresses this by including privacy and consent management tools by default. This not only simplifies setup but also sets a new standard for CMS platforms, promoting a better Open Web – one that prioritizes user privacy while helping organizations meet regulatory requirements.

Recipes for success

The privacy and consent management feature is just one of many 'recipes' available in Drupal CMS. Recipes are pre-configured packages of features, like blogs, events, or case studies, that simplify and speed up site-building. Each recipe automatically installs the necessary modules, sets up content types, and applies configurations, reducing manual setup.

This streamlined approach makes Drupal more accessible for beginners but also more efficient for experienced developers. Drupal CMS 1.0 launches with nearly 30 recipes included, many of which are applied by default to provide common functionality that most sites require. Recipes not applied by default are available as optional add-ons and can be applied either during setup or later through the new Project Browser. More recipes are already in development, with plans to release new versions of Drupal CMS throughout the year, each adding fresh recipes.

Screenshot of the Drupal CMS installer showing some recipes enabled and others disabled. The Drupal CMS installer lets users choose from predefined 'recipes' like blog, events, case studies and more. Each recipe automatically downloads the required modules, sets up preconfigured content types, and applies the necessary configurations.

Pioneering the future, again

Drupal CMS not only reduces costs and accelerates time to value with recipes but also stands out with innovative features like AI agents designed specifically for site building. While many platforms use AI primarily for content creation, our AI agents go further by enabling advanced tasks such as creating custom content types, configuring taxonomies, and more.

This kind of innovation really connects to Drupal's roots. In its early days, Drupal earned its reputation as a forward-thinking, innovative CMS. We helped pioneer the assembled web (now called 'composable') and contributed to the foundation of Web 2.0, shipping with features like blogging, RSS, and commenting long before the term Web 2.0 existed. Although it happened long ago and many may not remember, Drupal was the first CMS to adopt jQuery. This move played a key role in popularizing jQuery and establishing it as a cornerstone of web development.

Curious about what Drupal CMS' AI agents can do? Watch Ivan Zugec's video for a hands-on demonstration of how these tools simplify site-building tasks – even for expert developers.

We don't know exactly where AI agents will take us, but I'm excited to explore, learn, and grow. It feels like the early days when we experimented and boldly ventured into the unknown.

Changing perceptions and reaching more users

Drupal has often been seen as complex, but Drupal CMS is designed to change that. Still, we know that simply creating a more user-friendly and easier-to-maintain product isn't enough. After 24 years, many people still hold outdated perceptions shaped by experiences from over a decade ago.

Changing those perceptions takes time and deliberate effort. That is why the Drupal CMS initiative is focused not just on building software but also on repositioning and marketing Drupal in a way that highlights how much it has evolved.

Screenshot of the Drupal.org homepage showcasing the updated brand with the tagline 'Create ambitious digital experiences,' presented in a bold and vibrant design. The new Drupal.org features a refreshed brand and updated messaging, positioning Drupal as a modern, composable CMS.

To make this happen, we've refreshed our brand and started reworking Drupal.org with the help of the Drupal Association and our Drupal Certified Partners. The updated brand feels fresher, more modern, and more appealing to a larger audience.

For the first time, the Drupal Association has hired two full-time product marketers to help communicate our message.

Our goal is clear: to help people move past outdated perceptions and see Drupal for what it truly is – a powerful, modern platform for building websites that is becoming more user-friendly, as well as more affordable to use and maintain.

Achieving bold ambitions through collaboration

Launching the Drupal CMS initiative was bold and ambitious, requiring extraordinary effort from our community – and they truly stepped up. It was ambitious because this initiative has been about much more than building a second version of Drupal. It's been a focused and comprehensive effort to expand our market, modernize our brand, accelerate innovation, expand our marketing, and reimagine our partner ecosystem.

When I announced Drupal Starshot and Drupal CMS just 8 months ago, I remember turning to the team and asking, How exactly are we going to pull this off?. We had a lot to figure out – from building a team, setting goals, and mapping a path forward. It was a mix of uncertainty, determination, and maybe a touch of What have we gotten ourselves into?.

A key success factor has been fostering closer collaboration among contributors, agency partners, Drupal Core Committers, Drupal Association staff, and the Drupal Association Board of Directors. This stronger alignment didn't happen by chance; it's the result of thoughtfully structured meetings and governance changes that brought everyone closer together.

After just 8 months, the results speak for themselves. Drupal CMS has significantly increased the pace of innovation and the level of contributions to Drupal. It's a testament to what we can achieve when we work together. We've seen a 40% increase in contributor activity since the initiative launch, with over 2,000 commits from more than 300 contributors.

Bar chart showing a steady increase in organization credits for strategic initiatives from 2019 to 2024, with a significant jump in 2024. Drupal CMS has been a powerful catalyst for accelerating innovation and collaboration. Since development began in 2024, contributions have soared. Organization credits for strategic initiatives grew by 44% compared to 2023, with individual contributions increasing by 37%. The number of unique contributors rose by 12.5%, and participating organizations grew by 11.3%.

The initiative required me to make a significant time commitment I hadn't anticipated at the start of 2024 – but it's an experience I'm deeply grateful for. The Drupal CMS leadership team met at least twice a week, often more, to tackle challenges head-on. Similarly, I had weekly meetings with the Drupal Association.

Along the way we developed new working principles. One key principle was to solve end-user problems first, focusing on what marketers truly need rather than trying to account for every edge case. Another was prioritizing speed over process, enabling us to innovate and adapt quickly. These principles are still evolving, and now that the release is behind us, I'm eager to refine them further with the team.

The work we did together was intense, energizing, and occasionally filled with uncertainty about meeting our deadlines. We built strong bonds, learned to make quick, effective decisions, and maintained forward momentum. This experience has left me feeling more connected than ever to our shared mission.

The Drupal CMS roadmap for 2025

As exciting as this achievement is, some might ask if we've accomplished everything we set out to do. The answer is both yes and no. We've exceeded my expectations in collaboration and innovation, making incredible progress. But there is still much to do. In many ways, we're just getting started. We're less than one-third of the way through our three-year product strategy.

With Drupal CMS 1.0 released, 2025 is off to a strong start. Our roadmap for 2025 is clear: we'll launch Experience Builder 1.0, roll out more out-of-the-box recipes for marketers, improve our documentation, roll out our new brand to more parts of Drupal.org, and push forward with innovative experiments.

Each step brings us closer to our goal: modernizing Drupal and making Drupal the go-to platform for marketers and developers who want to build ambitious digital experiences — all while championing the Open Web.

Subscribe to my blog

Join 5,000+ subscribers and get new posts by email.

Thank you, Drupal community

We built Drupal CMS in a truly open source way – collaboratively, transparently, and driven by community contributions – proving once again that open source is the best way to build software.

The success of Drupal CMS 1.0 reflects the work of countless contributors. I'm especially grateful to these key contributors and their organizations (listed alphabetically): Jamie Abrahams (FreelyGive), Gareth Alexander (Zoocha), Martin Anderson-Clutz (Acquia), Tony Barker (Annertech), Pamela Barone (Technocrat), Addison Berry (Drupalize.me), Jim Birch (Kanopi Studios), Baddy Breidert (1xINTERNET), Christoph Breidert (1xINTERNET), Nathaniel Catchpole (Third and Grove / Tag1 Consulting), Cristina Chumillas (Lullabot), Suzanne Dergacheva (Evolving Web), Artem Dmitriiev (1xINTERNET), John Doyle (Digital Polygon), Tim Doyle (Drupal Association), Sascha Eggenberger (Gitlab), Dharizza Espinach (Evolving Web), Tiffany Farriss (Palantir.net), Matthew Grasmick (Acquia), Adam Globus-Hoenich (Acquia), Jürgen Haas (LakeDrops), Mike Herchel (DripYard), J. Hogue (Oomph, Inc), Gábor Hojtsy (Acquia), Emma Horrell (University of Edinburgh), Marcus Johansson (FreelyGive), Nick Koger (Drupal Association), Tim Lehnen (Drupal Association), Pablo López Escobés (Lullabot), Christian López Espínola (Lullabot), Leah Magee (Acquia), Amber Matz (Drupalize.me), Lenny Moskalyk (Drupal Association), Lewis Nyman, Matt Olivera (Lullabot), Shawn Perritt (Acquia), Megh Plunkett (Lullabot), Tim Plunkett (Acquia), Kristen Pol (Salsa Digital), Joe Shindelar (Drupalize.me), Lauri Timmanee (Acquia), Matthew Tift (Lullabot), Laurens Van Damme (Dropsolid), Ryan Witcombe (Drupal Association), Jen Witowski (Lullabot).

I also want to recognize our Marketing Committee, the Core Committers, the Drupal Association Board of Directors, and the Drupal Starshot Advisory Council, whose guidance and strategic input shaped this initiative along the way.

While I've highlighted some contributors here, I know there are hundreds more who shaped Drupal CMS 1.0 through their code, testing, UX work, feedback, advocacy and more. Each contribution, big or small, moved us forward. To everyone who helped build this milestone: THANK YOU!

January 12, 2025

One of the topics that most financial institutions are (still) currently working on, is their compliance with a European legislation called DORA. This abbreviation, which stands for "Digital Operational Resilience Act", is a European regulation. European regulations apply automatically and uniformly across all EU countries. This is unlike another recent legislation called NIS2, the "Network and Information Security" directive. As a EU directive, NIS2 requires the EU countries to formulate the directive into local law. As a result, different EU countries can have a slightly different implementation.

The DORA regulation applies to the EU financial sector, and has some strict requirements in it that companies' IT stakeholders are affected by. It doesn't often sugar-coat things like some frameworks do. This has the advantage that its "interpretation flexibility" is quite reduced - but not zero of course. Yet, that advantage is also a disadvantage: financial entities might have had different strategies covering their resiliency, and now need to adjust their strategy.

January 09, 2025

The preFOSDEM MySQL Belgian Days 2025 will occur at the usual place (ICAB Incubator, Belgium, 1040 Bruxelles) on Thursday, January 30th, and Friday, January 31st, just before FOSDEM. Again this year, we will have the chance to have incredible sessions from our Community and the opportunity to meet some MySQL Engineers from Oracle. DimK will […]

To our valued customers, partners, and the Drupal community.

I'm excited to share an important update about my role at Acquia, the company I co-founded 17 years ago. I'm transitioning from my operational roles as Chief Technology Officer (CTO) and Chief Strategy Officer (CSO) to become Executive Chairman. In this new role, I'll remain an Acquia employee, collaborating with Steve Reny (our CEO), our Board of Directors, and our leadership team on company strategy, product vision, and M&A.

This change comes at the right time for both Acquia and me. Acquia is stronger than ever, investing more in Drupal and innovation than at any point in our history. I made this decision so I can rebalance my time and focus on what matters most to me. I'm looking forward to spending more time with family and friends, as well as pursuing personal passions (including more blogging).

This change does not affect my commitment to Drupal or my role in the project. I will continue to lead the Drupal Project, helping to drive Drupal CMS, Drupal Core, and the Drupal Association.

Six months ago, I already chose to dedicate more of my time to Drupal. The progress we've made is remarkable. The energy in the Drupal community today is inspiring, and I'm amazed by how far we've come with Drupal Starshot. I'm truly excited to continue our work together.

Thank you for your continued trust and support!

January 07, 2025

FOSDEM Junior is a collaboration between FOSDEM, Code Club, CoderDojo, developers, and volunteers to organize workshops and activities for children during the FOSDEM weekend. These activities are for children to learn and get inspired about technology. This year’s activities include microcontrollers, embroidery, game development, music, and mobile application development. Last year we organized the first edition of FOSDEM Junior. We are pleased to announce that we will be back this year. Registration for individual workshops is required. Links can be found on the page of each activity. The full schedule can be viewed at the junior track schedule page. You舰

January 02, 2025

2024 brought a mix of work travel and memorable adventures, taking me to 13 countries across four continents — including the ones I call home. With 39 flights and 90 nights spent in hotels and rentals (about 25% of the year), it was a year marked by movement and new experiences.

Activity Count
🌍 Countries visited 13
✈️ Flights taken 39
🚕 Taxi rides 158
🍽️ Restaurant visits 175
☕️ Coffee shop visits 44
🍺 Bar visits 31
🏨 Days at hotel or rentals 90
⛺️ Days camping 12

Countries visited:

  • Australia
  • Belgium
  • Bulgaria
  • Canada
  • Cayman Islands
  • France
  • Japan
  • Netherlands
  • Singapore
  • South Korea
  • Spain
  • United Kingdom
  • United States

January 01, 2025

2025 = (20 + 25)²

2025 = 45²

2025 = 1³+2³+3³+4³+5³+6³+7³+8³+9³

2025 = (1+2+3+4+5+6+7+8+9)²

2025 = 1+3+5+7+9+11+...+89

2025 = 9² x 5²

2025 = 40² + 20² + 5²

December 27, 2024

At work, I've been maintaining a perl script that needs to run a number of steps as part of a release workflow.

Initially, that script was very simple, but over time it has grown to do a number of things. And then some of those things did not need to be run all the time. And then we wanted to do this one exceptional thing for this one case. And so on; eventually the script became a big mess of configuration options and unreadable flow, and so I decided that I wanted it to be more configurable. I sat down and spent some time on this, and eventually came up with what I now realize is a domain-specific language (DSL) in JSON, implemented by creating objects in Moose, extensible by writing more object classes.

Let me explain how it works.

In order to explain, however, I need to explain some perl and Moose basics first. If you already know all that, you can safely skip ahead past the "Preliminaries" section that's next.

Preliminaries

Moose object creation, references.

In Moose, creating a class is done something like this:

package Foo;

use v5.40;
use Moose;

has 'attribute' => (
    is  => 'ro',
    isa => 'Str',
    required => 1
);

sub say_something {
    my $self = shift;
    say "Hello there, our attribute is " . $self->attribute;
}

The above is a class that has a single attribute called attribute. To create an object, you use the Moose constructor on the class, and pass it the attributes you want:

use v5.40;
use Foo;

my $foo = Foo->new(attribute => "foo");

$foo->say_something;

(output: Hello there, our attribute is foo)

This creates a new object with the attribute attribute set to bar. The attribute accessor is a method generated by Moose, which functions both as a getter and a setter (though in this particular case we made the attribute "ro", meaning read-only, so while it can be set at object creation time it cannot be changed by the setter anymore). So yay, an object.

And it has methods, things that we set ourselves. Basic OO, all that.

One of the peculiarities of perl is its concept of "lists". Not to be confused with the lists of python -- a concept that is called "arrays" in perl and is somewhat different -- in perl, lists are enumerations of values. They can be used as initializers for arrays or hashes, and they are used as arguments to subroutines. Lists cannot be nested; whenever a hash or array is passed in a list, the list is "flattened", that is, it becomes one big list.

This means that the below script is functionally equivalent to the above script that uses our "Foo" object:

use v5.40;
use Foo;

my %args;

$args{attribute} = "foo";

my $foo = Foo->new(%args);

$foo->say_something;

(output: Hello there, our attribute is foo)

This creates a hash %args wherein we set the attributes that we want to pass to our constructor. We set one attribute in %args, the one called attribute, and then use %args and rely on list flattening to create the object with the same attribute set (list flattening turns a hash into a list of key-value pairs).

Perl also has a concept of "references". These are scalar values that point to other values; the other value can be a hash, a list, or another scalar. There is syntax to create a non-scalar value at assignment time, called anonymous references, which is useful when one wants to remember non-scoped values. By default, references are not flattened, and this is what allows you to create multidimensional values in perl; however, it is possible to request list flattening by dereferencing the reference. The below example, again functionally equivalent to the previous two examples, demonstrates this:

use v5.40;
use Foo;

my $args = {};

$args->{attribute} = "foo";

my $foo = Foo->new(%$args);

$foo->say_something;

(output: Hello there, our attribute is foo)

This creates a scalar $args, which is a reference to an anonymous hash. Then, we set the key attribute of that anonymous hash to bar (note the use arrow operator here, which is used to indicate that we want to dereference a reference to a hash), and create the object using that reference, requesting hash dereferencing and flattening by using a double sigil, %$.

As a side note, objects in perl are references too, hence the fact that we have to use the dereferencing arrow to access the attributes and methods of Moose objects.

Moose attributes don't have to be strings or even simple scalars. They can also be references to hashes or arrays, or even other objects:

package Bar;

use v5.40;
use Moose;

extends 'Foo';

has 'hash_attribute' => (
    is => 'ro',
    isa => 'HashRef[Str]',
    predicate => 'has_hash_attribute',
);

has 'object_attribute' => (
    is => 'ro',
    isa => 'Foo',
    predicate => 'has_object_attribute',
);

sub say_something {
    my $self = shift;

    if($self->has_object_attribute) {
        $self->object_attribute->say_something;
    }

    $self->SUPER::say_something unless $self->has_hash_attribute;

    say "We have a hash attribute!"
}

This creates a subclass of Foo called Bar that has a hash attribute called hash_attribute, and an object attribute called object_attribute. Both of them are references; one to a hash, the other to an object. The hash ref is further limited in that it requires that each value in the hash must be a string (this is optional but can occasionally be useful), and the object ref in that it must refer to an object of the class Foo, or any of its subclasses.

The predicates used here are extra subroutines that Moose provides if you ask for them, and which allow you to see if an object's attribute has a value or not.

The example script would use an object like this:

use v5.40;
use Bar;

my $foo = Foo->new(attribute => "foo");

my $bar = Bar->new(object_attribute => $foo, attribute => "bar");

$bar->say_something;

(output: Hello there, our attribute is foo)

This example also shows object inheritance, and methods implemented in child classes.

Okay, that's it for perl and Moose basics. On to...

Moose Coercion

Moose has a concept of "value coercion". Value coercion allows you to tell Moose that if it sees one thing but expects another, it should convert is using a passed subroutine before assigning the value.

That sounds a bit dense without example, so let me show you how it works. Reimaginging the Bar package, we could use coercion to eliminate one object creation step from the creation of a Bar object:

package "Bar";

use v5.40;

use Moose;
use Moose::Util::TypeConstraints;

extends "Foo";

coerce "Foo",
    from "HashRef",
    via { Foo->new(%$_) };

has 'hash_attribute' => (
    is => 'ro',
    isa => 'HashRef',
    predicate => 'has_hash_attribute',
);

has 'object_attribute' => (
    is => 'ro',
    isa => 'Foo',
    coerce => 1,
    predicate => 'has_object_attribute',
);

sub say_something {
    my $self = shift;

    if($self->has_object_attribute) {
        $self->object_attribute->say_something;
    }

    $self->SUPER::say_something unless $self->has_hash_attribute;

    say "We have a hash attribute!"
}

Okay, let's unpack that a bit.

First, we add the Moose::Util::TypeConstraints module to our package. This is required to declare coercions.

Then, we declare a coercion to tell Moose how to convert a HashRef to a Foo object: by using the Foo constructor on a flattened list created from the hashref that it is given.

Then, we update the definition of the object_attribute to say that it should use coercions. This is not the default, because going through the list of coercions to find the right one has a performance penalty, so if the coercion is not requested then we do not do it.

This allows us to simplify declarations. With the updated Bar class, we can simplify our example script to this:

use v5.40;

use Bar;

my $bar = Bar->new(attribute => "bar", object_attribute => { attribute => "foo" });

$bar->say_something

(output: Hello there, our attribute is foo)

Here, the coercion kicks in because the value object_attribute, which is supposed to be an object of class Foo, is instead a hash ref. Without the coercion, this would produce an error message saying that the type of the object_attribute attribute is not a Foo object. With the coercion, however, the value that we pass to object_attribute is passed to a Foo constructor using list flattening, and then the resulting Foo object is assigned to the object_attribute attribute.

Coercion works for more complicated things, too; for instance, you can use coercion to coerce an array of hashes into an array of objects, by creating a subtype first:

package MyCoercions;
use v5.40;

use Moose;
use Moose::Util::TypeConstraints;

use Foo;

subtype "ArrayOfFoo", as "ArrayRef[Foo]";
subtype "ArrayOfHashes", as "ArrayRef[HashRef]";

coerce "ArrayOfFoo", from "ArrayOfHashes", via { [ map { Foo->create(%$_) } @{$_} ] };

Ick. That's a bit more complex.

What happens here is that we use the map function to iterate over a list of values.

The given list of values is @{$_}, which is perl for "dereference the default value as an array reference, and flatten the list of values in that array reference".

So the ArrayRef of HashRefs is dereferenced and flattened, and each HashRef in the ArrayRef is passed to the map function.

The map function then takes each hash ref in turn and passes it to the block of code that it is also given. In this case, that block is { Foo->create(%$_) }. In other words, we invoke the create factory method with the flattened hashref as an argument. This returns an object of the correct implementation (assuming our hash ref has a type attribute set), and with all attributes of their object set to the correct value. That value is then returned from the block (this could be made more explicit with a return call, but that is optional, perl defaults a return value to the rvalue of the last expression in a block).

The map function then returns a list of all the created objects, which we capture in an anonymous array ref (the [] square brackets), i.e., an ArrayRef of Foo object, passing the Moose requirement of ArrayRef[Foo].

Usually, I tend to put my coercions in a special-purpose package. Although it is not strictly required by Moose, I find that it is useful to do this, because Moose does not allow a coercion to be defined if a coercion for the same type had already been done in a different package. And while it is theoretically possible to make sure you only ever declare a coercion once in your entire codebase, I find that doing so is easier to remember if you put all your coercions in a specific package.

Okay, now you understand Moose object coercion! On to...

Dynamic module loading

Perl allows loading modules at runtime. In the most simple case, you just use require inside a stringy eval:

my $module = "Foo";
eval "require $module";

This loads "Foo" at runtime. Obviously, the $module string could be a computed value, it does not have to be hardcoded.

There are some obvious downsides to doing things this way, mostly in the fact that a computed value can basically be anything and so without proper checks this can quickly become an arbitrary code vulnerability. As such, there are a number of distributions on CPAN to help you with the low-level stuff of figuring out what the possible modules are, and how to load them.

For the purposes of my script, I used Module::Pluggable. Its API is fairly simple and straightforward:

package Foo;

use v5.40;
use Moose;

use Module::Pluggable require => 1;

has 'attribute' => (
    is => 'ro',
    isa => 'Str',
);

has 'type' => (
    is => 'ro',
    isa => 'Str',
    required => 1,
);

sub handles_type {
    return 0;
}

sub create {
    my $class = shift;
    my %data = @_;

    foreach my $impl($class->plugins) {
        if($impl->can("handles_type") && $impl->handles_type($data{type})) {
            return $impl->new(%data);
        }
    }
    die "could not find a plugin for type " . $data{type};
}

sub say_something {
    my $self = shift;
    say "Hello there, I am a " . $self->type;
}

The new concept here is the plugins class method, which is added by Module::Pluggable, and which searches perl's library paths for all modules that are in our namespace. The namespace is configurable, but by default it is the name of our module; so in the above example, if there were a package "Foo::Bar" which

  • has a subroutine handles_type
  • that returns a truthy value when passed the value of the type key in a hash that is passed to the create subroutine,
  • then the create subroutine creates a new object with the passed key/value pairs used as attribute initializers.

Let's implement a Foo::Bar package:

package Foo::Bar;

use v5.40;
use Moose;

extends 'Foo';

has 'type' => (
    is => 'ro',
    isa => 'Str',
    required => 1,
);

has 'serves_drinks' => (
    is => 'ro',
    isa => 'Bool',
    default => 0,
);

sub handles_type {
    my $class = shift;
    my $type = shift;

    return $type eq "bar";
}

sub say_something {
    my $self = shift;
    $self->SUPER::say_something;
    say "I serve drinks!" if $self->serves_drinks;
}

We can now indirectly use the Foo::Bar package in our script:

use v5.40;
use Foo;

my $obj = Foo->create(type => bar, serves_drinks => 1);

$obj->say_something;

output:

Hello there, I am a bar.
I serve drinks!

Okay, now you understand all the bits and pieces that are needed to understand how I created the DSL engine. On to...

Putting it all together

We're actually quite close already. The create factory method in the last version of our Foo package allows us to decide at run time which module to instantiate an object of, and to load that module at run time. We can use coercion and list flattening to turn a reference to a hash into an object of the correct type.

We haven't looked yet at how to turn a JSON data structure into a hash, but that bit is actually ridiculously trivial:

use JSON::MaybeXS;

my $data = decode_json($json_string);

Tada, now $data is a reference to a deserialized version of the JSON string: if the JSON string contained an object, $data is a hashref; if the JSON string contained an array, $data is an arrayref, etc.

So, in other words, to create an extensible JSON-based DSL that is implemented by Moose objects, all we need to do is create a system that

  • takes hash refs to set arguments
  • has factory methods to create objects, which

    • uses Module::Pluggable to find the available object classes, and
    • uses the type attribute to figure out which object class to use to create the object
  • uses coercion to convert hash refs into objects using these factory methods

In practice, we could have a JSON file with the following structure:

{
    "description": "do stuff",
    "actions": [
        {
            "type": "bar",
            "serves_drinks": true,
        },
        {
            "type": "bar",
            "serves_drinks": false,
        }
    ]
}

... and then we could have a Moose object definition like this:

package MyDSL;

use v5.40;
use Moose;

use MyCoercions;

has "description" => (
    is => 'ro',
    isa => 'Str',
);

has 'actions' => (
    is => 'ro',
    isa => 'ArrayOfFoo'
    coerce => 1,
    required => 1,
);

sub say_something {
    say "Hello there, I am described as " . $self->description . " and I am performing my actions: ";

    foreach my $action(@{$self->actions}) {
        $action->say_something;
    }
}

Now, we can write a script that loads this JSON file and create a new object using the flattened arguments:

use v5.40;
use MyDSL;
use JSON::MaybeXS;

my $input_file_name = shift;

my $args = do {
    local $/ = undef;

    open my $input_fh, "<", $input_file_name or die "could not open file";
    <$input_fh>;
};

$args = decode_json($args);

my $dsl = MyDSL->new(%$args);

$dsl->say_something

Output:

Hello there, I am described as do stuff and I am performing my actions:
Hello there, I am a bar
I am serving drinks!
Hello there, I am a bar

In some more detail, this will:

  • Read the JSON file and deserialize it;
  • Pass the object keys in the JSON file as arguments to a constructor of the MyDSL class;
  • The MyDSL class then uses those arguments to set its attributes, using Moose coercion to convert the "actions" array of hashes into an array of Foo::Bar objects.
  • Perform the say_something method on the MyDSL object

Once this is written, extending the scheme to also support a "quux" type simply requires writing a Foo::Quux class, making sure it has a method handles_type that returns a truthy value when called with quux as the argument, and installing it into the perl library path. This is rather easy to do.

It can even be extended deeper, too; if the quux type requires a list of arguments rather than just a single argument, it could itself also have an array attribute with relevant coercions. These coercions could then be used to convert the list of arguments into an array of objects of the correct type, using the same schema as above.

The actual DSL is of course somewhat more complex, and also actually does something useful, in contrast to the DSL that we define here which just says things.

Creating an object that actually performs some action when required is left as an exercise to the reader.

December 20, 2024

Let’s stay a bit longer with MySQL 3.2x to advance the MySQL Retrospective in anticipation of the 30th Anniversary. The idea of this article was suggested to me by Daniël van Eeden. Did you know that in the early days, and therefore still in MySQL 3.20, MySQL used the ISAM storage format? IBM introduced the […]

December 18, 2024

To further advance the MySQL Retrospective in anticipation of the 30th Anniversary, today, let’s discuss the very first version of MySQL that became availble to a wide audient though the popular InfoMagic distribution: MySQL 3.20! In 1997, InfoMagic incorporated MySQL 3.20 as part of the RedHat Contrib CD-ROM (MySQL 3.20.25). Additionally, version 3.20.13-beta was also […]

December 08, 2024

It’s been 2 years since AOPro was launched and a lot has happened in that time; bugs were squashed, improvements were made and some great features were added. Taking that into account on one hand and increasing costs from suppliers on the other: prices will see a smallish increase as from 2025 (exact amounts still to be determined) But rest assured; if you already signed up, you will continue to…

Source

November 26, 2024

The deadline for talk submissions is rapidly approaching! If you are interested in talking at FOSDEM this year (yes, I'm talking to you!), it's time to polish off and submit those proposals in the next few days before the 1st: Devrooms: follow the instructions in each cfp listed here Main tracks: for topics which are more general or don't fit in a devroom, select 'Main' as the track here Lightning talks: for short talks (15 minutes) on a wide range of topics, select 'Lightning Talks' as the track here For more details, refer to the previous post.

November 25, 2024

Last month we released MySQL 9.1, the latest Innovation Release. Of course, we released bug fixes for 8.0 and 8.4 LTS but in this post, I focus on the newest release. Within these releases, we included patches and code received by our amazing Community. Here is the list of contributions we processed and included in […]

November 15, 2024

With great pleasure we can announce that the following projects will have a stand at FOSDEM 2025 (1st & 2nd February). This is the list of stands (in alphabetic order): 0 A.D. Empires Ascendant AlekSIS and Teckids AlmaLinux OS CalyxOS Ceph Chamilo CISO Assistant Cloud Native Computing Foundation (CNCF) Codeberg and Forgejo coreboot / flashprog / EDKII / OpenBMC Debian DeepComputing's DC-ROMA RISC-V Mainboard with Framework Laptop 13 DevPod Digital Public Goods Dolibarr ERP CRM Drupal Eclipse Foundation Fedora Project FerretDB Firefly Zero FOSSASIA Free Software Foundation Europe FreeBSD Project FreeCAD and KiCAD Furi Labs Gentoo Linux & Flatcar舰

November 01, 2024

Dear WordPress friends in the USA: I hope you vote and when you do, I hope you vote for respect. The world worriedly awaits your collective verdict, as do I. Peace! Watch this video on YouTube.

Source

October 29, 2024

As announced yesterday, the MySQL Devroom is back at FOSDEM! For people preparing for their travel to Belgium, we want to announce that the MySQL Belgian Days fringe event will be held on the Thursday and Friday before FOSDEM. This event will take place on January 30th and 31st, 2025, in Brussels at the usual […]

October 28, 2024

We are pleased to announce the Call for Participation (CfP) for the FOSDEM 2025 MySQL Devroom. The Devroom will be held on February 2 (Sunday), 2025 in Brussels, Belgium. The submission deadline for talk proposals is December 1, 2024. FOSDEM is a free event for software developers to meet, share ideas, and collaborate. Every year, […]

October 05, 2024

Cover Ember Knights

Proton is a compatibility layer for Windows games to run on Linux. Running a Windows games is mostly just hitting the Play button within Steam. It’s that good that many games now run faster on Linux than on native Windows. That’s what makes the Steam Deck the best gaming handheld of the moment.

But a compatibility layer is still a layer, so you may encounter … incompatibilities. Ember Knights is a lovely game with fun co-op multiplayer support. It runs perfectly on the (Linux-based) Steam Deck, but on my Ubuntu laptop I encountered long loading times (startup was 5 minutes and loading between worlds was slow). But once the game was loaded it ran fine.

Debugging the game reveled that there were lost of EAGAIN errors while the game was trying to access the system clock. Changing the numer of allowed open files fixed the problem for me.

Add this to end end of the following files:

  • in /etc/security/limits.conf:
* hard nofile 1048576
  • in /etc/systemd/system.conf and /etc/systemd/user.conf:
DefaultLimitNOFILE=1048576 

Reboot.

Cover In Game

“The Witcher 3: Wild Hunt” is considered to be one of the greatest video games of all time. I certainly agree with that sentiment.

At its core, The Witcher 3 is a action-role playing game with a third-person perspective in a huge open world. You develop your character while the story advances. At the same time you can freely roam and explore as much as you like. The main story is captivating and the world is filled with with side quests and lots of interesting people. Fun for at least 200 hours, if you’re the exploring kind. If you’re not, the base game (without DLCs) will still take you 50 hours to finish.

While similar to other great games like Nintendo’s Zelda Breath of the Wild and Sony’s Horizon Zero Dawn, the strength of the game is a deep lore originating from the Witcher series novels written by the “Polish Tolkien” Andrzej Sapkowski. It’s not a game, but a universe (nowadays it even includes a Netflix tv-series).

A must play.

Played on the Steam Deck without any issues (“Steam Deck Verified”)

September 10, 2024

In previous blog posts, we discussed setting up a GPG smartcard on GNU/Linux and FreeBSD.

In this blog post, we will configure Thunderbird to work with an external smartcard reader and our GPG-compatible smartcard.

beastie gnu tux

Before Thunderbird 78, if you wanted to use OpenPGP email encryption, you had to use a third-party add-on such as https://enigmail.net/.

Thunderbird’s recent versions natively support OpenPGP. The Enigmail addon for Thunderbird has been discontinued. See: https://enigmail.net/index.php/en/home/news.

I didn’t find good documentation on how to set up Thunderbird with a GnuPG smartcard when I moved to a new coreboot laptop, so this was the reason I created this blog post series.

GnuPG configuration

We’ll not go into too much detail on how to set up GnuPG. This was already explained in the previous blog posts.

If you want to use a HSM with GnuPG you can use the gnupg-pkcs11-scd agent https://github.com/alonbl/gnupg-pkcs11-scd that translates the pkcs11 interface to GnuPG. A previous blog post describes how this can be configured with SmartCard-HSM.

We’ll go over some steps to make sure that the GnuPG is set up correctly before we continue with the Thunderbird configuration. The pinentry command must be configured with graphical support to type our pin code in the Graphical user environment.

Import Public Key

Make sure that your public key - or the public key of the reciever(s) - is/are imported.

[staf@snuffel ~]$ gpg --list-keys
[staf@snuffel ~]$ 
[staf@snuffel ~]$ gpg --import <snip>.asc
gpg: key XXXXXXXXXXXXXXXX: public key "XXXX XXXXXXXXXX <XXX@XXXXXX>" imported
gpg: Total number processed: 1
gpg:               imported: 1
[staf@snuffel ~]$ 
[staf@snuffel ~]$  gpg --list-keys
/home/staf/.gnupg/pubring.kbx
-----------------------------
pub   xxxxxxx YYYYY-MM-DD [SC]
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
uid           [ xxxxxxx] xxxx xxxxxxxxxx <xxxx@xxxxxxxxxx.xx>
sub   xxxxxxx xxxx-xx-xx [A]
sub   xxxxxxx xxxx-xx-xx [E]

[staf@snuffel ~]$ 

Pinentry

Thunderbird will not ask for your smartcard’s pin code.

This must be done on your smartcard reader if it has a pin pad or an external pinentry program.

The pinentry is configured in the gpg-agent.conf configuration file. As we’re using Thunderbird is a graphical environment we’ll configure it to use a graphical version.

Installation

I’m testing KDE plasma 6 on FreeBSD, so I installed the Qt version of pinentry.

On GNU/Linux you can check the documentation of your favourite Linux distribution to install a graphical pinentry. If you use a Graphical user environment there is probably already a graphical-enabled pinentry installed.

[staf@snuffel ~]$ sudo pkg install -y pinentry-qt6
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        pinentry-qt6: 1.3.0

Number of packages to be installed: 1

76 KiB to be downloaded.
[1/1] Fetching pinentry-qt6-1.3.0.pkg: 100%   76 KiB  78.0kB/s    00:01    
Checking integrity... done (0 conflicting)
[1/1] Installing pinentry-qt6-1.3.0...
[1/1] Extracting pinentry-qt6-1.3.0: 100%
==> Running trigger: desktop-file-utils.ucl
Building cache database of MIME types
[staf@snuffel ~]$ 

Configuration

The gpg-agent is responsible for starting the pinentry program. Let’s reconfigure it to start the pinentry that we like to use.

[staf@snuffel ~]$ cd .gnupg/
[staf@snuffel ~/.gnupg]$ 
[staf@snuffel ~/.gnupg]$ vi gpg-agent.conf

The pinentry is configured in the pinentry-program directive. You’ll find the complete gpg-agent.conf that I’m using below.

debug-level expert
verbose
verbose
log-file /home/staf/logs/gpg-agent.log
pinentry-program /usr/local/bin/pinentry-qt

Reload the sdaemon and gpg-agent configuration.

staf@freebsd-gpg3:~/.gnupg $ gpgconf --reload scdaemon
staf@freebsd-gpg3:~/.gnupg $ gpgconf --reload gpg-agent
staf@freebsd-gpg3:~/.gnupg $ 

Test

To verify that gpg works correctly and that the pinentry program works in our graphical environment we sign a file.

Create a new file.

$ cd /tmp
[staf@snuffel /tmp]$ 
[staf@snuffel /tmp]$ echo "foobar" > foobar
[staf@snuffel /tmp]$ 

Try to sign it.

[staf@snuffel /tmp]$ gpg --sign foobar
[staf@snuffel /tmp]$ 

If everything works fine, the pinentry program will ask for the pincode to sign it.

image info

Thunderbird

In this section we’ll (finally) configure Thunderbird to use GPG with a smartcard reader.

Allow external smartcard reader

open settings

Open the global settings, click on the "Hamburger" icon and select settings.

Or press [F10] to bring-up the "Menu bar" in Thunderbird and select [Edit] and Settings.

open settings

In the settings window click on [Config Editor].

This will open the Advanced Preferences window.

allow external gpg

In the Advanced Preferences window search for "external_gnupg" settings and set mail.indenity.allow_external_gnupg to true.


 

Setup End-To-End Encryption

The next step is to configure the GPG keypair that we’ll use for our user account.

open settings

Open the account setting by pressing on the "Hamburger" icon and select Account Settings or press [F10] to open the menu bar and select Edit, Account Settings.

Select End-to-End Encryption at OpenPG section select [ Add Key ].

open settings

Select the ( * ) Use your external key though GnuPG (e.g. from a smartcard)

And click on [Continue]

The next window will ask you for the Secret Key ID.

open settings

Execute gpg --list-keys to get your secret key id.

Copy/paste your key id and click on [ Save key ID ].

I found that it is sometimes required to restart Thunderbird to reload the configuration when a new key id is added. So restart Thunderbird or restart it fails to find your key id in the keyring.

Test

open settings

As a test we send an email to our own email address.

Open a new message window and enter your email address into the To: field.

Click on [OpenPGP] and Encrypt.

open settings

Thunderbird will show a warning message that it doesn't know the public key to set up the encryption.

Click on [Resolve].

discover keys In the next window Thunderbird will ask to Discover Public Keys online or to import the Public Keys From File, we'll import our public key from a file.
open key file In the Import OpenPGP key File window select your public key file, and click on [ Open ].
open settings

Thunderbird will show a window with the key fingerprint. Select ( * ) Accepted.

Click on [ Import ] to import the public key.

open settings

With our public key imported, the warning about the End-to-end encryption requires resolving key issue should be resolved.

Click on the [ Send ] button to send the email.

open settings

To encrypt the message, Thunderbird will start a gpg session that invokes the pinentry command type in your pincode. gpg will encrypt the message file and if everything works fine the email is sent.

 

Have fun!

Links

September 09, 2024

The NBD protocol has grown a number of new features over the years. Unfortunately, some of those features are not (yet?) supported by the Linux kernel.

I suggested a few times over the years that the maintainer of the NBD driver in the kernel, Josef Bacik, take a look at these features, but he hasn't done so; presumably he has other priorities. As with anything in the open source world, if you want it done you must do it yourself.

I'd been off and on considering to work on the kernel driver so that I could implement these new features, but I never really got anywhere.

A few months ago, however, Christoph Hellwig posted a patch set that reworked a number of block device drivers in the Linux kernel to a new type of API. Since the NBD mailinglist is listed in the kernel's MAINTAINERS file, this patch series were crossposted to the NBD mailinglist, too, and when I noticed that it explicitly disabled the "rotational" flag on the NBD device, I suggested to Christoph that perhaps "we" (meaning, "he") might want to vary the decision on whether a device is rotational depending on whether the NBD server signals, through the flag that exists for that very purpose, whether the device is rotational.

To which he replied "Can you send a patch".

That got me down the rabbit hole, and now, for the first time in the 20+ years of being a C programmer who uses Linux exclusively, I got a patch merged into the Linux kernel... twice.

So, what do these things do?

The first patch adds support for the ROTATIONAL flag. If the NBD server mentions that the device is rotational, it will be treated as such, and the elevator algorithm will be used to optimize accesses to the device. For the reference implementation, you can do this by adding a line "rotational = true" to the relevant section (relating to the export where you want it to be used) of the config file.

It's unlikely that this will be of much benefit in most cases (most nbd-server installations will be exporting a file on a filesystem and have the elevator algorithm implemented server side and then it doesn't matter whether the device has the rotational flag set), but it's there in case you wish to use it.

The second set of patches adds support for the WRITE_ZEROES command. Most devices these days allow you to tell them "please write a N zeroes starting at this offset", which is a lot more efficient than sending over a buffer of N zeroes and asking the device to do DMA to copy buffers etc etc for just zeroes.

The NBD protocol has supported its own WRITE_ZEROES command for a while now, and hooking it up was reasonably simple in the end. The only problem is that it expects length values in bytes, whereas the kernel uses it in blocks. It took me a few tries to get that right -- and then I also fixed up handling of discard messages, which required the same conversion.

September 06, 2024

Some users, myself included, have noticed that their MySQL error log contains many lines like this one: Where does that error come from? The error MY-010914 is part of the Server Network issues like: Those are usually more problematic than the ones we are covering today. The list is not exhaustive and in the source […]

September 05, 2024

IT architects generally use architecture-specific languages or modeling techniques to document their thoughts and designs. ArchiMate, the framework I have the most experience with, is a specialized enterprise architecture modeling language. It is maintained by The Open Group, an organization known for its broad architecture framework titled TOGAF.

My stance, however, is that architects should not use the diagrams from their architecture modeling framework to convey their message to every stakeholder out there...

What is the definition of “Open Source”?

There’s been no shortage of contention on what “Open Source software” means. Two instances that stand out to me personally are ElasticSearch’s “Doubling down on Open” and Scott Chacon’s “public on GitHub”.

I’ve been active in Open Source for 20 years and could use a refresher on its origins and officialisms. The plan was simple: write a blog post about why the OSI (Open Source Initiative) and its OSD (Open Source Definition) are authoritative, collect evidence in its support (confirmation that they invented the term, of widespread acceptance with little dissent, and of the OSD being a practical, well functioning tool). That’s what I keep hearing, I just wanted to back it up. Since contention always seems to be around commercial re-distribution restrictions (which are forbidden by the OSD), I wanted to particularly confirm that there hasn’t been all that many commercial vendors who’ve used, or wanted, to use the term “open source” to mean “you can view/modify/use the source, but you are limited in your ability to re-sell, or need to buy additional licenses for use in a business”

However, the further I looked, the more I found evidence of the opposite of all of the above. I’ve spent a few weeks now digging and some of my long standing beliefs are shattered. I can’t believe some of the things I found out. Clearly I was too emotionally invested, but after a few weeks of thinking, I think I can put things in perspective. So this will become not one, but multiple posts.

The goal for the series is look at the tensions in the community/industry (in particular those directed towards the OSD), and figure out how to resolve, or at least reduce them.

Without further ado, let’s get into the beginnings of Open Source.

The “official” OSI story.

Let’s first get the official story out the way, the one you see repeated over and over on websites, on Wikipedia and probably in most computing history books.

Back in 1998, there was a small group of folks who felt that the verbiage at the time (Free Software) had become too politicized. (note: the Free Software Foundation was founded 13 years prior, in 1985, and informal use of “free software” had around since the 1970’s). They felt they needed a new word “to market the free software concept to people who wore ties”. (source) (somewhat ironic since today many of us like to say “Open Source is not a business model”)

Bruce Perens - an early Debian project leader and hacker on free software projects such as busybox - had authored the first Debian Free Software Guidelines in 1997 which was turned into the first Open Source Definition when he founded the OSI (Open Source Initiative) with Eric Raymond in 1998. As you continue reading, keep in mind that from the get-go, OSI’s mission was supporting the industry. Not the community of hobbyists.

Eric Raymond is of course known for his seminal 1999 essay on development models “The cathedral and the bazaar”, but he also worked on fetchmail among others.

According to Bruce Perens, there was some criticism at the time, but only to the term “Open” in general and to “Open Source” only in a completely different industry.

At the time of its conception there was much criticism for the Open Source campaign, even among the Linux contingent who had already bought-in to the free software concept. Many pointed to the existing use of the term “Open Source” in the political intelligence industry. Others felt the term “Open” was already overused. Many simply preferred the established name Free Software. I contended that the overuse of “Open” could never be as bad as the dual meaning of “Free” in the English language–either liberty or price, with price being the most oft-used meaning in the commercial world of computers and software

From Open Sources: Voices from the Open Source Revolution: The Open Source Definition

Furthermore, from Bruce Perens’ own account:

I wrote an announcement of Open Source which was published on February 9 [1998], and that’s when the world first heard about Open Source.

source: On Usage of The Phrase “Open Source”

Occasionally it comes up that it may have been Christine Peterson who coined the term earlier that week in February but didn’t give it a precise meaning. That was a task for Eric and Bruce in followup meetings over the next few days.

Even when you’re the first to use or define a term, you can’t legally control how others use it, until you obtain a Trademark. Luckily for OSI, US trademark law recognizes the first user when you file an application, so they filed for a trademark right away. But what happened? It was rejected! The OSI’s official explanation reads:

We have discovered that there is virtually no chance that the U.S. Patent and Trademark Office would register the mark “open source”; the mark is too descriptive. Ironically, we were partly a victim of our own success in bringing the “open source” concept into the mainstream

This is our first 🚩 red flag and it lies at the basis of some of the conflicts which we will explore in this, and future posts. (tip: I found this handy Trademark search website in the process)

Regardless, since 1998, the OSI has vastly grown its scope of influence (more on that in future posts), with the Open Source Definition mostly unaltered for 25 years, and having been widely used in the industry.

Prior uses of the term “Open Source”

Many publications simply repeat the idea that OSI came up with the term, has the authority (if not legal, at least in practice) and call it a day. I, however, had nothing better to do, so I decided to spend a few days (which turned into a few weeks 😬) and see if I could dig up any references to “Open Source” predating OSI’s definition in 1998, especially ones with different meanings or definitions.

Of course, it’s totally possible that multiple people come up with the same term independently and I don’t actually care so much about “who was first”, I’m more interested in figuring out what different meanings have been assigned to the term and how widespread those are.

In particular, because most contention is around commercial limitations (non-competes) where receivers of the code are forbidden to resell it, this clause of the OSD stands out:

Free Redistribution: The license shall not restrict any party from selling (…)

Turns out, the “Open Source” was already in use for more than a decade, prior to the OSI founding.

OpenSource.com

In 1998, a business in Texas called “OpenSource, Inc” launched their website. They were a “Systems Consulting and Integration Services company providing high quality, value-added IT professional services”. Sometime during the year 2000, the website became a RedHat property. Enter the domain name on Icann and it reveals the domain name was registered Jan 8, 1998. A month before the term was “invented” by Christine/Richard/Bruce. What a coincidence. We are just warming up…

image

Caldera announces Open Source OpenDOS

In 1996, a company called Caldera had “open sourced” a DOS operating system called OpenDos. Their announcement (accessible on google groups and a mailing list archive) reads:

Caldera Announces Open Source for DOS.
(…)
Caldera plans to openly distribute the source code for all of the DOS technologies it acquired from Novell., Inc
(…)
Caldera believes an open source code model benefits the industry in many ways.
(…)
Individuals can use OpenDOS source for personal use at no cost.
Individuals and organizations desiring to commercially redistribute
Caldera OpenDOS must acquire a license with an associated small fee.

Today we would refer to it as dual-licensing, using Source Available due to the non-compete clause. But in 1996, actual practitioners referred to it as “Open Source” and OSI couldn’t contest it because it didn’t exist!

You can download the OpenDos package from ArchiveOS and have a look at the license file, which includes even more restrictions such as “single computer”. (like I said, I had nothing better to do).

Investigations by Martin Espinoza re: Caldera

On his blog, Martin has an article making a similar observation about Caldera’s prior use of “open source”, following up with another article which includes a response from Lyle Ball, who headed the PR department of Caldera

Quoting Martin:

As a member of the OSI, he [Bruce] frequently championed that organization’s prerogative to define what “Open Source” means, on the basis that they invented the term. But I [Martin] knew from personal experience that they did not. I was personally using the term with people I knew before then, and it had a meaning — you can get the source code. It didn’t imply anything at all about redistribution.

The response from Caldera includes such gems as:

I joined Caldera in November of 1995, and we certainly used “open source” broadly at that time. We were building software. I can’t imagine a world where we did not use the specific phrase “open source software”. And we were not alone. The term “Open Source” was used broadly by Linus Torvalds (who at the time was a student (…), John “Mad Dog” Hall who was a major voice in the community (he worked at COMPAQ at the time), and many, many others.

Our mission was first to promote “open source”, Linus Torvalds, Linux, and the open source community at large. (…) we flew around the world to promote open source, Linus and the Linux community….we specifically taught the analysts houses (i.e. Gartner, Forrester) and media outlets (in all major markets and languages in North America, Europe and Asia.) (…) My team and I also created the first unified gatherings of vendors attempting to monetize open source

So according to Caldera, “open source” was a phenomenon in the industry already and Linus himself had used the term. He mentions plenty of avenues for further research, I pursued one of them below.

Linux Kernel discussions

Mr. Ball’s mentions of Linus and Linux piqued my interest, so I started digging.

I couldn’t find a mention of “open source” in the Linux Kernel Mailing List archives prior to the OSD day (Feb 1998), though the archives only start as of March 1996. I asked ChatGPT where people used to discuss Linux kernel development prior to that, and it suggested 5 Usenet groups, which google still lets you search through:

What were the hits? Glad you asked!

comp.os.linux: a 1993 discussion about supporting binary-only software on Linux

This conversation predates the OSI by five whole years and leaves very little to the imagination:

The GPL and the open source code have made Linux the success that it is. Cygnus and other commercial interests are quite comfortable with this open paradigm, and in fact prosper. One need only pull the source code to GCC and read the list of many commercial contributors to realize this.

comp.os.linux.announce: 1996 announcement of Caldera’s open-source environment

In November 1996 Caldera shows up again, this time with a Linux based “open-source” environment:

Channel Partners can utilize Caldera’s Linux-based, open-source environment to remotely manage Windows 3.1 applications at home, in the office or on the road. By using Caldera’s OpenLinux (COL) and Wabi solution, resellers can increase sales and service revenues by leveraging the rapidly expanding telecommuter/home office market. Channel Partners who create customized turn-key solutions based on environments like SCO OpenServer 5 or Windows NT,

comp.os.linux.announce: 1996 announcement of a trade show

On 17 Oct 1996 we find this announcement

There will be a Open Systems World/FedUnix conference/trade show in Washington DC on November 4-8. It is a traditional event devoted to open computing (read: Unix), attended mostly by government and commercial Information Systems types.

In particular, this talk stands out to me:

** Schedule of Linux talks, OSW/FedUnix'96, Thursday, November 7, 1996 ***
(…)
11:45 Alexander O. Yuriev, “Security in an open source system: Linux study

The context here seems to be open standards, and maybe also the open source development model.

1990: Tony Patti on “software developed from open source material”

in 1990, a magazine editor by name of Tony Patti not only refers to Open Source software but mentions that NSA in 1987 referred to “software was developed from open source material”

1995: open-source changes emails on OpenBSD-misc email list

I could find one mention of “Open-source” on an OpenBSD email list, seems there was a directory “open-source-changes” which had incoming patches, distributed over email. (source). Though perhaps the way to interpret is, to say it concerns “source-changes” to OpenBSD, paraphrased to “open”, so let’s not count this one.

(I did not look at other BSD’s)

Bryan Lunduke’s research

Bryan Lunduke has done similar research and found several more USENET posts about “open source”, clearly in the context of of source software, predating OSI by many years. He breaks it down on his substack. Some interesting examples he found:

19 August, 1993 post to comp.os.ms-windows

Anyone else into “Source Code for NT”? The tools and stuff I’m writing for NT will be released with source. If there are “proprietary” tricks that MS wants to hide, the only way to subvert their hoarding is to post source that illuminates (and I don’t mean disclosing stuff obtained by a non-disclosure agreement).

(source)

Then he writes:

Open Source is best for everyone in the long run.

Written as a matter-of-fact generalization to the whole community, implying the term is well understood.

December 4, 1990

BSD’s open source policy meant that user developed software could be ported among platforms, which meant their customers saw a much more cost effective, leading edge capability combined hardware and software platform.

source

1985: The “the computer chronicles documentary” about UNIX.

The Computer Chronicles was a TV documentary series talking about computer technology, it started as a local broadcast, but in 1983 became a national series. On February 1985, they broadcasted an episode about UNIX. You can watch the entire 28 min episode on archive.org, and it’s an interesting snapshot in time, when UNIX was coming out of its shell and competing with MS-DOS with its multi-user and concurrent multi-tasking features. It contains a segment in which Bill Joy, co-founder of Sun Microsystems is being interviewed about Berkley Unix 4.2. Sun had more than 1000 staff members. And now its CTO was on national TV in the United States. This was a big deal, with a big audience. At 13:50 min, the interviewer quotes Bill:

“He [Bill Joy] says its open source code, versatility and ability to work on a variety of machines means it will be popular with scientists and engineers for some time”

“Open Source” on national TV. 13 years before the founding of OSI.

image

Uses of the word “open”

We’re specifically talking about “open source” in this article. But we should probably also consider how the term “open” was used in software, as they are related, and that may have played a role in the rejection of the trademark.

Well, the Open Software Foundation launched in 1988. (10 years before the OSI). Their goal was to make an open standard for UNIX. The word “open” is also used in software, e.g. Common Open Software Environment in 1993 (standardized software for UNIX), OpenVMS in 1992 (renaming of VAX/VMS as an indication of its support of open systems industry standards such as POSIX and Unix compatibility), OpenStep in 1994 and of course in 1996, the OpenBSD project started. They have this to say about their name: (while OpenBSD started in 1996, this quote is from 2006):

The word “open” in the name OpenBSD refers to the availability of the operating system source code on the Internet, although the word “open” in the name OpenSSH means “OpenBSD”. It also refers to the wide range of hardware platforms the system supports.

Does it run DOOM?

The proof of any hardware platform is always whether it can run Doom. Since the DOOM source code was published in December 1997, I thought it would be fun if ID Software would happen to use the term “Open Source” at that time. There are some FTP mirrors where you can still see the files with the original December 1997 timestamps (e.g. this one). However, after sifting through the README and other documentation files, I only found references to the “Doom source code”. No mention of Open Source.

The origins of the famous “Open Source” trademark application: SPI, not OSI

This is not directly relevant, but may provide useful context: In June 1997 the SPI (“Software In the Public Interest”) organization was born to support the Debian project, funded by its community, although it grew in scope to help many more free software / open source projects. It looks like Bruce, as as representative of SPI, started the “Open Source” trademark proceedings. (and may have paid for it himself). But then something happened, 3/4 of the SPI board (including Bruce) left and founded the OSI, which Bruce announced along with a note that the trademark would move from SPI to OSI as well. Ian Jackson - Debian Project Leader and SPI president - expressed his “grave doubts” and lack of trust. SPI later confirmed they owned the trademark (application) and would not let any OSI members take it. The perspective of Debian developer Ean Schuessler provides more context.

A few years later, it seems wounds were healing, with Bruce re-applying to SPI, Ean making amends, and Bruce taking the blame.

All the bickering over the Trademark was ultimately pointless, since it didn’t go through.

Searching for SPI on the OSI website reveals no acknowledgment of SPI’s role in the story. You only find mentions in board meeting notes (ironically, they’re all requests to SPI to hand over domains or to share some software).

By the way, in November 1998, this is what SPI’s open source web page had to say:

Open Source software is software whose source code is freely available

A Trademark that was never meant to be.

Lawyer Kyle E. Mitchell knows how to write engaging blog posts. Here is one where he digs further into the topic of trademarking and why “open source” is one of the worst possible terms to try to trademark (in comparison to, say, Apple computers).

He writes:

At the bottom of the hierarchy, we have “descriptive” marks. These amount to little more than commonly understood statements about goods or services. As a general rule, trademark law does not enable private interests to seize bits of the English language, weaponize them as exclusive property, and sue others who quite naturally use the same words in the same way to describe their own products and services.
(…)
Christine Peterson, who suggested “open source” (…) ran the idea past a friend in marketing, who warned her that “open” was already vague, overused, and cliche.
(…)
The phrase “open source” is woefully descriptive for software whose source is open, for common meanings of “open” and “source”, blurry as common meanings may be and often are.
(…)
no person and no organization owns the phrase “open source” as we know it. No such legal shadow hangs over its use. It remains a meme, and maybe a movement, or many movements. Our right to speak the term freely, and to argue for our own meanings, understandings, and aspirations, isn’t impinged by anyone’s private property.

So, we have here a great example of the Trademark system working exactly as intended, doing the right thing in the service of the people: not giving away unique rights to common words, rights that were demonstrably never OSI’s to have.

I can’t decide which is more wild: OSI’s audacious outcries for the whole world to forget about the trademark failure and trust their “pinky promise” right to authority over a common term, or the fact that so much of the global community actually fell for it and repeated a misguided narrative without much further thought. (myself included)

I think many of us, through our desire to be part of a movement with a positive, fulfilling mission, were too easily swept away by OSI’s origin tale.

Co-opting a term

OSI was never relevant as an organization and hijacked a movement that was well underway without them.

(source: a harsh but astute Slashdot comment)

We have plentiful evidence that “Open Source” was used for at least a decade prior to OSI existing, in the industry, in the community, and possibly in government. You saw it at trade shows, in various newsgroups around Linux and Windows programming, and on national TV in the United States. The word was often uttered without any further explanation, implying it was a known term. For a movement that happened largely offline in the eighties and nineties, it seems likely there were many more examples that we can’t access today.

“Who was first?” is interesting, but more relevant is “what did it mean?”. Many of these uses were fairly informal and/or didn’t consider re-distribution. We saw these meanings:

  • a collaborative development model
  • portability across hardware platforms, open standards
  • disclosing (making available) of source code, sometimes with commercial limitations (e.g. per-seat licensing) or restrictions (e.g. non-compete)
  • possibly a buzz-word in the TV documentary

Then came the OSD which gave the term a very different, and much more strict meaning, than what was already in use for 15 years. However, the OSD was refined, “legal-aware” and the starting point for an attempt at global consensus and wider industry adoption, so we are far from finished with our analysis.

(ironically, it never quite matched with free software either - see this e-mail or this article)

Legend has it…

Repeat a lie often enough and it becomes the truth

Yet, the OSI still promotes their story around being first to use the term “Open Source”. RedHat’s article still claims the same. I could not find evidence of resolution. I hope I just missed it (please let me know!). What I did find, is one request for clarification remaining unaddressed and another handled in a questionable way, to put it lightly. Expand all the comments in the thread and see for yourself For an organization all about “open”, this seems especially strange. Seems we have veered far away from the “We will not hide problems” motto in the Debian Social Contract.

Real achievements are much more relevant than “who was first”. Here are some suggestions for actually relevant ways the OSI could introduce itself and its mission:

  • “We were successful open source practitioners and industry thought leaders”
  • “In our desire to assist the burgeoning open source movement, we aimed to give it direction and create alignment around useful terminology”.
  • “We launched a campaign to positively transform the industry by defining the term - which had thus far only been used loosely - precisely and popularizing it”

I think any of these would land well in the community. Instead, they are strangely obsessed with “we coined the term, therefore we decide its meaning. and anything else is “flagrant abuse”.

Is this still relevant? What comes next?

Trust takes years to build, seconds to break, and forever to repair

I’m quite an agreeable person, and until recently happily defended the Open Source Definition. Now, my trust has been tainted, but at the same time, there is beauty in knowing that healthy debate has existed since the day OSI was announced. It’s just a matter of making sense of it all, and finding healthy ways forward.

Most of the events covered here are from 25 years ago, so let’s not linger too much on it. There is still a lot to be said about adoption of Open Source in the industry (and the community), tension (and agreements!) over the definition, OSI’s campaigns around awareness and standardization and its track record of license approvals and disapprovals, challenges that have arisen (e.g. ethics, hyper clouds, and many more), some of which have resulted in alternative efforts and terms. I have some ideas for productive ways forward.

Stay tuned for more, sign up for the RSS feed and let me know what you think!
Comment below, on X or on HackerNews

August 29, 2024

In his latest Lex Fridman appearance, Elon Musk makes some excellent points about the importance of simplification.

Follow these steps:

  1. Simplify the requirements
  2. For each step, try to delete it altogether
  3. Implement well

1. Simplify the Requirements

Even the smartest people come up with requirements that are, in part, dumb. Start by asking yourself how they can be simplified.

There is no point in finding the perfect answer to the wrong question. Try to make the question as least wrong as possible.

I think this is so important that it is included in my first item of advice for junior developers.

There is nothing so useless as doing efficiently that which should not be done at all.

2. Delete the Step

For each step, consider if you need it at all, and if not, delete it. Certainty is not required. Indeed, if you only delete what you are 100% certain about, you will leave in junk. If you never put things back in, it is a sign you are being too conservative with deletions.

The best part is no part.

Some further commentary by me:

This applies both to the product and technical implementation levels. It’s related to YAGNI, Agile, and Lean, also mentioned in the first section of advice for junior developers.

It’s crucial to consider probabilities and compare the expected cost/value of different approaches. Don’t spend 10 EUR each day to avoid a 1% chance of needing to pay 100 EUR. Consistent Bayesian reasoning will reduce making such mistakes, though Elon’s “if you do not put anything back in, you are not removing enough” heuristic is easier to understand and implement.

3. Implement Well

Here, Elon talks about optimization and automation, which are specific to his problem domain of building a supercomputer. More generally, this can be summarized as good implementation, which I advocate for in my second section of advice for junior developers.

 

The relevant segment begins at 43:48.

The post Simplify and Delete appeared first on Entropy Wins.

August 27, 2024

I just reviewed the performance of a customer’s WordPress site. Things got a lot worse he wrote and he assumed Autoptimize (he was a AOPro user) wasn’t working any more and asked me to guide him to fix the issue. Instead it turns out he installed CookieYes, which adds tons of JS (part of which is render-blocking), taking 3.5s of main thread work and (fasten your seat-belts) which somehow seems…

Source

August 26, 2024

Building businesses based on an Open Source project is like balancing a solar system. Like the sun is the center of our own little universe, powering life on the planets which revolve around it in a brittle, yet tremendously powerful astrophysical equilibrium; so is the relationship between a thriving open source project, with a community, one or more vendors and their commercially supported customers revolving around it, driven by astronomical aspirations.

Source-available & Non-Compete licensing have existed in various forms, and have been tweaked and refined for decades, in an attempt to combine just enough proprietary conditions with just enough of Open Source flavor, to find that perfect trade-off. Fair Source is the latest refinement for software projects driven by a single vendor wanting to combine monetization, a high rate of contributions to the project (supported by said monetization), community collaboration and direct association with said software project.

Succinctly, Fair Source licenses provide much of the same benefits to users as Open Source licenses, although outsiders are not allowed to build their own competing service based on the software; however after 2 years the software automatically becomes MIT or Apache2 licensed, and at that point you can pretty much do whatever you want with the older code.

To avoid confusion, this project is different from:

It seems we have reached an important milestone in 2024: on the surface, “Fair Source” is yet another new initiative that positions itself as a more business friendly alternative to “Open Source”, but the delayed open source publication (DSOP) model has been refined to the point where the licenses are succinct, clear, easy to work with and should hold up well in court. Several technology companies are choosing this software licensing strategy (Sentry being the most famous one, you can see the others on their website).

My 2 predictions:

  • we will see 50-100 more companies in the next couple of years.
  • a governance legal entity will appear soon, and a trademark will follow after.

In this article, I’d like to share my perspective and address some - what I believe to be - misunderstandings in current discourse.

The licenses

At this time, the Fair Source ideology is implemented by the following licenses:

BSL/BUSL are more tricky to understand can have different implementations. FCL and FSL are nearly identical. They are clearly and concisely written and embody the Fair Source spirit in the most pure form.

Seriously, try running the following in your terminal. Sometimes as an engineer you have to appreciate legal text when it’s this concise, easy to understand, and diff-able!

wget https://raw.githubusercontent.com/keygen-sh/fcl.dev/master/FCL-1.0-MIT.md
wget https://fsl.software/FSL-1.1-MIT.template.md
diff FSL-1.1-MIT.template.md FCL-1.0-MIT.md

I will focus on FSL and FCL and FSL, the Fair Source “flagship licenses”.

Is it “open source, fixed”, or an alternative to open source? Neither.

First, we’ll need to agree on what the term “Open Source” means. This itself has been a battle for decades, with non-competes (commercial restrictions) being especially contentious and in use even before OSI came along, so I’m working on an article which challenges OSI’s Open Source Definition which I will publish soon. However, the OSD is probably the most common understanding in the industry today - so we’ll use that here - and it seems that folks behind FSL/Fair Source made the wise decision to distance themselves from these contentious debates: after some initial conversations about FSL using the “Open Source” term, they’ve adopted the less common term of “Fair Source” and I’ve seen a lot of meticulous work (e.g. fsl#2 and fsl#10 on how they articulate what they stand for. (the Open Source Definition debate is why I hope the Fair Source folks will file a trademark if this projects gains more traction.

Importantly, OSI’s definition of “Open Source” includes non-discrimination and free redistribution.

When you check out code that is FSL licensed, and the code was authored:

  1. less than 2 years ago: it’s available to you under terms similar to MIT, except you cannot compete with the author by making a similar service using the same software
  2. more than 2 years ago: it is now MIT licensed. (or Apache2, when applicable)

While after 2 years, it is clearly open source, the non-compete clause in option 1 is not compatible with the set of terms set forth by the OSI Open Source Definition. (or freedom 0 from the 4 freedoms of Free Software). Such a license is often referred to as “Source Available”.

So, Fair Source is a system to combine 2 licenses (an Open Source one and a Source Available one with proprietary conditions) in one. I think this is very clever approach, but I think it’s not all that useful to compare this to Open Source. Rather, it has a certain symmetry to Open Core:

  • In an Open Core product, you have a “scoped core”: a core built from open source code which is surrounded by specifically scoped pieces from proprietary code, for a indeterminate, but usually many-year or perpetual timeframe
  • With Fair Source, you have a “timed core”: the open source core is all the code that’s more than 2 years old, and the proprietary bits are the most recent developments (regardless which scope they belong to).

Open Core and Fair Source both try to balance open source with business interests: both have an open source component to attract a community, and a proprietary shell to make a business more viable. Fair Source is a licensing choice that’s only relevant to business, not individuals. How many business monetize pure Open Source software? I can count them on one hand. The vast majority go for something like Open Core. This is why the comparison with Open Core makes much more sense.

A lot of the criticisms of Fair Source suddenly become a lot more palatable when you consider it an alternative to Open Core.

As a customer, which is more tolerable? proprietary features or a proprietary 2-years worth of product developments? I don’t think it matters nearly as much as some of the advantages Fair Source has over Open Core:

  • Users can view, modify and distribute (but not commercialize) the proprietary code. (with Open Core, you only get the binaries)
  • It follows then, that the project can use a single repository and single license (with Open Core, there are multiple repositories and licenses involved)

Technically, Open Core is more of a business architecture, where you still have to figure out which licenses you want to use for the core and shell, whereas Fair Source is more of a prepackaged solution which defines the business architecture as well as the 2 licenses to use.

image

Note that you can also devise hybrid approaches. Here are some ideas:

  • a Fair Source core and Closed Source shell. (more defensive than Open Core or Fair Source separately). (e.g. PowerSync does this)
  • an Open Source core, with Fair Source shell. (more open than Open Core or Fair Source separately).
  • Open Source Core, with Source Available shell (users can view, modify and distribute the code but not commercialize it, and without the delayed open source publication). This would be the “true” symmetrical counterpart to Fair Source. It is essentially Open Core where the community also has access to the proprietary features (but can’t commercialize those). It would also allow to put all code in the same repository. (although this benefit works better with Fair Source because any contributed code will definitely become open source, thus incentivizing the community more). I find this a very interesting option that I hope Open Core vendors will start considering. (although it has little to do with Fair Source).
  • etc.

Non-Competition

The FSL introduction post states:

In plain language, you can do anything with FSL software except economically undermine its producer through harmful free-riding

The issue of large cloud vendors selling your software as a service, making money, and contributing little to nothing back to the project, has been widely discussed under a variety of names. This can indeed severely undermine a project’s health, or kill it.

(Personally, I find discussions around whether this is “fair” not very useful. Businesses will act in their best interest, you can’t change the rules of the game, you only have control over how you play the game, i.o.w. your own licensing and strategy)

Here, we’ll just use the same terminology that the FSL does, the “harmful free-rider” problem

However, the statement above is incorrect. Something like this would be more correct:

In plain language, you can do anything with FSL software except offer a similar paid service based on the software when it’s less than 2 years old.

What’s the difference? There are different forms of competition that are not harmful free-riding.

Multiple companies can offer a similar service/product which they base on the same project, which they all contribute to. They can synergize and grow the market together. (aka “non-zero-sum” if you want to sound smart). I think there are many good examples of this, e.g. Hadoop, Linux, Node.js, OpenStack, Opentelemetry, Prometheus, etc.

When the FSL website makes statements such as “You can do anything with FSL software except undermine its producer”, it seems to forget some of the best and most ubiquitous software in the world is the result of synergies between multiple companies collaborating.

Furthermore, when the company who owns the copyright on the project turns their back on their community/customers wouldn’t the community “deserve” a new player who offers a similar service, but on friendly terms? The new player may even contribute more to the project. Are they a harmful free-rider? Who gets to be judge of that?

Let’s be clear, FSL allows no competition whatsoever, at least not during the first 2 years. What about after 2 years?

Zeke Gabrielse, one of the shepherds of Fair Source, said it well here:

Being 2 years old also puts any SaaS competition far enough back to not be a concern

Therefore, you may as well say no competition is allowed. Although, in Zeke’s post, I presume he was writing from the position of an actively developing software project. If it becomes abandoned, the 2 years countdown is an obstacle, an overcomeable one, that eventually does let you compete, but in this case, the copyright holder probably went bust, so you aren’t really competing with them either. The 2 year window is not designed to enable competition, instead it is a contingency plan for when the company goes bankrupt. The wait can be needlessly painful for the community in such a situation. If a company is about to go bust, they could immediately release their Fair Source code as Open Source, but I wonder if this can be automated via the actual license text.

(I had found some ambiguous use of the term “direct” competition which I’ve reported and has since been resolved)

Perverse incentives

Humans are notoriously bad about predicting 2nd order effects. So I like to try to. What could be some second order effects of Fair Source projects? And how do they compare to Open Core?

  • can companies first grow on top of their Fair Source codebase, take community contributions, and then switch to more restrictive, or completely closed licensing, shutting out the community? Yes if a CLA is in place (or using the 2 year old code). (this isn’t any different from any other CLA using Open Source or Open Core project. Though with Open Core, you can’t take in external contributions on proprietary parts to begin with)
  • if you enjoy a privileged position where others can’t meaningfully compete with you based on the same source code, that can affect how the company treats its community and its customers. It can push through undesirable changes, it can price more aggressively, etc. (these issues are the same with Open Core)
  • With Open Source & Open Core, the company is incentivized to make the code well understood by the community. Under Fair Source it would still be sensible (in order to get free contributions), but at the same time, by hiding design documents, subtly obfuscating the code and withholding information it can also give itself the edge for when the code does become Open Source, although as we’ve seen, the 2 year delay makes competition unrealistic anyway.

All in all, nothing particularly worse than Open Core, here.

Developer sustainability

The FSL introduction post says:

We value user freedom and developer sustainability. Free and Open Source Software (FOSS) values user freedom exclusively. That is the source of its success, and the source of the free-rider problem that occasionally boils over into a full-blown tragedy of the commons, such as Heartbleed and Log4Shell.

F/OSS indeed doesn’t involve itself with sustainability, because of the simple fact that Open Source has nothing to do business models and monetization. As stated above, it makes more sense to compare to Open Core.

It’s like saying asphalt paving machinery doesn’t care about funding and is therefore to blame when roads don’t get built. Therefore we need tolls. But it would be more useful to compare tolls to road taxes and vignettes.

Of course it happens that people dedicate themselves to writing open source projects, usually driven by their interests, don’t get paid, get volumes of support requests (incl. from commercial entities), which can become suffering, and can also lead to codebases becoming critically important, yet critically misunderstood and fragile. This is clearly a situation to avoid, and there are many ways to solve the problem ranging from sponsorships (e.g. GitHub, tidelift), bounty programs (e.g. Algora), direct funding (e.g. Sentry’s 500k donation) and many more initiatives that have launched in the last few years. Certainly a positive development. Sometimes formally abandoning a project is also a clear sign that puts the burden of responsibility onto whoever consumes it and can be a relief to the original author. If anything, it can trigger alarm bells within corporations and be a fast path to properly engaging and compensating the author. There is no way around the fact that developers (and people in general) are generally responsible for their own well being and sometimes need to put their foot down, or put on their business hat (which many developers don’t like to do) if their decision to open source project is resulting in problems. No amount of licensing can change this hard truth.

Furthermore, you can make money via Open Core around OSI approved open source projects (e.g. Grafana), consulting/support, and many companies that pay developers to work on (pure) Open Source code (Meta, Microsft, Google, etc are the most famous ones, but there are many smaller ones). Companies that try to achieve sustainability (and even thriving) on pure open source software for which they are the main/single driving force, are extremely rare. (Chef tried, and now System Initiative is trying to do it better. I remain skeptical but am hopeful and am rooting for them to prove the model)

Doesn’t it sound a bit ironic that the path to getting developers paid is releasing your software via a non-compete license?

Do we reach developer sustainability by preventing developers from making money on top of projects they want to - or already have - contribute(d) to?

Important caveats:

  • Fair Source does allow to make money via consulting and auxiliary services related to the software.
  • Open Core shuts out people similarly, but many of the business models above, don’t.

CLA needed?

When a project uses an Open Source license with some restrictions (e.g. GPL with its copyleft) it is common to use a CLA such that the company backing it can use more restrictive or commercial licenses (either as a license change later on, or as dual licensing). With Fair Source (and indeed all Source Available licenses), this is also the the case.

However, unlike Open Source licenses, with Fair Source / Source Available licenses, a CLA becomes much more of a necessity, because such a license without CLA isn’t compatible with anything else, and the commercial FSL restriction may not always apply to outside contributions (it depends on e.g. whether it can be offered stand-alone). I’m not a lawyer, for more clarity you should consult with one. I think the Fair Source website, at least their adoption guide should mention something about CLA’s, because it’s an important step beyond simply choosing a license and publishing, so I’ve raised this with them.

AGPL

The FSL website states:

AGPLv3 is not permissive enough. As a highly viral copyleft license, it exposes users to serious risk of having to divulge their proprietary source code.

This looks like fear mongering.

  • AGPL is not categorically less permissive than FSL. It is less permissive when the code is 2 years old or older (and the FSL has turned into MIT/Apache2). For current and recent code, AGPL permits competition; FSL does not.
  • The world “viral” is more divisive than accurate. In my mind, complying with AGPL is rather easy, my rule of thumb is to say you trigger copyleft when you “ship”. Most engineers have an intuitive understanding of what it means to “ship” a feature, whether that’s on cloud, or on-prem. In my experience, people struggle more with patent clauses or even the relation between trademarks and software licensing than they do with copyleft. There’s still some level of uncertainty and caution around AGPL, mainly due to its complexity. (side note: Google and CNCF doesn’t allow copyleft licenses, and their portfolio doesn’t have a whole lot of commercial success to show for it, I see mainly projects that can easily be picked up by Google)

Heather Meeker, the lawyer consulted to draft up the FSL has spoken out against the virality discourse and tempering the FUD around AGPL

Conclusion

I think Fair Source, the FSL and FCL have a lot to offer. Throughout my analysis I may have raised some criticisms, but if anything, it reminds me of how much Open Core can suck (though it depends on the relative size of core vs shell). So I find it a very compelling alternative to Open Core. Despite some poor choices of wording, I find it well executed: It ties up a lot of loose ends from previous initiatives (Source Available, BSL and other custom licenses) into a neat package. Despite the need for a CLA it’s still quite easy to implement and is arguably more viable than Open Core is, in its current state today. When comparing to Open Source, the main question is: which is worse, the “harmful free-rider problem”, or the non-compete? (Anecdotally, my gut feeling says the former, but I’m on the look out for data driven evidence). When comparing to Open Core, the main question is: is a business more viable keeping proprietary features closed, or making them source-available (non-compete)?.

As mentioned, there are many more hybrid approaches possible. For a business thinking about their licensing strategy, it may make sense to think of these questions separately:

  • should our proprietary shell be time based or feature scoped? Does it matter?
  • should our proprietary shell be closed, or source-available?

I certainly would prefer to see companies and projects appear:

  • as Fair Source, rather than not at all
  • as Open Core, rather than not at all
  • as Fair Source, rather than Open Core (depending on “shell thickness”).
  • with more commercial restrictions from the get-go, instead of starting more permissively and re-licensing later. Just kidding, but that’s a topic for another day.

For vendors, I think there are some options left to explore, such as the Open Core with an source available (instead of closed) shell. Something to consider for any company doing Open Core today. For end-users / customers, “Open Source” vendors are not the only ones to be taken with a grain of salt, it’s the same with Fair Source, since they may have a more complicated arrangement rather than just using a Fair Source license.

Thanks to Heather Meeker and Joseph Jacks for providing input, although this article reflects only my personal views.

August 25, 2024

I made some time to give some love to my own projects and spent some time rewriting the Ansible role stafwag.ntpd and cleaning up some other Ansible roles.

There is some work ongoing for some other Ansible roles/projects, but this might be a topic for some other blog post(s) ;-)

freebsd with smartcard

stafwag.ntpd


An ansible role to configure ntpd/chrony/systemd-timesyncd.


This might be controversial, but I decided to add support for chrony and systemd-timesyncd. Ntpd is still supported and the default on the BSDs ( FreeBSD, NetBSD, OpenBSD).

It’s possible to switch from the ntp implementation by using the ntpd.provider directive.

The Ansible role stafwag.ntpd v2.0.0 is available at:

Release notes

V2.0.0

  • Added support for chrony and systemd-timesyncd on GNU/Linux
    • systemd-timesynced is the default on Debian GNU/Linux 12+ and Archlinux
    • ntpd is the default on all operating systems (BSDs, Solaris) and Debian GNU/Linux 10 and 11
    • chrony is the default on all other GNU/Linux distributes
    • For ntpd hash as the input for the role.
    • Updated README
    • CleanUp

stafwag.ntpdate


An ansible role to activate the ntpdate service on FreeBSD and NetBSD.


The ntpdate service is used on FreeBSD and NetBSD to sync the time during the system boot-up. On most Linux distributions this is handled by chronyd or systemd-timesyncd now. The OpenBSD ntpd implementation OpenNTPD also has support to sync the time during the system boot-up.

The role is available at:

Release notes

V1.0.0

  • Initial release on Ansible Galaxy
    • Added support for NetBSD

stafwag.libvirt


An ansible role to install libvirt/KVM packages and enable the libvirtd service.


The role is available at:

Release notes

V1.1.3

  • Force bash for shell execution on Ubuntu.
    • Force bash for shell execution on Ubuntu. As the default dash shell doesn’t support pipefail.

V1.1.2

  • CleanUp
    • Corrected ansible-lint errors
    • Removed install task “install/.yml’”;
      • This was introduced to support Kali Linux, Kali Linux is reported as “Debian” now.
      • It isn’t used in this role
    • Removed invalid CentOS-8.yml softlink
      • Removed invalid soft link, Centos 8 should be catched by
      • RedHat-yum.yml

stafwag.cloud_localds


An ansible role to create cloud-init config disk images. This role is a wrapper around the cloud-localds command.


It’s still planned to add support for distributions that don’t have cloud-localds as part of their official package repositories like RedHat 8+.

See the GitHub issue: https://github.com/stafwag/ansible-role-cloud_localds/issues/7

The role is available at:

Release notes

V2.1.3

  • CleanUp
    • Switched to vars and package to install the required packages
    • Corrected ansible-lint errors
    • Added more examples

stafwag.qemu_img


An ansible role to create QEMU disk images.


The role is available at:

Release notes

V2.3.0

  • CleanUp Release
    • Added doc/examples
    • Updated meta data
    • Switched to vars and package to install the required packages
    • Corrected ansible-lint errors

stafwag.virt_install_import


An ansible role to import virtual machine with the virt-install import command


The role is available at:

Release notes

  • Use var and package to install pkgs
    • v1.2.1 wasn’t merged correctly. The release should fix it…
    • Switched to var and package to install the required packages
    • Updated meta data
    • Updated documentation and include examples
    • Corrected ansible-lint errors



Have fun!

August 13, 2024

Here’s a neat little trick for those of you using Home Assistant while also driving a Volvo.

To get your Volvo driving data (fuel level, battery state, …) into Home Assistant, there’s the excellent volvo2mqtt addon.

One little annoyance is that every time it starts up, you will receive an e-mail from Volvo with a two-factor authentication code, which you then have to enter in Home Assistant.

Fortunately, there’s a solution for that, you can automate this using the built-in imap support of Home Assistant, with an automation such as this one:

alias: Volvo OTP
description: ""
trigger:
  - platform: event
    event_type: imap_content
    event_data:
      initial: true
      sender: no-reply@volvocars.com
      subject: Your Volvo ID Verification code
condition: []
action:
  - service: mqtt.publish
    metadata: {}
    data:
      topic: volvoAAOS2mqtt/otp_code
      payload: >-
        {{ trigger.event.data['text'] | regex_findall_index(find='Your Volvo ID verification code is:\s+(\d+)', index=0) }}
  - service: imap.delete
    data:
      entry: "{{ trigger.event.data['entry_id'] }}"
      uid: "{{ trigger.event.data['uid'] }}"
mode: single

This will post the OTP code to the right location and then delete the message from your inbox (if you’re using Google Mail, that means archiving it).


Comments | More on rocketeer.be | @rubenv on Twitter

July 28, 2024


Updated @ Mon Sep 2 07:55:20 PM CEST 2024: Added devfs section
Updated @ Wed Sep 4 07:48:56 PM CEST 2024 : Corrected gpg-agent.conf


I use FreeBSD and GNU/Linux. freebsd with smartcard

In a previous blog post, we set up GnuPG with smartcard support on Debian GNU/Linux.

In this blog post, we’ll install and configure GnuPG with smartcard support on FreeBSD.

The GNU/Linux blog post provides more details about GnuPG, so it might be useful for the FreeBSD users to read it first.

Likewise, Linux users are welcome to read this blog post if they’re interested in how it’s done on FreeBSD ;-)

Install the required packages

To begin, we need to install the required packages on FreeBSD.

Update the package database

Execute pkg update to update the package database.

Thunderbird

[staf@monty ~]$ sudo pkg install -y thunderbird
Password:
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The most recent versions of packages are already installed
[staf@monty ~]$ 

lsusb

You can verify the USB devices on FreeBSD using the usbconfig command or lsusb which is also available on FreeBSD as part of the usbutils package.

[staf@monty ~/git/stafnet/blog]$ sudo pkg install usbutils
Password:
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 3 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
	usbhid-dump: 1.4
	usbids: 20240318
	usbutils: 0.91

Number of packages to be installed: 3

301 KiB to be downloaded.

Proceed with this action? [y/N]: y
[1/3] Fetching usbutils-0.91.pkg: 100%   54 KiB  55.2kB/s    00:01    
[2/3] Fetching usbhid-dump-1.4.pkg: 100%   32 KiB  32.5kB/s    00:01    
[3/3] Fetching usbids-20240318.pkg: 100%  215 KiB 220.5kB/s    00:01    
Checking integrity... done (0 conflicting)
[1/3] Installing usbhid-dump-1.4...
[1/3] Extracting usbhid-dump-1.4: 100%
[2/3] Installing usbids-20240318...
[2/3] Extracting usbids-20240318: 100%
[3/3] Installing usbutils-0.91...
[3/3] Extracting usbutils-0.91: 100%
[staf@monty ~/git/stafnet/blog]$

GnuPG

We’ll need GnuPG ( of course ), so ensure that it is installed.

[staf@monty ~]$ sudo pkg install gnupg
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The most recent versions of packages are already installed
[staf@monty ~]$ 

Smartcard packages

To enable smartcard support on FreeBSD, we’ll need to install the smartcard packages. The same software as on GNU/Linux - opensc - is available on FreeBSD.

pkg provides

It’s handy to be able to check which packages provide certain files. On FreeBSD this is provided by the provides plugin. This plugin is not enabled by default in the pkg command.

To install in the provides plugin install the pkg-provides package.

[staf@monty ~]$ sudo pkg install pkg-provides
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        pkg-provides: 0.7.3_3

Number of packages to be installed: 1

12 KiB to be downloaded.

Proceed with this action? [y/N]: y
[1/1] Fetching pkg-provides-0.7.3_3.pkg: 100%   12 KiB  12.5kB/s    00:01    
Checking integrity... done (0 conflicting)
[1/1] Installing pkg-provides-0.7.3_3...
[1/1] Extracting pkg-provides-0.7.3_3: 100%
=====
Message from pkg-provides-0.7.3_3:

--
In order to use the pkg-provides plugin you need to enable plugins in pkg.
To do this, uncomment the following lines in /usr/local/etc/pkg.conf file
and add pkg-provides to the supported plugin list:

PKG_PLUGINS_DIR = "/usr/local/lib/pkg/";
PKG_ENABLE_PLUGINS = true;
PLUGINS [ provides ];

After that run `pkg plugins' to see the plugins handled by pkg.
[staf@monty ~]$ 

Edit the pkg configuration to enable the provides plug-in.

staf@freebsd-gpg:~ $ sudo vi /usr/local/etc/pkg.conf
PKG_PLUGINS_DIR = "/usr/local/lib/pkg/";
PKG_ENABLE_PLUGINS = true;
PLUGINS [ provides ];

Verify that the plugin is enabled.

staf@freebsd-gpg:~ $ sudo pkg plugins
NAME       DESC                                          VERSION   
provides   A plugin for querying which package provides a particular file 0.7.3     
staf@freebsd-gpg:~ $ 

Update the pkg-provides database.

staf@freebsd-gpg:~ $ sudo pkg provides -u
Fetching provides database: 100%   18 MiB   9.6MB/s    00:02    
Extracting database....success
staf@freebsd-gpg:~ $

Install the required packages

Let’s check which packages provide the tools to set up the smartcard reader on FreeBSD. And install the required packages.

staf@freebsd-gpg:~ $ pkg provides "pkcs15-tool"
Name    : opensc-0.25.1
Comment : Libraries and utilities to access smart cards
Repo    : FreeBSD
Filename: usr/local/share/man/man1/pkcs15-tool.1.gz
          usr/local/etc/bash_completion.d/pkcs15-tool
          usr/local/bin/pkcs15-tool
staf@freebsd-gpg:~ $ 
staf@freebsd-gpg:~ $ pkg provides "bin/pcsc"
Name    : pcsc-lite-2.2.2,2
Comment : Middleware library to access a smart card using SCard API (PC/SC)
Repo    : FreeBSD
Filename: usr/local/sbin/pcscd
          usr/local/bin/pcsc-spy
staf@freebsd-gpg:~ $ 
[staf@monty ~]$ sudo pkg install opensc pcsc-lite
Password:
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The most recent versions of packages are already installed
[staf@monty ~]$ 
staf@freebsd-gpg:~ $ sudo pkg install -y pcsc-tools
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The most recent versions of packages are already installed
staf@freebsd-gpg:~ $ 
staf@freebsd-gpg:~ $ sudo pkg install -y ccid
Password:
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The most recent versions of packages are already installed
staf@freebsd-gpg:~ $ 

USB

To use the smartcard reader we will need access to the USB devices as the user we use for our desktop environment. (No, this shouldn’t be the root user :-) )

permissions

verify

Execute the usbconfig command to verify that you can access the USB devices.

[staf@snuffel ~]$ usbconfig
No device match or lack of permissions.
[staf@snuffel ~]$ 

If you don’t have access, verify the permissions of the USB devices.

[staf@snuffel ~]$ ls -l /dev/usbctl
crw-r--r--  1 root operator 0x5b Sep  2 19:17 /dev/usbctl
[staf@snuffel ~]$  ls -l /dev/usb/
total 0
crw-------  1 root operator 0x34 Sep  2 19:17 0.1.0
crw-------  1 root operator 0x4f Sep  2 19:17 0.1.1
crw-------  1 root operator 0x36 Sep  2 19:17 1.1.0
crw-------  1 root operator 0x53 Sep  2 19:17 1.1.1
crw-------  1 root operator 0x7e Sep  2 19:17 1.2.0
crw-------  1 root operator 0x82 Sep  2 19:17 1.2.1
crw-------  1 root operator 0x83 Sep  2 19:17 1.2.2
crw-------  1 root operator 0x76 Sep  2 19:17 1.3.0
crw-------  1 root operator 0x8a Sep  2 19:17 1.3.1
crw-------  1 root operator 0x8b Sep  2 19:17 1.3.2
crw-------  1 root operator 0x8c Sep  2 19:17 1.3.3
crw-------  1 root operator 0x8d Sep  2 19:17 1.3.4
crw-------  1 root operator 0x38 Sep  2 19:17 2.1.0
crw-------  1 root operator 0x56 Sep  2 19:17 2.1.1
crw-------  1 root operator 0x3a Sep  2 19:17 3.1.0
crw-------  1 root operator 0x51 Sep  2 19:17 3.1.1
crw-------  1 root operator 0x3c Sep  2 19:17 4.1.0
crw-------  1 root operator 0x55 Sep  2 19:17 4.1.1
crw-------  1 root operator 0x3e Sep  2 19:17 5.1.0
crw-------  1 root operator 0x54 Sep  2 19:17 5.1.1
crw-------  1 root operator 0x80 Sep  2 19:17 5.2.0
crw-------  1 root operator 0x85 Sep  2 19:17 5.2.1
crw-------  1 root operator 0x86 Sep  2 19:17 5.2.2
crw-------  1 root operator 0x87 Sep  2 19:17 5.2.3
crw-------  1 root operator 0x40 Sep  2 19:17 6.1.0
crw-------  1 root operator 0x52 Sep  2 19:17 6.1.1
crw-------  1 root operator 0x42 Sep  2 19:17 7.1.0
crw-------  1 root operator 0x50 Sep  2 19:17 7.1.1

devfs

When the /dev/usb* are only accessible by the root user. You probably want to create devfs.rules that to grant permissions to the operator or another group.

See https://man.freebsd.org/cgi/man.cgi?devfs.rules for more details.

/etc/rc.conf

Update the /etc/rc.conf to apply custom devfs permissions.

[staf@snuffel /etc]$ sudo vi rc.conf
devfs_system_ruleset="localrules"

/etc/devfs.rules

Create or update the /dev/devfs.rules with the update permissions to grant read/write access to the operator group.

[staf@snuffel /etc]$ sudo vi devfs.rules
[localrules=10]
add path 'usbctl*' mode 0660 group operator
add path 'usb/*' mode 0660 group operator

Restart the devfs service to apply the custom devfs ruleset.

[staf@snuffel /etc]$ sudo -i
root@snuffel:~ #
root@snuffel:~ # service devfs restart

The operator group should have read/write permissions now.

root@snuffel:~ # ls -l /dev/usb/
total 0
crw-rw----  1 root operator 0x34 Sep  2 19:17 0.1.0
crw-rw----  1 root operator 0x4f Sep  2 19:17 0.1.1
crw-rw----  1 root operator 0x36 Sep  2 19:17 1.1.0
crw-rw----  1 root operator 0x53 Sep  2 19:17 1.1.1
crw-rw----  1 root operator 0x7e Sep  2 19:17 1.2.0
crw-rw----  1 root operator 0x82 Sep  2 19:17 1.2.1
crw-rw----  1 root operator 0x83 Sep  2 19:17 1.2.2
crw-rw----  1 root operator 0x76 Sep  2 19:17 1.3.0
crw-rw----  1 root operator 0x8a Sep  2 19:17 1.3.1
crw-rw----  1 root operator 0x8b Sep  2 19:17 1.3.2
crw-rw----  1 root operator 0x8c Sep  2 19:17 1.3.3
crw-rw----  1 root operator 0x8d Sep  2 19:17 1.3.4
crw-rw----  1 root operator 0x38 Sep  2 19:17 2.1.0
crw-rw----  1 root operator 0x56 Sep  2 19:17 2.1.1
crw-rw----  1 root operator 0x3a Sep  2 19:17 3.1.0
crw-rw----  1 root operator 0x51 Sep  2 19:17 3.1.1
crw-rw----  1 root operator 0x3c Sep  2 19:17 4.1.0
crw-rw----  1 root operator 0x55 Sep  2 19:17 4.1.1
crw-rw----  1 root operator 0x3e Sep  2 19:17 5.1.0
crw-rw----  1 root operator 0x54 Sep  2 19:17 5.1.1
crw-rw----  1 root operator 0x80 Sep  2 19:17 5.2.0
crw-rw----  1 root operator 0x85 Sep  2 19:17 5.2.1
crw-rw----  1 root operator 0x86 Sep  2 19:17 5.2.2
crw-rw----  1 root operator 0x87 Sep  2 19:17 5.2.3
crw-rw----  1 root operator 0x40 Sep  2 19:17 6.1.0
crw-rw----  1 root operator 0x52 Sep  2 19:17 6.1.1
crw-rw----  1 root operator 0x42 Sep  2 19:17 7.1.0
crw-rw----  1 root operator 0x50 Sep  2 19:17 7.1.1
root@snuffel:~ # 

Make sure that you’re part of the operator group

staf@freebsd-gpg:~ $ ls -l /dev/usbctl 
crw-rw----  1 root operator 0x5a Jul 13 17:32 /dev/usbctl
staf@freebsd-gpg:~ $ ls -l /dev/usb/
total 0
crw-rw----  1 root operator 0x31 Jul 13 17:32 0.1.0
crw-rw----  1 root operator 0x53 Jul 13 17:32 0.1.1
crw-rw----  1 root operator 0x33 Jul 13 17:32 1.1.0
crw-rw----  1 root operator 0x51 Jul 13 17:32 1.1.1
crw-rw----  1 root operator 0x35 Jul 13 17:32 2.1.0
crw-rw----  1 root operator 0x52 Jul 13 17:32 2.1.1
crw-rw----  1 root operator 0x37 Jul 13 17:32 3.1.0
crw-rw----  1 root operator 0x54 Jul 13 17:32 3.1.1
crw-rw----  1 root operator 0x73 Jul 13 17:32 3.2.0
crw-rw----  1 root operator 0x75 Jul 13 17:32 3.2.1
crw-rw----  1 root operator 0x76 Jul 13 17:32 3.3.0
crw-rw----  1 root operator 0x78 Jul 13 17:32 3.3.1
staf@freebsd-gpg:~ $ 

You’ll need to be part of the operator group to access the USB devices.

Execute the vigr command and add the user to the operator group.

staf@freebsd-gpg:~ $ sudo vigr
operator:*:5:root,staf

Relogin and check that you are in the operator group.

staf@freebsd-gpg:~ $ id
uid=1001(staf) gid=1001(staf) groups=1001(staf),0(wheel),5(operator)
staf@freebsd-gpg:~ $ 

The usbconfig command should work now.

staf@freebsd-gpg:~ $ usbconfig
ugen1.1: <Intel UHCI root HUB> at usbus1, cfg=0 md=HOST spd=FULL (12Mbps) pwr=SAVE (0mA)
ugen2.1: <Intel UHCI root HUB> at usbus2, cfg=0 md=HOST spd=FULL (12Mbps) pwr=SAVE (0mA)
ugen0.1: <Intel UHCI root HUB> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=SAVE (0mA)
ugen3.1: <Intel EHCI root HUB> at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen3.2: <QEMU Tablet Adomax Technology Co., Ltd> at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (100mA)
ugen3.3: <QEMU Tablet Adomax Technology Co., Ltd> at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (100mA)
staf@freebsd-gpg:~ $ 

SmartCard configuration

Verify the USB connection

The first step is to ensure your smartcard reader is detected on a USB level. Execute usbconfig and lsusb and make sure your smartcard reader is listed.

usbconfig

List the USB devices.

[staf@monty ~/git]$ usbconfig
ugen1.1: <Intel EHCI root HUB> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.1: <Intel XHCI root HUB> at usbus0, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen2.1: <Intel EHCI root HUB> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen2.2: <Integrated Rate Matching Hub Intel Corp.> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.2: <Integrated Rate Matching Hub Intel Corp.> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.2: <AU9540 Smartcard Reader Alcor Micro Corp.> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (50mA)
ugen0.3: <VFS 5011 fingerprint sensor Validity Sensors, Inc.> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (100mA)
ugen0.4: <Centrino Bluetooth Wireless Transceiver Intel Corp.> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (0mA)
ugen0.5: <SunplusIT INC. Integrated Camera> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (500mA)
ugen0.6: <X-Rite Pantone Color Sensor X-Rite, Inc.> at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON (100mA)
ugen0.7: <GemPC Key SmartCard Reader Gemalto (was Gemplus)> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (50mA)
[staf@monty ~/git]$ 

lsusb

[staf@monty ~/git/stafnet/blog]$ lsusb
Bus /dev/usb Device /dev/ugen0.7: ID 08e6:3438 Gemalto (was Gemplus) GemPC Key SmartCard Reader
Bus /dev/usb Device /dev/ugen0.6: ID 0765:5010 X-Rite, Inc. X-Rite Pantone Color Sensor
Bus /dev/usb Device /dev/ugen0.5: ID 04f2:b39a Chicony Electronics Co., Ltd 
Bus /dev/usb Device /dev/ugen0.4: ID 8087:07da Intel Corp. Centrino Bluetooth Wireless Transceiver
Bus /dev/usb Device /dev/ugen0.3: ID 138a:0017 Validity Sensors, Inc. VFS 5011 fingerprint sensor
Bus /dev/usb Device /dev/ugen0.2: ID 058f:9540 Alcor Micro Corp. AU9540 Smartcard Reader
Bus /dev/usb Device /dev/ugen1.2: ID 8087:8008 Intel Corp. Integrated Rate Matching Hub
Bus /dev/usb Device /dev/ugen2.2: ID 8087:8000 Intel Corp. Integrated Rate Matching Hub
Bus /dev/usb Device /dev/ugen2.1: ID 0000:0000  
Bus /dev/usb Device /dev/ugen0.1: ID 0000:0000  
Bus /dev/usb Device /dev/ugen1.1: ID 0000:0000  
[staf@monty ~/git/stafnet/blog]$ 

Check the GnuPG smartcard status

Let’s check if we get access to our smart card with gpg.

This might work if you have a native-supported GnuPG smartcard.

[staf@monty ~]$ gpg --card-status
gpg: selecting card failed: Operation not supported by device
gpg: OpenPGP card not available: Operation not supported by device
[staf@monty ~]$ 

In my case, it doesn’t work. I prefer the OpenSC interface, this might be useful if you want to use your smartcard for other usages.

opensc

Enable pcscd

FreeBSD has a handy tool sysrc to manage rc.conf

Enable the pcscd service.

[staf@monty ~]$ sudo sysrc pcscd_enable=YES
Password:
pcscd_enable: NO -> YES
[staf@monty ~]$ 

Start the pcscd service.

[staf@monty ~]$ sudo /usr/local/etc/rc.d/pcscd start
Password:
Starting pcscd.
[staf@monty ~]$ 

Verify smartcard access

pcsc_scan

The opensc-tools package provides a tool - pcsc_scan to verify the smartcard readers.

Execute pcsc_scan to verify that your smartcard is detected.

[staf@monty ~]$ pcsc_scan 
PC/SC device scanner
V 1.7.1 (c) 2001-2022, Ludovic Rousseau <ludovic.rousseau@free.fr>
Using reader plug'n play mechanism
Scanning present readers...
0: Gemalto USB Shell Token V2 (284C3E93) 00 00
1: Alcor Micro AU9540 01 00
 
Thu Jul 25 18:42:34 2024
 Reader 0: Gemalto USB Shell Token V2 (<snip>) 00 00
  Event number: 0
  Card state: Card inserted, 
  ATR: <snip>

ATR: <snip>
+ TS = 3B --> Direct Convention
+ T0 = DA, Y(1): 1101, K: 10 (historical bytes)
  TA(1) = 18 --> Fi=372, Di=12, 31 cycles/ETU
    129032 bits/s at 4 MHz, fMax for Fi = 5 MHz => 161290 bits/s
  TC(1) = FF --> Extra guard time: 255 (special value)
  TD(1) = 81 --> Y(i+1) = 1000, Protocol T = 1 
-----
  TD(2) = B1 --> Y(i+1) = 1011, Protocol T = 1 
-----
  TA(3) = FE --> IFSC: 254
  TB(3) = 75 --> Block Waiting Integer: 7 - Character Waiting Integer: 5
  TD(3) = 1F --> Y(i+1) = 0001, Protocol T = 15 - Global interface bytes following 
-----
  TA(4) = 03 --> Clock stop: not supported - Class accepted by the card: (3G) A 5V B 3V 
+ Historical bytes: 00 31 C5 73 C0 01 40 00 90 00
  Category indicator byte: 00 (compact TLV data object)
    Tag: 3, len: 1 (card service data byte)
      Card service data byte: C5
        - Application selection: by full DF name
        - Application selection: by partial DF name
        - EF.DIR and EF.ATR access services: by GET DATA command
        - Card without MF
    Tag: 7, len: 3 (card capabilities)
      Selection methods: C0
        - DF selection by full DF name
        - DF selection by partial DF name
      Data coding byte: 01
        - Behaviour of write functions: one-time write
        - Value 'FF' for the first byte of BER-TLV tag fields: invalid
        - Data unit in quartets: 2
      Command chaining, length fields and logical channels: 40
        - Extended Lc and Le fields
        - Logical channel number assignment: No logical channel
        - Maximum number of logical channels: 1
    Mandatory status indicator (3 last bytes)
      LCS (life card cycle): 00 (No information given)
      SW: 9000 (Normal processing.)
+ TCK = 0C (correct checksum)

Possibly identified card (using /usr/local/share/pcsc/smartcard_list.txt):
<snip>
        OpenPGP Card V2

 Reader 1: Alcor Micro AU9540 01 00
  Event number: 0
  Card state

pkcs15

pkcs15 is the application interface for hardware tokens while pkcs11 is the low-level interface.

You can use pkcs15-tool -D to verify that your smartcard is detected.

staf@monty ~]$ pkcs15-tool -D
Using reader with a card: Gemalto USB Shell Token V2 (<snip>) 00 00
PKCS#15 Card [OpenPGP card]:
        Version        : 0
        Serial number  : <snip>
        Manufacturer ID: ZeitControl
        Language       : nl
        Flags          : PRN generation, EID compliant


PIN [User PIN]
        Object Flags   : [0x03], private, modifiable
        Auth ID        : 03
        ID             : 02
        Flags          : [0x13], case-sensitive, local, initialized
        Length         : min_len:6, max_len:32, stored_len:32
        Pad char       : 0x00
        Reference      : 2 (0x02)
        Type           : UTF-8
        Path           : 3f00
        Tries left     : 3

PIN [User PIN (sig)]
        Object Flags   : [0x03], private, modifiable
        Auth ID        : 03
        ID             : 01
        Flags          : [0x13], case-sensitive, local, initialized
        Length         : min_len:6, max_len:32, stored_len:32
        Pad char       : 0x00
        Reference      : 1 (0x01)
        Type           : UTF-8
        Path           : 3f00
        Tries left     : 0

PIN [Admin PIN]
        Object Flags   : [0x03], private, modifiable
        ID             : 03
        Flags          : [0x9B], case-sensitive, local, unblock-disabled, initialized, soPin
        Length         : min_len:8, max_len:32, stored_len:32
        Pad char       : 0x00
        Reference      : 3 (0x03)
        Type           : UTF-8
        Path           : 3f00
        Tries left     : 0

Private RSA Key [Signature key]
        Object Flags   : [0x03], private, modifiable
        Usage          : [0x20C], sign, signRecover, nonRepudiation
        Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
        Algo_refs      : 0
        ModLength      : 3072
        Key ref        : 0 (0x00)
        Native         : yes
        Auth ID        : 01
        ID             : 01
        MD:guid        : <snip>

Private RSA Key [Encryption key]
        Object Flags   : [0x03], private, modifiable
        Usage          : [0x22], decrypt, unwrap
        Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
        Algo_refs      : 0
        ModLength      : 3072
        Key ref        : 1 (0x01)
        Native         : yes
        Auth ID        : 02
        ID             : 02
        MD:guid        : <snip>

Private RSA Key [Authentication key]
        Object Flags   : [0x03], private, modifiable
        Usage          : [0x200], nonRepudiation
        Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
        Algo_refs      : 0
        ModLength      : 3072
        Key ref        : 2 (0x02)
        Native         : yes
        Auth ID        : 02
        ID             : 03
        MD:guid        : <snip>

Public RSA Key [Signature key]
        Object Flags   : [0x02], modifiable
        Usage          : [0xC0], verify, verifyRecover
        Access Flags   : [0x02], extract
        ModLength      : 3072
        Key ref        : 0 (0x00)
        Native         : no
        Path           : b601
        ID             : 01

Public RSA Key [Encryption key]
        Object Flags   : [0x02], modifiable
        Usage          : [0x11], encrypt, wrap
        Access Flags   : [0x02], extract
        ModLength      : 3072
        Key ref        : 0 (0x00)
        Native         : no
        Path           : b801
        ID             : 02

Public RSA Key [Authentication key]
        Object Flags   : [0x02], modifiable
        Usage          : [0x40], verify
        Access Flags   : [0x02], extract
        ModLength      : 3072
        Key ref        : 0 (0x00)
        Native         : no
        Path           : a401
        ID             : 03

[staf@monty ~]$ 

GnuPG configuration

First test

Stop (kill) the scdaemon, to ensure that the scdaemon tries to use the opensc interface.

[staf@monty ~]$ gpgconf --kill scdaemon
[staf@monty ~]$ 
[staf@monty ~]$ ps aux | grep -i scdaemon
staf  9236  0.0  0.0   12808   2496  3  S+   20:42   0:00.00 grep -i scdaemon
[staf@monty ~]$ 

Try to read the card status again.

[staf@monty ~]$ gpg --card-status
gpg: selecting card failed: Operation not supported by device
gpg: OpenPGP card not available: Operation not supported by device
[staf@monty ~]$ 

Reconfigure GnuPG

Go to the .gnupg directory in your $HOME directory.

[staf@monty ~]$ cd .gnupg/
[staf@monty ~/.gnupg]$ 

scdaemon

Reconfigure scdaemon to disable the internal ccid and enable logging - always useful to verify why something isn’t working…

[staf@monty ~/.gnupg]$ vi scdaemon.conf
disable-ccid

verbose
debug-level expert
debug-all
log-file    /home/staf/logs/scdaemon.log

gpg-agent

Enable debug logging for the gpg-agent.

[staf@monty ~/.gnupg]$ vi gpg-agent.conf
debug-level expert
verbose
verbose
log-file /home/staf/logs/gpg-agent.log

Verify

Stop the scdaemon.

[staf@monty ~/.gnupg]$ gpgconf --kill scdaemon
[staf@monty ~/.gnupg]$ 

If everything goes well gpg will detect the smartcard.

If not, you have some logging to do some debugging ;-)

[staf@monty ~/.gnupg]$ gpg --card-status
Reader ...........: Gemalto USB Shell Token V2 (<snip>) 00 00
Application ID ...: <snip>
Application type .: OpenPGP
Version ..........: 2.1
Manufacturer .....: ZeitControl
Serial number ....: 000046F1
Name of cardholder: <snip>
Language prefs ...: nl
Salutation .......: Mr.
URL of public key : <snip>
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: xxxxxxx xxxxxxx xxxxxxx
Max. PIN lengths .: 32 32 32
PIN retry counter : 3 0 3
Signature counter : 80
Signature key ....: <snip>
      created ....: <snip>
Encryption key....: <snip>
      created ....: <snip>
Authentication key: <snip>
      created ....: <snip>
General key info..: [none]
[staf@monty ~/.gnupg]$ 

Test

shadow private keys

After you executed gpg --card-status, GnuPG created “shadow private keys”. These keys just contain references on which hardware tokens the private keys are stored.

[staf@monty ~/.gnupg]$ ls -l private-keys-v1.d/
total 14
-rw-------  1 staf staf 976 Mar 24 11:35 <snip>.key
-rw-------  1 staf staf 976 Mar 24 11:35 <snip>.key
-rw-------  1 staf staf 976 Mar 24 11:35 <snip>.key
[staf@monty ~/.gnupg]$ 

You can list the (shadow) private keys with the gpg --list-secret-keys command.

Pinentry

To be able to type in your PIN code, you’ll need a pinentry application unless your smartcard reader has a pinpad.

You can use pkg provides to verify which pinentry applications are available.

For the integration with Thunderbird, you probably want to have a graphical-enabled version. But this is the topic for a next blog post ;-)

We’ll stick with the (n)curses version for now.

Install a pinentry program.

[staf@monty ~/.gnupg]$ pkg provides pinentry | grep -i curses
Name    : pinentry-curses-1.3.1
Comment : Curses version of the GnuPG password dialog
Filename: usr/local/bin/pinentry-curses
[staf@monty ~/.gnupg]$ 
[staf@monty ~/.gnupg]$ sudo pkg install pinentry-curses
Password:
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The most recent versions of packages are already installed
[staf@monty ~/.gnupg]$ 

A soft link is created for the pinentry binary. On FreeBSD, the pinentry soft link is managed by the pinentry package.

You can verify this with the pkg which command.

[staf@monty ~]$ pkg which /usr/local/bin/pinentry
/usr/local/bin/pinentry was installed by package pinentry-1.3.1
[staf@monty ~]$ 

The curses version is the default.

If you want to use another pinentry version in the gpg-agent configuration ( $HOME/.gnupg/gpg-agent.conf).

pinentry-program <PATH>

Import your public key

Import your public key.

[staf@monty /tmp]$ gpg --import <snip>.asc
gpg: key <snip>: public key "<snip>" imported
gpg: Total number processed: 1
gpg:               imported: 1
[staf@monty /tmp]$ 

List the public keys.

[staf@monty /tmp]$ gpg --list-keys
/home/staf/.gnupg/pubring.kbx
-----------------------------
pub   XXXXXXX XXXX-XX-XX [SC]
      <snip>
uid           [ unknown] <snip>
sub   XXXXXXX XXXX-XX-XX [A]
sub   XXXXXXX XXXX-XX-XX [E]

[staf@monty /tmp]$ 

As a test, we try to sign something with the private key on our GnuPG smartcard.

Create a test file.

[staf@monty /tmp]$ echo "foobar" > foobar
[staf@monty /tmp]$ 
[staf@monty /tmp]$ gpg --sign foobar

If your smartcard isn’t inserted GnuPG will ask to insert it.

GnuPG asks for the smartcard with the serial in the shadow private key.


                ┌────────────────────────────────────────────┐
                │ Please insert the card with serial number: │
                │                                            │
                │ XXXX XXXXXXXX                              │
                │                                            │
                │                                            │
                │      <OK>                      <Cancel>    │
                └────────────────────────────────────────────┘


Type in your PIN code.



               ┌──────────────────────────────────────────────┐
               │ Please unlock the card                       │
               │                                              │
               │ Number: XXXX XXXXXXXX                        │
               │ Holder: XXXX XXXXXXXXXX                      │
               │ Counter: XX                                  │
               │                                              │
               │ PIN ________________________________________ │
               │                                              │
               │      <OK>                        <Cancel>    │
               └──────────────────────────────────────────────┘


[staf@monty /tmp]$ ls -l foobar*
-rw-r-----  1 staf wheel   7 Jul 27 11:11 foobar
-rw-r-----  1 staf wheel 481 Jul 27 11:17 foobar.gpg
[staf@monty /tmp]$ 

In a next blog post in this series, we’ll configure Thunderbird to use the smartcard for OpenPG email encryption.

Have fun!

Links