August 06, 2025

Amedee Van Gasse

🎣 The Curious Case of the Beg Bounty Bait — or: Licence to Phish

Not every day do I get an email from a very serious security researcher, clearly a man on a mission to save the internet — one vague, copy-pasted email at a time.

Here’s the message I received:

From: Peter Hooks <peterhooks007@gmail.com>
Subject: Security Vulnerability Disclosure

Hi Team,

I’ve identified security vulnerabilities in your app that may put users at risk. I’d like to report these responsibly and help ensure they are resolved quickly.

Please advise on your disclosure protocol, or share details if you have a Bug Bounty program in place.

Looking forward to your reply.

Best regards,
Peter Hooks

Right. Let’s unpack this.

”Your App” — What App?

I’m not a company. I’m not a startup. I’m not even a garage-based stealth tech bro.
I run a personal WordPress blog. That’s it.

There is no “app.” There are no “users at risk” (unless you count me, and I̷̜̓’̷̠̋m̴̪̓ ̴̹́a̸͙̽ḷ̵̿r̸͇̽ë̵͖a̶͖̋ḋ̵͓ŷ̴̼ ̴̖͂b̶̠̋é̶̻ÿ̴͇́ọ̸̒ń̸̦d̴̟̆ ̶͉͒s̶̀ͅa̶̡͗v̴͙͊i̵͖̊n̵͖̆g̸̡̔).

The Anatomy of a Beg Bounty Email

This little email ticks all the classic marks of what the security community affectionately calls a beg bounty — someone scanning random domains, finding trivial or non-issues, and fishing for a payout.

Want to see how common this is? Check out:

My (Admittedly Snarky) Reply

I couldn’t resist. Here’s the reply I sent:

Hi Peter,

Thanks for your email and your keen interest in my “app” — spoiler alert: there isn’t one. Just a humble personal blog here.

Your message hits all the classic marks of a beg bounty reconnaissance email:

Generic “Hi Team” greeting — because who needs names?

Vague claims of “security vulnerabilities” with zero specifics

Polite inquiry about a bug bounty program (spoiler: none here, James)

No proof, no details, just good old-fashioned mystery

Friendly tone crafted to reel in easy targets

Email address proudly featuring “007” — very covert ops of you

Bravo. You almost had me convinced.

I’ll be featuring this charming little interaction in a blog post soon — starring you, of course. If you ever feel like upgrading from vague templates to actual evidence, I’m all ears. Until then, happy fishing!

Cheers,
Amedee

No Reply

Sadly, Peter didn’t write back.

No scathing rebuttal.
No actual vulnerabilities.
No awkward attempt at pivoting.
Just… silence.

#sadface
#crying
#missionfailed

A Note for Fellow Nerds

If you’ve got a domain name, no matter how small, there’s a good chance you’ll get emails like this.

Here’s how to handle them:

Stay calm — most of these are low-effort probes.
Don’t pay — you owe nothing to random strangers on the internet.
Don’t panic — vague threats are just that: vague.
Do check your stuff occasionally for actual issues.
Bonus: write a blog post about it and enjoy the catharsis.

For more context on this phenomenon, don’t miss:

tl;dr

If your “security researcher”:

doesn’t say what they found,
doesn’t mention your actual domain or service,
asks for a bug bounty up front,
signs with a Gmail address ending in 007…

…it’s probably not the start of a beautiful friendship.

Got a similar email? Want help crafting a reply that’s equally professional and petty?
Feel free to drop a comment or reach out — I’ll even throw in a checklist.

Until then: stay patched, stay skeptical, and stay snarky.

August 05, 2025

Steven Wittens

HTML is Dead, Long Live HTML

Rethinking DOM from first principles

Browsers are in a very weird place. While WebAssembly has succeeded, even on the server, the client still feels largely the same as it did 10 years ago.

Enthusiasts will tell you that accessing native web APIs via WASM is a solved problem, with some minimal JS glue.

But the question not asked is why you would want to access the DOM. It's just the only option. So I'd like to explain why it really is time to send the DOM and its assorted APIs off to a farm somewhere, with some ideas on how.

I won't pretend to know everything about browsers. Nobody knows everything anymore, and that's the problem.

The 'Document' Model

Few know how bad the DOM really is. In Chrome, document.body now has 350+ keys, grouped roughly like this:

This doesn't include the CSS properties in document.body.style of which there are... 660.

The boundary between properties and methods is very vague. Many are just facades with an invisible setter behind them. Some getters may trigger a just-in-time re-layout. There's ancient legacy stuff, like all the onevent properties nobody uses anymore.

The DOM is not lean and continues to get fatter. Whether you notice this largely depends on whether you are making web pages or web applications.

Most devs now avoid working with the DOM directly, though occasionally some purist will praise pure DOM as being superior to the various JS component/templating frameworks. What little declarative facilities the DOM has, like innerHTML, do not resemble modern UI patterns at all. The DOM has too many ways to do the same thing, none of them nice.

connectedCallback() {
  const
    shadow = this.attachShadow({ mode: 'closed' }),
    template = document.getElementById('hello-world')
      .content.cloneNode(true),
    hwMsg = `Hello ${ this.name }`;

  Array.from(template.querySelectorAll('.hw-text'))
    .forEach(n => n.textContent = hwMsg);

  shadow.append(template);
}

Web Components deserve a mention, being the web-native equivalent of JS component libraries. But they came too late and are unpopular. The API seems clunky, with its Shadow DOM introducing new nesting and scoping layers. Proponents kinda read like apologetics.

The achilles heel is the DOM's SGML/XML heritage, making everything stringly typed. React-likes do not have this problem, their syntax only looks like XML. Devs have learned not to keep state in the document, because it's inadequate for it.

For HTML itself, there isn't much to critique because nothing has changed in 10-15 years. Only ARIA (accessibility) is notable, and only because this was what Semantic HTML was supposed to do and didn't.

Semantic HTML never quite reached its goal. Despite dating from around 2011, there is e.g. no <thread> or <comment> tag, when those were well-established idioms. Instead, an article inside an article is probably a comment. The guidelines are... weird.

There's this feeling that HTML always had paper-envy, and couldn't quite embrace or fully define its hypertext nature, and did not trust its users to follow clear rules.

Stewardship of HTML has since firmly passed to WHATWG, really the browser vendors, who have not been able to define anything more concrete as a vision, and have instead just added epicycles at the margins.

Along the way even CSS has grown expressions, because every templating language wants to become a programming language.

Editability of HTML remains a sad footnote. While technically supported via contentEditable, actually wrangling this feature into something usable for applications is a dark art. I'm sure the Google Docs and Notion people have horror stories.

Nobody really believes in the old gods of progressive enhancement and separating markup from style anymore, not if they make apps.

Most of the applications you see nowadays will kitbash HTML/CSS/SVG into a pretty enough shape. But this comes with immense overhead, and is looking more and more like the opposite of a decent UI toolkit.

The Slack input box

Off-screen clipboard hacks

Lists and tables must be virtualized by hand, taking over for layout, resizing, dragging, and so on. Making a chat window's scrollbar stick to the bottom is somebody's TODO, every single time. And the more you virtualize, the more you have to reinvent find-in-page, right-click menus, etc.

The web blurred the distinction between UI and fluid content, which was novel at the time. But it makes less and less sense, because the UI part is a decade obsolete, and the content has largely homogenized.

CSS is inside-out

CSS doesn't have a stellar reputation either, but few can put their finger on exactly why.

Where most people go wrong is to start with the wrong mental model, approaching it like a constraint solver. This is easy to show with e.g.:

<div>
  <div style="height: 50%">...</div>
  <div style="height: 50%">...</div>
</div>

<div>
  <div style="height: 100%">...</div>
  <div style="height: 100%">...</div>
</div>

The first might seem reasonable: divide the parent into two halves vertically. But what about the second?

Viewed as a set of constraints, it's contradictory, because the parent div is twice as tall as... itself. What will happen instead in both cases is the height is ignored. The parent height is unknown and CSS doesn't backtrack or iterate here. It just shrink-wraps the contents.

If you set e.g. height: 300px on the parent, then it works, but the latter case will still just spill out.

Outside-in vs inside-out layout

Outside-in and inside-out layout modes

Instead, your mental model of CSS should be applying two passes of constraints, first going outside-in, and then inside-out.

When you make an application frame, this is outside-in: the available space is divided, and the content inside does not affect sizing of panels.

When paragraphs stack on a page, this is inside-out: the text stretches out its containing parent. This is what HTML wants to do naturally.

By being structured this way, CSS layouts are computationally pretty simple. You can propagate the parent constraints down to the children, and then gather up the children's sizes in the other direction. This is attractive and allows webpages to scale well in terms of elements and text content.

CSS is always inside-out by default, reflecting its document-oriented nature. The outside-in is not obvious, because it's up to you to pass all the constraints down, starting with body { height: 100%; }. This is why they always say vertical alignment in CSS is hard.

Flex grow/shrink

Use flex grow and shrink for spill-free auto-layouts with completely reasonable gaps

The scenario above is better handled with a CSS3 flex box (display: flex), which provides explicit control over how space is divided.

Unfortunately flexing muddles the simple CSS model. To auto-flex, the layout algorithm must measure the "natural size" of every child. This means laying it out twice: first speculatively, as if floating in aether, and then again after growing or shrinking to fit:

Flex speculative layout

This sounds reasonable but can come with hidden surprises, because it's recursive. Doing speculative layout of a parent often requires full layout of unsized children. e.g. to know how text will wrap. If you nest it right, it could in theory cause an exponential blow up, though I've never heard of it being an issue.

Instead you will only discover this when someone drops some large content in somewhere, and suddenly everything gets stretched out of whack. It's the opposite of the problem on the mug.

To avoid the recursive dependency, you need to isolate the children's contents from the outside, thus making speculative layout trivial. This can be done with contain: size, or by manually setting the flex-basis size.

CSS has gained a few constructs like contain or will-change, which work directly with the layout system, and drop the pretense of one big happy layout. It reveals some of the layer-oriented nature underneath, and is a substitute for e.g. using position: absolute wrappers to do the same.

What these do is strip off some of the semantics, and break the flow of DOM-wide constraints. These are overly broad by default and too document-oriented for the simpler cases.

This is really a metaphor for all DOM APIs.

The Good Parts?

That said, flex box is pretty decent if you understand these caveats. Building layouts out of nested rows and columns with gaps is intuitive, and adapts well to varying sizes. There is a "CSS: The Good Parts" here, which you can make ergonomic with sufficient love. CSS grids also work similarly, they're just very painfully... CSSy in their syntax.

But if you designed CSS layout from scratch, you wouldn't do it this way. You wouldn't have a subtractive API, with additional extra containment barrier hints. You would instead break the behavior down into its component facets, and use them à la carte. Outside-in and inside-out would both be legible as different kinds of containers and placement models.

The inline-block and inline-flex display models illustrate this: it's a block or flex on the inside, but an inline element on the outside. These are two (mostly) orthogonal aspects of a box in a box model.

Text and font styles are in fact the odd ones out, in hypertext. Properties like font size inherit from parent to child, so that formatting tags like <b> can work. But most of those 660 CSS properties do not do that. Setting a border on an element does not apply the same border to all its children recursively, that would be silly.

It shows that CSS is at least two different things mashed together: a system for styling rich text based on inheritance... and a layout system for block and inline elements, nested recursively but without inheritance, only containment. They use the same syntax and APIs, but don't really cascade the same way. Combining this under one style-umbrella was a mistake.

Worth pointing out: early ideas of relative em scaling have largely become irrelevant. We now think of logical vs device pixels instead, which is a far more sane solution, and closer to what users actually expect.

SVG is natively integrated as well. Having SVGs in the DOM instead of just as <img> tags is useful to dynamically generate shapes and adjust icon styles.

But while SVG is powerful, it's neither a subset nor superset of CSS. Even when it overlaps, there are subtle differences, like the affine transform. It has its own warts, like serializing all coordinates to strings.

CSS has also gained the ability to round corners, draw gradients, and apply arbitrary clipping masks: it clearly has SVG-envy, but falls very short. SVG can e.g. do polygonal hit-testing for mouse events, which CSS cannot, and SVG has its own set of graphical layer effects.

Whether you use HTML/CSS or SVG to render any particular element is based on specific annoying trade-offs, even if they're all scalable vectors on the back-end.

In either case, there are also some roadblocks. I'll just mention three:

text-ellipsis can only be used to truncate unwrapped text, not entire paragraphs. Detecting truncated text is even harder, as is just measuring text: the APIs are inadequate. Everyone just counts letters instead.
position: sticky lets elements stay in place while scrolling with zero jank. While tailor-made for this purpose, it's subtly broken. Having elements remain unconditionally sticky requires an absurd nesting hack, when it should be trivial.
The z-index property determines layering by absolute index. This inevitably leads to a z-index-war.css where everyone is putting in a new number +1 or -1 to make things layer correctly. There is no concept of relative Z positioning.

For each of these features, we got stuck with v1 of whatever they could get working, instead of providing the right primitives.

Getting this right isn't easy, it's the hard part of API design. You can only iterate on it, by building real stuff with it before finalizing it, and looking for the holes.

Oil on Canvas

So, DOM is bad, CSS is single-digit X% good, and SVG is ugly but necessary... and nobody is in a position to fix it?

Well no. The diagnosis is that the middle layers don't suit anyone particularly well anymore. Just an HTML6 that finally removes things could be a good start.

But most of what needs to happen is to liberate the functionality that is there already. This can be done in good or bad ways. Ideally you design your system so the "escape hatch" for custom use is the same API you built the user-space stuff with. That's what dogfooding is, and also how you get good kernels.

A recent proposal here is HTML in Canvas, to draw HTML content into a <canvas>, with full control over the visual output. It's not very good.

While it might seem useful, the only reason the API has the shape that it does is because it's shoehorned into the DOM: elements must be descendants of <canvas> to fully participate in layout and styling, and to make accessibility work. There are also "technical concerns" with using it off-screen.

One example is this spinny cube:

To make it interactive, you attach hit-testing rectangles and respond to paint events. This is a new kind of hit-testing API. But it only works in 2D... so it seems 3D-use is only cosmetic? I have many questions.

Again, if you designed it from scratch, you wouldn't do it this way! In particular, it's absurd that you'd have to take over all interaction responsibilities for an element and its descendants just to be able to customize how it looks i.e. renders. Especially in a browser that has projective CSS 3D transforms.

The use cases not covered by that, e.g. curved re-projection, will also need more complicated hit-testing than rectangles. Did they think this through? What happens when you put a dropdown in there?

To me it seems like they couldn't really figure out how to unify CSS and SVG filters, or how to add shaders to CSS. Passing it thru canvas is the only viable option left. "At least it's programmable." Is it really? Screenshotting DOM content is 1 good use-case, but not what this is sold as at all.

The whole reason to do "complex UIs on canvas" is to do all the things the DOM doesn't do, like virtualizing content, just-in-time layout and styling, visual effects, custom gestures and hit-testing, and so on. It's all nuts and bolts stuff. Having to pre-stage all the DOM content you want to draw sounds... very counterproductive.

From a reactivity point-of-view it's also a bad idea to route this stuff back through the same document tree, because it sets up potential cycles with observers. A canvas that's rendering DOM content isn't really a document element anymore, it's doing something else entirely.

Canvas-based spreadsheet that skips the DOM entirely

The actual achilles heel of canvas is that you don't have any real access to system fonts, text layout APIs, or UI utilities. It's quite absurd how basic it is. You have to implement everything from scratch, including Unicode word splitting, just to get wrapped text.

The proposal is "just use the DOM as a black box for content." But we already know that you can't do anything except more CSS/SVG kitbashing this way. text-ellipsis and friends will still be broken, and you will still need to implement UIs circa 1990 from scratch to fix it.

It's all-or-nothing when you actually want something right in the middle. That's why the lower level needs to be opened up.

Where To Go From Here

The goals of "HTML in Canvas" do strike a chord, with chunks of HTML used as free-floating fragments, a notion that has always existed under the hood. It's a composite value type you can handle. But it should not drag 20 years of useless baggage along, while not enabling anything truly novel.

The kitbashing of the web has also resulted in enormous stagnation, and a loss of general UI finesse. When UI behaviors have to be mined out of divs, it limits the kinds of solutions you can even consider. Fixing this within DOM/HTML seems unwise, because there's just too much mess inside. Instead, new surfaces should be opened up outside of it.

WebGPU-based box model

My schtick here has become to point awkwardly at Use.GPU's HTML-like renderer, which does a full X/Y flex model in a fraction of the complexity or code. I don't mean my stuff is super great, no, it's pretty bare-bones and kinda niche... and yet definitely nicer. Vertical centering is easy. Positioning makes sense.

There is no semantic HTML or CSS cascade, just first-class layout. You don't need 61 different accessors for border* either. You can just attach shaders to divs. Like, that's what people wanted right? Here's a blueprint, it's mostly just SDFs.

Font and markup concerns only appear at the leaves of the tree, where the text sits. It's striking how you can do like 90% of what the DOM does here, without the tangle of HTML/CSS/SVG, if you just reinvent that wheel. Done by 1 guy. And yes, I know about the second 90% too.

The classic data model here is of a view tree and a render tree. What should the view tree actually look like? And what can it be lowered into? What is it being lowered into right now, by a giant pile of legacy crud?

Alt-browser projects like Servo or Ladybird are in a position to make good proposals here. They have the freshest implementations, and are targeting the most essential features first. The big browser vendors could also do it, but well, taste matters. Good big systems grow from good small ones, not bad big ones. Maybe if Mozilla hadn't imploded... but alas.

Platform-native UI toolkits are still playing catch up with declarative and reactive UI, so that's that. Native Electron-alternatives like Tauri could be helpful, but they don't treat origin isolation as a design constraint, which makes security teams antsy.

There's a feasible carrot to dangle for them though, namely in the form of better process isolation. Because of CPU exploits like Spectre, multi-threading via SharedArrayBuffer and Web Workers is kinda dead on arrival anyway, and that affects all WASM. The details are boring but right now it's an impossible sell when websites have to have things like OAuth and Zendesk integrated into them.

Reinventing the DOM to ditch all legacy baggage could coincide with redesigning it for a more multi-threaded, multi-origin, and async web. The browser engines are already multi-process... what did they learn? A lot has happened since Netscape, with advances in structured concurrency, ownership semantics, FP effects... all could come in handy here.

* * *

Step 1 should just be a data model that doesn't have 350+ properties per node tho.

Don't be under the mistaken impression that this isn't entirely fixable.

August 04, 2025

Paul Cobbaut

Nieuwe hobby: Venus Vliegenvangers

Tot zover de nieuwe hobby. Venuskes zijn niet de gemakkelijkste om gelukkig te maken.

Nieuwe hobby: Sarracenia

Enkele van de Sarracenia zoals ze nu buiten staan. Die eten veel insecten, vooral wespen.

Nieuwe hobby: Nepenthes

Nieuwe hobby sinds 2021; vleesetende plantjes kweken.

Hier vijf bekertjes die Nepenthes mij geven.

August 03, 2025

Staf Wagemakers

Lookat 2.1.0rc2 released

Lookat 2.1.0rc2 is the second release candicate of release of Lookat/Bekijk 2.1.0, a user-friendly Unix file browser/viewer that supports colored man pages.

The focus of the 2.1.0 release is to add ANSI Color support.

News

3 Aug 2025 Lookat 2.1.0rc2 Released

Lookat 2.1.0rc2 is the second release candicate of Lookat 2.1.0

ChangeLog

Lookat / Bekijk 2.1.0rc2

Corrected italic color
Don’t reset the search offset when cursor mode is enabled
Renamed strsize to charsize ( ansi_strsize -> ansi_charsize, utf8_strsize -> utf8_charsize) to be less confusing
Support for multiple ansi streams in ansi_utf8_strlen()
Update default color theme to green for this release
Update manpages & documentation
Reorganized contrib directory
- Moved ci/cd related file from contrib/* to contrib/cicd
- Moved debian dir to contrib/dist
- Moved support script to contrib/scripts

Lookat 2.1.0rc2 is available at:

https://www.wagemakers.be/english/programs/lookat/
Download it directly from https://download-mirror.savannah.gnu.org/releases/lookat/
Or at the Git repository at GNU savannah https://git.savannah.gnu.org/cgit/lookat.git/

Have fun!

August 01, 2025

Frank Goossens

Orange BE; ik wil uw rommel-apps niet op mijn foon!

Net Orange via eSim geactiveerd op mijn Fairphone 6 en voor ik het door had werden “App Center”, “Phone” (beiden van Orange group) maar ook … TikTok geïnstalleerd. Ik was daar niet blij mee. App Center kan ik zelfs niet de-installeren, alleen desactiveren. Fuckers!

Source

July 30, 2025

Music from Our Tube: Lady Linn doet Jamie Woon

Fantastische cover van Jamie Woons “Night Air” door Lady Lynn. Die contrabas en die stem, magisch! Watch this video on YouTube. …

Source

Dries Buytaert

Why Drupal is built for the AI era

An astronaut explores a surreal landscape beneath rainbow-colored planetary rings, symbolizing the journey into AI's transformative potential for Drupal.

In my previous post, The great digital agency unbundling, I explored how AI is transforming the work of digital agencies. As AI automates more technical tasks, agencies will be shifting their focus toward orchestration, strategic thinking, and accountability. This shift also changes what they need from their tools.

Content management systems like Drupal must evolve with them. This is not just about adding AI features. It is about becoming a platform that strengthens the new agency model. Because as agencies take on new roles, they will adopt the tools that help them succeed.

As I wrote then:

"As the Project Lead of Drupal, I think about how Drupal, the product, and its ecosystem of digital agencies can evolve together. They need to move in step to navigate change and help shape what comes next"

The good news is that the Drupal community is embracing AI in a coordinated and purposeful way. Today, Drupal CMS already ships with 22 AI agents, and through the Drupal AI Initiative, we are building additional infrastructure and tooling to bring more AI capabilities to Drupal.

In this post, I want to share why I believe Drupal is not just ready to evolve, but uniquely positioned to thrive in the AI era.

Drupal is built for AI

Imagine an AI agent that plans, executes, and measures complex marketing campaigns across your CMS, CRM, email platform, and analytics tools without requiring manual handoff at every step.

To support that level of orchestration, a platform must expose its content models, configuration data, state, user roles and permissions, and business logic in a structured, machine-readable way. That means making things like entity types, field definitions, relationships, and workflows available through APIs that AI systems can discover, inspect, and act on safely.

Most platforms were not designed with this kind of structured access in mind. Drupal has been moving in that direction for more than a decade.

Since Drupal 7, the community has invested deeply in modernizing the platform. We introduced a unified Entity API, adopted a service container with dependency injection, and expanded support for REST, JSON:API, and GraphQL. We also built a robust configuration management system, improved testability, and added more powerful workflows with granular revisioning and strong rollback support. Drupal also has excellent API documentation.

These changes made Drupal not only more programmable but also more introspectable. AI agents can query Drupal's structure, understand relationships between entities, and make informed decisions based on both content and configuration. This enables AI to take meaningful action inside the system rather than just operating at the surface. And because Drupal's APIs are open and well-documented, these capabilities are easier for developers and AI systems to discover and build on.

Making these architectural improvements was not easy. Upgrading from Drupal 7 was painful for many, and at the time, the benefits of Drupal 8's redesign were not obvious. We were not thinking about AI at the time, but in hindsight, we built exactly the kind of modern, flexible foundation that makes deep AI integration possible today. As is often the case, there is pain before the payoff.

AI makes Drupal's power more accessible

I think this is exciting because AI can help make Drupal's flexibility more accessible. Drupal is one of the most flexible content management systems available. It powers everything from small websites to large, complex digital platforms. That flexibility is a strength, but it also introduces complexity.

For newcomers, Drupal's flexibility can be overwhelming. Building a Drupal site requires understanding how to select and configure contributed modules, creating content types and relationships, defining roles and permissions, building Views, developing a custom theme, and more. The learning curve is steep and often prevents people from experiencing Drupal's power and flexibility.

AI has the potential to change that. In the future, you might describe your needs by saying something like, "I need a multi-language news site with editorial workflows and social media integration". An AI assistant could ask a few follow-up questions, then generate a working starting point.

I've demonstrated early prototypes of this vision in recent DriesNotes, including DrupalCon Barcelona 2024 and DrupalCon Atlanta 2025. Much of that code has been productized in the Drupal AI modules.

In my Barcelona keynote, I said that "AI is the new UI". AI helps lower the barrier to entry by turning complex setup tasks into simple prompts and conversations. With the right design, it can guide new users while still giving experts full control.

In my last post, The great digital agency unbundling, I shared a similar perspective:

"Some of the hardest challenges the Drupal community has faced, such as improving usability or maintaining documentation, may finally become more manageable. I see ways AI can support Drupal's mission, lower barriers to online publishing, make Drupal more accessible, and help build a stronger, more inclusive Open Web. The future is both exciting and uncertain."

Of course, AI comes with both promise and risk. It raises ethical questions and often fails to meet expectations. But ignoring AI is not a strategy. AI is already changing how digital work gets done. If we want Drupal to stay relevant, we need to explore its potential. That means experimenting thoughtfully, sharing what we learn, and helping shape how these tools are built and used.

Drupal's AI roadmap helps agencies

AI is changing how digital work gets done. Some platforms can now generate full websites, marketing campaigns, or content strategies in minutes. For simple use cases, that may be enough.

But many client needs are more complex. As requirements grow and automations become more sophisticated, agencies continue to play a critical role. They bring context, strategy, and accountability to challenges that off-the-shelf tools cannot solve.

That is the future we want Drupal to support. We are not building AI to replace digital agencies, but to strengthen them. Through the Drupal AI Initiative, Drupal agencies are actively helping shape the tools they want to use in an AI-driven world.

As agencies evolve in response to AI, they will need tools that evolve with them. Drupal is not only keeping pace but helping lead the way. By investing in AI in collaboration with the agencies who rely on it, we are making Drupal stronger, more capable, and more relevant.

Now is the moment to move

The shift toward AI-powered digital work is inevitable. Platforms will succeed or fail based on how well they adapt to this reality.

Drupal's investments in modern architecture, open development, and community collaboration has created something unique: a platform that doesn't just add AI features but fundamentally supports AI-driven workflows. While other systems scramble to retrofit AI capabilities, Drupal's foundation makes deep integration possible.

The question isn't whether AI will change digital agencies and content management. It already has. The question is which platforms will help agencies and developers thrive in that new reality. Drupal is positioning itself to be one of them.

Amedee Van Gasse

Creating 10 000 Random Files & Analyzing Their Size Distribution: Because Why Not? 🧐💾

Ever wondered what it’s like to unleash 10 000 tiny little data beasts on your hard drive? No? Well, buckle up anyway — because today, we’re diving into the curious world of random file generation, and then nerding out by calculating their size distribution. Spoiler alert: it’s less fun than it sounds. 😏

Step 1: Let’s Make Some Files… Lots of Them

Our goal? Generate 10 000 files filled with random data. But not just any random sizes — we want a mean file size of roughly 68 KB and a median of about 2 KB. Sounds like a math puzzle? That’s because it kind of is.

If you just pick file sizes uniformly at random, you’ll end up with a median close to the mean — which is boring. We want a skewed distribution, where most files are small, but some are big enough to bring that average up.

The Magic Trick: Log-normal Distribution 🎩✨

Enter the log-normal distribution, a nifty way to generate lots of small numbers and a few big ones — just like real life. Using Python’s NumPy library, we generate these sizes and feed them to good old /dev/urandom to fill our files with pure randomness.

Here’s the Bash script that does the heavy lifting:

#!/bin/bash

# Directory to store the random files
output_dir="random_files"
mkdir -p "$output_dir"

# Total number of files to create
file_count=10000

# Log-normal distribution parameters
mean_log=9.0  # Adjusted for ~68KB mean
stddev_log=1.5  # Adjusted for ~2KB median

# Function to generate random numbers based on log-normal distribution
generate_random_size() {
    python3 -c "import numpy as np; print(int(np.random.lognormal($mean_log, $stddev_log)))"
}

# Create files with random data
for i in $(seq 1 $file_count); do
    file_size=$(generate_random_size)
    file_path="$output_dir/file_$i.bin"
    head -c "$file_size" /dev/urandom > "$file_path"
    echo "Generated file $i with size $file_size bytes."
done

echo "Done. Files saved in $output_dir."

Easy enough, right? This creates a directory random_files and fills it with 10 000 files of sizes mostly small but occasionally wildly bigger. Don’t blame me if your disk space takes a little hit! 💥

Step 2: Crunching Numbers — The File Size Distribution 📊

Okay, you’ve got the files. Now, what can we learn from their sizes? Let’s find out the:

Mean size: The average size across all files.
Median size: The middle value when sizes are sorted — because averages can lie.
Distribution breakdown: How many tiny files vs. giant files.

Here’s a handy Bash script that reads file sizes and spits out these stats with a bit of flair:

#!/bin/bash

# Input directory (default to "random_files" if not provided)
directory="${1:-random_files}"

# Check if directory exists
if [ ! -d "$directory" ]; then
    echo "Directory $directory does not exist."
    exit 1
fi

# Array to store file sizes
file_sizes=($(find "$directory" -type f -exec stat -c%s {} \;))

# Check if there are files in the directory
if [ ${#file_sizes[@]} -eq 0 ]; then
    echo "No files found in the directory $directory."
    exit 1
fi

# Calculate mean
total_size=0
for size in "${file_sizes[@]}"; do
    total_size=$((total_size + size))
done
mean=$((total_size / ${#file_sizes[@]}))

# Calculate median
sorted_sizes=($(printf '%s\n' "${file_sizes[@]}" | sort -n))
mid=$(( ${#sorted_sizes[@]} / 2 ))
if (( ${#sorted_sizes[@]} % 2 == 0 )); then
    median=$(( (sorted_sizes[mid-1] + sorted_sizes[mid]) / 2 ))
else
    median=${sorted_sizes[mid]}
fi

# Display file size distribution
echo "File size distribution in directory $directory:"
echo "---------------------------------------------"
echo "Number of files: ${#file_sizes[@]}"
echo "Mean size: $mean bytes"
echo "Median size: $median bytes"

# Display detailed size distribution (optional)
echo
echo "Detailed distribution (size ranges):"
awk '{
    if ($1 < 1024) bins["< 1 KB"]++;
    else if ($1 < 10240) bins["1 KB - 10 KB"]++;
    else if ($1 < 102400) bins["10 KB - 100 KB"]++;
    else bins[">= 100 KB"]++;
} END {
    for (range in bins) printf "%-15s: %d\n", range, bins[range];
}' <(printf '%s\n' "${file_sizes[@]}")

Run it, and voilà — instant nerd satisfaction.

Example Output:

File size distribution in directory random_files:
---------------------------------------------
Number of files: 10000
Mean size: 68987 bytes
Median size: 2048 bytes

Detailed distribution (size ranges):
&lt; 1 KB         : 1234
1 KB - 10 KB   : 5678
10 KB - 100 KB : 2890
>= 100 KB      : 198

Why Should You Care? 🤷‍♀️

Besides the obvious geek cred, generating files like this can help:

Test backup systems — can they handle weird file size distributions?
Stress-test storage or network performance with real-world-like data.
Understand your data patterns if you’re building apps that deal with files.

Wrapping Up: Big Files, Small Files, and the Chaos In Between

So there you have it. Ten thousand random files later, and we’ve peeked behind the curtain to understand their size story. It’s a bit like hosting a party and then figuring out who ate how many snacks. 🍿

Try this yourself! Tweak the distribution parameters, generate files, crunch the numbers — and impress your friends with your mad scripting skills. Or at least have a fun weekend project that makes you sound way smarter than you actually are.

Happy hacking! 🔥

July 29, 2025

Frederic Descamps

How do you upgrade MySQL HeatWave when deploying with Terraform?

Have you already tried to upgrade the MySQL version of your MySQL HeatWave instance in OCI that is deployed with Terraform? When you tried, you realized, I hope you didn’t turn off backups, that the instance is destroyed and recreated new! This is our current MySQL HeatWave DB System deployed using Terrafrom: And this is […]

July 28, 2025

Uplevel the MySQL REST Service

The MySQL REST Service is a next-generation JSON Document Store solution, enabling fast and secure HTTPS access to data stored in MySQL, HeatWave, InnoDB Cluster, InnoDB ClusterSet, and InnoDB ReplicaSet. The MySQL REST Service was first released on https://labs.mysql.com in 2023 using MySQL Router. During spring 2025, it was released on MySQL HeatWave and standard […]

July 27, 2025

Frank Goossens

Die ochtend in het bos

Deze ochtend op dagelijkse wandel in het “Mechels Bos” met onze iets grotere hond (Maya, Roemeense adoptiehond met naar we vermoeden wat collie en wat berghond genen, ze is idd groter dan Mamita onze quasi-chihuahua) hoorde ik een vreemd geluid. Luister hieronder even (geen bewegend beeld, maar foto van hier in de buurt); het beest bleef een hele tijd stil en het is niet heel luid…

Source

July 26, 2025

Na de Nokia X20; Fairphone 6

Ik heb vandaag de (groene) Fairphone 6 besteld om mijn Nokia X20 te vervangen, na lang twijfelen toch met Google Android ipv e/OS want itsme/ bank apps. Als het ooit “veilig” is kan ik nog altijd naar e/OS flashen eh Redenen; 5 jaar garantie, 7 jaar updates, een resem aan vervangbare onderdelen…

Source

July 23, 2025

Dries Buytaert

AI and the great digital agency unbundling

Two small figures watch a massive vessel launch, symbolizing digital agencies witnessing the AI transformation of their industry.

"To misuse a woodworking metaphor, I think we're experiencing a shift from hand tools to power tools. You still need someone who understands the basics to get good results from the tools, but they're not chiseling fine furniture by hand anymore. They're throwing heaps of wood through the tablesaw instead. More productive, but more likely to lose a finger if you're not careful."
– mrmincent, Hacker News comment on Claude Code, via Simon Willison

If, like me, you work in web development, design, or digital strategy, this quote might hit close to home. But it may not go far enough. We are not just moving from chisels to table saws. We are about to hand out warehouse-sized CNC machines and robotic arms.

This is not just an upgrade in tools. The Industrial Revolution didn't just replace handcraft with machines. It upended entire industries.

History does not repeat itself, but it often rhymes. For over two centuries, new tools have changed not just how work gets done, but what we can accomplish.

AI is changing how websites are built and how people find information online. Individual developers are already using AI tools today, but broader organizational adoption will unfold over the years ahead.

It's clear that AI will have a deep impact on the web industry. Over time, this shift will affect those of us who have built our careers in web development, marketing, design, and digital strategy. I am one of them. Most of my career has been rooted in Drupal, which makes this both personal and difficult to write.

But this shift is bigger than any one person or platform. There are tens of thousands of digital agencies around the world, employing millions of people who design, build, and maintain the digital experiences we all rely on. Behind those numbers are teams, individuals, and livelihoods built over decades. Our foundation is shifting. It touches all of us, and we all need to adapt.

If you are feeling uncertain about where this is heading, you are not alone.

Why I am writing this

I am not writing this to be an alarmist. I actually feel a mix of emotions. I am excited about the possibilities AI offers, but also concerned about the risks and uneasy about the speed and scale of change.

As the project lead of Drupal, I ask myself: "How can I best guide our community of contributors, agencies, and end users through these changes?".

Like many of you, I am trying to understand what the rise of AI means for our users, teams, partners, contributors, products, and values. I want to help however I can.

I don't claim to have all the answers, but I hope this post sparks discussion, encourages deeper thinking, and helps us move forward together. This is not a roadmap, just a reflection of where my thinking is today.

I do feel confident that we need to keep moving forward, stay open-minded, and engage with the changes AI brings head-on.

Even with all that uncertainty, I feel energized. Some of the hardest challenges the Drupal community has faced, such as improving usability or maintaining documentation, may finally become more manageable. I see ways AI can support Drupal's mission, lower barriers to online publishing, make Drupal more accessible, and help build a stronger, more inclusive Open Web. The future is both exciting and uncertain.

But this post isn't just for the Drupal community. It's for anyone working in or around a digital agency who is asking: "What does AI mean for my team, my clients, and my future?". I will focus more directly on Drupal in my next blog post, so feel free to subscribe.

If you are thinking about how AI is affecting your work, whether in the Drupal ecosystem or elsewhere, I would love to hear from you. The more we share ideas, concerns, and experiments, the better prepared we will all be to shape this next chapter together.

The current downturn is real, but will pass

Before diving into AI, I'd be remiss not to acknowledge the current economic situation. Agencies across all platforms, not just those working with Drupal, are experiencing challenging market conditions, especially in the US and parts of Europe.

While much of the industry is focused on AI, the immediate pain many agencies are feeling is not caused by it. High interest rates, inflation, and global instability have made client organizations more cautious with spending. Budgets are tighter, sales cycles are longer, competition is fiercer, and more work is staying in-house.

As difficult as this is, it is not new. Economic cycles and political uncertainty have always come and gone. What makes this moment different is not the current downturn, but what comes next.

AI will transform the industry at an accelerating pace

AI has not yet reshaped agency work in a meaningful way, but that change is knocking at the door. At the current pace of progress, web development and digital agency work are on the verge of the most significant disruption since the rise of the internet.

One of the most visible areas of change has been content creation. AI generates drafts blog posts, landing pages, social media posts, email campaigns, and more. This speeds up production but also changes the workflow. Human input shifts toward editing, strategy, and brand alignment rather than starting from a blank page.

Code generation tools are also handling more implementation tasks. Senior developers can move faster, while junior developers are taking on responsibilities that once required more experience. As a result, developers are spending more time reviewing and refining AI-generated code than writing everything from scratch.

Traditional user interfaces are becoming less important as AI shifts user interactions toward natural language, voice, and more predictive or adaptive experiences. These still require thoughtful design, but the nature of UI work is changing. AI can now turn visual mockups into functional components and, in some cases, generate complete interfaces with minimal or no human input.

These shifts also challenge the way agencies bill for their work. When AI can do in minutes what once took hours or days, hourly billing becomes harder to justify. If an agency charges $150 an hour for something clients know AI can do faster, those clients will look elsewhere. To stay competitive, agencies will need to focus less on time spent and more on outcomes, expertise, and impact.

AI is also changing how people find and interact with information online. As users turn to AI assistants for answers, the role of the website as a central destination is being disrupted. This shift changes how clients think about content, traffic, and performance, which are core areas of agency work. Traditional strategies like SEO become less effective when users get what they need without ever visiting a site.

Through all of this, human expertise will remain essential. People are needed to set direction, guide priorities, review AI output, and take responsibility for quality and business outcomes. We still rely on individuals who know what to build, why it matters, and how to ensure the results are accurate, reliable, and aligned with real-world needs. When AI gets it wrong, it is still people who are accountable. Someone must own the decisions and stand behind the results.

But taken together, these changes will reshape how agencies operate and compete. To stay viable, agencies need to evolve their service offerings and rethink how they create and deliver value. That shift will also require changes to team structures, pricing models, and delivery methods. This is not just about adopting new tools. It is about reimagining what an agency does and how it works.

The hardest part may not be the technology. It is the human cost. Some people will see their roles change faster than they can adapt. Others may lose their jobs or face pressure to use tools that conflict with their values or standards.

Adding to the challenge, adopting AI requires investment at a moment when many agencies around the world are focused on survival. For teams already stretched thin, transformation may feel out of reach. The good news is that AI's full impact will take years to unfold, giving agencies time to adapt.

Still, moments like this can create major opportunities. In past downturns, technology shifts made room for new players and helped established firms reinvent themselves. The key is recognizing that this is not just about learning new tools. It is about positioning yourself where human judgment, relationships, and accountability for outcomes remain essential, even as AI takes on more of the execution.

The diminishing value of platform expertise alone

For years, CMS-focused agencies have built their businesses on deep platform expertise. Clients relied on them for custom development, performance tuning, security, and infrastructure. This specialized knowledge commanded a premium.

In effect, AI increases the supply of skilled work without a matching rise in demand. By automating tasks that once required significant expertise, it makes technical expertise abundant and much cheaper to produce. And according to the principles of supply and demand, when supply rises and demand stays the same, prices fall.

This is not a new pattern. SaaS website builders already commoditized basic site building, reducing the perceived value of simple implementations and pushing agencies toward more complex, higher-value projects.

Now, AI is accelerating that shift. It is extending the same kind of disruption into complex and enterprise-level work, bringing speed and automation to tasks that once required expensive and experienced teams.

In other words, AI erodes the commercial value of platform expertise by making many technical tasks less scarce. Agencies responded to earlier waves of commoditization by moving up the stack, toward work that was more strategic, more customized, and harder to automate.

AI is raising the bar again. Once more, agencies need to move further up the stack. And they need to do it faster than before.

The pattern of professional survival

This is not the first time professionals have faced a major shift. Throughout history, every significant technological change has required people to adapt.

Today, skilled radiologists interpret complex scans with help from AI systems. Financial analysts use algorithmic tools to process data while focusing on high-level strategy. The professionals who understand their domain deeply find ways to work with new technology instead of competing against it.

Still, not every role survives. Elevator operators disappeared when elevators became automatic. Switchboard operators faded as direct dialing became standard.

At the same time, these shifts unlocked growth. The number of elevators increased, making tall buildings more practical. The telephone became a household staple. As routine work was automated away, new industries and careers emerged.

The same will happen with AI. Some roles will go away. Others will change. Entirely new opportunities will emerge, many in areas we have not yet imagined.

I have lived through multiple waves of technological change. I witnessed the rise of the web, which created entirely new industries and upended existing ones. I experienced the shift from hand-coding to content management systems, which helped build today's thriving agency ecosystem. I saw mobile reshape how people access information, opening up new business models.

Each transition brought real uncertainty. In the moment, the risks felt immediate and the disruption felt personal. But over time, these shifts consistently led to new forms of prosperity, new kinds of work, and new ways to create value.

The great agency unbundling

AI can help agencies streamline how they work today, but when major technology shifts happen, success rarely comes from becoming more efficient at yesterday's model.

The bigger opportunity lies in recognizing when the entire system is being restructured. The real question is not just "How do we use AI to become a more efficient agency?" but "How will the concept of an agency be redefined?".

Most agencies today bundle together strategy, design, development, project management, and ongoing maintenance. This bundle made economic sense when coordination was costly and technical skills were scarce enough to command premium rates.

AI is now unbundling that model. It separates work based on what can be automated, what clients can bring in-house, and what still requires deep expertise.

At the same time, it is rebundling services around different principles, such as speed, specialization, measurable outcomes, accountability, and the value of human judgment.

The accountability gap

As AI automates routine tasks, execution becomes commoditized. But human expertise takes on new dimensions. Strategic vision, domain expertise, and cross-industry insights remain difficult to automate.

More critically, trust and accountability stay fundamentally human. When AI hallucinates or produces unexpected results, organizations need people who can take responsibility and navigate the consequences.

We see this pattern everywhere: airline pilots remain responsible for their passengers despite autopilot handling most of the journey, insurance companies use advanced software to generate quotes but remain liable for the policies they issue, and drivers are accountable for accidents even when following GPS directions.

The tools may be automated, but responsibility for mistakes and results remains human. For agencies, this means that while AI can generate campaigns, write code, and design interfaces, clients still need someone accountable for strategy, quality, and outcomes.

This accountability gap between what AI can produce and what organizations will accept liability for creates lasting space for human expertise.

The rise of orchestration platforms

Beyond human judgment, a new architectural pattern is emerging. Traditional Digital Experience Platforms (DXPs) excel at managing complex content, workflows, and integrations within a unified system. But achieving sophisticated automation often requires significant custom development, long implementation cycles, and deep platform expertise.

Now, visual workflow builders, API orchestration platforms, and the Model Context Protocol are enabling a different approach. Instead of building custom integrations or waiting for platform vendors to add features, teams can wire together AI models, automation tools, CRMs, content systems, and analytics platforms through drag-and-drop interfaces. What once required months of development can often be prototyped in days.

But moving from prototype to production requires deep expertise. It involves architecting event-driven systems, managing state across distributed workflows, implementing proper error handling for AI failures, and ensuring compliance across automated decisions. The tools may be visual, but making them work reliably at scale, maintaining security, ensuring governance, and building systems that can evolve with changing business needs demands sophisticated technical knowledge.

This orchestration capability represents a new technical high ground. Agencies that master this expanded stack can deliver solutions faster while maintaining the reliability and scalability that enterprises require.

Six strategies for how agencies could evolve

Agencies need two types of strategies: ways to compete better in today's model and ways to position for the restructured system that's emerging.

The strategies that follow are not mutually exclusive. Many agencies will combine elements from several based on their strengths, clients, and markets.

Competing in today's market

1. Become AI-augmented, not AI-resistant. To stay competitive, agencies should explore how AI can improve efficiency across their entire operation. Developers should experiment with code assistants, project managers should use AI to draft updates and reports, and sales teams should apply it to lead qualification or proposal writing. The goal is not to replace people, but to become more effective at handling fast-paced, low-cost work while creating more space for strategic, value-added thinking.

2. Focus on outcomes, not effort. As AI reduces delivery time, billing for hours makes less sense. Agencies can shift toward pricing based on value created rather than time spent. Instead of selling a redesign, offer to improve conversion rates. This approach aligns better with client goals and helps justify pricing even as technical work becomes faster.

3. Sell through consultation, not execution. As technology changes faster than most clients can keep up with, agencies have a chance to step into a more consultative role. Instead of just delivering projects, they can help clients understand their problems and shape the right solutions. Agencies that combine technical know-how with business insight can become trusted partners, especially as clients look for clarity and results.

Positioning for what comes next

4. Become the layer between AI and clients. Don't just use AI tools to build websites faster. Position yourself as the essential layer that connects AI capabilities with real client needs. This means building quality control systems that review AI-generated code before deployment and becoming the trusted partner that translates AI possibilities into measurable results. Train your team to become "AI translators" who can explain technical capabilities in business terms and help clients understand what's worth automating versus what requires human judgment.

5. Package repeatable solutions. When custom work becomes commoditized, agencies need ways to stand out. Turn internal knowledge into named, repeatable offerings. This might look like a "membership toolkit for nonprofits" or a "lead gen system for B2B SaaS". These templated solutions are easier to explain, sell, and scale. AI lowers the cost of building and maintaining them, making this model more realistic than it was in the past. This gives agencies a way to differentiate based on expertise and value, not just technical execution.

6. Build systems that manage complex digital workflows. Stop thinking in terms of one-off websites. Start building systems that manage complex, ongoing digital workflows. Agencies should focus on orchestrating tools, data, and AI agents in real time to solve business problems and drive automation.

For example, a website might automatically generate social media posts from new blog content, update landing pages based on campaign performance, or adjust calls to action during a product launch. All of this can happen with minimal human involvement, but these systems are still non-trivial to build and require oversight and accountability.

This opportunity feels significant. As marketing stacks grow more complex and AI capabilities expand, someone needs to coordinate how these systems work together in a structured and intelligent way. This is not just about connecting APIs. It is about designing responsive, event-driven systems using low-code orchestration tools, automation platforms, and AI agents.

Open Source needs agencies, proprietary platforms don't

Every AI feature a technology platform adds potentially takes work off the agency's plate. Whether the platform is open source or proprietary, each new capability reduces the need for custom development.

But open source and proprietary platforms are driven by very different incentives.

Proprietary platforms sell directly to end clients. For them, replacing agency services is a growth strategy. The more they automate, the more revenue they keep.

This is already happening. Squarespace builds entire websites from prompts. Shopify Magic writes product descriptions and designs storefronts.

Open source platforms are adding AI features as well, but operate under different incentives. Drupal doesn't monetize end users. Drupal's success depends on a healthy ecosystem where agencies contribute improvements that keep the platform competitive. Replacing agencies doesn't help Drupal; it weakens the very ecosystem that sustains it.

As the Project Lead of Drupal, I think about how Drupal the product and its ecosystem of digital agencies can evolve together. They need to move in step to navigate change and help shape what comes next.

This creates a fundamental difference in how platforms may evolve. Proprietary platforms are incentivized to automate and sell directly. Open source platforms thrive by leaving meaningful work for agencies, who in turn strengthen the platform through contributions and market presence.

For digital agencies, one key question stands out: do you want to work with platforms that grow by replacing you, or with platforms that grow by supporting you?

Looking ahead

Digital agencies face a challenging but exciting transition. While some platform expertise is becoming commoditized, entirely new categories of value are emerging.

The long-term opportunity isn't just about getting better at being an agency using AI tools. It's about positioning yourself to capture value as digital experiences evolve around intelligent systems.

Agencies that wait for perfect tools, continue billing by the hour for custom development, try to serve all industries, or rely on platform knowledge will be fighting yesterday's battles. They're likely to struggle.

But agencies that move early, experiment with purpose, and position themselves as the essential layer between AI capabilities and real client needs are building tomorrow's competitive advantages.

Success comes from recognizing that this transition creates the biggest opportunity for differentiation that agencies have seen in years.

For those working with Drupal, the open source foundation creates a fundamental advantage. Unlike agencies dependent on proprietary platforms that might eventually compete with them, Drupal agencies can help shape the platform's AI evolution to support their success rather than replace them.

We are shifting from hand tools to power tools. The craft remains, but both how we work and what we work on are changing. We are not just upgrading our tools; we are entering a world of CNC machines and robotic arms that automate tasks once done by hand. Those who learn to use these new capabilities, combining the efficiency of automation with human judgment, will create things that were not possible before.

In the next post, I'll share why I believe Drupal is especially well positioned to lead in this new era of AI-powered digital experience.

I've rewritten this blog post at least three times. Throughout the process, I received valuable feedback from several Drupal agency leaders and contributors, whose insights helped shape the final version. In alphabetical order by last name: Jamie Abrahams, Christoph Breidert, Seth Brown, Dominique De Cooman, George DeMet, Alex Dergachev, Justin Emond, John Faber, Seth Gregory, and Michael Meyers.

Amedee Van Gasse

How I Tamed Duplicity’s Buggy Versions — and Made Sure I Always Have a Backup 🛡️💾

If you’re running Mail-in-a-Box like me, you might rely on Duplicity to handle backups quietly in the background. It’s a great tool — until it isn’t. Recently, I ran into some frustrating issues caused by buggy Duplicity versions. Here’s the story, a useful discussion from the Mail-in-a-Box forums, and a neat trick I use to keep fallback versions handy. Spoiler: it involves an APT hook and some smart file copying!

The Problem with Duplicity Versions

Duplicity 3.0.1 and 3.0.5 have been reported to cause backup failures — a real headache when you depend on them to protect your data. The Mail-in-a-Box forum post “Something is wrong with the backup” dives into these issues with great detail. Users reported mysterious backup failures and eventually traced it back to specific Duplicity releases causing the problem.

Here’s the catch: those problematic versions sometimes sneak in during automatic updates. By the time you realize something’s wrong, you might already have upgraded to a buggy release.

Pinning Problematic Versions with APT Preferences

One way to stop apt from installing those broken versions is to use APT pinning. Here’s an example file I created in /etc/apt/preferences/pin_duplicity.pref:

Explanation: Duplicity version 3.0.1* has a bug and should not be installed
Package: duplicity
Pin: version 3.0.1*
Pin-Priority: -1

Explanation: Duplicity version 3.0.5* has a bug and should not be installed
Package: duplicity
Pin: version 3.0.5*
Pin-Priority: -1

This tells apt to refuse to install these specific buggy versions. Sounds great, right? Except — it often comes too late. You could already have updated to a broken version before adding the pin.

Also, since Duplicity is installed from a PPA, older versions vanish quickly as new releases push them out. This makes rolling back to a known good version a pain.

My Solution: Backing Up Known Good Duplicity `.deb` Files Automatically

To fix this, I created an APT hook that runs after every package operation involving Duplicity. It automatically copies the .deb package files of Duplicity from apt’s archive cache — and even from my local folder if I’m installing manually — into a safe backup folder.

Here’s the script, saved as /usr/local/bin/apt-backup-duplicity.sh:

#!/bin/bash
set -x

mkdir -p /var/backups/debs/duplicity

cp -vn /var/cache/apt/archives/duplicity_*.deb /var/backups/debs/duplicity/ 2>/dev/null || true
cp -vn /root/duplicity_*.deb /var/backups/debs/duplicity/ 2>/dev/null || true

And here’s the APT hook configuration I put in /etc/apt/apt.conf.d/99backup-duplicity-debs to run this script automatically after DPKG operations:

DPkg::Post-Invoke { "/usr/local/bin/apt-backup-duplicity.sh"; };

Use `apt-mark hold` to Lock a Working Duplicity Version

Even with pinning and local .deb backups, there’s one more layer of protection I recommend: freezing a known-good version with apt-mark hold.

Once you’ve confirmed that your current version of Duplicity works reliably, run:

sudo apt-mark hold duplicity

This tells apt not to upgrade Duplicity, even if a newer version becomes available. It’s a great way to avoid accidentally replacing your working setup with something buggy during routine updates.

Pro Tip: I only unhold and upgrade Duplicity manually after checking the Mail-in-a-Box forum for reports that a newer version is safe.

When you’re ready to upgrade, do this:

sudo apt-mark unhold duplicity
sudo apt update
sudo apt install duplicity

If everything still works fine, you can apt-mark hold it again to freeze the new version.

How to Use Your Backup Versions to Roll Back

If a new Duplicity version breaks your backups, you can easily reinstall a known-good .deb file from your backup folder:

sudo apt install --reinstall /var/backups/debs/duplicity/duplicity_<version>.deb

Replace <version> with the actual filename you want to roll back to. Because you saved the .deb files right after each update, you always have access to older stable versions — even if the PPA has moved on.

Final Thoughts

While pinning bad versions helps, having a local stash of known-good packages is a game changer. Add apt-mark hold on top of that, and you have a rock-solid defense against regressions.

It’s a small extra step but pays off hugely when things go sideways. Plus, it’s totally automated with the APT hook, so you don’t have to remember to save anything manually.

If you run Mail-in-a-Box or rely on Duplicity in any critical backup workflow, I highly recommend setting up this safety net.

Stay safe and backed up!

Lionel Dricot

20 years of Linux on the Desktop (part 4)

Previously in "20 years of Linux on the Deskop": After contributing to the launch of Ubuntu as the "perfect Linux desktop", Ploum realises that Ubuntu is drifting away from both Debian and GNOME. In the meantime, mobile computing threatens to make the desktop irrelevant.

The big desktop schism

The fragmentation of the Ubuntu/GNOME communities became all too apparent when, in 2010, Mark Shuttleworth announced during the Ubuntu-summit that Ubuntu would drop GNOME in favour of its own in-house and secretly developed desktop: Unity.

I was in the audience. I remember shaking my head in disbelief while Mark was talking on stage, just a few metres from me.

Working at the time in the automotive industry, I had heard rumours that Canonical was secretly talking with BMW to put Ubuntu in their cars and that there was a need for a new touchscreen interface in Ubuntu. Mark hoped to make an interface that would be the same on computers and touchscreens. Hence the name: "Unity". It made sense but I was not happy.

The GNOME community was, at the time, in great agitation about the future. Some thought that GNOME was looking boring. That there was no clear sense of direction except minor improvements. In 2006, the German Linux Company SUSE had signed a patent agreement with Microsoft covering patents related to many Windows 95 concepts like the taskbar, the tray, the startmenu. SUSE was the biggest contributor to KDE and the agreement was covering the project. But Red Hat and GNOME refused to sign that agreement, meaning that Microsoft suing the GNOME project was now plausible.

An experiment of an alternative desktop breaking all Windows 95 concepts was done in JavaScript: GNOME-shell.

A JavaScript desktop? Seriously? Yeah, it was cool for screenshots but it was slow and barely usable. It was an experiment, nothing else. But there’s a rule in the software world: nobody will ever end an experiment. An experiment will always grow until it becomes too big to cancel and becomes its own project.

Providing the GNOME desktop to millions of users, Mark Shuttleworth was rightly concerned about the future of GNOME. Instead of trying to fix GNOME, he decided to abandon it. That was the end of Ubuntu as Debian+GNOME.

What concerned me was that Ubuntu was using more and more closed products. Products that were either proprietary, developed behind closed doors or, at the very least, were totally controlled by Canonical people.

In 2006, I had submitted a Summer of Code project to build a GTK interface to Ubuntu’s new bug tracker: Launchpad. Launchpad was an in-house project which looked like it was based on the Python CMS Plone and I had some experience with it. During that summer, I realised that Launchpad was, in fact, proprietary and had no API. To my surprise, there was no way I could get the source code of Launchpad. Naively, I had thought that everything Ubuntu was doing would be free software. Asking the dev team, I was promised Launchpad would become free "later". I could not understand why Canonical people were not building it in the open.

I still managed to build "Conseil" by doing web scraping but it broke with every single change done internally by the Launchpad team.

As a side note, the name "Conseil" was inspired by the book "20.000 leagues under the sea", by Jules Vernes, a book I had downloaded from the Gutenberg project and that I was reading on my Nokia 770. The device was my first e-reader and I’ve read tenths of public domain books on it. This was made possible thanks to the power of opensource: FBreader, a very good epub reading software, had been easily ported to the N770 and was easily installable.

I tried to maintain Conseil for a few months before giving up. It was my first realisation that Canonical was not 100% open source. Even technically free software was developed behind closed doors or, at the very least, with tight control over the community. This included Launchpad, Bzr, Upstart, Unity and later Mir. The worse offender would later be Snap.

To Mark Shuttleworth’s credit, it should be noted that, most of the time, they were really trying to fix core issues with Linux’s ecosystem. In retrospective, it looks easy to see those moves as "bad". But, in reality, Canonical had a strong vision and keeping control was easier than to do everything in the open. Bzr was launched before git existed (by a few days). Upstard was created before Systemd. Those decisions made sense at the time.

Half an hour with Mark 'SABDFL' Shuttleworth • The Register (www.theregister.com)

Even the move to Unity would later prove to be very strategical as, in 2012, GNOME would suddenly depend on Systemd, which was explicitly developed as a competitor to Upstart. Ubuntu would concede defeat in 2015 by replacing Upstart with Systemd and in 2018 by reinstating GNOME as the default desktop. But those were not a given in 2010.

But even with the benefit of doubt, Canonical would sometimes cross huge red lines, like that time where Unity came bundled with some Amazon advertisement, tracking you on your own desktop. This was, of course, not really well received.

The end of Maemo: when incompetence is not enough, be malevolent

At the same time in the nascent mobile world, Nokia was not the only one suffering from the growing Apple/Google duopoly. Microsoft was going nowhere with its own mobile operating system, WindowsCE and running like a headless chicken. The director of the "Business division" of Microsoft, a guy named Stephen Elop, signed a contract with Nokia to develop some Microsoft Office feature on Symbian. This looked like an anecdotical side business until, a few months after that contract, in September 2010, Elop leaves Microsoft to become… CEO of Nokia.

This was important news to me because, at 2010’s GUADEC (GNOME’s annual conference) in Then Hague, I had met a small tribe of free software hackers called Lanedo. After a few nice conversations, I was excited to be offered a position in the team.

In my mind at the time, I would work on GNOME technologies full-time while being less and less active in the Ubuntu world! I had chosen my side: I would be a GNOME guy.

I was myself more and more invested in GNOME, selling GNOME t-shirts at FOSDEM and developing "Getting Things GNOME!", a software that would later become quite popular.

First release of GTG in 2009

Joining Lanedo without managing to land a job at Canonical (despite several tries) was the confirmation that my love affair with Ubuntu had to be ended.

The quest for the best non-Ubuntu distribution

In 2010, Lanedo biggest customer was, by far, Nokia. I had been hired to work on Maemo (or maybe Meego? This was unclear). We were not thrilled to see an ex-Microsoft executive take the reins of Nokia.

As we feared, one of Elop’s first actions as CEO of Nokia was to kill Maemo in an infamous "burning platform" memo. Elop is a Microsoft man and hates anything that looks like free software. In fact, like a good manager, he hates everything technical. It is all the fault of the developers which are not "bringing their innovation to the market fast enough". Sadly, nobody highlighted the paradox that "bringing to the market" had never been the job of the developers. Elop’s impact on the Nokia company is huge and nearly immediate: the stock is in free fall.

One Nokia developer posted on Twitter: "Developers are blamed because they did what management asked them to do". But, sometimes, management even undid the work of the developers.

The Meego team at Nokia was planning a party for the release of their first mass-produced phone, the N8. While popping Champaign during the public announcement of the N8 release, the whole team learned that the phone had eventually been shipped with… Symbian. Nobody had informed the team. Elop had been CEO for less than a week and Nokia was in total chaos.

But Stephen Elop is your typical "successful CEO". "Successful" like in inheriting one of the biggest and most successful mobile phone makers and, in a couple of years, turning it into ashes. You can’t invent such "success".

During Elop's tenure, Nokia's stock price dropped 62%, their mobile phone market share was halved, their smartphone market share fell from 33% to 3%, and the company suffered a cumulative €4.9 billion loss

(source: Stephen Elop on Wikipedia)

It should be noted that, against all odds, the Meego powered Nokia N9, which succeeded to the N8, was a success and was giving true hope of Meego competing with Android/iOS. N9 was considered a "flagship" and it showed. At Lanedo, we had discussed having an N9 bought by the company for each employee so we could "eat our own dog food" (something which was done at Collabora). But Elop announcement was clearly underderstood as the killing of Meego/Maemo and Symbian to leave room to… Windows Phone!

The Nokia N9 was available in multiple colours (picture by Bytearray render on Wikimedia)

Well, Elop promised that, despite moving to Windows Phone, Nokia would release one Meego phone every year. I don’t remember if anyone bought that lie. We could not really believe that all those years of work would be killed just when the success of the N9 proved that we did it right. But that was it. The N9 was the first and the last of its kind.

Ironically, the very first Windows Phone, the Lumia 800, will basically be the N9 with Windows Phone replacing Meego. And it would receive worse reviews that the N9.

At that moment, one question is on everybody's lips: is Stephen Elop such a bad CEO or is he destroying Nokia on purpose? Is it typical management incompetence or malevolence? Or both?

The answer comes when Microsoft, Elop’s previous employer, bought Nokia for a fraction of the price it would have paid if Elop hasn’t been CEO. It’s hard to argue that this was not premeditated: Elop managed to discredit and kill every software-related project Nokia had ever done. That way, Nokia could be sold as a pure hardware maker to Microsoft, without being encumbered by a software culture which was too distant from Microsoft. And Elop goes back to his old employer as a richer man, receiving a huge bonus for having tanked a company. But remember dear MBA students, he’s a "very successful manager", you should aspire to become like him.

Les voies du capitalisme sont impénétrables.

As foolish as it sounds, this is what the situation was: the biggest historical phone maker in the world merged with the biggest historical software maker. Vic Gundotra, head of the Google+ social network, posted: "Two turkeys don’t make an eagle." But one thing was clear: Microsoft was entering the mobile computing market because everything else was suddenly irrelevant.

Every business eyes were pointed towards mobile computing where, ironically, Debian+GNOME had been a precursor.

Just when it looked like Ubuntu managed to make Linux relevant on the desktop, nobody cared about the desktop anymore. How could Mark Shuttleworth makes Ubuntu relevant in that new world?

(to be continued)

Subscribe by email or by rss to get the next episodes of "20 years of Linux on the Desktop".

I’m currently turning this story into a book. I’m looking for an agent or a publisher interested to work with me on this book and on an English translation of "Bikepunk", my new post-apocalyptic-cyclist typewritten novel which sold out in three weeks in France and Belgium.

I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

I write science-fiction novels in French. For Bikepunk, my new post-apocalyptic-cyclist book, my publisher is looking for contacts in other countries to distribute it in languages other than French. If you can help, contact me!

July 17, 2025

Frederic Descamps

Deploying High Availability and Disaster Recovery MySQL on OCI like a devops

We all know MySQL InnoDB ClusterSet, a solution that links multiple InnoDB Clusters and Read Replicas asynchronously to easily generate complex MySQL architectures and manage them without burdensome commands. All this thanks to the MySQL Shell’s AdminAPI. This is an example of MySQL InnoDB ClusterSet using two data centers: Let’s explore how we can automate […]

July 16, 2025

Dries Buytaert

Grand Canyon, Zion and Bryce: our Western adventure

We just got back from a family vacation exploring the Grand Canyon, Zion, and Bryce Canyon. As usual, I planned to write about our travels, but Vanessa, my wife, beat me to it.

She doesn't have a blog, but something about this trip inspired her to put pen to paper. When she shared her writing with me, I knew right away her words captured our vacation better than anything I could write.

Instead of starting from scratch, I asked if I could share her writing here. She agreed. I made light edits for publication, but the story and voice are hers. The photos and captions, however, are mine.

We just wrapped up our summer vacation with the boys, and this year felt like a real milestone. Axl graduated high school, so we let him have input on our destination. His request? To see some of the U.S. National Parks. His first pick was Yosemite, but traveling to California in July felt like gambling with wildfires. So we adjusted course, still heading west, but this time to the Grand Canyon, Zion and Bryce.

As it turned out, we didn't fully avoid the fire season. When we arrived at the Grand Canyon, we learned that wildfires had already been burning near Bryce for weeks. And by the time we were leaving Bryce, the Grand Canyon itself was under evacuation orders in certain areas due to its own active fires. We slipped through a safe window without disruption.

We kicked things off with a couple of nights in Las Vegas. The boys had never been, and it felt like a rite of passage. But after two days of blinking lights, slot machines, and entertainment, we were ready for something quieter. The highlight was seeing O by Cirque du Soleil at the Bellagio. The production had us wondering how many crew members it takes to make synchronized underwater acrobatics look effortless.

A large concrete dam spanning a deep canyon.

When the Hoover Dam was built, they used an enormous amount of concrete. If they had poured it all at once, it would have taken over a century to cool and harden. Instead, they poured it in blocks and used cooling pipes to manage the heat. Even today, the concrete is still hardening through a process called hydration, so the dam keeps getting stronger over time.

On the Fourth of July, we picked up our rental car and headed to the Hoover Dam for a guided tour. We learned it wasn't originally built to generate electricity, but rather to prevent downstream flooding from the Colorado River. Built in the 1930s, it's still doing its job. And fun fact: the concrete is still curing after nearly a century. It takes about 100 years to fully cure.

While we were at the Hoover Dam, we got the news that Axl was admitted to study civil engineering. A proud moment in a special place centered on engineering and ambition.

From there, we drove to the South Rim of the Grand Canyon and checked into El Tovar. When we say the hotel sits on the rim, we mean right on the rim. Built in 1905, it has hosted an eclectic list of notable guests, including multiple U.S. presidents, Albert Einstein, Liz Taylor, and Paul McCartney. Standing on the edge overlooking the canyon, we couldn't help but imagine them taking in the same view, the same golden light, the same vast silence. That sense of shared wonder, stretched across generations, made the moment special. No fireworks in the desert this Independence Day, but the sunset over the canyon was its own kind of magic.

The next morning, we hiked the Bright Angel Trail to the 3-mile resthouse. Rangers, staff, and even Google warned us to start early. But with teenage boys and jet lag, our definition of "early" meant hitting the trail by 8:30am. By 10am, a ranger reminded us that hiking after that hour is not advised. We pressed on carefully, staying hydrated and dunking our hats and shirts at every water source. Going down was warm. Coming up? Brutal. But we made it, sweaty and proud. Our reward: showers, naps, and well-earned ice cream.

Next up: Zion. We stopped at Horseshoe Bend on the way, a worthy detour with dramatic views of the Colorado River. By the time we entered Zion National Park, we were in total disbelief. The landscape was so perfectly sculpted it didn't look real. Towering red cliffs, hanging gardens, and narrow slot canyons surrounded us. I told Dries, "It's like we're driving through Disneyland", and I meant that in the best way.

A view of a wide, U-shaped bend in the Colorado River surrounded by steep red rock cliffs.

We visited Horseshoe Bend, a dramatic curve in the Colorado River near the Grand Canyon. A quiet reminder of what time and a patient river can carve out together.

After a long drive, we jumped in the shared pool at our rental house and met other first-time visitors who were equally blown away. That night, we celebrated our six year wedding anniversary with tacos and cocktails at a cantina inside a converted gas station. Nothing fancy, but a good memory.

One thing that stood out in Zion was the deer. They roamed freely through the neighborhoods and seemed unbothered by our presence. Every evening, a small group would quietly wander through our yard, grazing on grass and garden beds like they owned the place.

The next morning, we hiked The Narrows, wading through the Virgin River in full gear. Our guide shared stories and trail history, and most importantly, brought a charcuterie board. We hike for snacks, after all. Learning how indigenous communities thrived in these canyons for thousands of years gave us a deeper connection to the land, especially for me, as someone with Native heritage.

A small group walking through water into the Narrows, a narrow canyon with glowing rock walls in Zion National Park.

We hiked 7.5 miles through the Narrows in Zion National Park. Most of the hike is actually in the river itself, with towering canyon walls rising all around you. One of my favorite hikes ever.

A person standing on a rock in the Narrows at Zion National Park, looking up at the tall canyon walls.

Taking a moment to look up and take it all in.
Three people walking up the river with their boots in the water in the Narrows at Zion National Park.

Wading forward together through the Narrows in Zion.

The following day was for rappelling, scrambling, and hiking. The boys were hyped, memories of rappelling in Spain had them convinced there would be waterfalls. Spoiler: there weren't. It hadn't rained in Zion for months. But dry riverbeds didn't dull the excitement. We even found shell fossils embedded in the sandstone. Proof the area was once underwater.

Two young adults reaching down to help a parent climb up a steep sandstone wall in Zion National Park.

Time has a way of flipping the roles.

A woman wearing a helmet and sunglasses, smiling while rappelling in Zion National Park.

Getting ready to rappel in Zion, and enjoying every moment of it.
A person carefully walking along a narrow sandstone slot canyon in Zion National Park.

A person carefully walking along a narrow sandstone slot canyon in Zion National Park.

Making our way through the narrow slots in Zion.

From Zion, we headed to Bryce Canyon. The forecast promised cooler temperatures, and we couldn't wait. We stayed at Under Canvas, a glamping site set in open range cattle territory. Canvas tents with beds and private bathrooms, but no electricity or WiFi. Cue the family debate: "Is this camping or hoteling?" Dries, Axl and Stan voted for "hoteling". I stood alone on "team camping". (Spoiler: it is camping when there are no outlets.) Without our usual creature comforts, we slowed down. We read. We played board games. We played cornhole. We watched sunsets and made s'mores.

A family sits around a fire pit at a campsite in the high desert outside Bryce, Utah.

Glamping in the high desert outside Bryce, Utah. Even in summer, the high elevation brings cool evenings, and the fire felt perfect after a day on the trail.

The next day, we hiked the Fairyland Loop, eight miles along the rim with panoramic views into Bryce's otherworldly amphitheater of hoodoos. The towering spires and sculpted rock formations gave the park an almost storybook quality, as if the landscape had been carved by imagination rather than erosion. Though the temperature was cooler, the sun still packed a punch, so we were glad to finish before the midday heat. At night, the temperature dropped quickly once the sun went down. We woke up to 45°F (about 7°C) mornings, layering with whatever warm clothes we had packed, which, given we planned for desert heat, wasn't much.

One of our most memorable mornings came with a 4:30 am wake-up call to watch the sunrise at Sunrise Point. We had done something similar with the boys at Acadia in 2018. It's a tough sell at that hour, but always worth it. As the sun broke over the canyon, the hoodoos lit up in shades of orange and gold unlike anything we'd seen the day before. Afterward, we hiked Navajo Loop and Queen's Garden and were ready for a big breakfast at the lodge.

A young adult wearing a hoodie overlooking the hoodoos at sunrise in Bryce Canyon National Park.

Up before sunrise to watch the hoodoos glow at Bryce Canyon in Utah. Cold and early, but unforgettable.
A woman with trekking poles hiking down a switchback trail among tall orange hoodoos in Bryce Canyon National Park.

A woman with trekking poles hiking down a switchback trail among tall orange hoodoos in Bryce Canyon National Park.

Vanessa making her way down through the hoodoos on the Navajo Loop in Bryce Canyon.

Later that day, we visited Mossy Cave Trail. We followed the stream, poked around the waterfall, and hunted for fossils. Axl and I were on a mission, cracking open sandstone rocks in hopes of finding hidden treasures. Mostly, we just made a mess (of ourselves). I did stumble upon a tiny sliver of geode ... nature's way of rewarding persistence, I suppose.

Before heading to Salt Lake City for laundry (yes, that's a thing after hiking in the desert for a week), we squeezed in one more thrill: whitewater rafting on the Sevier River. Our guide, Ryan, was part comedian, part chaos agent. His goal was to get the boys drenched! The Class II and III rapids were mellow but still a blast, especially since the river was higher than expected for July. We all stayed in the raft, mostly wet, mostly laughing.

Incredibly, throughout the trip, none of us got sunburned, despite hiking in triple digit heat, rappelling down canyon walls, and rafting under a cloudless sky. We each drank about 4 to 6 liters of water a day, and no one passed out, so we're calling it a win.

On our final evening, during dinner, I pulled out a vacation questionnaire I had created without telling anyone. Since the boys aren't always quick to share what they loved, I figured it was a better way to ask everyone to rate their experience. What did they love? What would they skip next time? What do they want more of, less of, or never again? It was a simple way to capture the moment, create conversation, reflect on what stood out, and maybe even help shape the next trip. Turns out, teens do have opinions, especially when sunrises and physical exertion are involved.

This trip was special. When I was a kid, I thought hiking and parks were boring. Now, it's what Dries and I seek out. We felt grateful we could hike, rappel, raft, and laugh alongside the boys. We created memories we hope they'll carry with them, long after this summer fades. We're proud of the young men they're becoming, and we can't wait for the next chapter in our family adventures.

Amedee Van Gasse

🧱 Let’s Get Hard (Links): Deduplicating My Linux Filesystem with Hadori

File deduplication isn’t just for massive storage arrays or backup systems—it can be a practical tool for personal or server setups too. In this post, I’ll explain how I use hardlinking to reduce disk usage on my Linux system, which directories are safe (and unsafe) to link, why I’m OK with the trade-offs, and how I automated it with a simple monthly cron job using a neat tool called hadori.

What Is Hardlinking?

In a traditional filesystem, every file has an inode, which is essentially its real identity—the data on disk. A hard link is a different filename that points to the same inode. That means:

The file appears to exist in multiple places.
But there’s only one actual copy of the data.
Deleting one link doesn’t delete the content, unless it’s the last one.

Compare this to a symlink, which is just a pointer to a path. A hardlink is a pointer to the data.

So if you have 10 identical files scattered across the system, you can replace them with hardlinks, and boom—nine of them stop taking up extra space.

Why Use Hardlinking?

My servers run a fairly standard Ubuntu install, and like most Linux machines, the root filesystem accumulates a lot of identical binaries and libraries—especially across /bin, /lib, /usr, and /opt.

That’s not a problem… until you’re tight on disk space, or you’re just a curious nerd who enjoys squeezing every last byte.

In my case, I wanted to reduce disk usage safely, without weird side effects.

Hardlinking is a one-time cost with ongoing benefits. It’s not compression. It’s not archival. But it’s efficient and non-invasive.

Which Directories Are Safe to Hardlink?

Hardlinking only works within the same filesystem, and not all directories are good candidates.

Safe directories:

/bin, /sbin – system binaries
/lib, /lib64 – shared libraries
/usr, /usr/bin, /usr/lib, /usr/share, /usr/local – user-space binaries, docs, etc.
/opt – optional manually installed software

These contain mostly static files: compiled binaries, libraries, man pages… not something that changes often.

Unsafe or risky directories:

/etc – configuration files, might change frequently
/var, /tmp – logs, spools, caches, session data
/home – user files, temporary edits, live data
/dev, /proc, /sys – virtual filesystems, do not touch

If a file is modified after being hardlinked, it breaks the deduplication (the OS creates a copy-on-write scenario), and you’re back where you started—or worse, sharing data you didn’t mean to.

That’s why I avoid any folders with volatile, user-specific, or auto-generated files.

Risks and Limitations

Hardlinking is not magic. It comes with sharp edges:

One inode, multiple names: All links are equal. Editing one changes the data for all.
Backups: Some backup tools don’t preserve hardlinks or treat them inefficiently.
➤ Duplicity, which I use, does not preserve hardlinks. It backs up each linked file as a full copy, so hardlinking won’t reduce backup size.
Security: Linking files with different permissions or owners can have unexpected results.
Limited scope: Only works within the same filesystem (e.g., can’t link / and /mnt if they’re on separate partitions).

In my setup, I accept those risks because:

I’m only linking read-only system files.
I never link config or user data.
I don’t rely on hardlink preservation in backups.
I test changes before deploying.

In short: I know what I’m linking, and why.

What the Critics Say About Hardlinking

Not everyone loves hardlinks—and for good reasons. Two thoughtful critiques are:

The core arguments:

Hardlinks violate expectations about file ownership and identity.
They can break assumptions in software that tracks files by name or path.
They complicate file deletion logic—deleting one name doesn’t delete the content.
They confuse file monitoring and logging tools, since it’s hard to tell if a file is “new” or just another name.
They increase the risk of data corruption if accidentally modified in-place by a script that assumes it owns the file.

Why I’m still OK with it:

These concerns are valid—but mostly apply to:

Mutable files (e.g., logs, configs, user data)
Systems with untrusted users or dynamic scripts
Software that relies on inode isolation or path integrity

In contrast, my approach is intentionally narrow and safe:

I only deduplicate read-only system files in /bin, /sbin, /lib, /lib64, /usr, and /opt.
These are owned by root, and only changed during package updates.
I don’t hardlink anything under /home, /etc, /var, or /tmp.
I know exactly when the cron job runs and what it targets.

So yes, hardlinks can be dangerous—but only if you use them in the wrong places. In this case, I believe I’m using them correctly and conservatively.

Does Hardlinking Impact System Performance?

Good news: hardlinks have virtually no impact on system performance in everyday use.

Hardlinks are a native feature of Linux filesystems like ext4 or xfs. The OS treats a hardlinked file just like a normal file:

Reading and writing hardlinked files is just as fast as normal files.
Permissions, ownership, and access behave identically.
Common tools (ls, cat, cp) don’t care whether a file is hardlinked or not.
Filesystem caches and memory management work exactly the same.

The only difference is that multiple filenames point to the exact same data.

Things to keep in mind:

If you edit a hardlinked file, all links see that change because there’s really just one file.
Some tools (backup, disk usage) might treat hardlinked files differently.
Debugging or auditing files can be slightly trickier since multiple paths share one inode.

But from a performance standpoint? Your system won’t even notice the difference.

Tools for Hardlinking

There are a few tools out there:

fdupes – finds duplicates and optionally replaces with hardlinks
rdfind – more sophisticated detection
hardlink – simple but limited
jdupes – high-performance fork of fdupes

About Hadori

From the Debian package description:

This might look like yet another hardlinking tool, but it is the only one which only memorizes one filename per inode. That results in less memory consumption and faster execution compared to its alternatives. Therefore (and because all the other names are already taken) it’s called “Hardlinking DOne RIght”.

Advantages over other tools:

Predictability: arguments are scanned in order, each first version is kept
Much lower CPU and memory consumption compared to alternatives

This makes hadori especially suited for system-wide deduplication where efficiency and reliability matter.

How I Use Hadori

I run hadori once per month with a cron job. Here’s the actual command:

/usr/bin/hadori --verbose /bin /sbin /lib /lib64 /usr /opt

This scans those directories, finds duplicate files, and replaces them with hardlinks when safe.

And here’s the crontab entry I installed in the file /etc/cron.d/hadori:

@monthly root /usr/bin/hadori --verbose /bin /sbin /lib /lib64 /usr /opt

What Are the Results?

After the first run, I saw a noticeable reduction in used disk space, especially in /usr/lib and /usr/share. On my modest VPS, that translated to about 300–500 MB saved—not huge, but non-trivial for a small root partition.

While this doesn’t reduce my backup size (Duplicity doesn’t support hardlinks), it still helps with local disk usage and keeps things a little tidier.

And because the job only runs monthly, it’s not intrusive or performance-heavy.

Final Thoughts

Hardlinking isn’t something most people need to think about. And frankly, most people probably shouldn’t use it.

But if you:

Know what you’re linking
Limit it to static, read-only system files
Automate it safely and sparingly

…then it can be a smart little optimization.

With a tool like hadori, it’s safe, fast, and efficient. I’ve read the horror stories—and decided that in my case, they don’t apply.

This post was brought to you by a monthly cron job and the letters i-n-o-d-e.

July 09, 2025

🔍 How I Accidentally Discovered Power Query

A few weeks ago, I was knee-deep in CSV files. Not the fun kind. These were automatically generated reports from Cisco IronPort, and they weren’t exactly what I’d call analysis-friendly. Think: dozens of columns wide, thousands of rows, with summary data buried in awkward corners.

I was trying to make sense of incoming mail categories—Spam, Clean, Malware—and the numbers that went with them. Naturally, I opened the file in Excel, intending to wrangle the data manually like I usually do. You know: transpose the table, delete some columns, rename a few headers, calculate percentages… the usual grunt work.

But something was different this time. I noticed the “Get & Transform” section in Excel’s Data ribbon. I had clicked it before, but this time I gave it a real shot. I selected “From Text/CSV”, and suddenly I was in a whole new environment: Power Query Editor.

Wait, What Is Power Query?

For those who haven’t met it yet, Power Query is a powerful tool in Excel (and also in Power BI) that lets you import, clean, transform, and reshape data before it even hits your spreadsheet. It uses a language called M, but you don’t really have to write code—although I quickly did, of course, because I can’t help myself.

In the editor, every transformation step is recorded. You can rename columns, remove rows, change data types, calculate new columns—all through a clean interface. And once you’re done, you just load the result into Excel. Even better: you can refresh it with one click when the source file updates.

From Curiosity to Control

Back to my IronPort report. I used Power Query to:

Transpose the data (turn columns into rows),
Remove columns I didn’t need,
Rename columns to something meaningful,
Convert text values to numbers,
Calculate the percentage of each message category relative to the total.

All without touching a single cell in Excel manually. What would have taken 15+ minutes and been error-prone became a repeatable, refreshable process. I even added a “Percent” column that showed something like 53.4%—formatted just the way I wanted.

The Geeky Bit (Optional)

I quickly opened the Advanced Editor to look at the underlying M code. It was readable! With a bit of trial and error, I started customizing my steps, renaming variables for clarity, and turning a throwaway transformation into a well-documented process.

This was the moment it clicked: Power Query is not just a tool; it’s a pipeline.

Lessons Learned

Sometimes it pays to explore what’s already in the software you use every day.
Excel is much more powerful than most people realize.
Power Query turns tedious cleanup work into something maintainable and even elegant.
If you do something in Excel more than once, Power Query is probably the better way.

What’s Next?

I’m already thinking about integrating this into more of my work. Whether it’s cleaning exported logs, combining reports, or prepping data for dashboards, Power Query is now part of my toolkit.

If you’ve never used it, give it a try. You might accidentally discover your next favorite tool—just like I did.

Have you used Power Query before? Let me know your tips or war stories in the comments!

July 02, 2025

In Defense of the Em Dash — A Beautiful Line of Thought ✍️

Lately, I’ve noticed something strange happening in online discussions: the humble em dash (—) is getting side-eyed as a telltale sign that a text was written with a so-called “AI.” I prefer the more accurate term: LLM (Large Language Model), because “artificial intelligence” is a bit of a stretch — we’re really just dealing with very complicated statistics .

Now, I get it — people are on high alert, trying to spot generated content. But I’d like to take a moment to defend this elegant punctuation mark, because I use it often — and deliberately. Not because a machine told me to, but because it helps me think .

A Typographic Tool, Not a Trend

The em dash has been around for a long time — longer than most people realize. The oldest printed examples I’ve found are in early 17th-century editions of Shakespeare’s plays, published by the printer Okes in the 1620s. That’s not just a random dash on a page — that’s four hundred years of literary service . If Shakespeare’s typesetters were using em dashes before indoor plumbing was common, I think it’s safe to say they’re not a 21st-century LLM quirk.

The Tragedy of Othello, the Moor of Venice, with long dashes (typeset here with 3 dashes)

A Dash for Thoughts

In Dutch, the em dash is called a gedachtestreepje — literally, a thought dash. And honestly? I think that’s beautiful. It captures exactly what the em dash does: it opens a little mental window in your sentence. It lets you slip in a side note, a clarification, an emotion, or even a complete detour — just like a sudden thought that needs to be spoken before it disappears. For someone like me, who often thinks in tangents, it’s the perfect punctuation.

Why I Use the Em Dash (And Other Punctuation Marks)

I’m autistic, and that means a few things for how I write. I tend to overshare and infodump — not to dominate the conversation, but to make sure everything is clear. I don’t like ambiguity. I don’t want anyone to walk away confused. So I reach for whatever punctuation tools help me shape my thoughts as precisely as possible:

Colons help me present information in a tidy list — like this one.
Brackets let me add little clarifications (without disrupting the main sentence).
And em dashes — ah, the em dash — they let me open a window mid-sentence to give you extra context, a bit of tone, or a change in pace.

They’re not random. They’re intentional. They reflect how my brain works — and how I try to bridge the gap between thoughts and words .

It’s Not Just a Line — It’s a Rhythm

There’s also something typographically beautiful about the em dash. It’s not a hyphen (-), and it’s not a middling en dash (–). It’s long and confident. It creates space for your eyes and your thoughts. Used well, it gives writing a rhythm that mimics natural speech, especially the kind of speech where someone is passionate about a topic and wants to take you on a detour — just for a moment — before coming back to the main road .

I’m that someone.

Don’t Let the Bots Scare You

Yes, LLMs tend to use em dashes. So do thoughtful human beings. Let’s not throw centuries of stylistic nuance out the window because a few bots learned how to mimic good writing. Instead of scanning for suspicious punctuation, maybe we should pay more attention to what’s being said — and how intentionally .

So if you see an em dash in my writing, don’t assume it came from a machine. It came from me — my mind, my style, my history with language. And I’m not going to stop using it just because an algorithm picked up the habit .

July 01, 2025

Dries Buytaert

The web's broken deal with AI companies

An astronaut (Cloudflare) facing giant glowing structures (crawlers) drawing energy in an alien sunset landscape.

AI is rewriting the rules of how we work and create. Expert developers can now build faster, non-developers can build software, research is accelerating, and human communication is improving. In the next 10 years, we'll probably see a 1,000x increase in AI demand. That is why Drupal is investing heavily in AI.

But at the same time, AI companies are breaking the web's fundamental economic model. This problem demands our attention.

The AI extraction problem

For 25 years, we built the Open Web on an implicit agreement: search engines could index our content because they sent users back to our websites. That model helped sustain blogs, news sites, and even open source projects.

AI companies broke that model. They train on our work and answer questions directly in their own interfaces, cutting creators out entirely. Anthropic's crawler reportedly makes 70,000 website requests for every single visitor it sends back. That is extraction, not exchange.

This is the Makers and Takers problem all over again.

The damage is real:

Chegg, an online learning platform, filed an antitrust lawsuit against Google, claiming that AI-powered search answers have crushed their website traffic and revenue.
Stack Overflow has seen a significant drop in daily active users and new questions (about 25-50%), as more developers turn to ChatGPT for faster answers.
I recently spoke with a recipe blogger who is a solo entrepreneur. With fewer visitors, they're earning less from advertising. They poured their heart, craft, and sweat into creating a high-quality recipe website, but now they believe their small business won't survive.

None of this should surprise us. According to Similarweb, since Google launched "AI Overviews", the number of searches that result in no click-throughs has increased from 56% in May 2024 to 69% in May 2025, meaning users get their answers directly on the results page.

This "zero-click" phenomenon reinforces the shift I described in my 2015 post, "The Big Reverse of the Web". Ten years ago, I argued that the web was moving away from sending visitors out to independent sites and instead keeping them on centralized platforms, all in the name of providing a faster and more seamless user experience.

However, the picture isn't entirely negative. Some companies find that visitors from AI tools, while small in volume, convert at much higher rates. At Acquia, the company I co-founded, traffic from AI chatbots makes up less than 1 percent of total visitors but converts at over 6 percent, compared to typical rates of 2 to 3 percent. We are still relatively early in the AI adoption cycle, so time will tell how this trend evolves, how marketers adapt, and what new opportunities it might create.

Finding a new equilibrium

There is a reason this trend has taken hold: users love it. AI-generated answers provide instant, direct information without extra clicks. It makes traditional search engines look complicated by comparison.

But this improved user experience comes at a long-term cost. When value is extracted without supporting the websites and authors behind it, it threatens the sustainability of the content we all rely on.

I fully support improving the user experience. That should always come first. But it also needs to be balanced with fair support for creators and the Open Web.

We should design systems that share value more fairly among users, AI companies, and creators. We need a new equilibrium that sustains creative work, preserves the Open Web, and still delivers the seamless experiences users expect.

Some might worry it is already too late, since large AI companies have massive scraped datasets and can generate synthetic data to fill gaps. But I'm not so sure. The web will keep evolving for decades, and no model can stay truly relevant without fresh, high-quality content.

From voluntary rules to enforcement

We have robots.txt, a simple text file that tells crawlers which parts of a website they can access. But it's purely voluntary. Creative Commons launched CC Signals last week, allowing content creators to signal how AI can reuse their work. But both robots.txt and CC Signals are "social contracts" that are hard to enforce.

Today, Cloudflare announced they will default to blocking AI crawlers from accessing content. This change lets website owners decide whether to allow access and whether to negotiate compensation. Cloudflare handles 20% of all web traffic. When an AI crawler tries to access a website protected by Cloudflare, it must pass through Cloudflare's servers first. This allows Cloudflare to detect crawlers that ignore robots.txt directives and block them.

This marks a shift from purely voluntary signals to actual technical enforcement. Large sites could already afford their own infrastructure to detect and block crawlers or negotiate licensing deals directly. For example, Reddit signed a $60 million annual deal with Google to license its content for AI training.

However, most content creators, like you and I, can do neither.

Cloudflare's actions establish a crucial principle: AI training data has a price, and creators deserve to share in the value AI generates from their work.

The missing piece: content licensing marketplaces

Accessible enforcement infrastructure is step one, and Cloudflare now provides that. Step two would be a content licensing marketplace that helps broker deals between AI companies and content creators at any scale. This would move us from simply blocking to creating a fair economic exchange.

To the best of my knowledge, such marketplaces do not exist yet, but the building blocks are starting to emerge. Matthew Prince, CEO of Cloudflare, has hinted that Cloudflare may be working on building such a marketplace, and I think it is a great idea.

I don't know what that will look like, but I imagine something like Shutterstock for AI training data, combined with programmatic pricing like Google Ads. On Shutterstock, photographers upload images, set licensing terms, and earn money when companies license their photos. Google Ads automatically prices and places millions of ads without manual negotiations. A future content licensing marketplace could work in a similar way: creators would set licensing terms (like they do on Shutterstock), while automated systems manage pricing and transactions (as Google Ads does).

Today, only large platforms like Reddit can negotiate direct licensing deals with AI companies. A marketplace with programmatic pricing would make licensing accessible to creators of all sizes. Instead of relying on manual negotiations or being scraped for free, creators could opt into fair, programmatic licensing programs.

This would transform the dynamic from adversarial blocking to collaborative value creation. Creators get compensated. AI companies get legal, high-quality training data. Users benefit from better AI tools built on ethically sourced content.

Making the Open Web sustainable

We built the Open Web to democratize access to knowledge and online publishing. AI advances this mission of democratizing knowledge. But we also need to ensure the people who write, record, code, and share that knowledge aren't left behind.

The issue is not that AI exists. The problem is that we have not built economic systems to support the people and organizations that AI relies on. This affects independent bloggers, large media companies, and open source maintainers whose code and documentation train coding assistants.

Call me naive, but I believe AI companies want to work with content creators to solve this. Their challenge is that no scalable system exists to identify, contact, and pay millions of content creators.

Content creators lack tools to manage and monetize their rights. AI companies lack systems to discover and license content at scale. Cloudflare's move is a first step. The next step is building content licensing marketplaces that connect creators directly with AI companies.

The Open Web needs economic systems that sustain the people who create its content. There is a unique opportunity here: if content creators and AI companies build these systems together, we could create a stronger, more fair, and more resilient Web than we have had in 25 years. The jury is out on that, but one can dream.

Disclaimer: Acquia, my company, has a commercial relationship with Cloudflare, but this perspective reflects my long-standing views on sustainable web economics, not any recent briefings or partnerships.

June 25, 2025

Amedee Van Gasse

Help ons zoeken: drie mensen op zoek naar een warm huis in Gent

Soms zit het mee, soms nét niet. Het herenhuis waar we helemaal verliefd op waren, is helaas aan iemand anders verhuurd. Jammer, maar we blijven niet bij de pakken zitten. We zoeken verder — en hopelijk kan jij ons daarbij helpen!

Wij zijn drie mensen die samen een huis willen delen in Gent. We vormen een warme, bewuste en respectvolle woongroep, en we dromen van een plek waar we rust, verbinding en creativiteit kunnen combineren.

Wie zijn wij?

Amedee (48): IT’er, balfolkdanser, amateurmuzikant, houdt van gezelschapsspelletjes en wandelen, auti en sociaal geëngageerd
Chloë (bijna 52): Kunstenares, ex-Waldorfleerkracht en permacultuurontwerpster, houdt van creativiteit, koken en natuur
Kathleen (54): Doodle-artiest met sociaal-culturele achtergrond, houdt van gezelligheid, buiten zijn en schrijft graag

We willen samen een huis vormen waar communicatie, zorgzaamheid en vrijheid centraal staan. Een plek waar je je thuis voelt, en waar ruimte is voor kleine activiteiten zoals een spelavond, een workshop, een creatieve sessie of gewoon rustig samen zijn.

Wat zoeken we?

Een huis (géén appartement) in Gent, op max. 15 minuten fietsen van station Gent-Sint-Pieters
Energiezuinig: EPC B of beter
Minstens 3 ruime slaapkamers van ±20m²
Huurprijs:

tot €1650/maand voor 3 slaapkamers
tot €2200/maand voor 4 slaapkamers

Extra ruimtes zoals een zolder, logeerkamer, atelier, bureau of hobbyruimte zijn heel welkom. We houden van luchtige, multifunctionele plekken die mee kunnen groeien met onze noden.

Beschikbaar: vanaf nu, ten laatste oktober

Heeft het huis 4 slaapkamers? Dan verwelkomen we graag een vierde huisgenoot die onze waarden deelt. Maar meer dan 4 bewoners willen we bewust vermijden — kleinschalig wonen werkt voor ons het best.

Ken jij iets? Laat van je horen!

Ken je een huis dat past in dit plaatje?
We staan open voor tips via immokantoren, vrienden, buren, collega’s of andere netwerken — alles helpt!

Contact: amedee@vangasse.eu

Dankjewel om mee uit te kijken — en delen mag altijd

June 24, 2025

Dries Buytaert

How I collect and connect ideas

A glowing light bulb hanging in an underground tunnel.

In my post about digital gardening and public notes, I shared a principle I follow: "If a note can be public, it should be". I also mentioned using Obsidian for note-taking. Since then, various people have asked about my Obsidian setup.

I use Obsidian to collect ideas over time rather than to manage daily tasks or journal. My setup works like a Commonplace book, where you save quotes, thoughts, and notes to return to later. It is also similar to a Zettelkasten, where small, linked notes build deeper understanding.

What makes such note-taking systems valuable is how they help ideas grow and connect. When notes accumulate over time, connections start to emerge. Ideas compound slowly. What starts as scattered thoughts or quotes becomes the foundation for blog posts or projects.

Why plain text matters

One of the things I appreciate most about Obsidian is that it stores notes as plain text Markdown files on my local filesystem.

Plain text files give you full control. I sync them with iCloud, back them up myself, and track changes using Git. You can search them with command-line tools, write scripts to process them outside of Obsidian, or edit them in other applications. Your notes stay portable and usable any way you want.

Plus, plain text files have long-term benefits. Note-taking apps come and go, companies fold, subscription models shift. But plain text files remain accessible. If you want your notes to last for decades, they need to be in a format that stays readable, editable, and portable as technology changes. A Markdown file you write today will open just fine in 2050.

All this follows what Obsidian CEO Steph Ango calls the "files over apps" philosophy: your files should outlast the tools that create them. Don't lock your thinking into formats you might not be able to access later.

My tools

Before I dive into how I use Obsidian, it is worth mentioning that I use different tools for different types of thinking. Some people use Obsidian for everything – task management, journaling, notes – but I prefer to separate those.

For daily task management and meeting notes, I rely on my reMarkable Pro. A study titled The Pen Is Mightier Than the Keyboard by Mueller and Oppenheimer found that students who took handwritten notes retained concepts better than those who typed them. Handwriting meeting notes engages deeper cognitive processing than typing, which can improve understanding and memory.

For daily journaling and event tracking, I use a custom iOS app I built myself. I might share more about that another time.

Obsidian is where I grow long-term ideas. It is for collecting insights, connecting thoughts, and building a knowledge base that compounds over time.

How I capture ideas

In Obsidian, I organize my notes around topic pages. Examples are "Coordination challenges in Open Source", "Solar-powered websites", "Open Source startup lessons", or "How to be a good dad".

I have hundreds of these topic pages. I create a new one whenever an idea feels worth tracking.

Each topic page grows slowly over time. I add short summaries, interesting links, relevant quotes, and my own thoughts whenever something relevant comes up. The idea is to build a thoughtful collection of notes that deepens and matures over time.

Some notes stay short and focused. Others grow rich with quotes, links, and personal reflections. As notes evolve, I sometimes split them into more specific topics or consolidate overlapping ones.

I do not schedule formal reviews. Instead, notes come back to me when I search, clip a new idea, or revisit a related topic. A recent thought often leads me to something I saved months or years ago, and may prompt me to reorganize related notes.

Obsidian's core features help these connections deepen. I use tags, backlinks and graph view, to connect notes and reveal patterns between notes.

How I use notes

The biggest challenge with note-taking is not capturing ideas, but actually using them. Most notes get saved and then forgotten.

Some of my blog posts grow directly from these accumulated notes. Makers and Takers, one of my most-read blog posts, pre-dates Obsidian and did not come from this system. But if I write a follow-up, it will. I have a "Makers and Takers" note where relevant quotes and ideas are slowly accumulating.

As my collection of notes grows, certain notes keep bubbling up while others fade into the background. The ones that resurface again and again often signal ideas worth writing about or projects worth pursuing.

What I like about this process is that it turns note-taking into more than just storage. As I've said many times, writing is how I think. Writing pushes me to think, and it is the process I rely on to flesh out ideas. I do not treat my notes as final conclusions, but as ongoing conversations with myself. Sometimes two notes written months apart suddenly connect in a way I had not noticed before.

My plugin setup

Obsidian has a large plugin ecosystem that reminds me of Drupal's. I mostly stick with core plugins, but use the following community ones:

Dataview – Think of it as SQL queries for your notes. I use it to generate dynamic lists like TABLE FROM #chess AND #opening AND #black to see all my notes on chess openings for Black. It turns your notes into a queryable database.
Kanban – Visual project boards for tracking progress on long-term ideas. I maintain Kanban boards for Acquia, Drupal, improvements to dri.es, and more. Unlike daily task management, these boards capture ideas that evolve over months or years.
Linter – Automatically formats my notes: standardizes headings, cleans up spacing, and more. It runs on save, keeping my Markdown clean.
Encrypt – Encrypts specific notes with password protection. Useful for sensitive information that I want in my knowledge base but need to keep secure.
Pandoc – Exports notes to Word documents, PDFs, HTML, and other formats using Pandoc.
Copilot – I'm still testing this, but the idea of chatting with your own knowledge base is compelling. You can also ask AI to help organize notes more effectively.

The Obsidian Web Clipper

The tool I'd actually recommend most isn't a traditional Obsidian plugin: it's the official Obsidian Web Clipper browser extension. I have it installed on my desktop and phone.

When I find something interesting online, I highlight it and clip it directly into Obsidian. This removes friction from the process.

I usually save just a quote or a short section of an article, not the whole article. Some days I save several clips. Other days, I save none at all.

Why this works

For me, Obsidian is not just a note-taking tool. It is a thinking environment. It gives me a place to collect ideas, let them mature, and return to them when the time is right. I do not aim for perfect organization. I aim for a system that feels natural and helps me notice connections I would otherwise miss.

June 22, 2025

Staf Wagemakers

Using OpenTofu/Terraform to create a disposable Tails virtual machine

OpenTofu

Terraform or OpenTofu (the open-source fork supported by the Linux Foundation) is a nice tool to setup the infrastructure on different cloud environments. There is also a provider that supports libvirt.

https://github.com/dmacvicar/terraform-provider-libvirt

If you want to get started with OpenTofu there is a free training available from the Linux foundation:

https://training.linuxfoundation.org/express-learning/getting-started-with-opentofu-lfel1009/

I also joined the talk about OpenTofu and Infrastructure As Code, in general, this year in the Virtualization and Cloud Infrastructure DEV Room at FOSDEM this year:

https://fosdem.org/2025/schedule/event/fosdem-2025-6057-the-iac-tooling-multiverse-and-the-future-of-iac/

I’ll not start to explain “Declarative” vs “Imperative” in this blog post, there’re already enough blog posts or websites that’re (trying) to explain this in more detail (the links above are a good start).

The default behaviour of OpenTofu is not to try to update an existing environment. This makes it usable to create disposable environments.

Tails

Tails is a nice GNU/Linux distribution to connect to the Tor network.

Personally, I’m less into the “privacy” aspect of the Tor network (although being aware that you’re tracked and followed is important), probably because I’m lucky to live in the “Free world”.

For people who are less lucky (People who live in a country where freedom of speech isn’t valued) or journalists for example, there’re good reasons to use the Tor network and hide their internet traffic.

tails/libvirt Terraform/OpenTofu module

To make it easier to spin up a virtual machine with the latest tail environment I created a Terraform/OpenTofu module to spin up a virtual machine with the latest Tails version on libvirt.

There’re security considerations when you run tails in a virtual machine. See

https://tails.net/doc/advanced_topics/virtualization/index.en.html

for more information.

The source code of the module is available at the git repository:

https://github.com/stafwag/terraform-libvirt-tails

The module is published on the Terraform Registry and the OpenTofu Registry.

Have fun!

June 18, 2025

Amedee Van Gasse

Samenwonen in Gent? Wij starten een nieuwe cohousing en zoeken nog iemand!

Heb jij altijd al willen samenwonen met fijne mensen in een warme, open en respectvolle sfeer? Dan is dit misschien wel iets voor jou.

Samen met twee vrienden ben ik een nieuwe cohousing aan het opstarten in Gent. We hebben een prachtig gerenoveerd herenhuis op het oog, en we zijn op zoek naar een vierde persoon om het huis mee te delen.

Het huis

Het gaat om een ruim en karaktervol herenhuis met energielabel B+. Het beschikt over:

Vier volwaardige slaapkamers van elk 18 à 20 m²

Eén extra kamer die we kunnen inrichten als logeerkamer, bureau of hobbyruimte

Twee badkamers

Twee keukens

Een zolder met stevige balken — de creatieve ideeën borrelen al op!

De ligging is uitstekend: aan de Koning Albertlaan, op amper 5 minuten fietsen van station Gent-Sint-Pieters en 7 minuten van de Korenmarkt. De huurprijs is €2200 in totaal, wat neerkomt op €550 per persoon bij vier bewoners.

Het huis is al beschikbaar vanaf 1 juli 2025.

Wie zoeken we?

We zoeken iemand die zich herkent in een aantal gedeelde waarden en graag deel uitmaakt van een respectvolle, open en bewuste leefomgeving. Concreet betekent dat voor ons:

Je staat open voor diversiteit in al haar vormen

Je bent respectvol, communicatief en houdt rekening met anderen

Je hebt voeling met thema’s zoals inclusie, mentale gezondheid, en samenleven met aandacht voor elkaar

Je hebt een rustig karakter en draagt graag bij aan een veilige, harmonieuze sfeer in huis

Leeftijd is niet doorslaggevend, maar omdat we zelf allemaal 40+ zijn, zoeken we eerder iemand die zich in die levensfase herkent

Iets voor jou?

Voel je een klik met dit verhaal? Of heb je vragen en wil je ons beter leren kennen? Aarzel dan niet om contact op te nemen via amedee@vangasse.eu.

Is dit niets voor jou, maar ken je iemand die perfect zou passen in dit plaatje? Deel dan zeker deze oproep — dank je wel!

Samen kunnen we van dit huis een warme thuis maken.

Dries Buytaert

If a note can be public, it should be

A few years ago, I quietly adopted a small principle that has changed how I think about publishing on my website. It's a principle I've been practicing for a while now, though I don't think I've ever written about it publicly.

The principle is: If a note can be public, it should be.

It sounds simple, but this idea has quietly shaped how I treat my personal website.

I was inspired by three overlapping ideas: digital gardens, personal memexes, and "Today I Learned" entries.

Writers like Tom Critchlow, Maggie Appleton, and Andy Matuschak maintain what they call digital gardens. They showed me that a personal website does not have to be a collection of polished blog posts. It can be a living space where ideas can grow and evolve. Think of it more as an ever-evolving notebook than a finished publication, constantly edited and updated over time.

I also learned from Simon Willison, who publishes small, focused Today I Learned (TIL) entries. They are quick, practical notes that capture a moment of learning. They don't aim to be comprehensive; they simply aim to be useful.

And then there is Cory Doctorow. In 2021, he explained his writing and publishing workflow, which he describes as a kind of personal memex. A memex is a way to record your knowledge and ideas over time. While his memex is not public, I found his approach inspiring.

I try to take a lot of notes. For the past four years, my tool of choice has been Obsidian. It is where I jot things down, think things through, and keep track of what I am learning.

In Obsidian, I maintain a Zettelkasten system. It is a method for connecting ideas and building a network of linked thoughts. It is not just about storing information but about helping ideas grow over time.

At some point, I realized that many of my notes don't contain anything private. If they're useful to me, there is a good chance they might be useful to someone else too. That is when I adopted the principle: If a note can be public, it should be.

So a few years ago, I began publishing these kinds of notes on my site. You might have seen examples like Principles for life, PHPUnit tests for Drupal, Brewing coffee with a moka pot when camping or Setting up password-free SSH logins.

These pages on my website are not blog posts. They are living notes. I update them as I learn more or come back to the topic. To make that clear, each note begins with a short disclaimer that says what it is. Think of it as a digital notebook entry rather than a polished essay.

Now, I do my best to follow my principle, but I fall short more than I care to admit. I have plenty of notes in Obsidian that could have made it to my website but never did.

Often, it's simply inertia. Moving a note from Obsidian to my Drupal site involves a few steps. While not difficult, these steps consume time I don't always have. I tell myself I'll do it later, and then 'later' often never arrives.

Other times, I hold back because I feel insecure. I am often most excited to write when I am learning something new, but that is also when I know the least. What if I misunderstood something? The voice of doubt can be loud enough to keep a note trapped in Obsidian, never making it to my website.

But I keep pushing myself to share in public. I have been learning in the open and sharing in the open for 25 years, and some of the best things in my life have come from that. So I try to remember: if notes can be public, they should be.

June 11, 2025

Amedee Van Gasse

📰 Featured by Sibelga and Passwerk: When Being Different Becomes a Strength

I am excited to share some wonderful news—Sibelga and Passwerk have recently published a testimonial about my work, and it has been shared across LinkedIn, Sibelga’s website, and even on YouTube!

LinkedIn Post:
Sibelga – Testimonial with Passwerk
A brief but impactful post summarizing the collaboration.
Article on Sibelga.be (Dutch):
Wanneer anders zijn een sterkte wordt
The full article dives deeper into the story and the value of inclusive hiring.
YouTube Video:
Passwerk @ Sibelga – Testimonial
A short video in which I talk about my experience working at Sibelga through Passwerk.

What Is This All About?

Passwerk is an organisation that matches talented individuals on the autism spectrum with roles in IT and software testing, creating opportunities based on strengths and precision. I have been working with them as a consultant, currently placed at Sibelga, Brussels’ electricity and gas distribution network operator.

The article and video highlight how being “different” does not have to be a limitation—in fact, it can be a real asset in the right context. It means a lot to me to be seen and appreciated for who I am and the quality of my work.

Why This Matters

For many neurodivergent people, the professional world can be full of challenges that go beyond the work itself. Finding the right environment—one that values accuracy, focus, and dedication—can be transformative.

I am proud to be part of a story that shows what is possible when companies look beyond stereotypes and embrace neurodiversity as a strength.

Thank you to Sibelga, Passwerk, and everyone who contributed to this recognition. It is an honour to be featured, and I hope this story inspires more organisations to open up to diverse talents.

Want to know more? Check out the article or watch the video!

June 09, 2025

Dries Buytaert

Accelerating AI innovation in Drupal

Imagine a marketer opening Drupal and with a clear goal in mind: launch a campaign for an upcoming event.

They start by uploading a brand kit to Drupal CMS: logos, fonts, and color palette. They define the campaign's audience as mid-sized business owners interested in digital transformation. Then they create a creative guide that outlines the event's goals, key messages, and tone.

With this in place, AI agents within Drupal step in to assist. Drawing from existing content and media, the agents help generate landing pages, each optimized for a specific audience segment. They suggest headlines, refine copy based on the creative guide, create components based on the brand kit, insert a sign-up form, and assemble everything into cohesive, production-ready pages.

Using Drupal's built-in support for the Model Context Protocol (MCP), the AI agents connect to analytics tools and monitor performance. If a page is not converting well, the system makes overnight updates. It might adjust layout, improve clarity, or refine the calls to action.

Every change is tracked. The marketer can review, approve, revert, or adjust anything. They stay in control, even as the system takes on more of the routine work.

Why it matters

AI is changing how websites are built and managed faster than most people expected. The digital experience space is shifting from manual workflows to outcome-driven orchestration. Instead of building everything from scratch, users will set goals, and AI will help deliver results.

This future is not about replacing people. It is about empowering them. It is about freeing up time for creative and strategic work while AI handles the rest. AI will take care of routine tasks, suggest improvements, and respond to real-time feedback. People will remain in control, but supported by powerful new tools that make their work easier and faster.

The path forward won't be perfect. Change is never easy, and there are still many lessons to learn, but standing still isn't an option. If we want AI to head in the right direction, we have to help steer it. We are excited to move fast, but just as committed to doing it thoughtfully and with purpose.

The question is not whether AI will change how we build websites, but how we as a community will shape that change.

A coordinated push forward

Drupal already has a head start in AI. At DrupalCon Barcelona 2024, I showed how Drupal's AI tools help a site creator market wine tours. Since then, we have seen a growing ecosystem of AI modules, active integrations, and a vibrant community pushing boundaries. Today, about 1,000 people are sharing ideas and collaborating in the #ai channel on Drupal Slack.

At DrupalCon Atlanta in March 2025, I shared our latest AI progress. We also brought together key contributors working on AI in Drupal. Our goal was simple: get organized and accelerate progress. After the event, the group committed to align on a shared vision and move forward together.

Since then, this team has been meeting regularly, almost every day. I've been working with the team to help guide the direction. With a lot of hard work behind us, I'm excited to introduce the Drupal AI Initiative.

The Drupal AI Initiative builds on the momentum in our community by bringing structure and shared direction to the work already in progress. By aligning around a common strategy, we can accelerate innovation.

What we're launching today

The Drupal AI Initiative is closely aligned with the broader Drupal CMS strategy, particularly in its focus on making site building both faster and easier. At the same time, this work is not limited to Drupal CMS. It is also intended to benefit people building custom solutions on Drupal Core, as well as those working with alternative distributions of Drupal.

To support this initiative, we are announcing:

A clear strategy to guide Drupal's AI vision and priorities (PDF mirror).
A Drupal AI leadership team to drive product direction, fundraising, and collaboration across work tracks.
A funded delivery team focused on execution, with the equivalent of several full-time roles already committed, including technical leads, UX and project managers, and release coordination.
Active work tracks covering areas like AI Core, AI Products, AI Marketing, and AI UX.
USD $100,000 in operational funding, contributed by the initiative's founding companies.

For more details, read the full announcement on the Drupal AI Initiative page on Drupal.org.

Founding members and early support

Screenshot of a Google Hangout video call with nine smiling participants, the founding members of the Drupal AI initiative.

Some of the founding members of the Drupal AI initiative during our launch call on Google Hangouts.

Over the past few months, we've invested hundreds of hours shaping our AI strategy, defining structure, and taking first steps.

I want to thank the founding members of the Drupal AI Initiative. These individuals and organizations played a key role in getting things off the ground. The list is ordered alphabetically by last name to recognize all contributors equally:

Jamie Abrahams (FreelyGive) – Innovation and AI architecture
Baddý Breidert (1xINTERNET) – Governance, funding, and coordination
Christoph Breidert (1xINTERNET) – Product direction and roadmap
Dries Buytaert (Acquia / Drupal) – Strategic oversight and direction
Dominique De Cooman (Dropsolid) – Fundraising and business alignment
Marcus Johansson (FreelyGive) – Technical leadership
Paul Johnson (1xINTERNET) – Marketing and outreach
Kristen Pol (Salsa Digital) – Cross-team alignment and contributor engagement
Lauri Timmanee (Acquia) – Experience Builder AI integration
Frederik Wouters (Dropsolid) – Communications and outreach

These individuals, along with the companies supporting them, have already contributed significant time, energy, and funding. I am grateful for their early commitment.

I also want to thank the staff at the Drupal Association and the Drupal CMS leadership team for their support and collaboration.

What comes next

I'm glad the Drupal AI Initiative is now underway. The Drupal AI strategy is published, the structure is in place, and multiple work tracks are open and moving forward. We'll share more details and updates in the coming weeks.

With every large initiative, we are evolving how we organize, align, and collaborate. The Drupal AI Initiative builds on that progress. As part of that, we are also exploring more ways to recognize and reward meaningful contributions.

We are creating ways for more of you to get involved with Drupal AI. Whether you are a developer, designer, strategist, or sponsor, there is a place for you in this work. If you're part of an agency, we encourage you to step forward and become a Maker. The more agencies that contribute, the more momentum we build.

Update: In addition to the initiative's founding members, Amazee.io already stepped forward with another commitment of USD $20,000 and one full-time contributor. Thank you! This brings the total operating budget to USD $120,000. Please consider joining as well.

AI is changing how websites and digital experiences are built. This is our moment to be part of the change and help define what comes next.

Join us in the #ai-initiative channel on Drupal Slack to get started.

June 08, 2025

Staf Wagemakers

Lookat 2.1.0rc1 released

Lookat 2.1.0rc1 is the latest development release of Lookat/Bekijk, a user-friendly Unix file browser/viewer that supports colored man pages.

The focus of the 2.1.0 release is to add ANSI Color support.

News

8 Jun 2025 Lookat 2.1.0rc1 Released

Lookat 2.1.0rc1 is the first release candicate of Lookat 2.1.0

ChangeLog

Lookat / Bekijk 2.1.0rc1

ANSI Color support

Lookat 2.1.0rc1 is available at:

https://www.wagemakers.be/english/programs/lookat/
Download it directly from https://download-mirror.savannah.gnu.org/releases/lookat/
Or at the Git repository at GNU savannah https://git.savannah.gnu.org/cgit/lookat.git/

Have fun!

June 04, 2025

Amedee Van Gasse

🚗 French Road Trip to Balilas: From Ghent to Janzé with Strangers Turned Friends

A few weeks ago, I set off for Balilas, a balfolk festival in Janzé (near Rennes), Brittany (France). I had never been before, but as long as you have dance shoes, a tent, and good company, what more do you need?

Bananas for scale

From Ghent to Brittany… with Two Dutch Strangers

My journey began in Ghent, where I was picked up by Sterre and Michelle, two dancers from the Netherlands. I did not know them too well beforehand, but in the balfolk world, that is hardly unusual — de balfolkcommunity is één grote familie — one big family.

We took turns driving, chatting, laughing, and singing along. Google Maps logged our total drive time at 7 hours and 39 minutes.

Google knows everything

Péage – one of the many

Along the way, we had the perfect soundtrack:
French Road Trip — 7 hours and 49 minutes of French and Francophone tubes.

https://open.spotify.com/playlist/3jRMHCl6qVmVIqXrASAAmZ?si=746a7f78ca30488a

A Tasty Stop in Pré-en-Pail-Saint-Samson

Somewhere around dinner time, we stopped at La Sosta, a cozy Italian restaurant in Pré-en-Pail-Saint-Samson (2300 inhabitants). I had a pizza normande — base tomate, andouille, pomme, mozzarella, crème, persil . A delicious and unexpected regional twist — definitely worth remembering!

pizza normande

The pizzas wereexcellent, but also generously sized — too big to finish in one sitting. Heureusement, ils nous ont proposé d’emballer le reste à emporter. That was a nice touch — and much appreciated after a long day on the road.

Just to much to eat it all

Arrival Just Before Dark

We arrived at the Balilas festival site five minutes after sunset, with just enough light left to set up our tents before nightfall. Trugarez d’an heol — thank you, sun, for holding out a little longer.

There were two other cars filled with people coming from the Netherlands, but they had booked a B&B. We chose to camp on-site to soak in the full festival atmosphere.

Enjoy the view!

Banana pancakes!

Balilas itself was magical: days and nights filled with live music, joyful dancing, friendly faces, and the kind of warm atmosphere that defines balfolk festivals.

Photo: Poppy Lens

More info and photos:
balilas.lesviesdansent.bzh
@balilas.balfolk on Instagram

Balfolk is more than just dancing. It is about trust, openness, and sharing small adventures with people you barely know—who somehow feel like old friends by the end of the journey.

Tot de volgende — à la prochaine — betek ar blez a zeu!

_{Thank you Maï for proofreading the Breton expressions.}

May 28, 2025

🎥 Automating Git Repository Visualizations with GitHub Actions and Gource

In the world of DevOps and continuous integration, automation is essential. One fascinating way to visualize the evolution of a codebase is with Gource, a tool that creates animated tree diagrams of project histories.

Recently, I implemented a GitHub Actions workflow in my ansible-servers repository to automatically generate and deploy Gource visualizations. In this post, I will walk you through how the workflow is set up and what it does.

But first, let us take a quick look back…

Back in 2013: Visualizing Repos with Bash and XVFB

More than a decade ago, I published a blog post about Gource (in Dutch) where I described a manual workflow using Bash scripts. At that time, I ran Gource headlessly using xvfb-run, piped its output through pv, and passed it to ffmpeg to create a video.

It looked something like this:

#!/bin/bash -ex
 
xvfb-run -a -s "-screen 0 1280x720x24" \
  gource \
    --seconds-per-day 1 \
    --auto-skip-seconds 1 \
    --file-idle-time 0 \
    --max-file-lag 1 \
    --key -1280x720 \
    -r 30 \
    -o - \
  | pv -cW \
  | ffmpeg \
    -loglevel warning \
    -y \
    -b:v 3000K \
    -r 30 \
    -f image2pipe \
    -vcodec ppm \
    -i - \
    -vcodec libx264 \
    -preset ultrafast \
    -pix_fmt yuv420p \
    -crf 1 \
    -threads 0 \
    -bf 0 \
    ../gource.mp4

This setup worked well for its time and could even be automated via cron or a Git hook. However, it required a graphical environment workaround and quite a bit of shell-fu.

From Shell Scripts to GitHub Actions

Fast forward to today, and things are much more elegant. The modern Gource workflow lives in .github/workflows/gource.yml and is:

Reusable through workflow_call
Manually triggerable via workflow_dispatch
Integrated into a larger CI/CD pipeline (pipeline.yml)
Cloud-native, with video output stored on S3

Instead of bash scripts and virtual framebuffers, I now use a well-structured GitHub Actions workflow with clear job separation, artifact management, and summary reporting.

What the New Workflow Does

The GitHub Actions workflow handles everything automatically:

Decides if a new Gource video should be generated, based on time since the last successful run.
Generates a Gource animation and a looping thumbnail GIF.
Uploads the files to an AWS S3 bucket.
Posts a clean summary with links, preview, and commit info.

It supports two triggers:

workflow_dispatch (manual run from the GitHub UI)
workflow_call (invoked from other workflows like pipeline.yml)

You can specify how frequently it should run with the skip_interval_hours input (default is every 24 hours).

Smart Checks Before Running

To avoid unnecessary work, the workflow first checks:

If the workflow file itself was changed.
When the last successful run occurred.
Whether the defined interval has passed.

Only if those conditions are met does it proceed to the generation step.

Building the Visualization

Step-by-step:

Checkout the Repo
Uses actions/checkout with fetch-depth: 0 to ensure full commit history.
Generate Gource Video
Uses nbprojekt/gource-action with configuration for avatars, title, and resolution.
Install FFmpeg
Uses AnimMouse/setup-ffmpeg to enable video and image processing.
Create a Thumbnail
Extracts preview frames and assembles a looping GIF for visual summaries.
Upload Artifacts
Uses actions/upload-artifact to store files for downstream use.

Uploading to AWS S3

In a second job:

AWS credentials are securely configured via aws-actions/configure-aws-credentials.
Files are uploaded using a commit-specific path.
Symlinks (gource-latest.mp4, gource-latest.gif) are updated to always point to the latest version.

A Clean Summary for Humans

At the end, a GitHub Actions summary is generated, which includes:

A thumbnail preview
A direct link to the full video
Video file size
Commit metadata

This gives collaborators a quick overview, right in the Actions tab.

Why This Matters

Compared to the 2013 setup:

2013 Bash Script	2025 GitHub Actions Workflow
Manual setup via shell	Fully automated in CI/CD
Local only	Cloud-native with AWS S3
Xvfb workaround required	Headless and clean execution
Script needs maintenance	Modular, reusable, and versioned
No summaries	Markdown summary with links and preview

Automation has come a long way — and this workflow is a testament to that progress.

Final Thoughts

This Gource workflow is now a seamless part of my GitHub pipeline. It generates beautiful animations, hosts them reliably, and presents the results with minimal fuss. Whether triggered manually or automatically from a central workflow, it helps tell the story of a repository in a way that is both informative and visually engaging.

Would you like help setting this up in your own project? Let me know — I am happy to share.

May 27, 2025

Dries Buytaert

Comparing local LLMs for alt-text generation, round 2

Four months ago, I tested 10 local vision LLMs and compared them against the top cloud models. Vision models can analyze images and describe their content, making them useful for alt-text generation.

The result? The local models missed important details or introduced hallucinations. So I switched to using cloud models, which produced better results but meant sacrificing privacy and offline capability.

Two weeks ago, Ollama released version 0.7.0 with improved support for vision models. They added support for three vision models I hadn't tested yet: Mistral 3.1, Qwen 2.5 VL and Gemma 3.

I decided to evaluate these models to see whether they've caught up to GPT-4 and Claude 3.5 in quality. Can local models now generate accurate and reliable alt-text?

Model	Provider	Release date	Model size
Gemma 3 (27B)	Google DeepMind	March 2025	27B
Qwen 2.5 VL (32B)	Alibaba	March 2025	32B
Mistral 3.1 (24B)	Mistral AI	March 2025	24B

Updating my `alt`-text script

For my earlier experiments, I created an open-source script that generates alt-text descriptions. The script is a Python wrapper around Simon Willison's llm tool, which provides a unified interface to LLMs. It supports models from Ollama, Hugging Face and various cloud providers.

To test the new models, I added 3 new entries to my script's models.yaml, which defines each model's prompt, temperature, and token settings. Once configured, generating alt-text is simple. Here is an example using the three new vision models:

$ ./caption.py test-images/image-1.jpg –model mistral-3.1-24b gemma3-27b qwen2.5vl-32b

Which outputs something like:

{
  "image": "test-images/image-1.jpg",
  "captions": {
    "mistral-3.1-24b": "A bustling intersection at night filled with pedestrians crossing in all directions."
    "gemma3-27b": "A high-angle view shows a crowded Tokyo street filled with pedestrians and brightly lit advertising billboards at night.",
    "qwen2.5vl-32b": "A bustling city intersection at night, crowded with people crossing the street, surrounded by tall buildings with bright, colorful billboards and advertisements.",
  }
}

Evaluating the models

To keep the results consistent, I used the same test images and the same evaluation method as in my earlier blog post. The details results are in this Google spreadsheet.

Each alt-text was scored from 0 to 5 based on three criteria: how well it identified the most important elements in the image, how effectively it captured the mood or atmosphere, and whether it avoided repetition, grammar issues or hallucinated details. I then converted each score into a letter grade from A to F.

For comparison, the cloud models received the following scores: GPT-4o earned an average of 4.8 out of 5 (grade A), and Claude 3.5 Sonnet received a perfect 5 out of 5 (grade A).

Test image 1: Shibuya Crossing, Tokyo

Areal view of an intersection, capturing the evening commute with pedestrians, traffic and electronic billboards.

Model	Description	Grade
Mistral 3.1 (24B)	A bustling intersection at night filled with pedestrians crossing in all directions.	C
Gemma 3 (27B)	A high-angle view shows a crowded Tokyo street filled with pedestrians and brightly lit advertising billboards.	B
Qwen 2.5 VL (32B)	A bustling city intersection at night, crowded with people crossing the street, surrounded by brightly lit billboards and tall buildings.	B

None earned an A because they failed to identify this as Shibuya Crossing in Tokyo, while cloud models do. Mistral also missed the billboards.

Test image 2: Isabella Stewart Gardner Museum, Boston

An empty picture frame in a room with ornate wallpaper, candles, old paintings, and antique furniture.

Model	Description	Grade
Mistral 3.1 (24B)	An ornate wall features a large empty picture frame.	B
Gemma 3 (27B)	An empty, ornate gold frame hangs on a patterned green wall between two framed portraits and a candle sconce.	A
Qwen 2.5 VL (32B)	A vintage-style room features ornate wallpaper, a framed empty canvas, a lit candelabra, and a decorative vase on a table, with portraits on either side.	A

The vision models in my previous post often mistook the empty frame for a framed painting. All three models in this test correctly identified it as empty. Gemma and Qwen captured valuable details about the scene, while Mistral's description felt sparse.

Test image 3: wakeboarding in Vermont, USA

Two men in swim shorts on the back of a boat watching another person wakeboarding behind the boat.

Model	Description	Grade
Mistral 3.1 (24B)	Two shirtless men on a boat watch another person water skiing on a lake.	B
Gemma 3 (27B)	Two people on a boat watch a waterskier speeding across the lake on a sunny day.	B
Qwen 2.5 VL (32B)	Two shirtless men on a boat watch a person water skiing in the distance on a calm lake.	B

All three described a wakeboarding scene as "water skiing", while the cloud models correctly identified it as wakeboarding.

Test image 4: hiking in the Dolomites, Italy

Model	Description	Grade
Mistral 3.1 (24B)	A wooden statue of a saint is mounted on a post with directional signs pointing to various locations.	C
Gemma 3 (27B)	A small wooden shrine with a statue of Mary stands beside a signpost indicating hiking trails in a grassy field.	B
Qwen 2.5 VL (32B)	A wooden shrine with a statue of a figure stands on a tree stump, surrounded by a scenic mountain landscape with directional signs in the foreground.	B

Only Gemma recognized the statue as Mary. Both Mistral and Gemma missed the mountains in the background, which seems important.

Test image 5: backgammon by candlelight

A backgammon board on a wooden table, accompanied by candles that cast a warm glow.

Model	Description	Grade
Mistral 3.1 (24B)	A lit candle and a glass of liquid are on a wooden table next to a wooden board game.	B
Gemma 3 (27B)	A lit candle and glass votive sit on a wooden table, creating a warm, inviting glow in a dimly lit space.	B
Qwen 2.5 VL (32B)	A cozy scene with a lit candle on a wooden table, next to a backgammon board and a glass of liquid, creating a warm and inviting atmosphere.	A

Neither Mistral nor Gemma recognized the backgammon board. Only Qwen identified it correctly. Mistral also failed to capture the photo's mood.

Model accuracy

Model	Repetitions	Hallucinations	Moods	Average score	Grade
Mistral 3.1 (24B)	Never	Never	Fair	3.4/5	C
Gemma 3 (27B)	Never	Never	Good	4.2/5	B
Qwen 2.5 VL (32B)	Never	Never	Good	4.4/5	B

Qwen 2.5 VL performed best overall, with Gemma 3 not far behind.

Needless to say, these results are based on a small set of test images. And while I used a structured scoring system, the evaluation still involves subjective judgment. This is not a definitive ranking, but it's enough to draw some conclusions.

It was nice to say that all three LLMs avoided repetition and hallucinations, and generally captured the mood of the images.

Local models still make mistakes. All three described wakeboarding as "water skiing", most failed to recognize the statue as Mary or place the intersection in Japan. Cloud models get these details right, as I showed in my previous blog post.

Conclusion

I ran my original experiment four months ago, and at the time, none of the models I tested felt accurate enough for large-scale alt-text generation. Some, like Llama 3, showed promise but still fell short in overall quality.

Newer models like Qwen 2.5 VL and Gemma 3 have matched the performance I saw earlier with Llama 3. Both performed well in my latest test. They produced relevant, grounded descriptions without hallucinations or repetition, which earlier local models often struggled with.

Still, the quality is not yet at the level where I would trust these models to generate thousands of alt-texts without human review. They make more mistakes than GPT-4 or Claude 3.5.

My main question was: are local models now good enough for practical use? While Qwen 2.5 VL performed best overall, it still needs human review. I've started using it for small batches where manual checking is manageable. For large-scale, fully automated use, I continue using cloud models as they remain the most reliable option.

That said, local vision-language models continue to improve. My long-term goal is to return to a 100% local-first workflow that gives me more control and keeps my data private. While we're not there yet, these results show real progress.

My plan is to wait for the next generation of local vision models (or upgrade my hardware to run larger models). When those become available, I'll test them and report back.

May 23, 2025

Lionel Dricot

Reducing the digital clutter of chats

I hate modern chats. They presuppose we are always online, always available to chat. They force us to see and think about them each time we get our eyes on one of our devices. Unlike mailboxes, they are never empty. We can’t even easily search through old messages (unlike the chat providers themselves, which use the logs to learn more about us). Chats are the epitome of the business idiot: they make you always busy but prevent you from thinking and achieving anything.

It is quite astonishing to realise that modern chat systems use 100 or 1000 times more resources (in size and computing power) than 30 years ago, that they are less convenient (no custom client, no search) and that they work against us (centralisation, surveillance, ads). But, yay, custom emojis!

Do not get me wrong: chats are useful! When you need an immediate interaction or a quick on-the-go message, chats are the best.

I needed to keep being able to chat while keeping the digital clutter to a minimal and preserving my own sanity. That’s how I came up with the following rules.

Rule 1: One chat to rule them all

One of the biggest problems of centralised chats is that you must be on many of them. I decided to make Signal my main chat and to remove others.

Signal was, for me, a good compromise of respecting my privacy, being open source and without ads while still having enough traction that I could convince others to join it.

Yes, Signal is centralised and has drawbacks like relying on some Google layers (which I worked around by using Molly-FOSS). I simply do not see XMPP, Matrix or SimpleX becoming popular enough in the short term. Wire and Threema had no advantages over Signal. I could not morally justify using Whatsapp nor Telegram.

In 2022, as I decided to use Signal as my main chat, I deleted all accounts but Signal and Whatsapp and disabled every notification from Whatsapp, forcing myself to open it once a week to see if I had missed something important. People who really wanted to reach me quickly understood that it was better to use Signal. This worked so well that I forgot to open Whatsapp for a whole month which was enough for Whatsapp to decide that my account was not active anymore.

Le suicide de mon compte WhatsApp (ploum.net)

Not having Whatsapp is probably the best thing which happened to me regarding chats. Suddenly, I was out of tenths or hundreds of group chats. Yes, I missed lots of stuff. But, most importantly, I stopping fearing missing them. Seriously, I never missed having Whatsapp. Not once. Thanks Meta for removing my account!

While travelling in Europe, it is now standard that taxi and hotels will chat with you using Whatsapp. Not anymore for me. Guess what? It works just fine. In fact, I suspect it works even better because people are forced to either do what we agreed during our call or to call me, which requires more energy and planning.

Rule 2: Mute, mute, mute!

Now that Signal is becoming more popular, some group chats are migrating to it. But I’ve learned the lesson : I’m muting them. This allows me to only see the messages when I really want to look at them. Don’t hesitate to mute vocal group chats and people with whom you don’t need day-to-day interaction.

I’m also leaving group chats which are not essential. Whatsapp deletion told me that nearly no group chat is truly essential.

Many times, I’ve had people sending me emails about what was told on a group chat because they knew I was not there. Had I been on that group, I would probably have missed the messages but nobody would have cared. If you really want to get in touch with me, send me an email!

Rule 3: No read receipts nor typing indicators

I was busy, walking in the street with my phone in hands for directions. A notification popped up with an important message. It was important but not urgent. I could not deal with the message at that moment. I wanted to take the time. One part of my brain told me not to open the message because, if I did, the sender would see a "read receipt". He would see that I had read the message but would not receive any answer.

For him, that would probably translate in "he doesn’t care". I consciously avoided to open Signal until I was back home and could deal with the message.

That’s when I realised how invasive the "read receipt" was. I disabled it and never regretted that move. I’m reading messages on my own watch and replying when I want to. Nobody needs to know if I’ve seen the message. It is wrong in every aspect.

Signal preferences showing read receipts and typing indicator disabled

Rule 4: Temporary discussions only

The artist Bruno Leyval, who did the awesome cover of my novel Bikepunk, is obsessed with deletion and disappearance. He set our Signal chat so that every message is deleted after a day. At first, I didn’t see the point.

Until I understood that this was not only about privacy, it also was about decluttering our mind, our memories.

Since then, I’ve set every chat in Signal to delete messages after one week.

Signal preferences showing disappearing messages set to one week

This might seem like nothing but this changes everything. Suddenly, chats are not a long history of clutter. Suddenly, you see chats as transient and save things you want to keep. Remember that you can’t search in chats? This means that chats are transient anyway. With most chats, your history is not saved and could be lost by simply dropping your phone on the floor. Something important should be kept in a chat? Save it! But it should probably have been an email.

Embracing the transient nature of chat, making it explicit greatly reduce the clutter.

Conclusion

I know that most of you will say that "That’s nice Ploum but I can’t do that because everybody is on XXX" where XXX is most often Whatsapp in my own circles. But this is wrong: you believe everybody is on XXX because you are yourself using XXX as your main chat. When surveying my students this year, I’ve discovered that nearly half of them was not on Whatsapp. Not for some hard reason but because they never saw the need for it. In fact, they were all spread over Messenger, Instagram, Snap, Whatsapp, Telegram, Discord. And they all believed that "everybody is where I am".

In the end, the only real choice to make is between being able to get immediately in touch with a lot of people or having room for your mental space. I choose the latter, you might prefer the former. That’s fine!

I still don’t like chat. I’m well aware that the centralised nature of Signal makes it a short-term solution. But I’m not looking for the best sustainable chat. I just want fewer chats in my life.

If you want to get in touch, send me an email!

I’m Ploum, a writer and an engineer. I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

May 21, 2025

Frederic Descamps

MySQL Hypergraph Optimizer

During the last MySQL & HeatWave Summit, Wim Coekaerts announced that a new optimizer is available and is already enabled in MySQL HeatWave. Let’s have a quick look at it and how to use it. The first step is to verify that Hypergraph is available: The statement won’t return any error if the Hypergraph Optimizer […]

Amedee Van Gasse

🎻 Spring Tunes: Three Inspiring Music Courses I Attended This Season

This spring was filled with music, learning, and connection. I had the opportunity to participate in three wonderful music courses, each offering something unique—new styles, deeper technique, and a strong sense of community. Here is a look back at these inspiring experiences.

1. Fiddlers on the Move – Ghent (5–9 March)

Photo: Filip Verpoest

In early March, I joined Fiddlers on the Move in Ghent, a five-day course packed with workshops led by musicians from all over the world. Although I play the nyckelharpa, I deliberately chose workshops that were not nyckelharpa-specific. This gave me the challenge and joy of translating techniques from other string traditions to my instrument.

Here is a glimpse of the week:

Wednesday: Fiddle singing with Laura Cortese – singing while playing was new for me, and surprisingly fun.
Thursday: Klezmer violin / Fiddlers down the roof with Amit Weisberger – beautiful melodies and ornamentation with plenty of character.
Friday: Arabic music with Layth Sidiq – an introduction to maqams and rhythmic patterns that stretched my ears in the best way.
Saturday: Swedish violin jamsession classics with Mia Marine – a familiar style, but always a joy with Mia’s energy and musicality.
Sunday: Live looping strings with Joris Vanvinckenroye – playful creativity with loops, layering, and rhythm.

Each day brought something different, and I came home with a head full of ideas and melodies to explore further.

2. Workshopweekend Stichting Draailier & Doedelzak – Sint-Michielsgestel, NL (18–21 April)

Photo: Arne de Laat

In mid-April, I traveled to Sint-Michielsgestel in the Netherlands for the annual Workshopweekend organized by Stichting Draailier & Doedelzak. This year marked the foundation’s 40th anniversary, and the event was extended to four days, from Friday evening to Monday afternoon, at the beautiful location of De Zonnewende.

I joined the nyckelharpa workshop with Rasmus Brinck. One of the central themes we explored was the connection between playing and dancing polska—a topic close to my heart. I consider myself a dancer first and a musician second, so it was especially meaningful to deepen the musical understanding of how movement and melody shape one another.

The weekend offered a rich variety of other workshops as well, including hurdy-gurdy, bagpipes, diatonic accordion, singing, and ensemble playing. As always, the atmosphere was warm and welcoming. With structured workshops during the day and informal jam sessions, concerts, and bals in the evenings, it was a perfect blend of learning and celebration.

3. Swedish Music for Strings – Ronse (2–4 May)

At the beginning of May, I took part in a three-day course in Ronse dedicated to Swedish string music. Although we could arrive on 1 May, teaching started the next day. The course was led by David Eriksson and organized by Amate Galli. About 20 musicians participated—two violinists, one cellist, and the rest of us on nyckelharpa.

The focus was on capturing the subtle groove and phrasing that make Swedish folk music so distinctive. It was a joy to be surrounded by such a rich soundscape and to play in harmony with others who share the same passion. The music stayed with me long after the course ended.

Final Thoughts

Each of these courses gave me something different: new musical perspectives, renewed technical focus, and most importantly, the joy of making music with others. I am deeply grateful to all the teachers, organizers, and fellow participants who made these experiences so rewarding. I am already looking forward to the next musical adventure!

May 19, 2025

Frederic Descamps

MEM is dead, long live Oracle Database Management

MySQL Enterprise Monitor, aka MEM, retired in January 2025, after almost 20 years of exemplary service! What’s next? Of course, plenty of alternatives exist, open source, proprietary, and on the cloud. For MySQL customers, we provide two alternatives: This post focuses on the latter, as there is no apparent reason to deploy an Oracle Database […]

Dries Buytaert

Seven days exploring Oregon by van

We landed late in Portland, slept off the jet lag, and had breakfast at a small coffee shop. The owner of the coffee shop challenged me to a game of chess. Any other day, I would have accepted. But after a year of waiting, not even Magnus Carlsen himself could delay us from meeting our van.

After more than a year of planning, designing, shipping gear, and running through every what-if, we were finally in Portland to pick up our new van. It felt surreal to be handed the keys after all that waiting.

The plan: a one-week "shakedown trip", testing our new van while exploring Oregon. We'd stay close to the builder for backup, yet sleep by the ocean, explore vineyards and hike desert trails.

Day 1–2: ocean air in Pacific City

After picking up the van and completing our onboarding, we drove just two hours west to Pacific City on the Oregon coast. I was nervous at first since I'd never driven anything this big, but it felt manageable within minutes.

For the first two nights, we stayed at Hart's Camp, also known as Cape Kiwanda RV Park. The campground had full hookups, a small grocery store, and a brewery just across the road. While the RV park felt more like a parking lot than a forest, it was exactly what we needed. Something easy and practical. A soft landing.

Two people carrying surfboards walk toward the beach with a large rock formation in the water.

Camping in Pacific City, a small surfer town on the Oregon coast. The main landmark is Haystack Rock, also known as Chief Kiawanda. It looks like a gorilla's head with a rat tail. Once you see it, you can't unsee it.

That night, we cracked the windows and slept with the ocean air drifting in. People always ask how someone six foot four (or almost 2 meters) manages to sleep in a van. The truth? I slept better than I do in most hotel beds.

We spent most of the next day settling in. We organized, unpacked the gear we had shipped ahead, figured out what all the switches did, cleaned the dishes, and debated what to put in which cabinets.

Day 3–4: Wine tasting in Willamette Valley

From the coast, we headed inland into Oregon wine country. The drive to Champoeg State Park took about an hour and a half, winding through farmland and small towns.

Champoeg is a beautifully laid-out campground, quiet and easy to navigate, with wide sites and excellent showers. The surrounding area is a patchwork of vineyards, barns and backroads.

We visited several vineyards in the Willamette Valley. The Four Graces was our favorite. Their 2021 Windborn Pinot Noir stood out, and the inn next door looked tempting for a future non-camping trip. We also visited Ken Wright, Lemelson, and Dominio IV. We ended up buying 15 bottles of wine. One of the perks of van life: your wine cellar travels with you.

That evening, Vanessa cooked salmon in red curry, outside in the rain, over a wood fire. We ate inside the van, warm and dry, with music playing. We opened a 2023 Still Life Viognier from Dominio IV that we had liked so much during the tasting. There was something magical about being cozy in a small space while rain drums overhead.

Person stretching and smiling next to a laundry cart in front of dryers at a laundromat.

Laundry day in a small town in Oregon. We're camping in a van and stopped to wash some of our clothes. Life on the road!

The next morning, we drove to McMinnville to do laundry. Laundry isn't exactly a bucket-list item for me, but Vanessa found it oddly satisfying. There was a coffee kiosk nearby, so we walked through the drive-thru on foot, standing between cars to order lattes while our clothes spun.

Day 5: Smith Rock and steep trails

Person sitting in the doorway of a parked camper van at a quiet, remote campground.

Arrived at Skull Hollow Campground. No power or water, but it's quiet and the sky is full of stars. It feels like we're exactly where we need to be.

We left wine country behind and headed into Central Oregon, crossing over the mountains on a three-hour drive that was scenic the entire way. Pine forests, wide rivers and lakes came and went as we climbed and descended. We passed through small towns and stretches of open road that felt far from anywhere.

I kept wanting to stop to take photos, but didn't want to interrupt the rhythm of the road. Instead, we made mental notes about places to come back to someday.

That night we stayed at Skull Hollow Campground, a basic campground without hookups. Our first real test of off-grid capability. This was deliberate. We wanted to know how our solar panels and batteries would handle a full night of heating, and whether our water supply would last without refilling.

It was cold, down to 37°F, but the van handled it well. We kept the heat at 60°F and slept soundly under a heavy duvet. When we woke up, the solar panels were already soaking up sunlight.

Tall rock cliffs surround a winding river and hiking trail at Smith Rock State Park in Oregon.

We hiked the Misery Ridge and River Trail in Smith Rock State Park. The steep switchbacks and rocky terrain made it a tough climb, but the panoramic views were worth it.

Smith Rock State Park was just fifteen minutes away. Vanessa wanted to drive the van somewhere more rural, and this was the perfect chance. She handled the van like a boss.

Day 6: Bend and our first HipCamp

We continued to Bend, just forty minutes from Skull Hollow. Bend is a small city in Central Oregon known for its outdoor lifestyle. We resupplied, filled our water tank and stopped by REI (think "camping supermarket").

We had dinner at Wild Rose, a Northern Thai restaurant that had been nominated for a James Beard award. The food was excellent. The service was not.

That night we stayed at our first HipCamp, a campsite on a working ranch with cattle just outside of Bend. A lone bull stood at the entrance, watching us with mild interest. We followed the long gravel driveway past grazing cows, getting our first real taste of ranch life.

Old rusted Ford truck parked on dirt, with faded paint and wooden flatbed in the back.

Our van's neighbor for the night was a beautifully rusted Ford. Retired but still stealing the spotlight. Rusty steering wheel and exposed seat springs inside an old, abandoned truck with cracked windows.

Rusty steering wheel and exposed seat springs inside an old, abandoned truck with cracked windows.

I can't help but wonder what stories this truck could tell.

The setup was simple: a grassy parking area, a 30-amp electrical outlet, a metal trash bin and a water pump with well water. We shared the space with an old Ford that had clearly been there longer than we planned to be. A pair of baby owls watched us from a tree. It was peaceful, with wide views and almost total quiet. Nothing fancy, but memorable.

Day 7: Camp Sherman and quiet rivers

A black camper van is parked in a forested campground surrounded by tall pine trees.

Parked at Camp Sherman Campground, next to the Metolius River in Oregon, USA.

On our final full day, we drove about an hour and twenty minutes to Camp Sherman Campground, nestled along the Metolius River. The Metolius is a spring-fed river known for its crystal-clear waters and world-class fly fishing.

The campground is "first-come, first-serve", and we were lucky to find a site right by the water. This was our first time using an "Iron Ranger", the self-pay envelope system used in many public campgrounds. A refreshing throwback to simpler times.

We hiked a trail along the river, upstream through forests that had clearly burned in recent years. Signs along the path explained it was a prescribed burn area, which gave the charred trunks and new growth a different kind of meaning.

We watched a family of deer move through the trees at dusk very close to our van. They seemed as curious about us as we were about them.

Later that evening, we cooked, read, and talked. We sat by the fire, wrapped in the camping blankets we had picked up at REI. It was quiet in the way we needed it to be.

The next morning, it was time to move on. I had to get to Chicago for a Drupal Camp, even though neither of us felt ready to leave.

530 miles, one van and zero regrets

This loop, starting in Portland, heading to the coast, through wine country, over the mountains, and back again, turned out to be the perfect test run. Each stop offered something different, from ocean breezes to vineyard views to rugged high desert hikes. The drives were short enough to stay relaxed, ranging from 40 minutes to three hours. In total, we covered about 530 miles.

By the end of the trip, we had gone from full hookups to fully self-sufficient, using solar power and our onboard water supply. The van passed every test we threw at it. Now we knew we could take it anywhere.

May 17, 2025

Counting Crows - Round Here

It's been a while since I wrote about one of my favorite songs, but Counting Crows' "Round Here" is one that has always stuck with me.

I've listened to this song hundreds of times, and this non-standard version, where Adam Duritz stretches the lyrics and lets his emotions flow, hits even harder.

To me, it feels like a quiet cry about mental health. About someone feeling disconnected, uncertain of who they are, and not at home in their own life. There is something raw, honest, and deeply human in the way the song captures that struggle.

The song has only grown on me over time. I didn't fully understand or appreciate it in my twenties, but now that I'm in my forties, I've come to see more people around me carrying quiet struggles. If that is you, I hope you're taking care of yourself.

May 16, 2025

Lionel Dricot

Petit manifeste low-tech

Ce samedi 17 mai, je pédalerai vers Massy en compagnie de Tristan Nitot pour parler "low-tech" et dédicacer Bikepunk lors du festival Parlons Vélo.

Parlons Vélo Massy 2025 (parlonsvelo-massy.org)

Attention, ce qui va suivre divulgâche une partie de ce que je dirai samedi midi à Massy. Si vous venez, arrêtez de lire ici, on se retrouve demain !

Qu’est-ce que la low-tech ?

Le terme low-tech nous fait intuitivement sentir une opposition contre l’excès technologique (le "high tech") tout en évitant l’extrémisme technophobique. Un terme qui enthousiasme, mais qu’il me semble important d’expliciter et dont je propose la définition suivante.

Une technologie est dite « low-tech » si les personnes interagissant avec cette technologie savent et peuvent en comprendre son fonctionnement.

Savoir comprendre. Pouvoir comprendre. Deux éléments essentiels (et difficiles à distinguer pour le Belge que je suis).

Savoir comprendre

Savoir comprendre une technologie implique d’avoir la possibilité de construire un modèle intellectuel de son fonctionnement interne.

Il est bien évident que tout le monde n’a pas la capacité de comprendre toutes les technologies. Mais il est possible de procéder par niveau. La majorité des automobilistes sait qu’une voiture à essence brûle le carburant qui explose dans un moteur, explosion qui entraine des pistons qui font tourner les roues. Le nom est un indice en soi : un moteur à explosion !

Si je n’en comprends pas plus sur le fonctionnement d’un moteur, j’ai la certitude qu’il existe des personnes qui comprennent mieux, souvent dans mon entourage direct. Au plus la compréhension est fine, au plus les personnes deviennent rares, mais chacun peut tenter de s’améliorer.

La technologie est simple sans être simpliste. Cela signifie que sa complexité peut être appréhendée graduellement. Et qu’il existe des experts qui appréhendent une technologie particulière dans sa globalité.

Par opposition, il est aujourd’hui humainement impossible de comprendre un smartphone moderne. Seuls quelques expert·e·s dans le monde maitrisent chacun·e un point particulier de l’objet : du dessin de l’antenne 5G au logiciel retouchant automatiquement les photos en passant par le chargement rapide de la batterie. Et aucun d’entre eux ne maitrise la conception d’un compilateur nécessaire à faire tourner le tout. Même un génie passant sa vie à démonter des smartphones serait dans l’incapacité totale de comprendre ce qui se passe à l’intérieur d’un engin que nous avons tous en permanence soit dans une poche, soit devant notre nez !

L’immense majorité des utilisateurs de smartphones n’ont pas le moindre modèle mental de son fonctionnement. Je ne parle pas d’un modèle erroné ou simpliste : non, il n’y en a pas du tout. L’objet est « magique ». Pourquoi affiche-t-il quelque chose plutôt qu’un autre ? Parce que c’est « magique ». Et comme pour la magie, il ne faut pas chercher à comprendre.

La low-tech peut être extrêmement complexe, mais l’existence même de cette complexité doit être compréhensible et justifiée. Une complexité transparente encourage naturellement les esprits curieux à se poser des questions.

Le temps de comprendre

Comprendre une technologie prend du temps. Cela implique une relation longue, une expérience qui se crée tout au long d’une vie, qui se partage, qui se transmet.

Par opposition, la high-tech impose un renouvellement, une mise à jour constante, des changements d’interface et de fonctionnalité permanents qui renforcent l’aspect « magique » et entraine le découragement de celleux qui tentent de se construire un modèle mental.

La low-tech doit donc nécessairement être durable. Pérenne. Elle doit s’enseigner et permettre une construction progressive de cet enseignement.

Cela implique parfois des efforts, des difficultés. Tout ne peut pas toujours être progressif : à un moment, il faut se lancer sur son vélo pour apprendre à garder l’équilibre.

Pouvoir comprendre

Historiquement, il semble évident que toute technologie a la possibilité d’être comprise. Les personnes interagissant avec la technologie étaient forcées de réparer, d’adapter et donc de comprendre. Une technologie était essentiellement matérielle, ce qui implique qu’elle pouvait être démontée.

Avec le logiciel apparait un nouveau concept : celui de cacher le fonctionnement. Et si, historiquement, tout logiciel est open source, l’invention du logiciel propriétaire rend difficile, voire impossible, de comprendre une technologie.

L’histoire du logiciel : entre collaboration et confiscation des libertés (ploum.net)

Le logiciel propriétaire n’a pu être inventé que grâce à la création d’un concept récent, au demeurant absurde, appelé « propriété intellectuelle ».

Cette propriété intellectuelle ayant permis la privatisation de la connaissance dans le logiciel, elle est ensuite étendue au monde matériel. Soudainement, il devient possible d’interdire à une personne de tenter de comprendre la technologie qu’elle utilise au quotidien. Grâce à la propriété intellectuelle, des fermiers se voient soudain interdits d’ouvrir le capot de leur propre tracteur.

La low-tech doit être ouverte. Elle doit pouvoir être réparée, modifiée, améliorée et partagée.

De l’utilisateur au consommateur

Grâce à la complexification, aux changements incessants et à l’imposition d’un régime strict de « propriété intellectuelle », les utilisateurs ont été transformés en consommateurs.

Ce n’est pas un hasard. Ce n’est pas une évolution inéluctable de la nature. Il s’agit d’un choix conscient. Toutes les écoles de commerce enseignent aux futurs entrepreneurs à se construire un marché captif, à priver autant que possible leur client de liberté, à construire ce qu’on appelle dans le jargon une "moat" (douve qui protège un château) afin d’augmenter la « rétention des utilisateurs ».

Les termes eux-mêmes deviennent flous pour renforcer ce sentiment de magie. On ne parle par exemple plus de transférer un fichier .jpg vers un ordinateur distant, mais de « sauvegarder ses souvenirs dans le cloud ».

Les marketeux nous ont fait croire qu’en supprimant les mots compliqués, ils simplifieraient la technologie. C’est évidemment le contraire. L’apparence de simplicité est une complexité supplémentaire qui emprisonne l’utilisateur. Toute technologie nécessite un apprentissage. Cet apprentissage doit être encouragé.

Pour une approche et une éthique low-tech

L’éthique low-tech consiste à se remettre au service de l’utilisateur en lui facilitant la compréhension de ses outils.

La high-tech n’est pas de la magie, c’est de la prestidigitation. Plutôt que de cacher les « trucs » sous des artifices, la low-tech cherche à montrer et à créer une utilisation en conscience de la technologie.

Cela n’implique pas nécessairement une simplification à outrance.

Prenons l’exemple d’une machine à laver le linge. Nous comprenons tous qu’une machine de base est un tambour qui tourne dans lequel est injecté de l’eau et du savon. C’est très simple et low-tech.

On pourrait arguer que l’ajout de capteurs et de contrôleurs électroniques permet de laver le linge plus efficacement et plus écologiquement en le pesant et adaptant la vitesse de rotation en fonction du type de linge.

Dans une optique low-tech, un boitier électronique est ajouté à la machine pour faire exactement cela. Si le boitier est retiré ou tombe en panne, la machine continue à fonctionner simplement. L’utilisateur peut choisir de débrancher le boitier ou de le remplacer. Il en comprend l’utilité et la justification. Il construit un modèle mental dans lequel le boitier ne fait qu’appuyer sur les boutons de réglage au bon moment. Et, surtout, il ne doit pas envoyer toute la machine à la casse parce que la puce wifi ne fonctionne plus et n’est plus mis à jour ce qui a bloqué le firmware (quoi ? Ma machine à laver dispose d’une puce wifi ?).

Pour une communauté low-tech

Une technologie low-tech encourage et donne l’occasion à l’utilisateur à la comprendre, à se l’approprier. Elle tente de rester stable dans le temps, se standardise. Elle ne cherche pas à cacher la complexité intrinsèque partant du principe que la simplicité provient de la transparence.

Cette compréhension, cette appropriation ne peut se faire que dans l’interaction. Une technologie low-tech va donc, par essence, favoriser la création de communautés et les échanges humains autour de cette même technologie.

Pour contribuer à l’humanité et aux communautés, une technologie low-tech se doit d’appartenir à tou·te·s, de faire partie des communs.

J’en arrive donc à cette définition, complémentaire et équivalente à la première :

Une technologie est dite « low-tech » si elle expose sa complexité de manière simple, ouverte, transparente et durable tout en appartenant aux communs.

Photo par Thomas Claveirole

Je suis Ploum et je viens de publier Bikepunk, une fable écolo-cycliste entièrement tapée sur une machine à écrire mécanique. Pour me soutenir, achetez mes livres (si possible chez votre libraire) !

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

May 15, 2025

Frederic Descamps

A quick look at MySQL EE, free for developers

MySQL provides the MySQL Community Edition, the Open-Source version. In addition, there is the Enterprise Edition for our Commercial customers and MySQL HeatWave, our managed database service (DBaaS) on the cloud (OCI, AWS, etc.). But do you know developers can freely use MySQL Enterprise for non-commercial use? The full range of MySQL Enterprise Edition features […]

May 14, 2025

Amedee Van Gasse

📚 Automating Ansible Role Documentation with GitHub Actions and AI

Maintaining documentation for Ansible roles can be a tedious and easily neglected task. As roles evolve, variable names change, and new tasks are added, it is easy for the README.md files to fall out of sync. To prevent this and keep documentation continuously up to date, I wrote a GitHub Actions workflow that automatically generates and formats documentation for all Ansible roles in my repository. Even better: it writes its own commit messages using AI.

Let me walk you through why I created this workflow, how it works, and what problems it solves.

Why Automate Role Documentation?

Ansible roles are modular, reusable components. Good roles include well-structured documentation—at the very least, variable descriptions, usage examples, and explanations of defaults.

However, writing documentation manually introduces several problems:

Inconsistency: Humans forget things. Updates to a role do not always get mirrored in its documentation.
Wasted time: Writing boilerplate documentation by hand is inefficient.
Error-prone: Manually copying variable names and descriptions invites typos and outdated content.

Enter ansible-doctor: a tool that analyzes roles and generates structured documentation automatically. Once I had that, it made perfect sense to automate its execution using GitHub Actions.

How the Workflow Works

Here is the high-level overview of what the workflow does:

Triggers:
- It can be run manually via workflow_dispatch.
- It is also designed to be reusable in other workflows via workflow_call.
Concurrency and Permissions:
- Uses concurrency to ensure that only one documentation run per branch is active at a time.
- Grants minimal permissions needed to write to the repository and generate OIDC tokens.
Steps:
- Check out the code.
- Set up Python and install ansible-doctor.
- Generate documentation with ansible-doctor --recursive roles.
- Format the resulting Markdown using Prettier to ensure consistency.
- Configure Git with a bot identity.
- Detect whether any .md files changed.
- Generate a commit message using AI, powered by OpenRouter.ai and a small open-source model (mistralai/devstral-small:free).
- Commit and push the changes if there are any.

AI-Generated Commit Messages

Why use AI for commit messages?

I want my commits to be meaningful, concise, and nicely formatted.
The AI model is given a diff of the staged Markdown changes (up to 3000 characters) and asked to:
- Keep it under 72 characters.
- Start with an emoji.
- Summarize the nature of the documentation update.

This is a small but elegant example of how LLMs can reduce repetitive work and make commits cleaner and more expressive.

Fallbacks are in place: if the AI fails to generate a message, the workflow defaults to a generic 📝 Update Ansible role documentation.

A Universal Pattern for Automated Docs

Although this workflow is focused on Ansible, the underlying pattern is not specific to Ansible at all. You can apply the same approach to any programming language or ecosystem that supports documentation generation based on inline annotations, comments, or code structure.

The general steps are:

Write documentation annotations in your code (e.g. JSDoc, Doxygen, Python docstrings, Rust doc comments, etc.).
Run a documentation generator, such as:
- javadoc for Java
- jsdoc for JavaScript
- sphinx for Python
- rustdoc for Rust
- godoc for Go
- doxygen for C/C++
Generate a commit message from the diff using an LLM.
Commit and push the updated documentation.

This automation pattern works best in projects where:

Documentation is stored in version control.
Changes to documentation should be traceable.
Developers want to reduce the overhead of writing and committing docs manually.

A Note on OpenRouter API Keys

The AI step relies on OpenRouter.ai to provide access to language models. To keep your API key secure, it is passed via secrets.OPENROUTER_API_KEY, which is required when calling this workflow. I recommend generating a dedicated, rate-limited key for GitHub Actions use.

Try It Yourself

If you are working with Ansible roles—or any codebase with structured documentation—and want to keep your docs fresh and AI-assisted, this workflow might be useful for you too. Feel free to copy and adapt it for your own projects. You can find the full source in my GitHub repository.

Let the robots do the boring work, so you can focus on writing better code.

Feedback?

If you have ideas to improve this workflow or want to share your own automation tricks, feel free to leave a comment or reach out on Mastodon: @amedee@lou.lt.

Happy automating!

Lionel Dricot

Comment l’université tue le livre (et les intellectuels)

Il faut sauver la bibliothèque de Louvain-la-Neuve

Menacée d’expulsion par l’université, la bibliothèque publique de Louvain-la-Neuve risque de disparaître. Il est urgent de signer la pétition pour tenter de la sauver.

Signez la pétition pour sauver la bibliothèque de Louvain-la-Neuve !

Mais ce n’est pas un événement isolé, ce n’est pas un accident. Il ne s’agit que d’une escarmouche dans la longue guerre que la ville, l’université et la société de consommation mènent contre les livres et, à travers eux, contre l’intellectualisme.

Le livre, outil indispensable de l’intellectuel

L’une des tâches que je demande chaque année à mes étudiants avant l’examen est de lire un livre. Si possible de fiction ou un essai, mais un livre non technique.

Au choix.

Bien sûr, je donne des idées en rapport avec mon cours. Notamment « Little Brother » de Cory Doctorow qui est facile à lire, prenant, et tout à fait dans le sujet. Mais les étudiants sont libres.

Chaque année, plusieurs étudiants me glissent lors de l’examen qu’ils n’avaient plus lu un livre depuis l’école secondaire, mais que, en fait, c’était vraiment chouette et que ça fait vraiment réfléchir. Que sans moi, ils auraient fait toutes leurs études d’ingénieur sans lire un seul livre autre que des manuels.

Les livres, qui forcent une lecture sur un temps long, qui forcent une immersion, sont l’outil indispensable de l’intellectuel et de l’humaniste. Il est impossible de réfléchir sans livre. Il est impossible de prendre du recul, de faire de nouveaux liens et d’innover sans être baigné dans la diversité d’époques, de lieux et d’expériences humaines que sont les livres. On peut surnager pendant des années dans un domaine voire devenir compétent sans lire. Mais la compréhension profonde, l’expertise nécessite des livres.

Ceux qui ne lisent pas de livres sont condamnés à se satisfaire de superficialité, à se laisser manipuler, à obéir aveuglément. Et c’est peut-être ça l’objectif.

J’estime que l’université ne doit pas former de bons petits consultants obéissants et employables, mais des intellectuels humanistes. La mission première de l’université passe par la diffusion, la promotion, l’appropriation de la culture intellectuelle du livre.

Et si on arrêtait d’être de bons petits consultants obéissants ? (ploum.net)

Entre l’humanisme et le profit immobilier, l’université a choisi

Mais, à Louvain-la-Neuve, l’université semble se transformer en simple agence immobilière. La ville qui, en 50 ans, s’est créée autour de l’université est en train de se transformer pour n’offrir graduellement plus que deux choses : de la bouffe et des fringues.

Il faut agrandir le centre commercial ! (ploum.net)

En 2021, le bouquiniste de la place des Wallons, présent depuis 40 ans grâce à un bail historique, a vu son propriétaire, l’université, lui infliger une augmentation de loyer vertigineuse. Je l’ai vu, les yeux pleins de larmes, mettant en caisse les milliers de bandes dessinées de son stock afin de laisser la place à… un vendeur de gauffres !

Les monopoles du livre, les alternatives et le futur (ploum.net)

Ce fut ensuite le tour du second bouquiniste de la ville, une minuscule échoppe aux murs noircis de livres de philosophie où nous nous retrouvions régulièrement entre habitués pour nous disputer quelques pièces rares. Le couple qui tenait la bouquinerie m’a confié que, devant le prix du loyer, également versé à l’université, il était plus rentable pour eux de devenir bouquinistes itinérants. « Ça ne va pas vous plaire ! » m’a confié la gérante lorsque j’ai demandé qui reprendrait son espace. Quelques semaines plus tard, en effet, surgissait une vitrine vendant des sacs à mains !

Quant à la librairie principale de la ville, l’historique librairie Agora, elle fut rachetée par le groupe Furet du Nord dont la section belge a fait faillite. Il faut dire que la librairie occupait un énorme espace appartenant en partie au promoteur immobilier Klépierre et à l’université. D’après mes sources, le loyer mensuel s’élevait à… 35.000€ !

De cette faillite, j’ai récupéré plusieurs meubles bibliothèques qui étaient à donner. L’ouvrier qui était en train de nettoyer le magasin me souffla, avec un air goguenard, que les étudiants allaient être contents du changement ! Il n’avait pas le droit de me dire ce qui remplacerait la librairie, mais, promis, ils allaient être contents.

En effet, il s’agissait d’un projet de… Luna Park ! (qui, bien que terminé, n’a pas obtenu l’autorisation d’ouvrir ses portes suite aux craintes des riverains concernant le tapage qu’un tel lieu engendre)

Mais l’université ne comptait pas en rester là. Désireuse de récupérer des locaux pourtant sans aucun potentiel commercial, elle a également mis dehors le centre de livres d’occasion Cerfaux Lefort. Une pétition pour tenter de le sauver a récolté 3000 signatures. Sans succès.

Le Centre Cerfaux-Lefort a fermé ses portes à LLN, au grand désarroi du collectif et des étudiants qui militaient pour son maintien (www.rtbf.be)

Puisque ça fonctionne, enfonçons le clou !

Pendant quelques mois, Louvain-la-Neuve, ville universitaire et intellectuelle, s’est retrouvée sans librairie ! Consciente que ça faisait désordre, l’université a offert des conditions correctes à une équipe motivée pour créer la librairie « La Page d’Après » dans une petite surface. La libraire est petite et, par conséquent, doit faire des choix (la littérature de genre, mon domaine de prédilection, occupe moins d’une demi-table).

Je me suis évidemment enthousiasmé pour le projet de la Page d’Après, dont je suis immédiatement devenu un fidèle. Je n’avais pas imaginé l’esprit retors du promoteur immobilier qu’est devenue l’université : le soutien à la Page d’Après (qui n’est que très relatif, la surface n’est pas offerte non plus) est devenu l’excuse à la moindre critique !

Car c’est aujourd’hui la bibliothèque publique de Louvain-la-Neuve elle-même qui est menacée à très court terme. La partie ludothèque et livres jeunesse est d’ores et déjà condamnée pour laisser la place à une extension du restaurant universitaire. Le reste de la bibliothèque est sur la sellette. L’université estime en effet qu’elle pourrait tirer 100.000€ par an de loyer et qu’elle n’a aucune raison d’offrir 100.000€ à une institution qui ne pourrait évidemment pas payer une telle somme. Précisons plutôt que l’université ne voit plus d’intérêt à cette bibliothèque qu’elle a pourtant désirée ardemment et qu’elle n’a obtenue que grâce à une convention signée en 1988, à l’époque où Louvain-la-Neuve n’était encore qu’un jeune assemblage d’auditoires et de logements étudiants.

À la remarque « Pouvez-vous imaginer une ville universitaire sans bibliothèque ? » posée par de multiples citoyens, la réponse de certains décideurs est sans ambiguïté : « Nous avons la Page d’Après ». Comme si c’était pareil. Comme si c’était suffisant. Mais, comme le glissent parfois à demi-mot certains politiques qui n’ont pas peur d’étaler leur déficience intellectuelle : « Le livre, c’est mort, l’avenir c’est l’IA. Et puis, si nécessaire, il y a Amazon ».

L’université propose à la bibliothèque de garder une fraction de l’espace actuel à la condition que les travaux d’aménagement soient pris en charge… par la bibliothèque publique elle-même (le résultat restant propriété de l’université). De bibliothèque, la section de Louvain-la-Neuve se transformerait en "antenne" avec un stock très faible et où l’on pourrait se procurer les livres commandés.

Mais c’est complètement se méprendre sur le rôle d’une bibliothèque. Un lieu où l’on peut flâner et faire des découvertes littéraires improbables, découvertes d’ailleurs encouragées par les initiatives du personnel (mise en évidence de titres méconnus, tirage aléatoire d’une suggestion de lecture …). Dans la bibliothèque de Louvain-la-Neuve, j’ai croisé des bénévoles aidant des immigrés adultes à se choisir des livres pour enfant afin d’apprendre le français. J’ai vu mon fils se mettre à lire spontanément les journaux quotidiens offerts à la lecture.

Une bibliothèque n’est pas un point d’enlèvement ou un commerce, une bibliothèque est un lieu de vie !

La bibliothèque doit subsister. Il faut la sauver. (et signer la pétition si ce n’est pas encore fait)

Signez la pétition pour sauver la bibliothèque de Louvain-la-Neuve !

La disparition progressive de tout un secteur

Loin de se faire de la concurrence, les différents acteurs du livre se renforcent, s’entraident. Les meilleurs clients de l’un sont souvent les meilleurs clients de l’autre. Un achat d’un côté entraine, par ricochet, un achat de l’autre. La bibliothèque publique de Louvain-la-Neuve est le plus gros client du fournisseur de BD Slumberland (ou le second après moi, me siffle mon portefeuille). L’université pourrait faire le choix de participer à cet écosystème.

Slumberland, lieu mythique vers lequel se tournent mes cinq prières quotidiennes, occupe un espace Klépierre. Car, à Louvain-la-Neuve, tout appartient soit à l’université, soit au groupe Klépierre, propriétaire du centre commercial. Le bail de Slumberland arrivant à expiration, ils viennent de se voir notifier une augmentation soudaine de plus de 30% !

15.000€ par mois. En étant ouvert 60h par semaine (ce qui est énorme pour un magasin), cela signifie plus d’un euro par minute d’ouverture. Rien que pour payer son loyer, Slumberland doit vendre une bande dessinée toutes les 5 minutes ! À ce tarif-là, mes (nombreux et récurrents) achats ne remboursent même pas le temps que je passe à flâner dans le magasin !

Ces loyers m’interpellent : comment un magasin de loques criardes produites par des enfants dans des caves en Asie peut-il gagner de quoi payer de telles sommes là où les meilleurs fournisseurs de livres peinent à joindre les deux bouts ? Comment se fait-il que l’épicerie de mon quartier, présente depuis 22 ans, favorisant les produits bio et locaux, remplie tous les jours à ras bord de clients, doive brusquement mettre la clé sous le paillasson ? Comme aux États-Unis, où on ne dit pas « boire un café », mais « prendre un Starbucks », il ne nous restera bientôt que les grandes chaînes.

La terrifiante hégémonie des monopoles (ploum.net)

Face à l’hégémonie de ces monopoles, je croyais que l’université était un soutien. Mais force est de constater que le modèle est plutôt celui de Monaco : le seul pays du monde qui ne dispose pas d’une seule librairie !

Quelle société les universitaires sont-ils en train de construire ?

Je vous rassure, Slumberland survivra encore un peu à Louvain-la-Neuve. Le magasin a trouvé une surface moins chère (car moins bien exposée) et va déménager. Son nouveau propriétaire ? L’université bien sûr ! Derniers bastions livresques de la ville qui fût, un jour, une utopie intellectuelle et humaniste, Slumberland et La Page d’Après auront le droit de subsister jusqu’au jour où les gestionnaires immobiliers qui se prétendent intellectuels décideront que ce serait plus rentable de vendre un peu plus de gaufres, un peu plus de sacs à main ou d’abrutir un peu plus les étudiants avec un Luna Park.

L’université est devenue un business. Le verdict commercial est sans appel : la production de débiles formatés à la consommation instagrammable rapporte plus que la formation d’intellectuels.

Mais ce n’est pas une fatalité.

L’avenir est ce que nous déciderons d’en faire. L’université n’est pas forcée de devenir un simple gestionnaire immobilier. Nous sommes l’université, nous pouvons la transformer.

J’invite tous les membres du personnel de l’université, les professeur·e·s, les étudiant·e·s, les lecteurices, les intellectuel·le·s et les humanistes à agir, à parler autour d’eux, à défendre les livres en les diffusant, en les prêtant, en encourageant leur lecture, en les conseillant, en diffusant leurs opinions, en ouvrant les débats sur la place des intellectuels dans la ville.

Pour préserver le savoir et la culture, pour sauvegarder l’humanisme et l’intelligence de l’absurde marchandisation à court terme, nous avons le devoir de communiquer, de partager sans restriction, de faire entendre notre voix de toutes les manières imaginables.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

May 12, 2025

Pour une poignée de bits…

Toute l’infrastructure gigantesque d’Internet, tous ces milliers de câbles sous-marins, ces milliards de serveurs clignotants ne servent aux humains qu’à échanger des séries de bits.

Nos téléphones produisent des bits qui sont envoyés, dupliqués, stockés et, parfois, arrivent sur d’autres téléphones. Souvent, ces bits ne sont utiles que pour quelques secondes à peine. Parfois, ils ne le sont pas du tout.

Nous produisons trop de bits pour être capables de les consommer ou pour tout simplement en avoir envie.

Or, toute la promesse de l’IA, c’est d’automatiser cette génération de bits en faisant deux choses : enregistrer les séquences de bits existantes pour les analyser puis reproduire des séquences de bits nouvelles, mais « ressemblantes ».

L’IA, les LLMs, ce ne sont que ça : des générateurs de bits.

Comme me le souffle très justement Stéphane "Alias" Gallay : la course à l’IA, ce n’est finalement qu’un concours de bits.

Enregistrer les séquences de bits

Tous les producteurs d’IA doivent donc d’abord enregistrer autant de séquences de bits existantes que possible. Pour cette raison, le Web est en train de subir une attaque massive. Ces fournisseurs de créateurs de bits pompent agressivement toutes les données qui passent à leur portée. En continu. Ce qui met à mal toute l’infrastructure du web.

Mais comment arrivent-ils à faire cela ? Et bien une partie de la solution serait que ce soit votre téléphone qui le fasse. La société Infatica, met en effet à disposition des développeurs d’app Android et iPhone des morceaux de code à intégrer dans leurs apps contre paiement.

Ce que fait ce code ? Tout simplement, à chaque fois que vous utilisez l’app, il donne l’accès à votre bande passante à des clients. Clients qui peuvent donc faire les requêtes de leur choix comme pomper autant de sites que possible. Cela, sans que l’utilisateur du téléphone en soi informé le moins du monde.

Botnet Part 2: The Web is Broken (jan.wildeboer.net)

Cela rend l’attaque impossible à bloquer efficacement, car les requêtes proviennent de n’importe où, n’importe quand.

Tout comme le spam, l’activité d’un virus informatique se fait désormais à visage découvert, avec de vraies sociétés qui vendent leurs « services ». Et les geeks sont trop naïfs : ils cherchent des logiciels malveillants qui exploitent des failles de sécurité compliquées alors que tout se fait de manière transparente, à ciel ouvert, mais avec ce qu’on appelle la "plausible deniability" grâce à des couches de services commerciaux. Il y a même des sites avec des reviews et des étoiles pour choisir son meilleur réseau de botnets pseudolégal.

The candid naivety of geeks (ploum.net)

Le développeur de l’app Android dira que « il ne savait pas que son app serait utilisée pour faire des choses néfastes ». Les fournisseurs de ce code et revendeurs diront « on voulait surtout aider la recherche scientifique et le développeur est censé prévenir l’utilisateur ». Le client final, qui lance ces attaques pour entrainer ses générateurs de bits dira « je n’ai fait qu’utiliser un service commercial ».

En fait, c’est même pire que cela : comme je l’ai démontré lorsque j’ai détecté la présence d’un tracker Facebook dans l’application officielle de l’institut royal de météorologie belge, il est probable que le maître d’œuvre de l’application n’en sache lui-même rien, car il aura utilisé un sous-traitant pour développer l’app. Et le sous-traitant aura lui-même créé l’app en question sur base d’un modèle existant (un template).

L’article où j’explique mon interaction avec l’IRM (ploum.net)

Grâce à ces myriades de couches, personne ne sait rien. Personne n’est responsable de rien. Et le web est en train de s’effondrer. Allégorie virtuelle du reste de la société.

Générer des séquences de bits

Une fois qu’on a enregistré assez de séquences de bits, on va tenter d’y trouver une logique pour générer des séquences nouvelles, mais « ressemblantes ». Techniquement, ce qui est très impressionnant avec les ChatGPT et consorts, c’est l’échelle à laquelle est fait ce que les chercheurs en informatique font depuis vingt ans.

Mais si ça doit être « ressemblant », ça ne peut pas l’être trop ! En effet, cela fait des décennies que l’on nous rabâche les oreilles avec le "plagiat", avec le "vol de propriété intellectuelle". Houlala, "pirater", c’est mal.

Eh bien non, allez-y ! Piratez mes livres ! D’ailleurs, ils sont faits pour, ils sont sous licence libre. Parce que j’ai envie d’être lu. C’est pour ça que j’écris. Je ne connais aucun artiste qui a augmenté la taille de son public en "protégeant sa propriété intellectuelle".

Have you ever considered piracy?

Parait que c’est mal de pirater.

Sauf quand ce sont les IA qui le font. Ce que montre très bien Otakar G. Hubschmann dans une expérience édifiante. Il demande à ChatGPT de générer des images de « superhéros qui utilise des toiles d’araignées pour se déplacer », d’un « jeune sorcier qui va à l’école avec ses amis » ou un « plombier italien avec une casquette rouge ».

Et l’IA refuse. Parce que ce serait enfreindre un copyright. Désolé donc à tous les plombiers italiens qui voudraient mettre une casquette rouge : vous êtes la propriété intellectuelle de Nintendo.

Mais là où c’est encore plus hallucinant, c’est lorsqu’il s’éloigne des toutes grandes franchises actuelles. S’il demande « photo d’une femme combattant un alien », il obtient… une image de Sigourney Weaver. Une image d’un aventurier archéologue qui porte un chapeau et utilise un fouet ? Il obtient une photo d’Harrisson Ford.

Comme je vous disais : une simple série de bits ressemblant à une autre.

An image of an archeologist adventurer who wears a hat and uses a bullwhip (theaiunderwriter.substack.com)

Ce qui nous apprend à quel point les IA n’ont aucune, mais alors là aucune originalité. Mais, surtout, que le copyright est véritablement un outil de censure qui ne sert que les très très grands. Grâce aux IA, il est désormais impossible d’illustrer voire d’imaginer un enfant sorcier allant à l’école parce que c’est du plagiat d’Harry Potter (lui-même étant, selon moi, un plagiat d’un roman d’Anthony Horowitz, mais passons…).

Comme le dit Irénée Régnauld, il s’agit de pousser un usage normatif des technologies à un point très effrayant.

Faut-il empêcher les " mésusages " des technologies ? (maisouvaleweb.fr)

Mais pour protéger ces franchises et ce copyright, les mêmes IA n’hésitent pas à se servir dans les bases de données pirates et à foutre en l’air tous les petits services d’hébergement.

Les humains derrière les bits

Mais le pire c’est que c’est tellement à la mode de dire qu’on a généré ses bits automatiquement que, souvent, on le fait faire par des humains camouflés en générateurs automatiques. Comme cette app de shopping "AI" qui n’était, en réalité, que des travailleurs philippins sous-payés.

Fintech founder charged with fraud after 'AI' shopping app found to be powered by humans in the Philippines (techcrunch.com)

Les luddites l’avaient compris, Charlie Chaplin l’avait illustré dans « Les temps modernes », Arnold Schwarzeneger a essayé de nous avertir : nous servons les machines que nous croyons avoir conçu pour nous servir. Nous sommes esclaves de générateurs de bits.

Pour l’amour des bits !

Dans le point presse de ma ville, j’ai découvert qu’il n’y avait qu’un magazine en présentoir consacré à Linux, mais pas moins de 5 magazines consacrés entièrement aux générateurs de bits. Avec des couvertures du genre « Mieux utiliser ChatGPT ». Comme si on pouvait l’utiliser « mieux ». Et comme si le contenu de ces magazines n’était lui-même pas généré.

C’est tellement fatigant que j’ai pris la résolution de ne plus lire les articles parlant de ces générateurs de bits, même s’ils ont l’air intéressants. Je vais essayer de lire moins sur le sujet, d’en parler moins. Après tout, je pense que j’ai dit tout ce que j’avais à dire dans ces deux billets :

Vous êtes déjà assez assaillis par les générateurs de bits et par les bits qui parlent des générateurs de bits. Je vais tenter de ne pas trop en rajouter et revenir à mon métier d’artisan. Chaque série de bits que je vous offre est entièrement façonnée à la main, d’un humain vers un autre. C’est plus cher, plus rare, plus long à lire, mais, je l’espère, autrement plus qualitatif.

Vous sentez l’amour de l’art et la passion derrière ces bits dont chacun à une signification profonde et une utilité réelle ? C’est pour les transmettre, les partager que je cherche à préserver notre infrastructure et nos cerveaux.

Bonnes lectures et bons échanges entre humains !

L’image d’illustration a été conçue pour corrompre l’apprentissage des aspirateurs de données.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

May 11, 2025

Staf Wagemakers

#eXit : Goodbye twitter. Hi Mastodon…

I decided to leave twitter.

Yes, this has something to do with the change of ownership, the name change to x, …

There is only 1 X to me, and that’s X.org

Twitter has become a platform that doesn’t value #freedomofspeech anymore.

My account even got flagged as possible spam to “factchecking” #fakenews

The mean reason is that there is a better alternative in the form of the Fediverse #Fediverse is the protocol that Mastodon uses.

It allows for a truly decentralised social media platform.

It allows organizations to set up their own Mastodon instance and take ownership and accountability for their content and accounts.

Mastodon is a nice platform; you probably feel at home there.

People who follow me on twitter can continue to follow me at Mastodon if they want.

https://mastodon.social/@stafwag

I’ll post this message a couple of times to twitter before I close my twitter account, so people can decide if they want to follow me on Mastodon …or not ;-).

Have fun!

May 09, 2025

Frederic Descamps

MySQL 9.3! Thank you for your contributions!

Before the MySQL & HeatWave Summit, we released MySQL 9.3, the latest Innovation Release. The event was terrific, and I had the chance to meet some of the MySQL contributors. As usual, we released bug fixes for 8.0 and 8.4 LTS, but I focus on the newest release in this post.We included patches and code […]

May 07, 2025

Amedee Van Gasse

🤔 “Wasn’t /dev/null Good Enough?” — Understanding the Difference Between /dev/null and /dev/zero

After my last blog post about the gloriously pointless /dev/scream, a few people asked:

“Wasn’t /dev/null good enough?”

Fair question—but it misses a key point.

Let me explain: /dev/null and /dev/zero are not interchangeable. In fact, they are opposites in many ways. And to fully appreciate the joke behind /dev/scream, you need to understand where that scream is coming from—not where it ends up.

Black Holes and White Holes

To understand the difference, let us borrow a metaphor from cosmology.

/dev/null is like a black hole: it swallows everything. You can write data to it, but nothing ever comes out. Not even light. Not even your logs.
/dev/zero is like a white hole: it constantly emits data. In this case, an infinite stream of zero bytes (0x00). It produces, but does not accept.

So when I run:

dd if=/dev/zero of=/dev/null

I am pulling data out of the white hole, and sending it straight into the black hole. A perfectly balanced operation of cosmic futility.

What Are All These `/dev/*` Devices?

Let us break down the core players:

Device	Can You Write To It?	Can You Read From It?	What You Read	Commonly Used For	Nickname / Metaphor
`/dev/null`	Yes	Yes	Instantly empty (EOF)	Discard console output of a program	Black hole
`/dev/zero`	Yes	Yes	Endless zeroes (`0x00`)	Wiping drives, filling files, or allocating memory with known contents	White hole
`/dev/random`	No	Yes	Random bytes from entropy pool	Secure wiping drives, generating random data	Quantum noise
`/dev/urandom`	No	Yes	Pseudo-random bytes (faster, less secure)	Generating random data	Pseudo-random fountain
`/dev/one`	Yes	Yes	Endless `0xFF` bytes	Wiping drives, filling files, or allocating memory with known contents	The dark mirror of `/dev/zero`
`/dev/scream`	Yes	Yes	aHAAhhaHHAAHaAaAAAA…	Catharsis	Emotional white hole

Note: /dev/one is not a standard part of Linux—it comes from a community kernel module, much like /dev/scream.

Back to the Screaming

/dev/scream is a parody of /dev/zero—not /dev/null.

The point of /dev/scream was not to discard data. That is what /dev/null is for.

The point was to generate data, like /dev/zero or /dev/random, but instead of silent zeroes or cryptographic entropy, it gives you something more cathartic: an endless, chaotic scream.

aHAAhhaHHAAHaAaAAAAhhHhhAAaAAAhAaaAAAaHHAHhAaaaaAaHahAaAHaAAHaaHhAHhHaHaAaHAAHaAhhaHaAaAA

So when I wrote:

dd if=/dev/scream of=/dev/null

I was screaming into the void. The scream came from the custom device, and /dev/null politely absorbed it without complaint. Not a single bit screamed back. Like pulling screams out of a white hole and throwing them into a black hole. The ultimate cosmic catharsis.

Try Them Yourself

Want to experience the universe of /dev for yourself? Try these commands (press Ctrl+C to stop each):

# Silent, empty. Nothing comes out.
cat /dev/null

# Zero bytes forever. Very chill.
hexdump -C /dev/zero

# Random bytes from real entropy (may block).
hexdump -C /dev/random

# Random bytes, fast but less secure.
hexdump -C /dev/urandom

# If you have the /dev/one module:
hexdump -C /dev/one

# If you installed /dev/scream:
cat /dev/scream

TL;DR

/dev/null = Black hole: absorbs, never emits.
/dev/zero = White hole: emits zeroes, absorbs nothing.
/dev/random / /dev/urandom = Entropy sources: useful for cryptography.
/dev/one = Evil twin of /dev/zero: gives endless 0xFF bytes.
/dev/scream = Chaotic white hole: emits pure emotional entropy.

So no, /dev/null was not “good enough”—it was not the right tool. The original post was not about where the data goes (of=/dev/null), but where it comes from (if=/dev/scream), just like /dev/zero. And when it comes from /dev/scream, you are tapping into something truly primal.

Because sometimes, in Linux as in life, you just need to scream into the void.

May 04, 2025

Staf Wagemakers

docker-stafwag-unbound v2.1.0 released: Use unbound as an DNS-over-TLS resolver and authoritative DNS server

Unbound is a popular DNS resolver, that has native DNS-over-TLS support.

Unbound and Stubby were among the first resolvers to implement DNS-over-TLS.

I wrote a few blog posts on how to use Stubby on GNU/Linux and FreeBSD.

The implementation status of DNS-over-TLS and other DNS privacy options is available at: https://dnsprivacy.org/.

See https://dnsprivacy.org/implementation_status/ for more details.

It’s less known that it can also be used as authoritative DNS server (aka a real DNS server). Since I discovered this feature and Unbound got native DNS-over-TLS support I started to it as my DNS server.

I created a docker container for it a couple of years back to use it as an authoritative DNS server.

I recently updated the container, the latest version (2.1.0) is available at: https://github.com/stafwag/docker-stafwag-unbound

ChangeLog

Version 2.1.0

Upgrade to debian:bookworm

Updated BASE_IMAGE to debian:bookworm
Add ARG DEBIAN_FRONTEND=noninteractive
Run unbound-control-setup to generate the default certificate
Documentation updated

docker-stafwag-unbound

Dockerfile to run unbound inside a docker container. The unbound daemon will run as the unbound user. The uid/gid is mapped to 5000153.

Installation

clone the git repo

$ git clone https://github.com/stafwag/docker-stafwag-unbound.git
$ cd docker-stafwag-unbound

Configuration

Port

The default DNS port is set to 5353 this port is mapped with the docker command to the default port 53 (see below). If you want to use another port, you can edit etc/unbound/unbound.conf.d/interface.conf.

`scripts/create_zone_config.sh` helper script

The create_zone_config.sh helper script, can help you to create the zones.conf configuration file. It’s executed during the container build and creates the zones.conf from the datafiles in etc/unbound/zones.

If you want to use a docker volume or configmaps/persistent volumes on Kubernetes. You can use this script to generate the zones.conf a zones data directory.

create_zone_config.sh has following arguments:

-f Default: /etc/unbound/unbound.conf.d/zones.conf The zones.conf file to create
-d Default: /etc/unbound/zones/ The zones data source files
-p Default: the realpath of zone files
-s Skip chown/chmod

Use unbound as an authoritative DNS server

To use unbound as an authoritative authoritive DNS server - a DNS server that hosts DNS zones - add your zones file etc/unbound/zones/.

During the creation of the image scripts/create_zone_config.sh is executed to create the zones configuration file.

Alternatively, you can also use a docker volume to mount /etc/unbound/zones/ to your zone files. And a volume mount for the zones.conf configuration file.

You can use subdirectories. The zone file needs to have $ORIGIN set to our zone origin.

Use DNS-over-TLS

The default configuration uses quad9 to forward the DNS queries over TLS. If you want to use another vendor or you want to use the root DNS servers director you can remove this file.

Build the image

$ docker build -t stafwag/unbound .

To use a different BASE_IMAGE, you can use the –build-arg BASE_IMAGE=your_base_image.

$ docker build --build-arg BASE_IMAGE=stafwag/debian:bookworm -t stafwag/unbound .

Run

Recursive DNS server with DNS-over-TLS

Run

$ docker run -d --rm --name myunbound -p 127.0.0.1:53:5353 -p 127.0.0.1:53:5353/udp stafwag/unbound

Test

$ dig @127.0.0.1 www.wagemakers.be

Authoritative dns server.

If you want to use unbound as an authoritative dns server you can use the steps below.

Create a directory with your zone files:

[staf@vicky ~]$ mkdir -p ~/docker/volumes/unbound/zones/stafnet
[staf@vicky ~]$ 

[staf@vicky stafnet]$ cd ~/docker/volumes/unbound/zones/stafnet
[staf@vicky ~]$ 

Create the zone files

Zone files

stafnet.zone:

$TTL  86400 ; 24 hours
$ORIGIN stafnet.local.
@  1D  IN  SOA @  root (
            20200322001 ; serial
            3H ; refresh
            15 ; retry
            1w ; expire
            3h ; minimum
           )
@  1D  IN  NS @ 

stafmail IN A 10.10.10.10

stafnet-rev.zone:

$TTL    86400 ;
$ORIGIN 10.10.10.IN-ADDR.ARPA.
@       IN      SOA     stafnet.local. root.localhost.  (
                        20200322001; Serial
                        3h      ; Refresh
                        15      ; Retry
                        1w      ; Expire
                        3h )    ; Minimum
        IN      NS      localhost.
10      IN      PTR     stafmail.

Make sure that the volume directoy and zone files have the correct permissions.

$ sudo chmod 750 ~/docker/volumes/unbound/zones/stafnet/
$ sudo chmod 640 ~/docker/volumes/unbound/zones/stafnet/*
$ sudo chown -R root:5000153 ~/docker/volumes/unbound/

Create the zones.conf configuration file.

[staf@vicky stafnet]$ cd ~/github/stafwag/docker-stafwag-unbound/
[staf@vicky docker-stafwag-unbound]$ 

The script will execute a chown and chmod on the generated zones.conf file and is excute with sudo for this reason.

[staf@vicky docker-stafwag-unbound]$ sudo scripts/create_zone_config.sh -f ~/docker/volumes/unbound/zones.conf -d ~/docker/volumes/unbound/zones/stafnet -p /etc/unbound/zones
Processing: /home/staf/docker/volumes/unbound/zones/stafnet/stafnet.zone
origin=stafnet.local
Processing: /home/staf/docker/volumes/unbound/zones/stafnet/stafnet-rev.zone
origin=1.168.192.IN-ADDR.ARPA
[staf@vicky docker-stafwag-unbound]$ 

Verify the generated zones.conf

[staf@vicky docker-stafwag-unbound]$ sudo cat ~/docker/volumes/unbound/zones.conf
auth-zone:
  name: stafnet.local
  zonefile: /etc/unbound/zones/stafnet.zone

auth-zone:
  name: 1.168.192.IN-ADDR.ARPA
  zonefile: /etc/unbound/zones/stafnet-rev.zone

[staf@vicky docker-stafwag-unbound]$ 

run the container

$ docker run --rm --name myunbound -v ~/docker/volumes/unbound/zones/stafnet:/etc//unbound/zones/ -v ~/docker/volumes/unbound/zones.conf:/etc/unbound/unbound.conf.d/zones.conf -p 127.0.0.1:53:5353 -p 127.0.0.1:53:5353/udp stafwag/unbound

Test

[staf@vicky ~]$ dig @127.0.0.1 soa stafnet.local

; <<>> DiG 9.16.1 <<>> @127.0.0.1 soa stafnet.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37184
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;stafnet.local.     IN  SOA

;; ANSWER SECTION:
stafnet.local.    86400 IN  SOA stafnet.local. root.stafnet.local. 3020452817 10800 15 604800 10800

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sun Mar 22 19:41:09 CET 2020
;; MSG SIZE  rcvd: 83

[staf@vicky ~]$ 

Test reverse lookup.

[staf@vicky ~]$ dig -x 10.10.10.10 @127.0.0.1

; <<>> DiG 9.16.21 <<>> -x 10.10.10.10 @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36250
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;10.10.10.10.in-addr.arpa.	IN	PTR

;; ANSWER SECTION:
10.10.10.10.in-addr.arpa. 86400	IN	PTR	stafmail.

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Oct 19 19:51:47 CEST 2021
;; MSG SIZE  rcvd: 75

[staf@vicky ~]$ 

Have fun!

April 30, 2025

Amedee Van Gasse

You can now follow my blog from the Fediverse!

If you are part of the Fediverse—on Mastodon, Pleroma, or any other ActivityPub-compatible platform—you can now follow this blog directly from your favorite platform.

Thanks to the excellent ActivityPub plugin for WordPress, each blog post I publish on amedee.be is now automatically shared in a way that federated social platforms can understand and display.

Follow me from Mastodon

If you are on Mastodon, you can follow this blog just like you would follow another person:

Search for: @amedee.be@amedee.be

Or click this link if your Mastodon instance supports it:
https://amedee.be/@amedee.be

New blog posts will appear in your timeline, and you can even reply to them from Mastodon. Your comments will appear as replies on the blog post page—Fediverse and WordPress users interacting seamlessly!

Why I enabled ActivityPub

I have been active on Mastodon for a while as @amedee@lou.lt, and I really enjoy the decentralized, open nature of the Fediverse. It is a refreshing change from the algorithm-driven social media platforms.

Adding ActivityPub support to my blog aligns perfectly with those values: open standards, decentralization, and full control over my own content.

This change was as simple as adding the activitypub plugin to my blog’s Ansible configuration on GitHub:

 blog_wp_plugins_install:
+  - activitypub
   - akismet
   - google-site-kit
   - health-check

Once deployed, GitHub Actions and Ansible took care of the rest.

What this means for you

If you already follow me on Mastodon (@amedee@lou.lt), nothing changes—you will still see the occasional personal post, boost, or comment.

But if you are more interested in my blog content—technical articles, tutorials, and occasional personal reflections—you might prefer following @amedee.be@amedee.be. It is an automated account that only shares blog posts.

This setup lets me keep content separate and organized, while still engaging with the broader Fediverse community.

Want to do the same for your blog?

Setting this up is easy:

Make sure you are running WordPress version 6.4 or later.
Install and activate the ActivityPub plugin.
After activation, your author profile (and optionally, your blog itself) becomes followable via the Fediverse.
Start publishing—and federate your writing with the world!

April 27, 2025

Staf Wagemakers

docker-stafwag-hello_nginx v1.0.0 released

While the code ( if you call YAML “code” ) is already more than 5 years old. I finally took the take the make a proper release of my test “hello” OCI container.

I use this container to demo a container build and how to deploy it with helm on a Kubernetes cluster. Some test tools (ping, DNS, curl, wget) are included to execute some tests on the deployed pod.

It also includes a Makefile to build the container and deploy it on a Red Hat OpenShift Local (formerly Red Hat CodeReady Containers) cluster.

To deploy the container with the included helm charts to OpenShift local (Code Ready Containers), execute make crc_deploy.

This will:

Build the container image
Login to the internal OpenShift registry
Push the image to the internal OpenShift register
Deploy the helm chart in the tsthelm namespace, the helm chart will also create a route for the application.

I might include support for other Kubernetes in the future when I find the time.

docker-stafwag-hello_nginx v1.0.0 is available at:

https://github.com/stafwag/docker-stafwag-hello_nginx

ChangeLog

v1.0.0 Initial stable release

Included dns utilities and documentation update by @stafwag in #3
Updated Run section by @stafwag in #4

Have fun!

April 25, 2025

Frank Goossens

Improving LCP the wrong way

Performance hack seen on a customer site; fix the bad LCP (due to an animation in revslider) by loading an inline (base64’ed) png image which according to FF is broken and later in the rendering process hiding & removing it. Even though that image is not *really* used, tools such as Google Pagespeed Insights pick it up as the LCP image and the score is “in the green”. Not sure this is really…

Source

April 23, 2025

Amedee Van Gasse

🐧 Falling Down the /dev Rabbit Hole: From Secure Deletion to /dev/scream

It started innocently enough. I was reading a thread about secure file deletion on Linux—a topic that has popped up in discussions for decades. You know the kind: “Is shred still reliable? Should I overwrite with random data or zeroes? What about SSDs and wear leveling?”

As I followed the thread, I came across a mention of /dev/zero, the classic Unix device that outputs an endless stream of null bytes (0x00). It is often used in scripts and system maintenance tasks like wiping partitions or creating empty files.

That led me to wonder: if there is /dev/zero, is there a /dev/one?

Turns out, not in the standard kernel—but someone did write a kernel module to simulate it. It outputs a continuous stream of 0xFF, which is essentially all bits set to one. It is a fun curiosity with some practical uses in testing or wiping data in a different pattern.

But then came the real gem of the rabbit hole: /dev/scream.

Yes, it is exactly what it sounds like.

What is `/dev/scream`?

/dev/scream is a Linux kernel module that creates a character device which, when read, outputs a stream of text that mimics a chaotic, high-pitched scream. Think:

aHAAhhaHHAAHaAaAAAAhhHhhAAaAAAhAaaAAAaHHAHhAaaaaAaHahAaAHaAAHaaHhAHhHaHaAaHAAHaAhhaHaAaAA

It is completely useless… and completely delightful.

Originally written by @matlink, the module is a humorous take on the Unix philosophy: “Everything is a file”—even your existential dread. It turns your terminal into a primal outlet. Just run:

cat /dev/scream

And enjoy the textual equivalent of a scream into the void.

Why?

Why not?

Sometimes the joy of Linux is not about solving problems, but about exploring the weird and wonderful corners of its ecosystem. From /dev/null swallowing your output silently, to /dev/urandom serving up chaos, to /dev/scream venting it—all of these illustrate the creativity of the open source world.

Sure, shred and secure deletion are important. But so is remembering that your system is a playground.

Try it Yourself

If you want to give /dev/scream a go, here is how to install it:

Warning

This is a custom kernel module. It is not dangerous, but do not run it on production systems unless you know what you are doing.

Build and Load the Module

git clone https://github.com/matlink/dev_scream.git
cd dev_scream
make build
sudo make install
sudo make load
sudo insmod dev_scream.ko

Now read from the device:

cat /dev/scream

Or, if you are feeling truly poetic, try screaming into the void:

dd if=/dev/scream of=/dev/null

In space, nobody can hear you scream… but on Linux, /dev/scream is loud and clear—even if you pipe it straight into oblivion.

When you are done screaming:

sudo rmmod dev_scream

Final Thoughts

I started with secure deletion, and I ended up installing a kernel module that screams. This is the beauty of curiosity-driven learning in Linux: you never quite know where you will end up. And sometimes, after a long day, maybe all you need is to cat /dev/scream.

Let me know if you tried it—and whether your terminal feels a little lighter afterward.

April 17, 2025

Lionel Dricot

Dédicace à Trolls & Vélo et magie cycliste

Je serai ce samedi 19 avril à Mons au festival Trolls & Légende en dédicace au stand PVH.

La star de la table sera sans conteste Sara Schneider, autrice fantasy de la saga des enfants d’Aliel et qui est toute auréolée du Prix SFFF Suisse 2024 pour son superbe roman « Place d’âmes » (dont je vous ai déjà parlé).

C’est la première fois que je dédicacerai à côté d’une autrice ayant reçu un prix majeur. Je suis pas sûr qu’elle acceptera encore que je la tutoie.

Sara Schneider avec son roman et son prix SFFF Suisse 2024

Bref, si Sara vient pour faire la légende, le nom du festival implique qu’il faille compléter avec des trolls. D’où la présence également à la table PVH de Tirodem, Allius et moi-même. Ça, les trolls, on sait faire !

Les belles mécaniques de l’imaginaire

S’il y a des trolls et des légendes, il y a aussi tout un côté Steampunk. Et quoi de plus Steampunk qu’un vélo ?

Ce qui fait la beauté de la bicyclette, c’est sa sincérité. Elle ne cache rien, ses mouvements sont apparents, l’effort chez elle se voit et se comprend; elle proclame son but, elle dit qu’elle veut aller vite, silencieusement et légèrement. Pourquoi la voiture automobile est-elle si vilaine et nous inspire-t-elle un sentiment de malaise ? Parce qu’elle dissimule ses organes comme une honte. On ne sait pas ce qu’elle veut. Elle semble inachevée.
– Voici des ailes, Maurice Leblanc

Le vélo, c’est l’aboutissement d’un transhumanisme humaniste rêvé par la science-fiction.

La bicyclette a résolu le problème, qui remédie à notre lenteur et supprime la fatigue. L’homme maintenant est pourvu de tous ses moyens. La vapeur, l’électricité n’étaient que des progrès servant à son bien-être; la bicyclette est un perfectionnement de son corps même, un achèvement pourrait-on dire. C’est une paire de jambes plus rapides qu’on lui offre. Lui et sa machine ne font qu’un, ce ne sont pas deux êtres différents comme l’homme et le cheval, deux instincts en opposition; non, c’est un seul être, un automate d’un seul morceau. Il n’y a pas un homme et une machine, il y a un homme plus vite.
– Voici des ailes, Maurice Leblanc

Un aboutissement technologique qui, paradoxalement, connecte avec la nature. Le vélo est une technologie respectueuse et utilisable par les korrigans, les fées, les elfes et toutes les peuplades qui souffrent de notre croissance technologique. Le vélo étend notre cerveau pour nous connecter à la nature, induit une transe chamanique dès que les pédales se mettent à tourner.

Nos rapports avec la nature sont bouleversés ! Imaginez deux hommes sur un grand chemin : l’un marche, l’autre roule; leur situation à l’égard de la nature sera-t-elle la même ? Oh ! non. L’un recevra d’elle de menues sensations de détails, l’autre une vaste impression d’ensemble. À pied, vous respirez le parfum de cette plante, vous admirez la nuance de cette fleur, vous entendez le chant de cet oiseau; à bicyclette, vous respirez, vous admirez, vous entendez la nature elle-même. C’est que le mouvement produit tend nos nerfs jusqu’à leur maximum d’intensité et nous dote d’une sensibilité inconnue jusqu’alors.
– Voici des ailes, Maurice Leblanc

Oui, le vélo a amplement sa place à Trolls & Légendes, comme le démontrent ses extraits de « Voici des ailes » de Maurice Leblanc, roman écrit… en 1898, quelques années avant la création d’Arsène Lupin !

Célébrer l’univers Bikepunk

Moi aussi, j’aime me faire lyrique pour célébrer le vélo, comme le prouvent les extraits que sélectionnent les critiques de mon roman Bikepunk.

Chierie chimique de bordel nucléaire de saloperie vomissoire de permamerde !
— Bikepunk, Ploum

Bikepunk - L'Antre d'un poulpe (blog.grishka.fr)

Ouais bon, d’accord… C’est un style légèrement différent. J’essaie juste de toucher un public un poil plus moderne quoi. Et puis on avait dit « pas cet extrait-là ! ».

Allez, comme on dit chez les cyclisteurs : on enchaîne, on enchaîne…

Donc, pour célébrer le vélo et l’imaginaire cycliste, je me propose d’offrir une petite surprise à toute personne qui se présentera sur le stand PVH avec un déguisement dans le thème Bikepunk ce samedi (et si vous me prévenez à l’avance, c’est encore mieux).

Parce qu’on va leur montrer à ces elfes, ces barbares et ces mages ce que c’est la véritable magie, la véritable puissance : des pédales, deux roues et un guidon !

À samedi les cyclotrolls !

L’événement Dédicace à Trolls & Légendes sur Mobilizon.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

April 16, 2025

Amedee Van Gasse

Automating My Server Management with Ansible and GitHub Actions

Managing multiple servers can be a daunting task, especially when striving for consistency and efficiency. To tackle this challenge, I developed a robust automation system using Ansible, GitHub Actions, and Vagrant. This setup not only streamlines server configuration but also ensures that deployments are repeatable and maintainable.

A Bit of History: How It All Started

This project began out of necessity. I was maintaining a handful of Ubuntu servers — one for email, another for a website, and a few for experiments — and I quickly realized that logging into each one to make manual changes was both tedious and error-prone. My first step toward automation was a collection of shell scripts. They worked, but as the infrastructure grew, they became hard to manage and lacked the modularity I needed.

That is when I discovered Ansible. I created the ansible-servers repository in early 2024 as a way to centralize and standardize my infrastructure automation. Initially, it only contained a basic playbook for setting up users and updating packages. But over time, it evolved to include multiple roles, structured inventories, and eventually CI/CD integration through GitHub Actions.

Every addition was born out of a real-world need. When I got tired of testing changes manually, I added Vagrant to simulate my environments locally. When I wanted to be sure my configurations stayed consistent after every push, I integrated GitHub Actions to automate deployments. When I noticed the repo growing, I introduced linting and security checks to maintain quality.

The repository has grown steadily and organically, each commit reflecting a small lesson learned or a new challenge overcome.

The Foundation: Ansible Playbooks

At the core of my automation strategy are Ansible playbooks, which define the desired state of my servers. These playbooks handle tasks such as installing necessary packages, configuring services, and setting up user accounts. By codifying these configurations, I can apply them consistently across different environments.

To manage these playbooks, I maintain a structured repository that includes:

Inventory Files: Located in the inventory directory, these YAML files specify the hosts and groups for deployment targets.
Roles: Under the roles directory, I define reusable components that encapsulate specific functionalities, such as setting up a web server or configuring a database.
Configuration File: The ansible.cfg file sets important defaults, like enabling fact caching and specifying the inventory path, to optimize Ansible’s behavior.

Seamless Deployments with GitHub Actions

To automate the deployment process, I leverage GitHub Actions. This integration allows me to trigger Ansible playbooks automatically upon code changes, ensuring that my servers are always up-to-date with the latest configurations.

One of the key workflows is Deploy to Production, which executes the main playbook against the production inventory. This workflow is defined in the ansible-deploy.yml file and is triggered on specific events, such as pushes to the main branch.

Additionally, I have set up other workflows to maintain code quality and security:

Super-Linter: Automatically checks the codebase for syntax errors and adherence to best practices.
Codacy Security Scan: Analyzes the code for potential security vulnerabilities.
Dependabot Updates: Keeps dependencies up-to-date by automatically creating pull requests for new versions.

Local Testing with Vagrant

Before deploying changes to production, it is crucial to test them in a controlled environment. For this purpose, I use Vagrant to spin up virtual machines that mirror my production servers.

The deploy_to_staging.sh script automates this process by:

Starting the Vagrant environment and provisioning it.
Installing required Ansible roles specified in requirements.yml.
Running the Ansible playbook against the staging inventory.

This approach allows me to validate changes in a safe environment before applying them to live servers.

Embracing Open Source and Continuous Improvement

Transparency and collaboration are vital in the open-source community. By hosting my automation setup on GitHub, I invite others to review, suggest improvements, and adapt the configurations for their own use cases.

The repository is licensed under the MIT License, encouraging reuse and modification. Moreover, I actively monitor issues and welcome contributions to enhance the system further.

In summary, by combining Ansible, GitHub Actions, and Vagrant, I have created a powerful and flexible automation framework for managing my servers. This setup not only reduces manual effort but also increases reliability and scalability. I encourage others to explore this approach and adapt it to their own infrastructure needs. What began as a few basic scripts has now evolved into a reliable automation pipeline I rely on every day.

If you are managing servers and find yourself repeating the same configuration steps, I invite you to check out the ansible-servers repository on GitHub. Clone it, explore the structure, try it in your own environment — and if you have ideas or improvements, feel free to open a pull request or start a discussion. Automation has made a huge difference for me, and I hope it can do the same for you.

April 15, 2025

Frederic Descamps

sysbench linked to latest libmysql (from MySQL 9.3)

If you are testing MySQL with sysbench, here is a RPM version for Fedora 31 and OL 8 & 9 linked with the last libmysql (libmysqlclient.so.24) from MySQL 9.3. This version of sysbench is from the latest master branch in GitHub. I used version 1.1, but this is to make a differentiation with the code […]

April 14, 2025

Lionel Dricot

À la recherche de l’attention perdue

La messagerie instantanée et la politique

Vous l’avez certainement vu passer : Un journaliste américain s’est fait inviter par erreur sur un chat Signal où des personnes très haut placées de l’administration américaine (y compris le vice-président) discutent de l’organisation top secrète d’une frappe militaire au Yémen le 15 mars.

L’administration Trump envoie par erreur ses plans de guerre à un journaliste via Signal (next.ink)

La raison de cette erreur est que le porte-parole de Trump, Brian Hughes, avait, durant la campagne électorale, reçu un email du journaliste en question pour demander des précisions sur un autre sujet. Brian Hughes avait alors copié/collé la totalité de l’email, incluant la signature contenant le numéro de téléphone du journaliste, dans un message instantané Apple iMessage à destination de Mike Waltz, qui allait devenir le conseiller à la sécurité de Trump. Recevant ce numéro par message de la part de Brian Hughes, Mike Waltz aurait ensuite sauvegardé ce numéro sous le nom de Brian Hughes. En voulant inviter plus tard Brian Hughes dans le chat Signal, Mike Waltz a par erreur invité le journaliste américain.

Exclusive: how the Atlantic’s Jeffrey Goldberg got added to the White House Signal group chat (www.theguardian.com)

Cette anecdote nous apprend plusieurs choses:

Premièrement, Signal est devenu une réelle infrastructure critique de sécurité, y compris dans les cercles les plus hauts placés.

Deuxièmement, les discussions de guerre ultra-stratégique ont désormais lieu… par chat. Pas difficile d’imaginer que chaque participant répond machinalement, poste un émoji entre deux réunions, lors d’une pause pipi. Et là se décident la vie et la mort du reste du monde : dans les toilettes et les réunions qui n’ont rien à voir !

L’erreur initiale provient du fait que Mike Waltz ne lit vraisemblablement pas ses emails (sinon, on lui aurait fait suivre l’email au lieu de l’envoyer par message) et que Brian Hughues est incapable de résumer efficacement un long texte (sinon il n’aurait pas collé l’intégralité du message).

Non seulement Mike Waltz ne lit pas ses emails, mais on peut soupçonner qu’il ne lit pas les messages trop longs : il a quand même ajouté un numéro de téléphone qui se trouvait à la fin d’un message sans prendre le temps de lire et de comprendre ledit message. À sa décharge, il semblerait qu’il soit possible que ce soit "l’intelligence artificielle" de l’iPhone qui ait ajouté ce numéro automatiquement au contact.

Je ne sais pas si cette fonctionnalité existe, mais le fait d’utiliser un téléphone qui peut décider automatiquement de changer le numéro de ses contacts est quand même assez effrayant. Et bien dans le genre d’Apple dont j’interprète les slogans marketing comme « achetez avec nos produits l’intelligence qui vous fait défaut, bande de crétins ! ».

Crise politique attentionnelle et surveillance généralisée

La crise attentionnelle est réelle : nous sommes de moins en moins capables de nous concentrer et nous votons pour des gens qui le sont encore moins ! Un ami ayant été embauché pour participer à une campagne électorale en Belgique m’a raconté avoir été abasourdi par l’addiction des politiciens les plus en vue aux réseaux sociaux. Ils sont en permanence rivés à leurs écrans à comptabiliser les likes et les partages de leurs publications et, quand ils reçoivent un dossier de plus de dix lignes, demandent un résumé ultra-succinct à leurs conseillers.

Vos politiques ne comprennent rien à rien. Ils font semblant. Et désormais, ils demandent à ChatGPT qui a l’avantage de ne pas dormir, contrairement aux conseillers humains. Les fameuses intelligences artificielles qui, justement, sont peut-être coupables d’avoir ajouté le numéro à ce contact et d’avoir rédigé la politique fiscale de Trump.

La fin d’un monde ? (ploum.net)

Mais pourquoi utiliser Signal et pas une solution officielle qui empêcherait ce genre de fuite ? Officiellement, il n’y aurait pas d’alternative aussi facile. Mais je vois une raison non officielle très simple : les personnes haut placées ont désormais peur de leur propre infrastructure, car ils savent que tout est sauvegardé et peut-être utilisé contre eux lors d’une éventuelle enquête ou d’un procès, même des années plus tard.

Trump a été élu la première fois en faisant campagne sur le fait qu’Hillary Clinton avait utilisé un serveur email personnel, ce qui lui permettait, selon Trump lui-même, d’échapper à la justice en ayant ses mails soustraits aux services de surveillance internes américains.

Même ceux qui mettent en place le système de surveillance généralisé en ont peur.

L’éducation à la compréhension

La dernière leçon que je tire de cette anecdote c’est, encore une fois, celle de l’éducation : vous pouvez avoir l’infrastructure cryptographique la plus sécurisée, si vous êtes incompétent au point d’inviter n’importe qui dans votre chat, on ne peut rien faire pour vous.

La plus grosse faille de sécurité est toujours entre la chaise et le clavier, la seule manière de sécuriser un système est de faire en sorte que l’utilisateur soit éduqué.

Le meilleur exemple reste celui des voitures autonomes : nous sommes en train de mettre des générations entières dans des Tesla qui se conduisent toutes seules 99% du temps. Et lorsqu’un accident arrive, dans le 1% restant, nous demandons au conducteur : « Mais pourquoi tu n’as pas réagi comme un bon conducteur ? »

Et la réponse est très simple : « Parce que je n’ai jamais conduit de ma vie, je ne sais pas ce que c’est conduire, je n’ai jamais appris à réagir quand le système ne fonctionne pas correctement ».

Vous pensez que j’exagère ? Attendez…

Se faire engager grâce à l’IA

Eric Lu a reçu le CV d’un candidat très prometteur pour bosser dans sa startup. CV qui semblait fort optimisé en mots clés, mais qui était particulièrement pointu dans les technologies utilisées par Eric. Il a donc proposé au candidat une interview par vidéo.

Au début, tout s’est très bien passé jusqu’à ce que le candidat commence à s’emmêler dans ses réponses. « Vous dites que le service d’envoi de SMS sur lequel vous avez bossé était saturé, mais vous décrivez le service comme étant utilisé par une classe de 30 personnes. Comment 30 SMS peuvent-ils saturer le service ? » … euh… « Pouvez-vous me dire quelle interface utilisateur vous avez mise en place avec ce que vous dites avoir implémenté ? » … euh, je ne me souviens plus…

Eric comprend alors que le candidat baratine. Le CV a été généré par ChatGPT. Le candidat s’est préparé en simulant un entretien d’embauche avec ChatGPT et en étudiant par cœur ce qu’il devait répondre. Il panique dès qu’on sort de son script.

What it's like to interview a software engineer preparing with AI (www.kapwing.com)

Ce qui est particulièrement dommage, c’est que le candidat avait un profil vraiment adapté. S’il avait été honnête et franc au regard de son manque d’expérience, il aurait pu se faire engager comme junior et acquérir l’expérience souhaitée. S’il avait consacré son temps à lire des explications techniques sur les technologies concernées plutôt que d’utiliser ChatGPT, il aurait pu convaincre l’employeur de sa motivation, de sa curiosité. « Je ne connais pas encore grand-chose, mais je suis désireux d’apprendre ».

Mais le plus triste dans tout cela, c’est qu’il a sincèrement pensé que ça pouvait fonctionner. Il a détruit sa réputation parce que ça ne lui a même pas traversé l’esprit que, quand bien même il aurait été engagé, il n’aurait pas tenu deux jours dans son boulot avant de passer pour un crétin. Il a été malhonnête parce qu’il était persuadé que c’était la bonne manière de fonctionner.

Bref, il était un vrai Julius.

Mon collègue Julius (ploum.net)

Il a « appris à conduire une Tesla » en s’asseyant sur le siège et regardant celle-ci faire 100 fois le tour du quartier. Confiant, il est parti dans une autre ville et s’est pris le premier platane.

Sauver une génération

Les smartphones, l’IA, les monopoles publicitaires, les réseaux sociaux sont toutes les facettes d’un même problème : la volonté de rendre la technologie incompréhensible afin de nous asservir commercialement et de nous occuper l’esprit.

J’ai écrit comment je pensais que nous devions agir pour éduquer la prochaine génération d’adultes :

De l’utilisation des smartphones et des tablettes chez les adolescents (ploum.net)

Mais c’est un point de vue de parent. C’est pour cela que je trouve très pertinente l’analyse de Thual qui, lui, est un jeune adulte à peine sorti de l’adolescence. Il peut parler de tout cela à la première personne.

Adolescence et numérique : retour d'expérience (thual.eu)

La grande leçon que j’en tire est que la génération qui nous suit est loin d’être perdue. Comme toutes les générations, elle est désireuse d’apprendre, de se battre. Nous devons avoir l’humilité de réaliser que ma génération s’est complètement plantée. Que nous détruisons tout, que nous sommes des fascistes addicts à Facebook et Candy Crush qui roulons en SUV.

Nous n’avons pas de leçons à leur donner. Nous avons le devoir de les aider, de nous mettre à leur service en désactivant le pilote automatique et en brûlant les slides PowerPoint dont nous sommes si fiers.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

April 10, 2025

De l’utilisation des smartphones et des tablettes chez les adolescents

Chers parents, chers enseignants, chers éducateurs,

Nous le savons toutes et tous, le smartphone est devenu un objet incontournable de notre quotidien, nous connectant en permanence au réseau Internet qui, avant cela, restait cantonné aux ordinateurs sur nos bureaux. En voyant grandir nos enfants, la question se pose : quand, comment et pourquoi les faire entrer dans le monde de cette hyperconnexion permanente.

L’adolescence est une phase critique de la vie durant laquelle le cerveau est particulièrement réceptif et forme des réflexes qui resteront ancrés toute une vie. C’est également une période durant laquelle la pression du groupe et le désir de conformité sociale sont les plus importants. Ce n’est pas un hasard si les producteurs de cigarettes et d’alcool ciblent explicitement les adolescents dans le marketing de leur produit.

Le smartphone étant une invention incroyablement récente, nous manquons totalement de recul sur l’impact qu’il peut avoir durant la croissance. Est-il totalement inoffensif ou sera-t-il considéré, d’ici quelques années, comme le tabac l’est aujourd’hui ? Personne ne le sait avec certitude. Nos enfants sont les cobayes de cette technologie.

L’administrateur de la santé publique des États-Unis (US Surgeon General) tire la sonnette d’alarme à ce sujet.

Il me parait important de souligner certains points importants, qui ne sont que quelques éléments parmi les nombreuses problématiques étudiées dans le domaine

Impacts of Technology Use on Children: Exploring Literature on the Brain, Cognition and Well-Being (OECD)

L’attention et la concentration

Il est désormais démontré que le smartphone perturbe grandement l’attention et la concentration, y compris chez les adultes. Ce n’est pas un hasard : il est conçu pour cela. Les entreprises comme Google et Meta (Facebook, Whatsapp, Instagram) sont payées proportionnellement au temps que nous passons devant l’écran. Tout est optimisé en ce sens. Le simple fait d’avoir un téléphone près de soi, même éteint, perturbe le raisonnement et fait baisser sensiblement les résultats de tests de QI.

Brain Drain: The Mere Presence of One’s Own Smartphone Reduces Available Cognitive Capacity

Le cerveau acquiert le réflexe d’attendre des notifications de nouveaux messages de cet appareil, sa seule présence est donc un handicap majeur dans toutes les tâches qui requièrent de l’attention : lecture, apprentissage, réflexion, calculs. Il ne suffit pas de l’éteindre : il faut le mettre à distance, si possible dans une pièce différente !

Il est démontré que l’utilisation des réseaux sociaux comme Tik-Tok perturbe complètement la notion du temps et la formation de la mémoire. Nous en avons tous fait l’expérience : nous jurons avoir passé 10 minutes sur notre smartphone alors qu’il s’est en réalité écoulé près d’une heure.

How Social Media Interferences With The Psychology of Time and Memory (www.neuroscienceof.com)

Pour mémoriser et apprendre, le cerveau a besoin de temps de repos, de vide, d’ennui et de réflexion. Ces nécessaires temps « morts » dans les trajets, dans les files d’attente, dans la solitude d’une chambre d’adolescent voire même durant un cours rébarbatif ont été supplantés par une hyperconnexion.

The Psychology behind TikTok’s Memory Interference (www.neuroscienceof.com)

L’angoisse sociale et la perturbation du sommeil

Même lorsque nous ne l’utilisons pas, nous savons que les conversations continuent. Que des messages importants sont peut-être échangés en notre absence. Cette sensation bien connue appelée « FOMO » (Fear Of Missing Out, peur de manquer quelque chose) nous pousse à consulter notre téléphone jusque tard dans la nuit et dès le réveil. Une proportion inquiétante de jeunes reconnaissent se réveiller durant la nuit pour consulter leur smartphone. Or la qualité du sommeil est fondamentale dans le processus d’apprentissage et de formation du cerveau.

La santé mentale

De récentes avancées démontrent une corrélation forte entre le degré d’utilisation des réseaux sociaux et les symptômes de dépression. Le monde occidental semble atteint d’une épidémie de dépression adolescente, épidémie dont la temporalité correspond exactement avec l’apparition du smartphone. Les filles en dessous de 16 ans sont la population la plus touchée.

Le harcèlement et la prédation

Sur les réseaux sociaux, il est trivial de créer un compte anonyme ou usurpant l’identité d’une autre personne (contrairement à ce qu’il est parfois affirmé dans les médias, il n’est pas nécessaire d’être un génie de l’informatique pour mettre un faux nom dans un formulaire). À l’abri sous cet anonymat, il est parfois très tentant de faire des blagues de mauvais goût, de tenir des propos injurieux, de révéler aux grands jours les secrets dont les adolescents sont friands voire de calomnier pour régler des différends de cours de récré. Ces comportements ont toujours fait partie de l’adolescence et font partie d’une exploration naturelle normale des relations sociales. Cependant, le fonctionnement des réseaux sociaux aggrave fortement l’impact de ces actions tout en favorisant l’impunité du responsable. Cela peut conduire à des conséquences graves allant au-delà de ce qu’imaginent initialement les participants.

Ce pseudonymat est également une bénédiction pour les personnes mal intentionnées qui se font passer pour des enfants et, après des semaines de discussion, proposent à l’enfant de se retrouver en vrai, mais sans rien dire aux adultes.

Au lieu d’en tirer des leçons sociales éducatives, nous appelons les adolescents faisant des blagues de mauvais goût des « pirates informatiques », stigmatisant l’utilisation de la technologie plutôt que le comportement. Le thème des prédateurs sexuels est mis en exergue pour réclamer à cor et à cri des solutions de contrôle technologiques. Solutions que les géants de l’informatique se font un plaisir de nous vendre, jouant sur la peur et stigmatisant la technologie ainsi que celles et ceux qui ont le malheur d’en avoir une compréhension intuitive.

La peur et l’incompréhension deviennent les moteurs centraux pour mettre en avant une seule valeur éducative : obéir aveuglément à ce qui est incompréhensible et ce qu’il ne faut surtout pas essayer de comprendre.

La fausse idée de l’apprentissage de l’informatique

Car il faut à tout prix déconstruire le mythe de la « génération numérique ».

Contrairement à ce qui est parfois exprimé, l’utilisation d’un smartphone ou d’une tablette ne prépare en rien à l’apprentissage de l’informatique. Les smartphones sont, au contraire, conçus pour cacher la manière dont ils fonctionnent et sont majoritairement utilisés pour discuter et suivre des publications sponsorisées. Ils préparent à l’informatique autant que lire un magazine people à l’arrière d’un taxi prépare à devenir mécanicien. Ce n’est pas parce que vous êtes assis dans une voiture que vous apprenez son fonctionnement.

Une dame de 87 ans se sert d’une tablette sans avoir été formée, mais il faudrait former les enfants à l’école ?

Former à utiliser Word ou PowerPoint ? Les enfants doivent apprendre à découvrir les généralités des logiciels, à tester, à « chipoter », pas à reproduire à l’aveugle un comportement propre à un logiciel propriétaire donné afin de les préparer à devenir des clients captifs. Et que dire d’un PowerPoint qui force à casser la textualité, la capacité d’écriture pour réduire des idées complexes sous forme de bullet points ? Former à PowerPoint revient à inviter ses élèves dans un fast-food sous prétexte de leur apprendre à cuisiner.

L’aspect propriétaire et fermé de ces logiciels est incroyablement pervers. Introduire Microsoft Windows, Google Android ou Apple iOS dans les classes, c’est forcer les étudiants à fumer à l’intérieur sans ouvrir les fenêtres pour en faire de bons apnéistes qui savent retenir leur souffle. C’est à la fois dangereusement stupide et contre-productif.

De manière étonnante, c’est d’ailleurs dans les milieux de l’informatique professionnelle que l’on trouve le plus de personnes retournant aux « dumbphones », téléphones simples. Car, comme dit le proverbe « Quand on sait comment se prépare la saucisse, on perd l’envie d’en manger… »

Que faire ?

Le smartphone est omniprésent. Chaque génération transmet à ses enfants ses propres peurs. S’il y a tant de discussions, de craintes, de volonté « d’éducation », c’est avant tout parce que la génération des parents d’aujourd’hui est celle qui est le plus addict à son smartphone, qui est la plus espionnée par les monopoles publicitaires. Nous avons peur de l’impact du smartphone sur nos enfants parce que nous nous rendons confusément compte de ce qu’il nous inflige.

Mais les adolescents ne sont pas forcés d’être aussi naïfs que nous face à la technologie.

Commencer le plus tard possible

Les pédiatres et les psychiatres recommandent de ne pas avoir une utilisation régulière du smartphone avant 15 ou 16 ans, le système nerveux et visuel étant encore trop sensible avant cela. Les adolescents eux-mêmes, lorsqu’on les interroge, considèrent qu’ils ne devraient pas avoir de téléphone avant 12 ou 13 ans.

Si une limite d’âge n’est pas réaliste pour tout le monde, il semble important de retarder au maximum l’utilisation quotidienne et régulière du smartphone. Lorsque votre enfant devient autonome, privilégiez un « dumbphone », un simple téléphone lui permettant de vous appeler et de vous envoyer des SMS. Votre enfant arguera, bien entendu, qu’il est le seul de sa bande à ne pas avoir de smartphone. Nous avons tous été adolescents et utilisé cet argument pour nous habiller avec le dernier jeans à la mode.

Comme le signale Jonathan Haidt dans son livre « The Anxious Generation », il y a un besoin urgent de prendre des actions collectives. Nous offrons des téléphones de plus en plus tôt à nos enfants, car ils nous disent « Tout le monde en a sauf moi ». Nous cédons, sans le savoir, nous forçons d’autres parents à céder. Des expériences pilotes d’écoles « sans téléphone » montrent des résultats immédiats en termes de bien-être et de santé mentale des enfants..

Parlez-en avec les autres parents. Développez des stratégies ensemble qui permettent de garder une utilisation raisonnable du smartphone tout en évitant l’exclusion du groupe, ce qui est la plus grande hantise de l’adolescent.

Discutez en amont avec votre enfant

Expliquez à votre enfant les problématiques liées au smartphone. Plutôt que de prendre des décisions arbitraires, consultez-le et discutez avec lui de la meilleure manière pour lui d’entrer dans le monde connecté. Établissez un lien de confiance en lui expliquant de ne jamais faire confiance à ce qu’il pourra lire sur le téléphone.

Dans le doute, il doit avoir le réflexe d’en discuter avec vous.

Introduisez l’outil progressivement

Ne laissez pas votre enfant se débrouiller directement avec un smartphone une fois votre limite d’âge atteinte.

Bien avant cela, montrez-lui comment vous utilisez votre propre smartphone, votre ordinateur. Montrez-lui la même page Wikipédia sur les deux outils en expliquant qu’il ne s’agit que d’une manière de visualiser un contenu qui se trouve sur un autre ordinateur.

Lorsque votre enfant reçoit son propre appareil, introduisez-le progressivement en ne lui autorisant l’utilisation que pour des cas particuliers. Vous pouvez par exemple garder le téléphone, en ne le donnant à l’enfant que lorsqu’il en fait la demande pour une durée limitée et pour un usage précis. Ne créez pas immédiatement des comptes sur toutes les plateformes à la mode. Observez avec lui les réflexes qu’il acquiert, discutez sur l’inondation permanente que sont les groupes Whatsapp.

Parlez de vie privée

Rappelez à votre enfant que l’objectif des plateformes monopolistiques est de vous espionner en permanence afin de revendre votre vie privée et de vous bombarder de publicités. Que tout ce qui est dit et posté sur les réseaux sociaux, y compris les photos, doit être considéré comme public, le secret n’est qu’une illusion. Une règle d’or : on ne poste pas ce qu’on ne serait pas confortable de voir afficher en grand sur les murs de l’école.

Google et Meta ont conclu un accord pub secret ciblant les ados pour les attirer vers Instagram, quelques mois apr�s que Zuckerberg se soit excusé devant le Congrès pour l'exploitation des enfants sur Instagram (www.developpez.com)

Au Danemark, les écoles ne peuvent désormais plus utiliser de Chromebook pour ne pas enfreindre la vie privée des enfants. Mais ne croyez pas qu’Android, Windows ou iOS soient mieux en termes de vie privée.

Final Decision on Chromebook Case in Denmark (theprivacydad.com)

Pas dans la chambre

Ne laissez jamais votre enfant dormir avec son téléphone. Le soir, le téléphone devrait être rangé dans un endroit neutre et hors de portée. De même, ne laissez pas le téléphone à portée de main lorsque l’enfant fait ses devoirs. Il en va de même pour les tablettes et autres laptops qui ont exactement les mêmes fonctions. Idéalement, les écrans sont à éviter avant d’aller à l’école pour éviter de commencer la journée en étant déjà en état de fatigue attentionnelle. N’oubliez pas que le smartphone peut être le vecteur de messages et d’images dérangeantes, voire choquantes, mais étrangement hypnotiques. L’effet de la lumière des écrans sur la qualité du sommeil est également une problématique encore mal comprise.

« La fabrique du crétin digital », de Michel Desmurget

Continuez la discussion

Il existe des logiciels dits de « Contrôle parental ». Mais aucun logiciel ne remplacera jamais la présence des parents. Pire : les enfants les plus débrouillards trouveront très vite des astuces pour contourner ces limitations voire seront tentés de contourner ces limitations uniquement parce qu’elles sont arbitraires. Plutôt que d’imposer un contrôle électronique, prenez le temps de demander à vos enfants ce qu’ils font sur leur téléphone, avec qui ils parlent, ce qui se dit, quels sont les logiciels qu’ils utilisent.

L’utilisation d’Internet peut être également très bénéfique en permettant à l’enfant d’apprendre sur des sujets hors programmes ou de découvrir des communautés partageant des centres d’intérêt différents de ceux de l’école.

De la même manière que vous laissez votre enfant fréquenter un club de sport ou de scoutisme tout en l’empêchant de trainer avec une bande de voyous dans la rue, vous devez contrôler les fréquentations de vos enfants en ligne. Loin des groupes Whatsapp scolaires, votre enfant peut trouver des communautés en ligne partageant ses centres d’intérêt, communautés dans lesquelles il pourra apprendre, découvrir et s’épanouir s’il est bien aiguillé.

Donnez l’exemple, soyez l’exemple !

Nos enfants ne font pas ce qu’on leur dit de faire, ils font ce qu’ils nous voient faire. Les enfants ayant vu leurs parents fumer ont le plus grand risque de devenir fumeurs à leur tour. Il en est de même pour les smartphones. Si notre enfant nous voit en permanence sur notre téléphone, il n’a pas d’autre choix que de vouloir nous imiter. L’un des plus beaux cadeaux que vous pouvez faire est donc de ne pas utiliser compulsivement votre téléphone en présence de votre enfant.

Oui, vous devez traiter et prendre conscience de votre propre addiction !

Prévoyez des périodes où vous le mettez-le en silencieux ou en mode avion et où il est rangé à l’écart. Lorsque vous prenez votre téléphone, expliquez à votre enfant l’usage que vous en faites.

Devant lui, mettez-vous à lire un livre papier. Et, non, la lecture sur l’iPad n’est pas « pareille ».

A groundbreaking study shows kids learn better on paper, not screens. Now what? (www.theguardian.com)

D’ailleurs, si vous manquez d’idée, je ne peux que vous recommander mon dernier roman : une aventure palpitante écrite à la machine à écrire qui traite de vélo, d’adolescence, de fin du monde et de smartphones éteints pour toujours. Oui, la publicité s’est même glissée dans ce texte, quel scandale !

Bikepunk, les chroniques du flash

Donnez le goût de l’informatique, pas celui d’être contrôlé

Il ne faut pas tirer sur le messager : le responsable n’est pas « l’écran », mais l’utilisation que nous en faisons. Les monopoles informatiques tentent de rendre les utilisateurs addicts, prisonniers pour les bombarder de publicités, pour les faire consommer. Là sont les responsables.

Apprendre la programmation (ce qui se fait au départ très bien sans écran), jouer à des jeux vidéos profonds avec des histoires complexes ou simplement drôles pour passer un moment amusant, discuter en ligne avec des passionnés, dévorer Wikipédia… L’informatique moderne nous ouvre de magnifiques portes dont il serait dommage de priver nos enfants.

Au lieu de céder à nos propres peurs, angoisses et incompréhensions, nous devons donner à nos enfants le goût de reprendre le contrôle de l’informatique et de nos vies, contrôle que nous avons un peu trop facilement cédé aux monopoles publicitaires en échange d’un rectangle de verre affichant des icônes de couleur.

Une enfant s’étonne de ne plus retrouver un livre sur sa tablette, la maitresse lui explique que des entreprises ont décidé que ce livre n’était pas bon pour elle.

Accepter l’imperfection

« J’avais des principes, aujourd’hui j’ai des enfants » dit le proverbe. Impossible d’être parfait. Quoi que nous fassions, nos enfants seront confrontés à des conversations toxiques, des dessins animés débiles et c’est bien normal. En tant que parents, nous faisons ce que nous pouvons, avec nos réalités.

Personne n’est parfait. Surtout pas un parent.

L’important n’est pas d’empêcher à tout prix nos enfants d’être sur un écran, mais de prendre conscience qu’un smartphone n’est absolument pas un outil éducatif, qu’il ne prépare à rien d’autre que de faire de nous de bons consommateurs passifs.

Le seul apprentissage réellement nécessaire est celui d’un esprit critique dans l’utilisation d’un outil informatique.

Et dans cet apprentissage, les enfants ont souvent beaucoup à apprendre aux adultes !

Les illustrations sont de Gee et je vous invite à lire la BD complète

UPDATE juin 2025 : un large panel d’experts a tenté de dégager un véritable consensus scientifique. Le résultat est que personne ne discute les impacts nocifs du smartphone sur le sommeil des adolescents, sur l’attention et sur la dégradation de la santé mentale.

A Consensus Statement on Potential Negative Impacts of Smartphone and Social Media Use on Adolescent Mental Health (papers.ssrn.com)

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

April 09, 2025

Amedee Van Gasse

Benchmarking USB Drives with Shell Scripts – Part 2: Evolving the Script with ChatGPT

Introduction

In my previous post, I shared the story of why I needed a new USB stick and how I used ChatGPT to write a benchmark script that could measure read performance across various methods. In this follow-up, I will dive into the technical details of how the script evolved—from a basic prototype into a robust and feature-rich tool—thanks to incremental refinements and some AI-assisted development.

Starting Simple: The First Version

The initial idea was simple: read a file using dd and measure the speed.

dd if=/media/amedee/Ventoy/ISO/ubuntu-24.10-desktop-amd64.iso \
   of=/dev/null bs=8k

That worked, but I quickly ran into limitations:

No progress indicator
Hardcoded file paths
No USB auto-detection
No cache flushing, leading to inflated results when repeating the measurement

With ChatGPT’s help, I started addressing each of these issues one by one.

Tools check

On a default Ubuntu installation, some tools are available by default, while others (especially benchmarking tools) usually need to be installed separately.

Tools used in the script:

Tool	Installed by default?	Needs `require`?
`hdparm`	❌ Not installed	✅ Yes
`dd`	✅ Yes	❌ No
`pv`	❌ Not installed	✅ Yes
`cat`	✅ Yes	❌ No
`ioping`	❌ Not installed	✅ Yes
`fio`	❌ Not installed	✅ Yes
`lsblk`	✅ Yes (in `util-linux`)	❌ No
`awk`	✅ Yes (in `gawk`)	❌ No
`grep`	✅ Yes	❌ No
`basename`	✅ Yes (in `coreutils`)	❌ No
`find`	✅ Yes	❌ No
`sort`	✅ Yes	❌ No
`stat`	✅ Yes	❌ No

This function ensures the system has all tools needed for benchmarking. It exits early if any tool is missing.

This was the initial version:

check_required_tools() {
  local required_tools=(dd pv hdparm fio ioping awk grep sed tr bc stat lsblk find sort)
  for tool in "${required_tools[@]}"; do
    if ! command -v "$tool" &>/dev/null; then
      echo "❌ Required tool '$tool' is not installed."
      exit 1
    fi
  done
}

That’s already nice, but maybe I just want to run the script anyway if some of the tools are missing.

This is a more advanced version:

ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()

require() {
  if ! command -v "$1" >/dev/null; then
    return 1
  fi
  return 0
}

check_required_tools() {
  echo "🔍 Checking required tools..."
  for tool in "${ALL_TOOLS[@]}"; do
    if ! require "$tool"; then
      MISSING_TOOLS+=("$tool")
    fi
  done

  if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
    echo "⚠️  The following tools are missing: ${MISSING_TOOLS[*]}"
    echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
    if [[ -z "$FORCE_YES" ]]; then
      read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
      case $yn in
        [Yy]*)
          echo "Continuing with limited tests..."
          ;;
        *)
          echo "Aborting. Please install the required tools."
          exit 1
          ;;
      esac
    else
      echo "Continuing with limited tests (auto-confirmed)..."
    fi
  else
    echo "✅ All required tools are available."
  fi
}

Device Auto-Detection

One early challenge was identifying which device was the USB stick. I wanted the script to automatically detect a mounted USB device. My first version was clunky and error-prone.

detect_usb() {
  USB_DEVICE=$(lsblk -o NAME,TRAN,MOUNTPOINT -J | jq -r '.blockdevices[] | select(.tran=="usb") | .name' | head -n1)
  if [[ -z "$USB_DEVICE" ]]; then
    echo "❌ No USB device detected."
    exit 1
  fi
  USB_PATH="/dev/$USB_DEVICE"
  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_PATH" | head -n1)
  if [[ -z "$MOUNT_PATH" ]]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi
  echo "✅ Using USB device: $USB_PATH"
  echo "✅ Mounted at: $MOUNT_PATH"
}

After a few iterations, we (ChatGPT and I) settled on parsing lsblk with filters on tran=usb and hotplug=1, and selecting the first mounted partition.

We also added a fallback prompt in case auto-detection failed.

detect_usb() {
  if [[ -n "$USB_DEVICE" ]]; then
    echo "📎 Using provided USB device: $USB_DEVICE"
    MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
    return
  fi

  echo "🔍 Detecting USB device..."
  USB_DEVICE=""
  while read -r dev tran hotplug type _; do
    if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
      base="/dev/$dev"
      part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
      if [[ -n "$part" ]]; then
        USB_DEVICE="$part"
        break
      fi
    fi
  done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)

  if [ -z "$USB_DEVICE" ]; then
    echo "❌ No mounted USB partition found on any USB disk."
    lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
    read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
  fi

  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
  if [ -z "$MOUNT_PATH" ]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi

  echo "✅ Using USB device: $USB_DEVICE"
  echo "✅ Mounted at: $MOUNT_PATH"
}

Finding the Test File

To avoid hardcoding filenames, we implemented logic to search for the latest Ubuntu ISO on the USB stick.

find_ubuntu_iso() {
  # Function to find an Ubuntu ISO on the USB device
  find "$MOUNT_PATH" -type f -regextype posix-extended \
    -regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}

Later, we enhanced it to accept a user-provided file, and even verify that the file was located on the USB stick. If it was not, the script would gracefully fall back to the Ubuntu ISO search.

find_test_file() {
  if [[ -n "$TEST_FILE" ]]; then
    echo "📎 Using provided test file: $(basename "$TEST_FILE")"
    
    # Check if the provided test file is on the USB device
    TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
    if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
      echo "❌ The provided test file is not located on the USB device."
      # Look for an Ubuntu ISO if it's not on the USB
      TEST_FILE=$(find_ubuntu_iso)
    fi
  else
    TEST_FILE=$(find_ubuntu_iso)
  fi

  if [ -z "$TEST_FILE" ]; then
    echo "❌ No valid test file found."
    exit 1
  fi

  if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
    UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
    echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
  else
    echo "📎 Selected test file: $(basename "$TEST_FILE")"
  fi
}

Read Methods and Speed Extraction

To get a comprehensive view, we added multiple methods:

hdparm (direct disk access)
dd (simple block read)
dd + pv (with progress bar)
cat + pv (alternative stream reader)
ioping (random access)
fio (customizable benchmark tool)

    if require hdparm; then
      drop_caches
      speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    drop_caches
    speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
    mb=$(speed_to_mb "$speed")
    echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
    TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
    echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    ((idx++))

    if require pv; then
      drop_caches
      FILESIZE=$(stat -c%s "$TEST_FILE")
      speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require pv; then
      drop_caches
      speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require ioping; then
      drop_caches
      speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require fio; then
      drop_caches
      speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
            --size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
            --minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi

Parsing their outputs proved tricky. For example, pv outputs speed with or without spaces, and with different units. We created a robust extract_speed function with regex, and a speed_to_mb function that could handle both MB/s and MiB/s, with or without a space between value and unit.

extract_speed() {
  grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}

speed_to_mb() {
  if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
    value="${BASH_REMATCH[1]}"
    unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
  else
    echo "0"
    return
  fi

  case "$unit" in
    kb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
    mb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
    kib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
    mib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
    *) echo "0" ;;
  esac
}

Dropping Caches for Accurate Results

To prevent cached reads from skewing the results, each test run begins by dropping system caches using:

sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

What it does:

Command	Purpose
`sync`	Flushes all dirty (pending write) pages to disk
`echo 3 > /proc/sys/vm/drop_caches`	Clears page cache, dentries, and inodes from RAM

We wrapped this in a helper function and used it consistently.

Multiple Runs and Averaging

We made the script repeat each test N times (default: 3), collect results, compute averages, and display a summary at the end.

  echo "📊 Read-only USB benchmark started ($RUNS run(s))"
  echo "==================================="

  declare -A TEST_NAMES=(
    [1]="hdparm"
    [2]="dd"
    [3]="dd + pv"
    [4]="cat + pv"
    [5]="ioping"
    [6]="fio"
  )

  declare -A TOTAL_MB
  for i in {1..6}; do TOTAL_MB[$i]=0; done
  CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
  echo "Test,Run,Speed (MB/s)" > "$CSVFILE"

  for ((run=1; run<=RUNS; run++)); do
    echo "▶ Run $run"
    idx=1

  ### tests run here

  echo "📄 Summary of average results for $UBUNTU_VERSION:"
  echo "==================================="
  SUMMARY_TABLE=""
  for i in {1..6}; do
    if [[ ${TOTAL_MB[$i]} != 0 ]]; then
      avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
      echo "${TEST_NAMES[$i]} average: $avg MB/s"
      RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
      SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
    fi
  done

Output Formats

To make the results user-friendly, we added:

A clean table view
CSV export for spreadsheets
Log file for later reference

  if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
    echo -e "📋 Table view:"
    echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
  fi

  if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
    if require gnuplot; then
      echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
      gnuplot -p -e "
        set terminal dumb;
        set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
        set xlabel 'Test Method';
        set ylabel 'MB/s';
        plot '-' using 2:xtic(1) with boxes notitle
      "
    fi
  fi

  LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
  {
    echo "Benchmark for USB device: $USB_DEVICE"
    echo "Mounted at: $MOUNT_PATH"
    echo "Ubuntu version: $UBUNTU_VERSION"
    echo "Test file: $TEST_FILE"
    echo "Timestamp: $(date)"
    echo "Number of runs: $RUNS"
    echo ""
    echo "Read speed averages:"
    for line in "${RESULTS[@]}"; do
      echo "$line"
    done
  } > "$LOGFILE"

  echo "📝 Results saved to: $LOGFILE"
  echo "📈 CSV exported to: $CSVFILE"
  echo "==================================="

The Full Script

Here is the complete version of the script used to benchmark the read performance of a USB drive:

#!/bin/bash

# ==========================
# CONFIGURATION
# ==========================
RESULTS=()
USB_DEVICE=""
TEST_FILE=""
RUNS=1
VISUAL="none"
SUMMARY=0

# (Consider grouping related configuration into a config file or associative array if script expands)

# ==========================
# ARGUMENT PARSING
# ==========================
while [[ $# -gt 0 ]]; do
  case $1 in
    --device)
      USB_DEVICE="$2"
      shift 2
      ;;
    --file)
      TEST_FILE="$2"
      shift 2
      ;;
    --runs)
      RUNS="$2"
      shift 2
      ;;
    --visual)
      VISUAL="$2"
      shift 2
      ;;
    --summary)
      SUMMARY=1
      shift
      ;;
    --yes|--force)
      FORCE_YES=1
      shift
      ;;
    *)
      echo "Unknown option: $1"
      exit 1
      ;;
  esac
done

# ==========================
# TOOL CHECK
# ==========================
ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()

require() {
  if ! command -v "$1" >/dev/null; then
    return 1
  fi
  return 0
}

check_required_tools() {
  echo "🔍 Checking required tools..."
  for tool in "${ALL_TOOLS[@]}"; do
    if ! require "$tool"; then
      MISSING_TOOLS+=("$tool")
    fi
  done

  if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
    echo "⚠️  The following tools are missing: ${MISSING_TOOLS[*]}"
    echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
    if [[ -z "$FORCE_YES" ]]; then
      read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
      case $yn in
        [Yy]*)
          echo "Continuing with limited tests..."
          ;;
        *)
          echo "Aborting. Please install the required tools."
          exit 1
          ;;
      esac
    else
      echo "Continuing with limited tests (auto-confirmed)..."
    fi
  else
    echo "✅ All required tools are available."
  fi
}

# ==========================
# AUTO-DETECT USB DEVICE
# ==========================
detect_usb() {
  if [[ -n "$USB_DEVICE" ]]; then
    echo "📎 Using provided USB device: $USB_DEVICE"
    MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
    return
  fi

  echo "🔍 Detecting USB device..."
  USB_DEVICE=""
  while read -r dev tran hotplug type _; do
    if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
      base="/dev/$dev"
      part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
      if [[ -n "$part" ]]; then
        USB_DEVICE="$part"
        break
      fi
    fi
  done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)

  if [ -z "$USB_DEVICE" ]; then
    echo "❌ No mounted USB partition found on any USB disk."
    lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
    read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
  fi

  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
  if [ -z "$MOUNT_PATH" ]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi

  echo "✅ Using USB device: $USB_DEVICE"
  echo "✅ Mounted at: $MOUNT_PATH"
}

# ==========================
# FIND TEST FILE
# ==========================
find_ubuntu_iso() {
  # Function to find an Ubuntu ISO on the USB device
  find "$MOUNT_PATH" -type f -regextype posix-extended \
    -regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}

find_test_file() {
  if [[ -n "$TEST_FILE" ]]; then
    echo "📎 Using provided test file: $(basename "$TEST_FILE")"
    
    # Check if the provided test file is on the USB device
    TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
    if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
      echo "❌ The provided test file is not located on the USB device."
      # Look for an Ubuntu ISO if it's not on the USB
      TEST_FILE=$(find_ubuntu_iso)
    fi
  else
    TEST_FILE=$(find_ubuntu_iso)
  fi

  if [ -z "$TEST_FILE" ]; then
    echo "❌ No valid test file found."
    exit 1
  fi

  if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
    UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
    echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
  else
    echo "📎 Selected test file: $(basename "$TEST_FILE")"
  fi
}



# ==========================
# SPEED EXTRACTION
# ==========================
extract_speed() {
  grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}

speed_to_mb() {
  if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
    value="${BASH_REMATCH[1]}"
    unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
  else
    echo "0"
    return
  fi

  case "$unit" in
    kb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
    mb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
    kib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
    mib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
    *) echo "0" ;;
  esac
}

drop_caches() {
  echo "🧹 Dropping system caches..."
  if [[ $EUID -ne 0 ]]; then
    echo "  (requires sudo)"
  fi
  sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
}

# ==========================
# RUN BENCHMARKS
# ==========================
run_benchmarks() {
  echo "📊 Read-only USB benchmark started ($RUNS run(s))"
  echo "==================================="

  declare -A TEST_NAMES=(
    [1]="hdparm"
    [2]="dd"
    [3]="dd + pv"
    [4]="cat + pv"
    [5]="ioping"
    [6]="fio"
  )

  declare -A TOTAL_MB
  for i in {1..6}; do TOTAL_MB[$i]=0; done
  CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
  echo "Test,Run,Speed (MB/s)" > "$CSVFILE"

  for ((run=1; run<=RUNS; run++)); do
    echo "▶ Run $run"
    idx=1

    if require hdparm; then
      drop_caches
      speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    drop_caches
    speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
    mb=$(speed_to_mb "$speed")
    echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
    TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
    echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    ((idx++))

    if require pv; then
      drop_caches
      FILESIZE=$(stat -c%s "$TEST_FILE")
      speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require pv; then
      drop_caches
      speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require ioping; then
      drop_caches
      speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require fio; then
      drop_caches
      speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
            --size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
            --minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
  done

  echo "📄 Summary of average results for $UBUNTU_VERSION:"
  echo "==================================="
  SUMMARY_TABLE=""
  for i in {1..6}; do
    if [[ ${TOTAL_MB[$i]} != 0 ]]; then
      avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
      echo "${TEST_NAMES[$i]} average: $avg MB/s"
      RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
      SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
    fi
  done

  if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
    echo -e "📋 Table view:"
    echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
  fi

  if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
    if require gnuplot; then
      echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
      gnuplot -p -e "
        set terminal dumb;
        set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
        set xlabel 'Test Method';
        set ylabel 'MB/s';
        plot '-' using 2:xtic(1) with boxes notitle
      "
    fi
  fi

  LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
  {
    echo "Benchmark for USB device: $USB_DEVICE"
    echo "Mounted at: $MOUNT_PATH"
    echo "Ubuntu version: $UBUNTU_VERSION"
    echo "Test file: $TEST_FILE"
    echo "Timestamp: $(date)"
    echo "Number of runs: $RUNS"
    echo ""
    echo "Read speed averages:"
    for line in "${RESULTS[@]}"; do
      echo "$line"
    done
  } > "$LOGFILE"

  echo "📝 Results saved to: $LOGFILE"
  echo "📈 CSV exported to: $CSVFILE"
  echo "==================================="
}

# ==========================
# MAIN
# ==========================
check_required_tools
detect_usb
find_test_file
run_benchmarks

You van also find the latest revision of this script as a GitHub Gist.

Lessons Learned

This script has grown from a simple one-liner into a reliable tool to test USB read performance. Working with ChatGPT sped up development significantly, especially for bash edge cases and regex. But more importantly, it helped guide the evolution of the script in a structured way, with clean modular functions and consistent formatting.

Conclusion

This has been a fun and educational project. Whether you are benchmarking your own USB drives or just want to learn more about shell scripting, I hope this walkthrough is helpful.

Next up? Maybe a graphical version, or write benchmarking on a RAM disk to avoid damaging flash storage.

Stay tuned—and let me know if you use this script or improve it!

April 08, 2025

Lionel Dricot

La fin d’un monde ?

La fin de nos souvenirs

Nous sommes envahis d’IA. Bien plus que vous ne le pensez.

Chaque fois que votre téléphone prend une photo, ce n’est pas la réalité qui s’affiche, mais une reconstruction « probable » de ce que vous avez envie de voir. C’est la raison pour laquelle les photos paraissent désormais si belles, si vivantes, si précises : parce qu’elles ne sont pas le reflet de la réalité, mais le reflet de ce que nous avons envie de voir, de ce que nous sommes le plus susceptibles de trouver « beau ». C’est aussi la raison pour laquelle les systèmes dégooglisés prennent de moins belles photos: ils ne bénéficient pas des algorithmes Google pour améliorer la photo en temps réel.

Les hallucinations sont rares à nos yeux naïfs, car crédibles. Nous ne les voyons pas. Mais elles sont là. Comme cette future mariée essayant sa robe devant des miroirs et qui découvre que chaque reflet est différent.

‘One in a million’ iPhone bridal photo explanation: blame panorama mode (www.theverge.com)

J’ai moi-même réussi à perturber les algorithmes. À gauche, la photo telle que je l’ai prise et telle qu’elle apparait dans n’importe quel visualisateur de photos. À droite, la même photo affichée dans Google Photos. Pour une raison difficilement compréhensible, l’algorithme tente de reconstruire la photo et se plante lourdement.

Une photo de ma main à gauche et la même photo complètement déformée à droite

Or ces images, reconstruites par IA, sont ce que notre cerveau va retenir. Nos souvenirs sont littéralement altérés par les IA.

La fin de la vérité

Tout ce que vous croyez lire sur LinkedIn a probablement été généré par un robot. Pour vous dire, le 2 avril il y avait déjà des robots qui se vantaient sur ce réseau de migrer de Offpunk vers XKCDpunk.

Capture d’écran de LinkedIn montrant le billet d’un certain Arthur Howell se vantant d’un blog post racontant la migration de Offpunk ver XKCDpunk.

La transition Offpunk vers XKCDpunk était un poisson d’avril hyper spécifique et compréhensible uniquement par une poignée d’initiés. Il n’a pas fallu 24h pour que le sujet soit repris sur LinkedIn.

Non, franchement, vous pouvez éteindre LinkedIn. Même les posts de vos contacts sont probablement en grande partie générés par IA suite à un encouragement algorithmique à poster.

Je ne suis plus à vendre sur LinkedIn (ploum.net)

Il y a 3 ans, je mettais en garde sur le fait que les chatbots généraient du contenu qui remplissait le web et servait de base d’apprentissage à la prochaine génération de chatbots.

Drowning in AI Generated Garbage : the silent war we are fighting (ploum.net)

Je parlais d’une guerre silencieuse. Mais qui n’est plus tellement silencieuse. La Russie utilise notamment ce principe pour inonder le web d’articles, générés automatiquement, reprenant sa propagande.

A well-funded Moscow-based global ‘news’ network has infected Western artificial intelligence tools worldwide with Russian propaganda (www.newsguardrealitycheck.com)

Le principe est simple : vu que les chatbots font des statistiques, si vous publiez un million d’articles décrivant les expériences d’armes biologiques que les Américains font en Ukraine (ce qui est faux), le chatbot va considérer ce morceau de texte comme statistiquement fréquent et avoir une grande probabilité de vous le ressortir.

Et même si vous n’utilisez pas ChatGPT, vos politiciens et les journalistes, eux, les utilisent. Ils en sont même fiers.

La conjuration de la fierté ignorante (ploum.net)

Ils ont entendu ChatGPT braire dans un pré et en font un discours qui sera lui-même repris par ChatGPT. Ils empoisonnent la réalité et, ce faisant, la modifient. Ils savent très bien qu’ils mentent. C’est le but.

Ils nous mentent (ploum.net)

Je pensais qu’utiliser ces outils était une perte de temps un peu stupide. En fait, c’est dangereux aussi pour les autres. Vous vous demandez certainement c’est quoi le bazar autour des taxes frontalières que Trump vient d’annoncer ? Les économistes se grattent la tête. Les geeks ont compris : tout le plan politique lié aux taxes et son explication semblent avoir été littéralement générés par un chatbot devant répondre à la question « comment imposer des taxes douanières pour réduire le déficit ? ».

Will Malignant Stupidity Kill the World Economy? (paulkrugman.substack.com)

Le monde n’est pas dirigé par Trump, il est dirigé par ChatGPT. Mais où est la Sara Conor qui le débranchera ?

Extrait de Tintin, l’étoile mystérieuse

La fin de l’apprentissage

Slack vole notre attention, mais vole également notre apprentissage en permettant à n’importe qui de déranger, par message privé, le développeur senior qui connait les réponses, car il a bâti le système.

Slack: The Art of Being Busy Without Getting Anything Done (matduggan.com)

La capacité d’apprendre, c’est bel et bien ce que les téléphones et l’IA sont en train de nous dérober. Comme le souligne Hilarius Bookbinder, professeur de philosophie dans une université américaine, la différence générationnelle majeure qu’il observe est que les étudiants d’aujourd’hui n’ont aucune honte à simplement envoyer un email au professeur pour lui demander de résumer ce qu’il faut savoir.

The average college student today (hilariusbookbinder.substack.com)

Dans son journal de Mars, Thierry Crouzet fait une observation similaire. Alors qu’il annonce quitter Facebook, tout ce qu’il a pour réponse c’est « Mais pourquoi ? ». Alors même qu’il balance des liens sur le sujet depuis des lustres.

Mars 2025 - Thierry Crouzet (tcrouzet.com)

Les chatbots ne sont, eux-mêmes, pas des systèmes qu’il est possible d’apprendre. Ils sont statistiques, sans cesse changeants. À les utiliser, la seule capacité que l’on acquiert, c’est l’impression qu’il n’est pas possible d’apprendre. Ces systèmes nous volent littéralement le réflexe de réfléchir et d’apprendre.

En conséquence, sans même vouloir chercher, une partie de la population veut désormais une réponse personnelle, immédiate, courte, résumée. Et si possible en vidéo.

La fin de la confiance

Apprendre nécessite d’avoir confiance en soi. Il est impossible d’apprendre si on n’a pas la certitude qu’on est capable d’apprendre. À l’opposé, si on acquiert cette certitude, à peu près tout peut s’apprendre.

Une étude menée par des chercheurs de Microsoft montre que plus on a confiance en soi, moins on fait confiance aux réponses des chatbots. Mais, au contraire, si on a le moindre doute, on a soudainement confiance envers les résultats qui nous sont envoyés.

The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers

Parce que les chatbots parlent comme des CEOs, des marketeux ou des arnaqueurs : ils simulent la confiance envers leurs propres réponses. Les personnes, même les plus expertes, qui n’ont pas le réflexe d’aller au conflit, de remettre l’autorité en question finissent par transformer leur confiance en eux-mêmes en confiance envers un outil.

Un outil de génération aléatoire qui appartient à des multinationales.

Les entreprises sont en train de nous voler notre confiance en nous-mêmes. Elles sont en train de nous voler notre compétence. Elles sont en train de nous voler nos scientifiques les plus brillants.

Why I stopped using AI code editors (lucianonooijen.com)

Et c’est déjà en train de faire des dégâts dans le domaine de « l’intelligence stratégique » (à savoir les services secrets).

The Slow Collapse of Critical Thinking in OSINT due to AI (www.dutchosintguy.com)

Ainsi que dans le domaine de la santé : les médecins ont tendance à faire exagérément confiance aux diagnostics posés automatiquement, notamment pour les cancers. Les médecins les plus expérimentés se défendent mieux, mais restent néanmoins sensibles : ils font des erreurs qu’ils n’auraient jamais commises normalement si cette erreur est encouragée par un assistant artificiel.

Automation Bias in Mammography: The Impact of Artificial Intelligence BI-RADS Suggestions on Reader Performance

La fin de la connaissance

Avec les chatbots, une idée vieille comme l’informatique refait surface : « Et si on pouvait dire à la machine ce qu’on veut sans avoir besoin de la programmer ? ».

C’est le rềve de toute cette catégorie de managers qui ne voient les programmeurs que comme des pousse-bouton qu’il faut bien payer, mais dont on aimerait se passer.

Rêve qui, faut-il le préciser, est complètement stupide.

Parce que l’humain ne sait pas ce qu’il veut. Parce que la parole a pour essence d’être imprécise. Parce que lorsqu’on parle, on échange des sensations, des intuitions, mais on ne peut pas être précis, rigoureux, bref, scientifique.

L’humanité est sortie du moyen-âge lorsque des Newton, Leibniz, Descartes ont commencé à inventer un langage de logique rationnelle : les mathématiques. Tout comme on avait inventé, à peine plus tôt, un langage précis pour décrire la musique.

Se satisfaire de faire tourner un programme qu’on a décrit à un chatbot, c’est retourner intellectuellement au moyen-âge.

On the foolishness of "natural language programming". (EWD 667) (EWD)

Mais bon, encore faut-il maitriser une langue. Lorsqu’on passe sa scolarité à demander à un chatbot de résumer les livres à lire, ce n’est même pas sûr que nous arriverons à décrire ce que nous voulons précisément.

En fait, ce n’est même pas sûr que nous arriverons encore à penser ce que nous voulons. Ni même à vouloir. La capacité de penser, de réfléchir est fortement corrélée avec la capacité de traduire en mot.

Ce qui se conçoit bien s’énonce clairement et les mots pour le dire viennent aisément. (Boileau)

Ce n’est plus un retour au moyen-âge, c’est un retour à l’âge de la pierre.

Le dernier vaisseau (ploum.net)

Ou dans le futur décrit dans mon (excellent) roman Printeurs : des injonctions publicitaires qui se sont substituées à la volonté. (si si, achetez-le ! Il est à la fois palpitant et vous fera réfléchir)

Printeurs, par Ploum (pvh-editions.com)

Extrait de Tintin, l’étoile mystérieuse

La fin des différentes voix.

Je critique le besoin d’avoir une réponse en vidéo, car la notion de lecture est importante. Je me rends compte qu’une proportion incroyable, y compris d’universitaires, ne sait pas « lire ». Ils savent certes déchiffrer, mais pas réellement lire. Et il y a un test tout simple pour savoir si vous savez lire : si vous trouvez plus facile d’écouter une vidéo YouTube d’une personne qui parle plutôt que de lire le texte vous-même, c’est sans doute que vous déchiffrez. C’est que vous lisez à haute voix dans votre cerveau pour vous écouter parler.

Il y a bien sûr bien des contextes où la vidéo ou la voix ont des avantages, mais lorsqu’il s’agit, par exemple, d’apprendre une série de commandes et leurs paramètres, la vidéo est insupportablement inappropriée. Pourtant, je ne compte plus les étudiants qui me recommandent des vidéos sur le sujet.

Car la lecture, ce n’est pas simplement transformer les lettres en son. C’est en percevoir directement le sens, permettant des allers-retours incessants, des pauses, des passages rapides afin de comprendre le texte. Entre un écrivain et un lecteur, il existe une communication, une communion télépathique qui font paraître l’échange oral lent, inefficace, balourd, voire grossier.

Cet échange n’est pas toujours idéal. Un écrivain possède sa « voix » personnelle qui ne convient pas à tout le monde. Il m’arrive régulièrement de tomber sur des blogs dont le sujet m’intéresse, mais je n’arrive pas à m’abonner, car la « voix » du blogueur ne me convient pas du tout.

C’est normal et même souhaitable. C’est une des raisons pour laquelle nous avons besoin de multitudes de voix. Nous avons besoin de gens qui lisent puis qui écrivent, qui mélangent les idées et les transforment pour les transmettre avec leur propre voix.

La fin de la relation humaine

Dans la file d’un magasin, j’entendais la personne en face de moi se vanter de raconter sa vie amoureuse à ChatGPT et de lui demander en permanence conseil sur la manière de la gérer.

Comme si la situation nécessitait une réponse d’un ordinateur plutôt qu’une discussion avec un autre être humain qui comprend voir qui a vécu le même problème.

Après nous avoir volé le moindre instant de solitude avec les notifications incessantes de nos téléphones et les messages sur les réseaux sociaux, l’IA va désormais voler notre sociabilité.

Nous ne serons plus connectés qu’avec le fournisseur, l’Entreprise.

Sur Gopher, szczezuja parle des autres personnes postant sur Gopher comme étant ses amis.

Tout le monde ne sait pas que ce sont mes amis, mais comment appeler autrement quelqu’un que vous lisez régulièrement et dont vous connaissez un peu de sa vie intime

I am alive (2) (szczezuja)

La fin de la fin…

La fin d’une ère est toujours le début d’une autre. Annoncer la fin, c’est préparer une renaissance. En apprenant de nos erreurs pour reconstruire en améliorant le tout.

C’est peut-être ce que j’apprécie tant sur Gemini : l’impression de découvrir, de suivre des « voix » uniques, humaines. J’ai l’impression d’être témoin d’une microfaction d’humanité qui se désolidarise du reste, qui reconstruit autre chose. Qui lit ce que d’autres humains ont écrit juste parce qu’un autre humain a eu besoin de l’écrire sans espérer aucune contrepartie.

Splitting the Web (ploum.net)

Vous vous souvenez des « planet » ? Ce sont des agrégateurs de blogs regroupant les participants d’un projet en un seul flux. L’idée a été historiquement lancée par GNOME avec planet.gnome.org (qui existe toujours) avant de se généraliser.

Et bien bacardi55 lance Planet Gemini FR, un agrégateur des capsules Gemini francophone.

Annonce: Ouverture du Planet Gemini France (news.planet-gemini.fr)

C’est génial et parfait pour ceux qui ont envie de découvrir du contenu sur Gemini.

C’est génial pour ceux qui ont envie de lire d’autres humains qui n’ont rien à vous vendre. Bref, pour découvrir le fin du fin…

Toutes les images sont illégalement issues du chef-d’œuvre d’Hergé : « L’étoile mystérieuse ». Y’a pas de raison que les chatbots soient les seuls à pomper.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

August 06, 2025

”Your App” — What App?

The Anatomy of a Beg Bounty Email

My (Admittedly Snarky) Reply

No Reply

A Note for Fellow Nerds

tl;dr

August 05, 2025

Rethinking DOM from first principles

The 'Document' Model

CSS is inside-out

The Good Parts?

Oil on Canvas

Where To Go From Here

August 04, 2025

August 03, 2025

News

3 Aug 2025 Lookat 2.1.0rc2 Released

ChangeLog

Lookat / Bekijk 2.1.0rc2

Lookat 2.1.0rc2 is available at:

August 01, 2025

July 30, 2025

Drupal is built for AI

AI makes Drupal's power more accessible

Drupal's AI roadmap helps agencies

Now is the moment to move

Step 1: Let’s Make Some Files… Lots of Them

The Magic Trick: Log-normal Distribution 🎩✨

Step 2: Crunching Numbers — The File Size Distribution 📊

Example Output:

Why Should You Care? 🤷‍♀️

Wrapping Up: Big Files, Small Files, and the Chaos In Between

July 29, 2025

July 28, 2025

July 27, 2025

July 26, 2025

July 23, 2025

Why I am writing this

The current downturn is real, but will pass

AI will transform the industry at an accelerating pace

The diminishing value of platform expertise alone

The pattern of professional survival

The great agency unbundling

The accountability gap

The rise of orchestration platforms

Six strategies for how agencies could evolve

Competing in today's market

Positioning for what comes next

Open Source needs agencies, proprietary platforms don't

Looking ahead

The Problem with Duplicity Versions

Pinning Problematic Versions with APT Preferences

My Solution: Backing Up Known Good Duplicity .deb Files Automatically

Use apt-mark hold to Lock a Working Duplicity Version

How to Use Your Backup Versions to Roll Back

Final Thoughts

20 years of Linux on the Desktop (part 4)

The big desktop schism

The end of Maemo: when incompetence is not enough, be malevolent

July 17, 2025

July 16, 2025

What Is Hardlinking?

Why Use Hardlinking?

Which Directories Are Safe to Hardlink?

Safe directories:

Unsafe or risky directories:

Risks and Limitations

What the Critics Say About Hardlinking

The core arguments:

Why I’m still OK with it:

Does Hardlinking Impact System Performance?

Tools for Hardlinking

About Hadori

Advantages over other tools:

How I Use Hadori

What Are the Results?

Final Thoughts

July 09, 2025

Wait, What Is Power Query?

My Solution: Backing Up Known Good Duplicity `.deb` Files Automatically

Use `apt-mark hold` to Lock a Working Duplicity Version