Planet Grep

Planet'ing Belgian FLOSS people

Planet Grep is maintained by Wouter Verhelst. All times are in UTC.

March 19, 2024

In MySQL 8.0.12, we introduced a new algorithm for DDLs that won’t block the table when changing its definition. The first instant operation was adding a column at the end of a table, this was a contribution from Tencent Games.

Then in MySQL 8.0.29 we added the possibility to add (or remove) a column anywhere in the table.

For more information, please check these articles from Mayank Prasad : [1], [2]

In this article, I want to focus on some dangers that could happen when using blindly this feature.

Default Algorithm

Since MySQL 8.0.12, for any supported DDL, the default algorithm is INSTANT. This means that the ALTER statement will only modify the table’s metadata in the data dictionary. No exclusive metadata locks are taken on the table during the preparation and execution phases of the operation, and table data is unaffected, making the operations instantaneous.

The other two algorithms are COPY and INPLACE, see the manual for the online DDL operations.

However, there is a limitation for INSTANT DDLs even when the operation is supported: a table supports 64 instant changes. After reaching that counter, the table needs to be “rebuilt”.

If the algorithm is not specified during the ALTER statement (DDL operation), the appropriate algorithm will be chosen silently. Of course, this can lead to a nightmare situation in production if not expected.

Always specify the ALGORITHM

So the first recommendation is always to specify the algorithm even if it’s the default one when performing DDLs. When the algorithm is specified, if MySQL is not able to use it, it will throw an error instead of executing the operation using another algorithm:

SQL > ALTER TABLE t1 DROP col1, ALGORITHM=INSTANT;
ERROR: 4092 (HY000): Maximum row versions reached for table test/t1.
No more columns can be added or dropped instantly. Please use COPY/INPLACE.

Monitor the instant changes

The second recommendation is also to monitor the number of instant changes performed on the tables.

MySQL keeps the row versions in Information_Schema:

SQL > SELECT NAME, TOTAL_ROW_VERSIONS 
      FROM INFORMATION_SCHEMA.INNODB_TABLES WHERE NAME LIKE 'test/t1';
+---------+--------------------+
| NAME    | TOTAL_ROW_VERSIONS |
+---------+--------------------+
| test/t1 |                 63 |
+---------+--------------------+

In the example above, the DBA will be able to perform one extra INSTANT DDL operation but after that one, MySQL won’t be able to perform another one.

As DBA, it’s a good practice to monitor all the tables and decide when a table needs to be rebuilt (to reset that counter).

This is an example of a recommended query to add to your monitoring tool:

SQL > SELECT NAME, TOTAL_ROW_VERSIONS, 64-TOTAL_ROW_VERSIONS AS 
             "REMAINING_INSTANT_DDLs", 
             ROUND(TOTAL_ROW_VERSIONS/64 * 100,2) AS "DDLs %"
      FROM INFORMATION_SCHEMA.INNODB_TABLES 
      WHERE TOTAL_ROW_VERSIONS > 0 ORDER BY 2 DESC;
+--------------------------+--------------------+------------------------+--------+
| NAME                     | TOTAL_ROW_VERSIONS | REMAINING_INSTANT_DDLs | DDLs % |
+--------------------------+--------------------+------------------------+--------+
| test/t1                  |                 63 |                      1 |  98.44 |
| test/t                   |                  4 |                     60 |   6.25 |
| test2/t1                 |                  3 |                     61 |   4.69 |
| sbtest/sbtest1           |                  2 |                     62 |   3.13 |
| test/deprecation_warning |                  1 |                     63 |   1.56 |
+--------------------------+--------------------+------------------------+--------+

To reset the counter and rebuild the table, you can use OPTIMIZE TABLE <table> or ALTER TABLE <table> ENGINE=InnoDB

Conclusion

In conclusion, MySQL 8.0’s introduction of the INSTANT algorithm for DDL operations has revolutionized schema changes by avoiding blocking changes. However, with the limitation of 64 instant changes, before a table rebuild is required, it’s crucial to specify the algorithm explicitly during ALTER statements to avoid unexpected behaviors. Monitoring the number of instant changes through Information_Schema is also recommended to avoid surprises by reaching the instant change limit unaware and plan carefully the table rebuilds

Enjoy MySQL !

March 18, 2024

A Society That Lost Focus

Our Mind, the Bottleneck

In the early 90s, after tweaking my MS-DOS computer, I was able to play games. One of those game was called "Battle Chess". A Chess game were pieces were really fighting against each other. It was fun. I was, and still am, a mediocre chess player. I was mate in less than 10 or 15 turns at the easiest level.

For the sake of the experiment, I turned the difficulty to the harder level and started playing. Something strange happened: I was still losing but it took a lot more turns. I was able to protect my game, even to manage a few draws.

Was it a bug in the game?

Even as a young teenager, I quickly understood the reason. With the setting set to "hard", the game would try a lot harder to find a good move. On my 386 processor, without the mathematical coprocessor, this would take time. Several seconds or even one minute by turn. During that time, I was thinking, anticipating.

With the easiest setting, computer moves would happen immediately. I knew I had all the time I want but I was compelled to move fast. I could not take the time while the other side was immediately reacting to my moves.

The world we are living in is that same chess game on the easiest setting. Everything happens immediately, all the time. White-collar work can now be summarised as trying to reply as fast as possible to every single email until calling it a day and starting again in the morning, a process which essentially prevents any deep thinking, as pointed by Cal Newport in his book "A world without email".

As we don’t have the time to think anymore, we masquerade our lack of ideas with behavioural tricks. We replaced documents with PowerPoints because it allowed lack of structure and emptiness to look professional (just copy paste the data of the last PowerPoint you received in a text file and see by yourself how pitiful it is. PowerPoint communications at NASA were even diagnosed by Edward R. Tufte, author of the "The cognitive style of PowerPoint", as one of the causes that led to Space Shuttle Columbia’s disaster).

The root problem is that, for the first time in human history, our brain is the bottleneck. For all history, transmitting information was slow. Brains were fasts. After sending a letter, we had days or months to think before receiving an answer. Erasmus wrote his famous "Éloge de la folie" in several days while travelling in Europe. He would never have done it in a couple of hours in a plane while the small screen in the backseat would show him advertisements.

In 2012, the French writer Thierry Crouzet had one of the first recorded "online burnout". Being connected all the time with interesting strangers and interesting ideas to which he wanted to reply quickly was too much for his brain. One night, he had a strong panic attack and decided to spend six months without the Internet, an experience he told in his book "J’ai débranché".

The Oversold Internet

The instant feedback of permanent connectivity is clearly a bad thing. But the worst had yet to come. After the 2000s bubble popped and told us that Internet was not "magic money", the question became "how do we monetise the Internet?" A few idealistic geeks replied, "You don’t monetise it, it’s a non-commercial world." But geeks, as everyone, wanted or needed to be paid.

To earn money, they handed the reins of the whole new world they were creating to marketers. That’s it: hackers sold the Internet in exchange for a salary. Until 2000, marketers played along with the idea of selling the work hackers were doing. With one small problem: they oversold it completely, diving in the geek fantasy that, soon, everybody would be on that Internet buying stuff online.

In the 2000s, nobody but geeks wanted to spend their life behind a huge radiating screen. Marketers suddenly waked up to that reality with the dot-com bubble. If not everybody wanted to be on the Internet and nobody would buy anything on the Internet, there were two potential solutions: either monetising the fact that some people were already spending lots of time of the Internet or convincing more people to come on the Internet.

Surviving companies such as Google decided for the easiest one: monetising what people were already giving to the Internet: their time and attention. Advertising was, of course, already part of the web (mostly through the infamous "popups") but Google innovated by inventing a whole new way of exploiting attention: trying to learn as much as possible about users to show them the advertising they are more likely to click on. The whole story is told in great details in the book "Surveillance Capitalism", by Soshanna Zuboff.

Whether this "personalised advertisement" really works better than traditional one is up to debate. For Tim Hang, author of "Subtime Attention Crisis" and for Cory Doctorow, author of "How to destroy surveillance capitalism", the real impact on sales is negligible but as marketers think it works, they invest massive money in it, making the whole technology a very lucrative bubble.

But the real impact is undisputed : as long as someone buys it, it is really lucrative to sell the attention and all the information you could from consumers. As a consequence, the practice has been generalised and nearly every website, every app on the Internet is trying to get both. And they are very scientific about the process.

We forgot how not to spy and steal attention

It is now considered as "normal practice" to try to get the attention and the data of your users, even if it doesn’t make sense from a business perspective.

Banking apps send notifications to show you their new shiny logo, good old e-commerce website ask their customers for the number of children they have or their income bracket. Even non-commercial personal blogs or some websites dedicated to privacy contain analytics software to track their users. Not tracking your users is harder than not! Every single vendor from which you shop, even a brick-and-mortar one, will bury you with their mailings.

One could assume that buying a new mattress is something you do only every decade and that the prospective market for mattress vendors is those who didn’t buy a mattress in the last five years. So why did anybody think that, right after buying a mattress, I would be interested in receiving news about mattresses every single week of my life?

The two consequences of all this are that our privacy is invaded as much as it is technologically possible and that our attention is scientifically captured as much as it is technologically possible. And, in both aspects, technology is "improving" as all the smartest minds in the world are hired to do just that.

While working at Google, Tristan Harris realised how much what they were building was in order to get the focus and the attention of people. He left Google to create the "Center for Humane Technology" that tries to raise attention about the fact that… our attention is captured by monopolist technologies.

The irony is palpable: Tristan Harris had a very good intuition but can’t imagine doing anything else than either "raising attention" through social networks or building technologies that would notify you that you should be focused. Let’s build yet another layer of complexity above everything else and raise attention so this layer is adopted widely enough to become the foundation of the next complexity paradigm.

Worshipping Shallow Ideas

Being distracted all the time prevent us from having any ideas and understanding. We need a catchy slogan. Instead of reading a three-page report, we prefer a 60 slides PowerPoint, containing mostly stock pictures and out-of-context charts.

We have valorised the heroic image of the CEO that comes in a meeting and tell engineers, "I have ten minutes left before my next meeting. Tell me everything in five and I’ll take a billion dollar decisions."

In retrospect, it is obvious that taking good decisions in that context is nothing more than rolling a die. Funnily enough, it has been proved multiple times than every high-profile CEO is not better than a random decision algorithm. But, unlike algorithms, CEOs usually have charisma and assurance. They may take a very wrong decision but they can convince everybody that it’s the right one. Which is exactly the definition of a salesman job.

In "Deep Work", Cal Newport tries to promote the opposite stance, the art of taking the time to think, to ponder. In "The Ideas Industry", Daniel Drezner observes that long, subtle and complex ideas are more and more replaced by simplistic slogans, the epitome being the famous TED conferences. In 18 minutes, people are sold an idea and, if the speaker is a good salesman, feel like they’ve learned something deep and new. The mere fact that you could learn something deeply enough in 18 minutes is an insult to all the academic world. Without surprise, the same academic world is seen by many as boring old people spending their time writing long articles instead of making a catchy slogan to change the world.

Succumbing to Our Addictions

Most monopolies were built by removing choices. You could not buy a computer without Microsoft Windows. You could not visit some websites without Internet Explorer. You can’t find a phone without Google in a shop (Google pay many billions dollars every year to be the default search engine on Apple devices). And if you manage to remove Google from your phone, you will lose the ability to run some apps, including most banking apps. Most apps even check at start if Google services are installed on the phone and refuse to start if it’s not the case. If it’s really hard not to use Google, it’s by definition a forced monopoly. Similarly, it is very hard to avoid Amazon when shopping online.

There’s one exception : Facebook. There’s nothing forcing us to go to Facebook or Instagram. There’s nothing forcing us to spend time on it. It’s like we have choice. But it seems we haven’t.

Why is this? Why are we playing one hour of what was supposed to be five minutes of a stupid smartphone game instead of reading a book? Why are we spending every minute awake checking our smartphone and replying to mundane chitchat, even if we are in the middle of the conversation with someone else? Why are we compelled to put our life and the lives of our children at risk just to quickly reply while driving?

Because of the way the human brain is wired. Evolutionary speaking, we are craving for new experiences. Learning new experiences, good or bad, may help your chromosomes to survive more generations than others. We get that famous "dopamine rush", described in great details by Liberman and Long in "The molecule of more".

Each time there’s a notification, each time there’s a red bubble in some part of the screen, the brain acts like it’s a new vital opportunity. We can’t miss it. A study showed that the sole notification sound was enough to distract a driver as much as if he was texting while driving. Yes, even without looking at your phone, you were distracted as much as if you did (which is not an excuse to look at it).

The brain has learned that the phone is a random provider of "new experiences". Even in airplane mode, it was demonstrated that having the phone on your desk or in your bag degrades heavily your attention and your thinking performance. Performance went back to normal only when the phone was put in another room.

Fighting to Get Our Focus Back

That’s it, the only way to not have any temptation is not to have the phone at arm reach. The aforementioned French writer Thierry Crouzet told me once that it was very difficult to focus on writing when you know you only have to move the word processor window with the mouse to go to the Internet. On the web, writers’ forums are full of discussions about "distraction-less" devices. Some, including your servitor, are going back to old typewriters, a paradigm described as a true resistance by Richard Polt in the excellent book "The Typewriter Revolution".

One may even wonder if the epidemic of "electro-sensitivity", feeling bad or being sick when exposed to wifi or similar wireless emissions, may simply be a psychological reaction to the overstimulation. It has been observed that the symptoms are real (people are really feeling bad and are not simulating) but that, in double-blind controlled environment, the symptoms are linked to the belief of wireless emissions (if you simulate a blinking wireless router without emitting anything, people feel bad. If you have wireless emission but tell people it’s disabled, they will feel better).

In his landmark book "Digital Minimalism", Cal Newport offers a framework to rethink the way we use digital technologies. The central idea is to balance costs and benefits consciously, highlighting most hidden costs. Facebook might be free in the sense you don’t have to pay for it. But being exposed to advertising, being exposed to angry political rants, feeling compelled to answer, being exposed to picture of people you once knew and who seems to have an extraordinary (even if virtual) life is a very high cost.

Simply do the math. If you have 180 friends on Facebook, which seems to be a low amount those days, if your friends take, on average, 10 days of vacation per year, you will have, on average, five friends on vacation every day. Add to this statistic that some people like to re-post pictures of old vacations and it means that you will be bombarded daily by pictures of sunny beaches and beautiful landscapes while you are waiting under neon light for your next boring meeting in a gray office. By design, Facebook makes you feel miserable.

That’s not to say that Facebook cannot be useful and have benefits. As Cal Newport highlight, you need to adapt your use to maximise the benefits while trying to avoid costs as much as possible. You have to think consciously about what you really want to achieve.

This idea of digital minimalism prompted a revival of the so-called "dumb phones", phones which are not smart and which are able to make phone call and send/receive SMS. Some brands are even starting to innovate in that particular market like Mudita and Lightphone.

Ironically, they are advertising mindfulness and being focused. They are trying to catch your attention to sell you back… your own attention.

Focus Against Consumerism

One of the consumerist credo is that the market will fix everything. If there’s a problem, someone will quickly sell a solution. As pointed by Evgeny Morozov in "To Save Everything, Click Here", this is not only wrong thinking. This is actually harmful.

With public money, we are actually actively funding companies and startups thinking they will both create jobs and sell solutions to every problem. It is implied that every solution should be a technological one, should be sellable and should be intuitive. That’s it: you should not think too much about a problem but instead build blindly whatever solution comes to mind using the currently trending technological stack. French Author Antoine Gouritin wrote a funny and interesting book about that whole philosophy he called "Le Startupisme".

The root cause is there: we don’t have any mental framework left other than spying on people and steal their attention. Business schools are teaching how to do catchy PowerPoints while stealing attention from people. Every business is at war with the other to catch your attention and your brain cycles. Even academy is now fighting to get grants based on catchy PowerPoints and raw number of publications. This was the raw observation of David Graeber: even academics have stopped thinking to play the "catch your attention game".

There’s no silver bullet. There will not be any technological solution. If we want to claim back our focus and our brain cycles, we will need to walkaway and normalise disconnected times. To recognise and share the work of those who are not seeking attention at all cost, who don’t have catchy slogans nor spectacular conclusions. We need to start to appreciate harder works which don’t offer us immediate short-term profit.

Our mind, not the technology, is the bottleneck. We need to care about our minds. To dedicate time to think slowly and deeply.

We need to bring back Sapiens in Home Sapiens Sapiens.

As a writer and an engineer, I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

If you read French, you can support me by buying/sharing/reading my books and subscribing to my newsletter in French or RSS. I also develop Free Software.

March 15, 2024

Dit kan helemaal niet in België want het vereist een serieus niet populistisch debat tussen alle politieke partijen (waarbij ze hun rekenmachine en logisch redeneren meenemen én er op toezien dat het budget écht in balans blijft). Dat bestaat al decennia niet meer op het Federaal niveau. Dus ja.

Maar een manier om het budgetair probleem van het land op te lossen zou kunnen door meer te belasten op consumptie door het BTW tarief van 21% naar bv. 25% of 30% te verhogen, (veel) minder te belasten op inkomen en tot slot meer producten en diensten op 6% en 12% te zetten.

M.a.w. alle noodzakelijke uitgaven zoals electriciteit, Internet, (gas) verwarming, water, brood, en zo verder op 6%. Een (veel) grotere mand van (bv. gezonde) voedingsmiddelen, woning (en kosten zoals bv. verbouwingen) op 12%.

Maar voor alle luxeconsumptie 25% of 30% of eventueel zelfs 40%.

Daaraan gekoppeld een stevige verlaging van de personenbelasting.

Gevolg meer BTW-inkomsten uit consumptie. Betere concurrentiepositie t.o.v. onze buurlanden doordat de salarissen niet hoeven te stijgen (door de verlaging van de personenbelasting). Koopkracht wordt daardoor versterkt en die versterking wordt aangevuld doordat er een grotere mand voor 6% en 12% BTW tarieven is.

M.a.w. enkel wordt luxeconsumptie (veel) meer belast. Noodzakelijke consumptie gaat naar 6% of 12% (of men behoudt hiervoor de 21%).

Grootste nadeel is dat onze boekhouders meer werk hebben. Maar ik denk wel dat ze dat zo erg niet zullen vinden (ze factureren hun extra uren trouwens aan het luxeconsumptie BTW tarief).

Voorbeeld voordeel is dat je consumptie beter kan sturen. Wil je dat wij Belgen meer electrische wagens kopen? Maak zo’n wagen 6% BTW en een wagen met een dieselverbrandingsmotor 40% BTW. Wil je dat jongeren gezondere voeding eten? Coca Cola 40% BTW en fruitsap 6% BTW.

Uiteraard moet je ook besparen en een serieuze efficiëntieoefening doen.

Lectures : Une société de mensonges

Productivisme

En lisant ce simplissime et magnifique texte de Bruno Leyval décrivant la condition ouvrière, je ne peux m’empêcher d’avoir « Tranche de vie » de François Béranger en tête. Et de prendre pleinement conscience à quel point le fonctionnement en pause des usines est inhumain.

Le travail de nuit est nécessaire pour plein de raisons : la santé, la sécurité, l’aide aux personnes. Ou bien dans l’agriculture, dans le soin de tout ce qui est vivant, des plantes aux animaux. Mais que dire du fait qu’on force des humains à travailler de nuit et se détruire irrémédiablement la santé pour fabriquer des bagnoles ou empaqueter des petits pois dont personne ne voudrait s’il n’y avait pas ce matraquage incessant du marketing ? Qui est ce « on » qui force ? Les patrons ? Les politiciens ? Ou nous-mêmes, les consommateurs ?

Surveillance généralisée

Les caméras de surveillance partout, ça ne peut pas faire de tort, hein ? Y compris dans les distributeurs automatiques. Et puis voilà qu’à cause d’un bug, la population étudiante d’une université canadienne découvre que les vendeurs automatiques de chocolats font de la reconnaissance faciale. Le visage de chaque client est scanné, à son insu, et des statistiques sur l’âge, le genre et potentiellement plein d’autres choses sont envoyées au département marketing. Afin de nous vendre les merdes produites par les travailleurs passant leurs nuits sur des chaînes de production.

La question qui se pose : mais qui a accès à ces millions de données qui sont capturées autour de nous en permanence, que ce soit par notre montre, notre voiture, une caméra de surveillance ou un distributeur de boissons ?

Aram Sinnreich et Jesse Gilbert ont trouvé la réponse. Elle est simple : tout le monde. Il suffit de payer et ce n’est vraiment pas très cher.

Et même si vous n’avez aucun engin connecté, que vous faites très attention, aux États-Unis il est possible d’acheter les données concernant toutes les personnes se rendant à une adresse. Votre maison par exemple. Et savoir d’où elles viennent. Quel est leur niveau de vie.

Mais, dans ce cas-ci, le fabriquant du distributeur tente de rassurer : la machine respecte les lois, y compris le RGPD européen.

En fait, non. Le RGPD ne permet pas le scan de visage sans consentement. Mais si l’entreprise dit respecter le RGPD, qui ira vérifier ? Et comment vérifier ?

En utilisant le RGPD, j’ai fait effacer plus de 300 comptes en ligne à mon nom (ça m’a pris trois ans). Après plusieurs années, certains comptes soi-disant effacés ont recommencé à m’envoyer des « newsletters ». Donc, en fait, rien n’était effacé du tout. Mention spéciale au restaurant de sushis qui avait remplacé mon login "ploum" par "deleted_ploum" et prétendait que tout était effacé alors que j’avais encore accès au compte via le cookie de mon ordinateur. Ou le site d’immobilier qui a soudainement commencé à m’envoyer journalièrement les résultats d’une recherche que j’avais enregistrée… il y a 10 ans ! (et je ne pouvais pas me désinscrire vu que j’avais officiellement effacé mon compte depuis plus de deux ans).

Le marketing est, par définition, du mensonge. Les gens travaillant dans le marketing sont des menteurs. Dans les écoles de marketing, on apprend à mentir, le plus outrageusement possible. Tout ce qu’ils disent doit être considéré comme un mensonge. Nous sommes entourés d’un nuage de mensonges. Toute notre société est construite sur le rejet de toute forme de vérité.

Lorsqu’un menteur est pris en flagrant délit, ce n’est pas le mensonge détecté le problème. C’est, au contraire de réaliser tous les autres mensonges que nous n’avons jamais détectés.

Les hallucinations de Turing

Le test de Turing dit qu’une intelligence artificielle est véritablement intelligente si on ne peut pas distinguer ce qu’elle écrit de ce qu’écrirait un humain. Turing n’avait pas envisagé que l’immense majorité des écrits humains seraient des piles de mensonges produits par des départements marketing. Dans tout ce débat, ce qui m’impressionne n’est pas tellement l’intelligence de nos ordinateurs, mais la bêtise humaine…

Ce n’est même pas caché. Quand un journaliste de The Verge fait un article pour dire qu’une imprimante fonctionne bien (parce qu’elle imprime et pis c’est tout, ce qui est un truc de dingue dans le monde moderne), il est obligé de rajouter, de son propre aveu, du bullshit généré par ChatGPT afin que l’article paraisse sérieux aux yeux de Google. Lui, il le reconnait et préviens de ne pas lire. Mais pour un article comme cela, combien ne le disent pas ?

Les modèles de langage larges (LLM) sont condamnés à « halluciner », le terme politiquement correct pour « raconter absolument n’importe quoi ». Exactement comme les départements marketing. Ou comme les religions ou n’importe quelle superstition. L’être humain a une propension à aimer le grand n’importe quoi et à y trouver un sens arbitraire (c’est le phénomène de paréidolie, qui permet notamment de voir des formes dans les nuages).

Donc, ma question : vous espériez quoi, très sincèrement, des intelligences artificielles ?

Les mensonges que nous nous racontons

Et vous espérez quoi de l’humanité ? Non seulement nous passons notre temps à nous mentir, mais à « débattre » nos propres actions qui ne devraient même pas être discutables.

Allez, je vous donne un exemple de comportement que nous trouvons « normal » simplement parce que nous sommes trop crétins pour dire aux marketeux qu’ils sont de dangereux menteurs psychopathes et que donc, on se dit qu’ils ne doivent pas avoir toujours complètement tort.

Oui, je veux sauver la planète ! Chaque jour, je prends sur mon salaire pour faire un don de dix ou vingt euros à une énorme organisation multinationale qui fait travailler les enfants dans des conditions horribles et qui est responsable de près de 1% de toutes les émissions de CO2 de la planète. Sans compter les polluants : je participe chaque jour à l’une des plus importantes sources de pollution des nappes phréatiques et des océans.

C’est pour ça que je les soutiens chaque jour. Ça me coûte cher, mais je continue ! Et en les soutenant, je participe à tuer 8 millions de personnes chaque année, à causer 50x plus de cancers en France que n’importe quelle autre pollution et je participe activement à empirer toutes les maladies respiratoires dans mon entourage, surtout les plus jeunes.

Bref, je fume.

En plus, je pue. Je dérange tout le monde, mais personne n’ose me le dire, car c’est « ma liberté ». Les non-fumeurs sont tellement bien élevés, ils n’osent pas me dire que, si, ça les dérange que je m’en grille un. Alors, j’emmerde les non-fumeurs, j’emmerde les asthmatiques, j’emmerde la planète, car fumer, c’est ma liberté. Donner mon argent à l’une des entreprises les plus polluantes du monde, c’est ma liberté. Rendre asthmatiques mes enfants, c’est ma liberté. D’ailleurs, je vais manifester pour le climat en m’en grillant une !

Le tabac est un exemple très symptomatique : il est hyper dangereux, hyper polluant, hyper nocif y compris pour les non-fumeurs et n’a aucune justification pratique. Aucune, zéro, nada. L’interdiction de la vente de tabac aux jeunes ne devrait même pas être discutable. L’interdiction de fumer à moins de cent mètres d’un mineur devrait vous paraître une évidence.

Ce n’est pas le cas. Et beaucoup trouvent que simplement dire à un fumeur « tu pues, tu m’empestes » c’est « pas très respectueux », mais que se taper une crise d’asthme ou un cancer de tabagisme passif, c’est « normal ». Socialement, je ne comprends pas comment on ne considère pas un type qui allume une clope en public comme quelqu’un qui enlève son pantalon et se met à déféquer au milieu de la rue : comme un gros dégueulasse qui n’a aucune éducation et qui empeste son environnement (avec la différence que la merde n’est pas cancérigène).

Pour mettre en rapport, le tabac c’est trois crises du COVID chaque année. Rien que les morts annuels du tabagisme passif sont au même niveau que les victimes du COVID. Mais, contrairement au COVID, on ne prend aucune mesure alors que celles-ci sont faciles et évidentes. Pire, on encourage activement notre jeunesse à fumer !

Il faut en déduire que tant qu’on tolère la vente et la fumée dans l’espace public, c’est qu’on n’a pas envie de sauver la planète ni la vie des humains. Voilà, au moins les choses sont claires. Ce n’est pas qu’on ne peut pas. On ne veut pas. Point. C’est bon ? On peut arrêter de faire semblant et détacher les capuchons des bouteilles en plastique ?

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

March 12, 2024

Nintendo killed the switch emulator Yuzu. Sadly, the official Citra distribution disappeared a well because they shared some developers. The good news is that Citra being open source, is still available as archive builds and new forks. EmuDeck on the Steam Deck adapted to this new reality and adopted a bring-your-own Yuzu and Citra model. The changes to this emulators came at the same time of a big EmuDeck release, complication things a little.

Last fall, we toured the Champagne region in France, famous for its sparkling wines. We explored the ancient, underground cellars where Champagne undergoes its magical transformation from grape juice to sparkling wine. These cellars, often 30 meters deep and kilometers long, maintain a constant temperature of around 10-12°C, providing the perfect conditions for aging and storing Champagne.

A glowing light bulb hanging in an underground tunnel. 25 meters underground in a champagne tunnel, which often stretches for miles/kilometers.

After sampling various Champagnes, we returned home with eight cases to store in our home's basement. However, unlike those deep cellars, our basement is just a few meters deep, prompting a simple question that sent me down a rabbit hole: how does our basement's temperature compare?

Rather than just buying a thermometer, I decided to build my own "temperature monitoring system" using open hardware and custom-built software. After all, who needs a simple solution when you can spend evenings tinkering with hardware, sensors, wires and writing your own software? Sometimes, more is more!

The basic idea is this: track the temperature and humidity of our basement every 15 minutes and send this information to a web service. This web service analyzes the data and alerts us if our basement becomes too cold or warm.

I launched this monitoring system around Christmas last year, so it's been running for nearly three months now. You can view the live temperature and historical data trends at https://dri.es/sensors. Yes, publishing our basement's temperature online is a bit quirky, but it's all in good fun.

A webpage displaying temperature and humidity readings for a basement in Belgium. A screenshot of my basement temperature dashboard.

So far, the temperature in our basement has been ideal for storing wine. However, I expect it will change during the summer months.

In the rest of this blog post, I'll share how I built the client that collects and sends the data, as well as the web service backend that processes and visualizes that data.

Hardware used

For this project, I bought:

  1. Adafruit ESP32-S3 Feather: A microcontroller board with Wi-Fi and Bluetooth capabilities, serving as the central processing unit of my project.
  2. Adafruit SHT4x sensor: A high-accuracy temperature and humidity sensor.
  3. 3.7v 500mAh battery: A small and portable power source.
  4. STEMMA QT / Qwiic JST SH 4-pin cable: To connect the sensor to the board without soldering.

The total hardware cost was $32.35 USD. I like Adafruit a lot, but it's worth noting that their products often come at a higher cost. You can find comparable hardware for as little as $10-15 elsewhere. Adafruit's premium cost is understandable, considering how much valuable content they create for the maker community.

An ESP32-S3 development board is linked to an SHT41 temperature and humidity sensor and powered by a battery pack. For scale, a 2 Euro coin is included. The SHT41 sensor is roughly equivalent in size to the coin, and the ESP32-S3 board is about twice the coin's diameter. An ESP32-S3 development board (middle) linked to an SHT41 temperature and humidity sensor (left) and powered by a battery pack (right).

Client code for Adafruit ESP32-S3 Feather

I developed the client code for the Adafruit ESP32-S3 Feather using the Arduino IDE, a widely used platform for developing and uploading code to Arduino-compatible boards.

The code measures temperature and humidity every 15 minutes, connects to WiFi, and sends this data to https://dri.es/sensors, my web service endpoint.

One of my goals was to create a system that could operate for a long time without needing to recharge the battery. The ESP32-S3 supports a "deep sleep" mode where it powers down almost all its functions, except for the clock and memory. By placing the ESP32-S3 into deep sleep mode between measurements, I was able to significantly reduce power.

Now that you understand the high-level design goals, including deep sleep mode, I'll share the complete client code below. It includes detailed code comments, making it self-explanatory.

#include "Adafruit_SHT4x.h"
#include "Adafruit_MAX1704X.h"
#include "WiFiManager.h"
#include "ArduinoJson.h"
#include "HTTPClient.h"

// The Adafruit_SHT4x sensor is a high-precision, temperature and humidity
// sensor with I2C interface.
Adafruit_SHT4x sht4 = Adafruit_SHT4x();

// The Adafruit ESP32-S3 Feather comes with a built-in MAX17048 LiPoly / LiIon
// battery monitor. The MAX17048 provides accurate monitoring of the battery's
// voltage. Utilizing the Adafruit library, not only helps us obtain the raw
// voltage data from the battery cell, but also converts this data into a more
// intuitive battery percentage or charge level. We will pass on the battery
// percentage to the web service endpoint, which can visualize it or use it to
// send notifications when the battery needs recharging.
Adafruit_MAX17048 maxlipo;

void setup() {
   Serial.begin(115200);

   // Wait for the serial connection to establish before proceeding further.
   // This is crucial for boards with native USB interfaces. Without this loop,
   // initial output sent to the serial monitor is lost. This code is not
   // needed when running on battery.
   //delay(1000);

   // Generates a unique device ID from a segment of the MAC address.
   // Since the MAC address is permanent and unchanged after reboots,
   // this guarantees the device ID remains consistent. To achieve a
   // compact ID, only a specific portion of the MAC address is used,
   // specifically the range between 0x10000 and 0xFFFFF. This range
   // translates to a hexadecimal string of a fixed 5-character length,
   // giving us roughly 1 million unique IDs. This approach balances
   // uniqueness with compactness.
   uint64_t chipid = ESP.getEfuseMac();
   uint32_t deviceValue = ((uint32_t)(chipid >> 16) & 0x0FFFFF) | 0x10000;
   char device[6]; // 5 characters for the hex representation + the null terminator.
   sprintf(device, "%x", deviceValue); // Use '%x' for lowercase hex letters

   // Initialize the SHT4x sensor:
  if (sht4.begin()) {
    Serial.println(F("SHT4 temperature and humidity sensor initialized."));
    sht4.setPrecision(SHT4X_HIGH_PRECISION);
    sht4.setHeater(SHT4X_NO_HEATER);
  }
  else {
   Serial.println(F("Could not find SHT4 sensor."));
  }

  // Initialize the MAX17048 sensor:
  if (maxlipo.begin()) {
    Serial.println(F("MAX17048 battery monitor initialized."));
  }
  else {
    Serial.println(F("Could not find MAX17048 battery monitor!"));
  }

  // Insert a short delay to ensure the sensors are ready and their data is stable:
  delay(200);

  // Retrieve temperature and humidity data from SHT4 sensor:
  sensors_event_t humidity, temp;
  sht4.getEvent(&humidity, &temp);

  // Get the battery percentage and calibrate if it's over 100%:
  float batteryPercent = maxlipo.cellPercent();
  batteryPercent = (batteryPercent > 100) ? 100 : batteryPercent;

  WiFiManager wifiManager;

  // Uncomment the following line to erase all saved WiFi credentials.
  // This can be useful for debugging or reconfiguration purposes.
  // wifiManager.resetSettings();
 
  // This WiFI manager attempts to establish a WiFi connection using known
  // credentials, stored in RAM. If it fails, the device will switch to Access
  // Point mode, creating a network named "Temperature Monitor". In this mode,
  // connect to this network, navigate to the device's IP address (default IP
  // is 192.168.4.1) using a web browser, and a configuration portal will be
  // presented, allowing you to enter new WiFi credentials. Upon submission,
  // the device will reboot and try connecting to the specified network with
  // these new credentials.
  if (!wifiManager.autoConnect("Temperature Monitor")) {
    Serial.println(F("Failed to connect to WiFi ..."));

    // If the device fails to connect to WiFi, it will restart to try again.
    // This approach is useful for handling temporary network issues. However,
    // in scenarios where the network is persistently unavailable (e.g. router
    // down for more than an hour, consistently poor signal), the repeated
    // restarts and WiFi connection attempts can quickly drain the battery.
    ESP.restart();

    // Mandatory delay to allow the restart process to initiate properly:
    delay(1000);
  }

  // Send collected data as JSON to the specified URL:
  sendJsonData("https://dri.es/sensors", device, temp.temperature, humidity.relative_humidity, batteryPercent);


  // WiFi consumes significant power so turn it off when done:
  WiFi.disconnect(true);
 
  // Enter deep sleep for 15 minutes. The ESP32-S3's deep sleep mode minimizes
  // power consumption by powering down most components, except the RTC. This
  // mode is efficient for battery-powered projects where constant operation
  // isn't needed. When the device wakes up after the set period, it runs
  // setup() again, as the state  isn't preserved.
  Serial.println(F("Going to sleep for 15 minutes ..."));
  ESP.deepSleep(15 * 60 * 1000000); // 15 mins * 60 secs/min * 1,000,000 μs/sec.
}

bool sendJsonData(const char* url, const char* device, float temperature, float humidity, float battery) {
  StaticJsonDocument<200> doc;

  // Round floating-point values to one decimal place for efficient data
  // transmission. This approach reduces the JSON payload size, which is
  // important for IoT applications running on batteries.
  doc["device"] = device;
  doc["temperature"] = String(temperature, 1);
  doc["humidity"] = String(humidity, 1);
  doc["battery"] = String(battery, 1);

  // Serialize JSON to a string:
  String jsonData;
  serializeJson(doc, jsonData);

  // Initialize an HTTP client with the provided URL:
  HTTPClient httpClient;
  httpClient.begin(url);
  httpClient.addHeader("Content-Type", "application/json");

  // Send a HTTP POST request:
  int httpCode = httpClient.POST(jsonData);

  // Close the HTTP connection:
  httpClient.end();

  // Print debug information to the serial console:
  Serial.println("Sent '" + jsonData + "' to " + String(url) + ", return code " + httpCode);
  return (httpCode == 200);
}

void loop() {
  // The ESP32-S3 resets and runs setup() after waking up from deep sleep,
  // making this continuous loop unnecessary.
}

Further optimizing battery usage

When I launched my thermometer around Christmas 2023, the battery was at 88%. Today, it is at 52%. Some quick math suggests it's using approximately 12% of its battery per month. Given its current rate of usage, it needs recharging about every 8 months.

Connecting to the WiFi and sending data are by far the main power drains. To extend the battery life, I could send updates less frequently than every 15 minutes, only send them when there is a change in temperature (which is often unchanged or only different by 0.1°C), or send batches of data points together. Any of these methods would work for my needs, but I haven't implemented them yet.

Alternatively, I could hook the microcontroller up to a 5V power adapter, but where is the fun in that? It goes against the project's "more is more" principle.

Handling web service requests

With the client code running on the ESP32-S3 and sending sensor data to https://dri.es/sensors, the next step is to set up a web service endpoint to receive this incoming data.

As I use Drupal for my website, I implemented the web service endpoint in Drupal. Drupal uses Symfony, a popular PHP framework, for large parts of its architecture. This combination offers an easy but powerful way for implementing web services, similar to those found across other modern server-side web development frameworks like Laravel, Django, etc.

Here is what my Drupal routing configuration looks like:

sensors.sensor_data:
  path: '/sensors'
  methods: [POST]
  defaults:
    _controller: '\Drupal\sensors\Controller\SensorMonitorController::postSensorData'
  requirements:
    _access: 'TRUE'

The above configuration directs Drupal to send POST requests made to https://dri.es/sensors to the postSensorData() method of the SensorMonitorController class.

The implementation of this method handles request authentication, validates the JSON payload, and saves the data to a MariaDB database table. Pseudo-code:

public function postSensorData(Request $request) : JsonResponse {
  $content = $request->getContent();
  $data = json_decode($content, TRUE);

  // Validate the JSON payload:
  …

  // Authenticate the request:
  … 

  $device = DeviceFactory::getDevice($data['device']);
  if ($device) {
    $device->recordSensorEvent($data);
  }

  return new JsonResponse(['message' => 'Thank you!']);
 }

For testing your web service, you can use tools like cURL:

$ curl -X POST -H "Content-Type: application/json" -d '{"device":"0xdb123", "temperature":21.5, "humidity":42.5, "battery":90.0}' https://localhost/sensors

While cURL is great for quick tests, I use PHPUnit tests for automated testing in my CI/CD workflow. This ensures that everything keeps working, even when upgrading Drupal, Symfony, or other components of my stack.

Storing sensor data in a database

The primary purpose of $device->recordSensorEvent() in SensorMonitorController::postSensorData() is to store sensor data into a SQL database. So, let's delve into the database design.

My main design goals for the database backend were:

  1. Instead of storing every data point indefinitely, only keep the daily average, minimum, maximum, and the latest readings for each sensor type across all devices.
  2. Make it easy to add new devices and new sensors in the future. For instance, if I decide to add a CO2 sensor for our bedroom one day (a decision made in my head but not yet pitched to my better half), I want that to be easy.

To this end, I created the following MariaDB table:

CREATE TABLE sensor_data (
  date DATE,
  device VARCHAR(255),
  sensor VARCHAR(255),
  avg_value DECIMAL(5,1),
  min_value DECIMAL(5,1),
  max_value DECIMAL(5,1),
  min_timestamp DATETIME,
  max_timestamp DATETIME,
  readings SMALLINT NOT NULL,
  UNIQUE KEY unique_stat (date, device, sensor)
);

A brief explanation for each field:

  • date: The date for each sensor reading. It doesn't include a time component as we aggregate data on a daily basis.
  • device: The device ID of the device providing the sensor data, such as 'basement' or 'bedroom'.
  • sensor: The type of sensor, such as 'temperature', 'humidity' or 'co2'.
  • avg_value: The average value of the sensor readings for the day. Since individual readings are not stored, a rolling average is calculated and updated with each new reading using the formula: avg_value = avg_value + new_value - avg_value new_total_readings . This method can accumulate minor rounding errors, but simulations show these are negligible for this use case.
  • min_value and max_value: The daily minimum and maximum sensor readings.
  • min_timestamp and max_timestamp: The exact moments when the minimum and maximum values for that day were recorded.
  • readings: The number of readings (or measurements) taken throughout the day, which is used for calculating the rolling average.

In essence, the recordSensorEvent() method needs to determine if a record already exists for the current date. Depending on this determination, it will either insert a new record or update the existing one.

In Drupal this process is streamlined with the merge() function in Drupal's database layer. This function handles both inserting new data and updating existing data in one step.

private function updateDailySensorEvent(string $sensor, float $value): void {
  $timestamp = \Drupal::time()->getRequestTime();
  $date = date('Y-m-d', $timestamp);
  $datetime = date('Y-m-d H:i:s', $timestamp);

  $connection = Database::getConnection();

  $result = $connection->merge('sensor_data')
    ->keys([
     'device' => $this->id,
     'sensor' => $sensor,
     'date' => $date,
    ])
    ->fields([
     'avg_value' => $value,
     'min_value' => $value,
     'max_value' => $value,
     'min_timestamp' => $datetime,
     'max_timestamp' => $datetime,
     'readings' => 1,
    ])
    ->expression('avg_value', 'avg_value + ((:new_value - avg_value) / (readings + 1))', [':new_value' => $value])
    ->expression('min_value', 'LEAST(min_value, :value)', [':value' => $value])
    ->expression('max_value', 'GREATEST(max_value, :value)', [':value' => $value])
    ->expression('min_timestamp', 'IF(LEAST(min_value, :value) = :value, :timestamp, min_timestamp)', [':value' => $value, ':timestamp' => $datetime])
    ->expression('max_timestamp', 'IF(GREATEST(max_value, :value) = :value, :timestamp, max_timestamp)', [':value' => $value, ':timestamp' => $datetime])
    ->expression('readings', 'readings + 1')
    ->execute();
 }

Here is what the query does:

  • It checks if a record for the current sensor and date exists.
  • If not, it creates a new record with the sensor data, including the initial average, minimum, maximum, and latest value readings, along with the timestamp for these values.
  • If a record does exist, it updates the record with the new sensor data, adjusting the average value, and updating minimum and maximum values and their timestamps if the new reading is a new minimum or maximum.
  • The function also increments the count of readings.

For those not using Drupal, similar functionality can be achieved with MariaDB's INSERT ... ON DUPLICATE KEY UPDATE command, which allows for the same conditional insert or update logic based on whether the specified unique key already exists in the table.

Here are example queries, extracted from MariaDB's General Query Log to help you get started:

INSERT INTO sensor_data (device, sensor, date, min_value, min_timestamp, max_value, max_timestamp, readings) 
VALUES ('0xdb123', 'temperature', '2024-01-01', 21, '2024-01-01 00:00:00', 21, '2024-01-01 00:00:00', 1);

UPDATE sensor_data 
SET min_value = LEAST(min_value, 21), 
   min_timestamp = IF(LEAST(min_value, 21) = 21, '2024-01-01 00:00:00', min_timestamp), 
   max_value = GREATEST(max_value, 21), 
   max_timestamp = IF(GREATEST(max_value, 21) = 21, '2024-01-01 00:00:00', max_timestamp), 
   readings = readings + 1
WHERE device = '0xdb123' AND sensor = 'temperature' AND date = '2024-01-01';

Generating graphs

With the data securely stored in the database, the next step involved generating the graphs. To accomplish this, I wrote some custom PHP code that generates Scalable Vector Graphics (SVGs).

Given that is blog post is already quite long, I'll spare you the details. For now, those curious can use the 'View source' feature in their web browser to examine the SVGs on the thermometer page.

Conclusion

It's fun how a visit to the Champagne cellars in France sparked an unexpected project. Choosing to build a thermometer rather than buying one allowed me to dive back into an old passion for hardware and low-level software.

I also like taking control of my own data and software. It gives me a sense of control and creativity.

As Drupal's project lead, using Drupal for an Internet-of-Things (IoT) backend brought me unexpected joy. I just love the power and flexibility of open-source platforms like Drupal.

As a next step, I hope to design and 3D print a case for my thermometer, something I've never done before. And as mentioned, I'm also considering integrating additional sensors. Stay tuned for updates!

March 05, 2024

Lectures : petite écologie de l’éducation et de l’informatique

De l’importance de l’écriture manuelle

L’écriture à la main, que ce soit avec un stylo, un crayon, un marqueur ou ce que vous voulez est une étape primordiale dans le développement du cerveau et dans la compréhension future de la langue écrite.

La littérature à ce sujet semble unanime, mais l’étude suscitée va encore plus loin en mesurant l’activité neuronale lors de l’écriture à la main ou avec un clavier. Il n’y a pas photo : l’apprentissage de l’écriture se fait donc d’abord, et c’est essentiel, en écrivant à la main et en déchiffrant différentes écritures.

Ensuite, si l’outil informatique vous intéresse, je conseille très fortement d’apprendre la dactylographie. Cela ne demande que quelques semaines d’efforts et cela change complètement l’interaction avec un ordinateur. Pour rappel, la dactylographie sur un clavier se base sur deux principes fondamentaux :

Premièrement, chaque touche correspond à un doigt particulier. On n’utilise pas les doigts au hasard.

Deuxièmement, la dactylographie doit impérativement s’apprendre à l’aveugle. Il ne s’agit pas de connaître par cœur la disposition du clavier ou de la visualiser. Il s’agit de créer un réflexe musculaire, un mouvement d’un doigt particulier pour chaque lettre.

C’est comme ça que j’ai appris à taper en Bépo et c’est, je pense, le meilleur investissement en temps que j’aie jamais fait de toute ma vie. Le fait de taper à l’aveugle et au rythme de ma pensée a transformé l’écriture en une véritable extension de mon cerveau. Je n’écris pas ce à quoi je réfléchis, mes pensées se contentent d’apparaitre sur l’écran.

Je tape tout en Bépo dans Vim car Vim me permet d’étendre les automatismes dactylographiques aux actions sur le texte lui-même : se déplacer, supprimer, remplacer, copier-coller. Mes enfants sont peut-être encore jeunes pour se mettre à Vim, mais si jamais une envie de Vim (une envim quoi) vous titille, je vous conseille le court manuel de Vincent Jousse.

Digitalisation de l’éducation

La « digitalisation » à l’aveugle des salles de classe est une hérésie absolue. Surtout que l’immense majorité des professeurs sont complètement incompétents en informatique et ne comprennent pas eux-mêmes ce qu’est un ordinateur (ce qui est normal et attendu, ils n’ont jamais été formés à cela).

Dans l’école primaire de mes enfants, ils sont tout fiers de proposer des séances d’explications… de PowerPoint !

Alors, deux petits rappels importants :

Premièrement, apprendre à utiliser des logiciels commerciaux, ce n’est pas de l’informatique. C’est de l’utilisation d’un outil commercial qui n’est pas généralisable et donc foncièrement inutile sur le long terme. Les enfants ont appris à cliquer sur deux boutons ? À la prochaine version, les boutons seront ailleurs et les enfants n’auront aucune connaissance intuitive de leur outil. Et c’est sans doute parce que le professeur n’en a aucune lui-même, mais c’était le prof de l’école qui « aime bien l’informatique », qui clique sur des .exe sans transpirer à grosses gouttes, du coup on lui confie ce rôle.

En deuxième lieu, si vous voulez qu’un enfant se débrouille avec n’importe quel logiciel, il y a une solution très simple : laissez-le faire. Sérieusement, laissez-le chipoter, essayer, faire n’importe quoi. Dans le cas du PowerPoint, je suis certain que si on laisse une classe une heure avec le logiciel, elle en saura plus que l’adulte qui peine. Mettez-leur en main des ordinateurs où ils ont le droit de « tout casser ».

Après, il y’a un énorme problème avec la génération actuelle à qui on fourgue une tablette dès la couveuse : ils n’ont jamais chipoté. Ils sont nés avec des appareils avec des grosses icônes sur lesquelles il suffit de cliquer pour acheter des applications. Des appareils qui sont conçus à dessein pour empêcher de comprendre comment ils fonctionnent et qui ne peuvent pas être « cassés ». Les outils propriétaires sont, par essence, des boîtes noires qui se veulent arbitraires et incompréhensibles.

Ce n’est pas de l’informatique. Ce n’est pas une connaissance utile. Ce n’est pas le rôle de l’école de s’occuper de cela.

Je réfléchis beaucoup à une méthode d’enseignement de l’informatique. Et j’en suis arrivé à une conclusion : apprendre les bases de l’informatique doit se faire sans ordinateur. Il y a tant de choses amusantes à faire sur un tableau noir : compter en binaire, écrire des petits algorithmes, créer des groupes d’enfants appliquant chacun un algorithme et voir ce qui se passe si on se passe des « données » dans un ordre plutôt qu’un autre…

Le concept de boîte noire

L’ingénieur tente d’utiliser une boîte noire avec laquelle il a appris à interagir grâce au scientifique. Le scientifique, lui, cherche à comprendre comment fonctionne l’intérieur de la boîte noire (qui contient elle-même d’autres boîtes noires).

Méfiez-vous des gens qui vous vendent des boîtes noires, mais sont étonnés à l’idée que vous demandiez ce qu’il y a à l’intérieur.

Réseaux sociaux

Outre comprendre l’informatique, il est vrai que les nouvelles générations doivent apprendre à vivre dans un monde de réseaux sociaux. Mais, une fois encore, la plupart des adultes ne comprennent rien et tentent d’imposer leur vision étriquée de ce qu’ils n’ont pas compris.

J’écrivais qu’un véritable réseau social ne peut pas être un succès. Tout le monde ne peut pas être dessus, sinon ce n’est plus vraiment un réseau social.

Winter traduit très bien ce sentiment avec ses mots : sur chaque plateforme, nous avons une identité différente, une créativité qui s’accompagne parfois d’une grande pudeur.

Après tout, lorsque j’ai créé ce blog, j’ai évité consciencieusement toute référence à mon nom officiel pour pouvoir m’exprimer sans crainte d’être jugé par mes proches. Ce n’est que petit à petit que j’ai pu prendre l’assurance de lier l’identité de Ploum avec celle de Lionel Dricot. J’ai tenu deux autres sites web, aujourd’hui disparus, dont un qui était un blog avant la lettre, sous des identités qui n’ont jamais été liées à moi. Dans son livre « Mémoires Vives », Edward Snowden insiste sur cet aspect multi-identitaire fondamental à sa vocation. Aaron Swartz a également utilisé ces outils pour contribuer à définir la norme RSS en cachant qu’il était encore adolescent.

Cet apprentissage, ces libertés et ces explorations de ses propres identités sont malheureusement complètement perdus dans la vocation des réseaux sociaux centralisateurs qui imposent une et une seule identité.

La seule chose que les jeunes peuvent faire désormais, c’est de créer un compte sous un faux nom "réaliste", ce qui les incite à se faire passer pour un camarade et, de ce fait, à se porter préjudice l’un à l’autre, au grand dam des établissements scolaires et des parents qui doivent prendre des mesures énergiques pour dire que « Ça ne se fait pas d’usurper l’identité d’un autre ».

Non, ça ne se fait pas. Mais ça se fait de s’inventer des identités. De se créer des univers différents, qui interagissent dans des communautés différentes.

Et nous avons revendu cette liberté contre la possibilité d’être fliqué par la publicité avec l’obligation d’avoir notre vrai nom partout parce que « c’était plus facile ».

Bloat JavaScript

Et même sur le côté « plus facile », nous nous sommes fait avoir par le côté « boîte noire sans cesse changeante ».

Niki, blogueur sur tonsky.me, s’est amusé à calculer la quantité de JavaScript que chargent les sites principaux… par défaut ! Sur le « minimaliste » Medium, c’est 3Mo. Sur LinkedIn, c’est 31Mo.

Pour rappel, il y a zéro JavaScript sur ploum.net. En fait, pour un rendu relativement similaire (du texte aligné au milieu d’un écran), vous devriez télécharger 1000 fois plus de données pour lire un de mes billets sur Medium et 10.000 fois plus de données pour lire un de mes billets sur LinkedIn.

En plus du temps de téléchargement, le processeur de votre appareil serait mis à rude épreuve pendant quelques dixièmes de secondes voire des secondes tout court, augmentant la consommation d’électricité (de manière significative) et vous donnant une légère impression de lenteur ou de difficulté lors de l’affichage. Si vous avez un appareil un peu ancien ou une connexion un peu mauvaise, ces difficultés sont multipliées exponentiellement.

Ah oui. En plus de tout, sur Medium et LinkedIn, vous êtes complètement pistés et les données de votre lecture vont grossir les milliers de bases de données marketing.

Il y a même des chances que, pour lire l’article sur Medium ou LinkedIn, vous utilisiez le mode « lecture » de votre navigateur ou d’un logiciel quelconque. Mode qui après avoir tout téléchargé et tout calculé va tenter d’extraire le contenu de l’article pour l’afficher dans un style similaire à ploum.net.

Tous ces allers et retours alors qu’il est tellement simple, en temps que webmaster, d’offrir le texte directement, sans fioriture. De simplifier la vie de tout le monde…

Gaspillage et sécurité

Ces systèmes sont donc plus lourds, plus énergivores, beaucoup plus compliqués à produire. Pourquoi les produit-on ?

Mais ce n’est pas tout ! Leur complexité augmente la surface d’attaque potentielle et donc le nombre de failles de sécurité. Une joie pour les script-kiddies. Sauf que plus besoin des script-kiddies, les intelligences artificielles peuvent désormais automatiquement exploiter les failles de sécurité.

On va donc avoir, d’une part, des sites de spam/SEO générés automatiquement et, d’autre part, des sites légitimes qui se sont fait pirater et sur lesquels a été injecté du contenu spam/SEO.

Voilà, voilà, ne me dites pas que je ne vous avais pas prévenu.

Écologie

Le plus difficile, dans tout cela, c’est certainement la pression sociale. Si on aime vivre en ermite comme moi, c’est simple de refuser Whatsapp, Google. Mais lorsqu’on est ado, on se fout de ces principes. On veut faire partie du groupe. Avoir un iPhone. Être sur Tiktok. Jouer au dernier jeu à la mode en parlant du dernier Youtubeur sponsorisé par une marque d’alcool ou de tabac. Porter des fringues de marque produit par des enfants dans des caves en Thaïlande.

Pour l’adolescence, les préceptes moraux et les interdictions sont là pour être contournés (et c’est une bonne chose). Tout ce que nous pouvons offrir, c’est l’éducation. La compréhension des enjeux et des conséquences des actes posés dans leur univers de vie.

Bref, dans l’informatique comme dans tout autre domaine, nous devons enseigner l’écologie.

Mais encore faut-il que nous la comprenions nous-mêmes.

Nous vivons dans un monde où l’eau de pluie est désormais contaminée par des polluants dangereux… partout sur la planète ! Il n’existe plus une goutte d’eau de pluie qui soit considérée comme potable.

Alors, peut-être que nous devons accepter n’avoir pas de leçons écologiques à donner à nos enfants…

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

March 04, 2024

Next week, I'm traveling to Japan and Australia. I've been to both countries before and can't wait to return – they're among my favorite places in the world.

My goal is to connect with the local Drupal community in each country, discussing the future of Drupal, learning from each other, and collaborating.

I'll also be connecting with Acquia's customers and partners in both countries, sharing our vision, strategy and product roadmap. As part of that, I look forward to spending some time with the Acquia teams as well – about 20 employees in Japan and 35 in Australia.

I'll present at a Drupal event in Tokyo the evening of March 14th at Yahoo! Japan.

While in Australia, I'll be attending Drupal South, held at the Sydney Masonic Centre from March 20-22. I'm excited to deliver the opening keynote on the morning of March 20th, where I'll delve into Drupal's past, present, and future.

I look forward to being back in Australia and Japan, reconnecting with old friends and the local communities.

February 29, 2024

Two people on a platform observe a decentralized web of nodes.

Two years ago, I launched a simple Web3 website using IPFS (InterPlanetary File System) and ENS (Ethereum Name Service). Back then, Web3 tools were getting a lot of media attention and I wanted to try it out.

Since I set up my Web3 website two years ago, I basically forgot about it. I didn't update it or pay attention to it for two years. But now that we hit the two-year mark, I'm curious: is my Web3 website still online?

At that time, I also stated that Web3 was not fit for hosting modern web applications, except for a small niche: static sites requiring high resilience and infrequent content updates.

I was also curious to explore the evolution of Web3 technologies to see if they became more applicable for website hosting.

My original Web3 experiment

In my original blog post, I documented the process of setting up what could be called the "Hello World" of Web3 hosting. I stored an HTML file on IPFS, ensured its availability using "pinning services", and made it accessible using an ENS domain.

For those with a basic understanding of Web3, here is a summary of the steps I took to launch my first Web3 website two years ago:

  1. Purchased an ENS domain name: I used a crypto wallet with Ethereum to acquire dries.eth through the Ethereum Name Service, a decentralized alternative to the traditional DNS (Domain Name System).
  2. Uploaded an HTML File to IPFS: I uploaded a static HTML page to the InterPlanetary File System (IPFS), which involved running my own IPFS node and utilizing various pinning services like Infura, Fleek, and Pinata. These pinning services ensure that the content remains available online even when my own IPFS node is offline.
  3. Accessed the website: I confirmed that my website was accessible through IPFS-compatible browsers.
  4. Mapped my webpage to my domain name: As the last step, I linked my IPFS-hosted site to my ENS domain dries.eth, making the web page accessible under an easy domain name.

If the four steps above are confusing to you, I recommend reading my original post. It is over 2,000 words, complete with screenshots and detailed explanations of the steps above.

Checking the pulse of various Web3 services

As the first step in my check-up, I wanted to verify if the various services I referenced in my original blog post are still operational.

The results, displayed in the table below, are really encouraging: Ethereum, ENS, IPFS, Filecoin, Infura, Fleek, Pinata, and web3.storage are all operational.

The two main technologies – ENS and IPFS – are both actively maintained and developed. This indicates that Web3 technology has built a robust foundation.

Service Description Still around in February 2024)
ENS A blockchain-based naming protocol offering DNS for Web3, mapping domain names to Ethereum addresses. Yes
IPFS A peer-to-peer protocol for storing and sharing data in a distributed file system. Yes
Filecoin A blockchain-based storage network and cryptocurrency that incentivizes data storage and replication. Yes
Infura Provides tools and infrastructure to manage content on IPFS and other tools for developers to connect their applications to blockchain networks and deploy smart contracts. Yes
Fleek A platform for building websites using IPFS and ENS. Yes
Pinata Provides tools and infrastructure to manage content on IPFS, and more recently Farcaster applications. Yes
web3.storage Provides tools and infrastructure to manage content on IPFS with support for Filecoin. Yes

Is my Web3 website still up?

Seeing all these Web3 services operational is encouraging, but the ultimate test is to check if my Web3 webpage, dries.eth, remained live. It's one thing for these services to work, but another for my site to function properly. Here is what I found in a detailed examination:

  1. Domain ownership verification: A quick check on etherscan.io confirmed that dries.eth is still registered to me. Relief!
  2. ENS registrar access: Using my crypto wallet, I could easily log into the ENS registrar and manage my domains. I even successfully renewed dries.eth as a test.
  3. IPFS content availability: My webpage is still available on IPFS, thanks to having pinned it two years ago. Logging into Fleek and Pinata, I found my content on their admin dashboards.
  4. Web3 and ENS gateway access: I can visit dries.eth using a Web3 browser, and also via an IPFS-compatible ENS gateway like https://dries.eth.limo/ – a privacy-centric service, new since my initial blog post.

The verdict? Not only are these Web3 services still operational, but my webpage also continues to work!

This is particularly noteworthy given that I haven't logged in to these services, didn't perform any maintenance, or didn't pay any hosting fees for two years (the pinning services I'm using have a free tier).

Visit my Web3 page yourself

For anyone interested in visiting my Web3 page (perhaps your first Web3 visit?), there are several methods to choose from, each with a different level of Web3-ness.

  • Use a Web3-enabled browser: Browsers such as Brave and Opera, offer built-in ENS and IPFS support. They can resolve ENS addresses and interpret IPFS addresses, making it as easy to navigate IPFS content as if it is traditional web content via HTTP or HTTPS.
  • Install a Web3 browser extension: If your favorite browser does not support Web3 out of the box, adding a browser extension like MetaMask can help you access Web3 applications. MetaMask works with Chrome, Firefox, and Edge. It enables you to use .eth domains for doing Ethereum transactions or for accessing content on IPFS.
  • Access through an ENS gateway: For those looking for the simplest way to access Web3 content without installing anything new, using an ENS gateway, such as eth.limo, is the easiest method. This gateway maps ENS domains to DNS, offering direct navigation to Web3 sites like mine at https://dries.eth.limo/. It serves as a simple bridge between Web2 (the conventional web) and Web3.

Streamlining content updates with IPNS

In my original post, I highlighted various challenges, such as the limitations for hosting dynamic applications, the cost of updates, and the slow speed of these updates. Although these issues still exist, my initial analysis was conducted with an incomplete understanding of the available technology. I want to delve deeper into these limitations, and refine my previous statements.

Some of these challenges stem from the fact that IPFS operates as a "content-addressed network". Unlike traditional systems that use URLs or file paths to locate content, IPFS uses a unique hash of the content itself. This hash is used to locate and verify the content, but also to facilitate decentralized storage.

While the principle of addressing content by a hash is super interesting, it also introduces some complications: whenever content is updated, its hash changes, making it tricky to link to the updated content. Specifically, every time I updated my Web3 site's content, I had to update my ENS record, and pay a translation fee on the Ethereum network.

At the time, I wasn't familiar with the InterPlanetary Name System (IPNS). IPNS, not to be confused with IPFS, addresses this challenge by assigning a mutable name to content on IPFS. You can think of IPNS as providing an "alias" or "redirect" for IPFS addresses: the IPNS address always stays the same and points to the latest IPFS address. It effectively eliminates the necessity of updating ENS records with each content change, cutting down on expenses and making the update process more automated and efficient.

To leverage IPNS, you have to take the following steps:

  1. Upload your HTML file to IPFS and receive an IPFS hash.
  2. Publish this hash to IPNS, creating an IPNS hash that directs to the latest IPFS hash.
  3. Link your ENS domain to this IPNS hash. Since the IPNS hash remains constant, you only need to update your ENS record once.

Without IPNS, updating content involved:

  1. Update the HTML file.
  2. Upload the revised file to IPFS, generating a new IPFS hash.
  3. Update the ENS record with the new IPFS hash, which costs some Ether and can take a few minutes.

With IPNS, updating content involves:

  1. Update the HTML file.
  2. Upload the revised file to IPFS, generating a new IPFS hash.
  3. Update the IPNS record to reference this new hash, which is free and almost instant.

Although IPNS is a faster and more cost-effective approach compared to the original method, it still carries a level of complexity. There is also a minor runtime delay due to the extra redirection step. However, I believe this tradeoff is worth it.

Updating my Web3 site to use IPNS

With this newfound knowledge, I decided to use IPNS for my own site. I generated an IPNS hash using both the IPFS desktop application (see screenshot) and IPFS' command line tools:

$ ipfs name publish /ipfs/bafybeibbkhmln7o4ud6an4qk6bukcpri7nhiwv6pz6ygslgtsrey2c3o3q
> Published to k51qzi5uqu5dgy8mzjtcqvgr388xjc58fwprededbb1fisq1kvl34sy4h2qu1a: /ipfs/bafybeibbkhmln7o4ud6an4qk6bukcpri7nhiwv6pz6ygslgtsrey2c3o3q
A screenshot of the IPFS Desktop application displaying an HTML file with a drop-down menu option to 'Publish to IPNS'. The IPFS Desktop application showing my index.html file with an option to 'Publish to IPNS'.

After generating the IPNS hash, I was able to visit my site in Brave using the IPFS protocol at ipfs://bafybeibbkhmln7o4ud6an4qk6bukcpri7nhiwv6pz6ygslgtsrey2c3o3q, or via the IPNS protocol at ipns://k51qzi5uqu5dgy8mzjtcqvgr388xjc58fwprededbb1fisq1kvl34sy4h2qu1a.

A browser window displaying a simple "Hello world!" webpage on IPNS. My Web3 site in Brave using IPNS.

Next, I updated the ENS record for dries.eth to link to my IPNS hash. This change cost me 0.0011 ETH (currently $4.08 USD), as shown in the Etherscan transaction. Once the transaction was processed, dries.eth began directing to the new IPNS address.

A screen confirming an updated content hash for 'dries.eth' on the Ethereum Name Service, with a link to view the transaction on Etherscan. A transaction confirmation on the ENS website, showing a successful update for dries.eth.

Rolling back my IPNS record in ENS

Unfortunately, my excitement was short-lived. A day later, dries.eth stopped working. IPNS records, it turns out, need to be kept alive – a lesson learned the hard way.

While IPFS content can be persisted through "pinning", IPNS records require periodic "republishing" to remain active. Essentially, the network's Distributed Hash Table (DHT) may drop IPNS records after a certain amount of time, typically 24 hours. To prevent an IPNS record from being dropped, the owner must "republish" it before the DHT forgets it.

I found out that the pinning services I use – Dolphin, Fleek and Pinata – don't support IPNS republishing. Looking into it further, it turns out few IPFS providers do.

During my research, I discovered Filebase, a small Boston-based company with fewer than five employees that I hadn't come across before. Interestingly, they provide both IPFS pinning and IPNS republishing. However, to pin my existing HTML file and republish its IPNS hash, I had to subscribe to their service at a cost of $20 per month.

Faced with the challenge of keeping my IPNS hash active, I found myself at a crossroads: either fork out $20 a month for a service like Filebase that handles IPNS republishing for me, or take on the responsibility of running my own IPFS node.

Of course, the whole point of decentralized storage is that people run their own nodes. However, considering the scope of my project – a single HTML file – the effort of running a dedicated node seemed disproportionate. I'm also running my IPFS node on my personal laptop, which is not always online. Maybe one day I'll try setting up a dedicated IPFS node on a Raspberry Pi or similar setup.

Ultimately, I decided to switch my ENS record back to the original IPFS link. This change, documented in the Etherscan transaction, cost me 0.002 ETH (currently $6.88 USD).

Although IPNS works, or can work, it just didn't work for me. Despite the setback, the whole experience was a great learning journey.

(Update: A couple of days after publishing this blog post, someone kindly recommended https://dwebservices.xyz/, claiming their free tier includes IPNS republishing. Although I haven't personally tested it yet, a quick look at their about page suggests they might be a promising solution.)

Web3 remains too complex for most people

Over the past two years, Web3 hosting hasn't disrupted the mainstream website hosting market. Despite the allure of Web3, mainstream website hosting is simple, reliable, and meets the needs of nearly all users.

Despite a significant upgrade of the Ethereum network that reduced energy consumption by over 99% through its transition to a Proof of Stake (PoS) consensus mechanism, environmental considerations, especially the carbon footprint associated with blockchain technologies, continue to create further challenges for the widespread adoption of Web3 technologies. (Note: ENS operates on the blockchain but IPFS does not.)

As I went through the check-up, I discovered islands of innovation and progress. Wallets and ENS domains got easier to use. However, the overall process of creating a basic website with IPFS and ENS remains relatively complex compared to the simplicity of Web2 hosting.

The need for a SQL-compatible Web3 database

Modern web applications like those built with Drupal and WordPress rely on a technology stack that includes a file system, a domain name system (e.g. DNS), a database (e.g. MariaDB or MySQL), and a server-side runtime environment (e.g. PHP).

While IPFS and ENS offer decentralized alternatives for the first two, the equivalents for databases and runtime environments are less mature. This limits the types of applications that can easily move from Web2 to Web3.

A major breakthrough would be the development of a decentralized database that is compatible with SQL, but currently, this does not seem to exist. The complexity of ensuring data integrity and confidentiality across multiple nodes without a central authority, along with meeting the throughput demands of modern web applications, may be too complex to solve.

After all, blockchains, as decentralized databases, have been in development for over a decade, yet lack support for the SQL language and fall short in speed and efficiency required for dynamic websites.

The need for a distributed runtime

Another critical component for modern websites is the runtime environment, which executes the server-side logic of web applications. Traditionally, this has been the domain of PHP, Python, Node.js, Java, etc.

WebAssembly (WASM) could emerge as a potential solution. It could make for an interesting decentralized solution as WASM binaries can be hosted on IPFS.

However, when WASM runs on the client-side – i.e. in the browser – it can't deliver the full capabilities of a server-side environment. This limitation makes it challenging to fully replicate traditional web applications.

So for now, Web3's applications are quite limited. While it's possible to host static websites on IPFS, dynamic applications requiring database interactions and server-side processing are difficult to transition to Web3.

Bridging the gap between Web2 and Web3

In the short term, the most likely path forward is blending decentralized and traditional technologies. For example, a website could store its static files on IPFS while relying on traditional Web2 solutions for its dynamic features.

Looking to the future, initiatives like OrbitDB's peer-to-peer database, which integrates with IPFS, show promise. However, OrbitDB lacks compatibility with SQL, meaning applications would need to be redesigned rather than simply transferred.

Web3 site hosting remains niche

Even the task of hosting static websites, which don't need a database or server-side processing, is relatively niche within the Web3 ecosystem.

As I wrote in my original post: In its current state, IPFS and ENS offer limited value to most website owners, but tremendous value to a very narrow subset of all website owners.. This observation remains accurate today.

IPFS and ENS stand out for their strengths in censorship resistance and reliability. However, for the majority of users, the convenience and adequacy of Web2 for hosting static sites often outweigh these benefits.

The key to broader acceptance of new technologies, like Web3, hinges on either discovering new mass-market use cases or significantly enhancing the user experience for existing ones. Web3 has not found a universal application or surpassed Web2 in user experience.

The popularity of SaaS platforms underscores this point. They dominate not because they're the most resilient or robust options, but because they're the most convenient. Despite the benefits of resilience and autonomy offered by Web3, most individuals opt for less resilient but more convenient SaaS solutions.

Conclusion

Despite the billions invested in Web3 and notable progress, its use for website hosting still has significant limitations.

The main challenge for the Web3 community is to either develop new, broadly appealing applications or significantly improve the usability of existing technologies.

Website hosting falls into the category of existing use cases.

Unfortunately, Web3 remains mostly limited to static websites, as it does not yet offer robust alternatives to SQL databases and server-side runtime.

Even within the limited scope of static websites, improvements to the user experience have been marginal, focused on individual parts of the technology stack. The overall end-to-end experience remains complex.

Nonetheless, the fact that my Web3 page is still up and running after two years is encouraging, showing the robustness of the underlying technology, even if its current use remains limited. I've grown quite fond of IPFS, and I hope to do more useful experiments with it in the future.

All things considered, I don't see Web3 taking the website hosting world by storm any time soon. That said, over time, Web3 could become significantly more attractive and functional. All in all, keeping an eye on this space is definitely fun and worthwhile.

February 26, 2024

A graph showing the state of the Digital Experience Platforms in 2024. Vendors are plotted on a grid based on their ability to execute and completeness of vision. Acquia is placed in the 'Leaders' quadrant, indicating strong performance in both vision and execution.

For the fifth year in a row, Acquia has been named a Leader in the Gartner Magic Quadrant for Digital Experience Platforms (DXP).

Acquia received this recognition from Gartner based on both the completeness of product vision and ability to execute.

Central to our vision and execution is a deep commitment to openness. Leveraging Drupal, Mautic and open APIs, we've built the most open DXP, empowering customers and partners to tailor our platform to their needs.

Our emphasis on openness extends to ensuring our solutions are accessible and inclusive, making them available to everyone. We also prioritize building trust through data security and compliance, integral to our philosophy of openness.

We're proud to be included in this report and thank our customers and partners for their support and collaboration.

Mandatory disclaimer from Gartner

Gartner, Magic Quadrant for Digital Experience Platforms, Irina Guseva, Jim Murphy, Mike Lowndes, John Field - February 21, 2024.

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Acquia.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Gartner is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.

February 25, 2024

Wij houden met Euroclear zo’n 260 miljard euro aan Russische geblokkeerde tegoeden vast.

Moesten wij die gebruiken dan houdt dat in dat ons land de komende honderd en meer jaar internationaal gezien zal worden als een dief.

Tegelijkertijd spenderen wij slechts zo’n procent van onze GDP aan defensie. Dat is niet goed. Maar ja. Misschien indien de VS zo erg graag wil dat we die Russische tegoeden voor hun politieke pleziertjes gebruiken, dat ze deze ‘gunsten’ als Belgische defensie-uitgaven kunnen zien?

Bovendien horen er een aantal garanties te zijn t.o.v. de represailles die Rusland zal uitwerken tegen België wanneer Euroclear de tegoeden herinvesteert (dit wil voor Rusland zeggen: steelt).

Waar zijn die garanties? Welke zijn ze? Over hoeveel geld gaat het? Is het extreem veel? Want dat moet. Is het walgelijk extreem veel? Want dat moet.

Die garanties, gaan die over meer dan honderd jaar? Want dat moet.

Het is allemaal wel gemakkelijk en zo om een en ander te willen. Maar wat staat daar tegenover?

Martti Malmi, an early contributor to the Bitcoin project, recently shared a fascinating piece of internet history: an archive of private emails between himself and Satoshi Nakamoto, Bitcoin's mysterious founder.

The identity of Satoshi Nakamoto remains one of the biggest mysteries in the technology world. Despite extensive investigations, speculative reports, and numerous claims over the years, the true identity of Bitcoin's creator(s) is still unknown.

Martti Malmi released these private conversations in reaction to a court case focused on the true identity of Satoshi Nakamoto and the legal entitlements to the Bitcoin brand and technology.

The emails provide some interesting details into Bitcoin's early days, and might also provide some new clues about Satoshi's identity.

Satoshi and Martti worked together on a variety of different things, including the relaunch of the Bitcoin website. Their goal was to broaden public understanding and awareness of Bitcoin.

And to my surprise, the emails reveal they chose Drupal as their preferred CMS! (Thanks to Jeremy Andrews for making me aware.)

Email from Satoshi Nakamoto to Martti Malmi, dated December 2009, about Bitcoin's new Drupal site going online and being an improvement over the old bitcoin.org page.

The emails detail Satoshi's hands-on involvement, from installing Drupal themes, to configuring Drupal's .htaccess file, to exploring Drupal's multilingual capabilities.

Email from Satoshi Nakamoto to Martti Malmi, dated November 2009, discussing Drupal themes installation and comparing Drupal and Joomla!.

At some point in the conversation, Satoshi expressed reservations about Drupal's forum module.

Email from Satoshi Nakamoto to Martti Malmi, dated November 2009, about preferring Drupal as a CMS over a Wiki, and expressing that Drupal's forum capabilities are not ideal but better than Wikis.

For what it is worth, this proves that I'm not Satoshi Nakamoto. Had I been, I'd have picked Drupal right away, and I would never have questioned Drupal's forum module.

Jokes aside, as Drupal's Founder and Project Lead, learning about Satoshi's use of Drupal is a nice addition to Drupal's rich history. Almost every day, I'm inspired by the unexpected impact Drupal has.

I started to migrate all the services that I use on my internal network to my Raspberry Pi 4 cluster. I migrated my FreeBSD jails to BastileBSD on a virtual machine running on a Raspberry Pi. See my blog post on how to migrate from ezjail to BastilleBSD. https://stafwag.github.io/blog/blog/2023/09/10/migrate-from-ezjail-to-bastille-part1-introduction-to-bastillebsd/

tianocore

Running FreeBSD as a virtual machine with UEFI on ARM64 came to the point that it just works. I have to use QEMU with u-boot to get FreeBSD up and running on the Raspberry Pi as a virtual machine with older FreeBSD versions: https://stafwag.github.io/blog/blog/2021/03/14/howto_run_freebsd_as_vm_on_pi/.

But with the latest versions of FreeBSD ( not sure when it started to work, but it works on FreeBSD 14) you can run FreeBSD as a virtual machine on ARM64 with UEFI just like on x86 on GNU/Linux with KVM.

UEFI on KVM is in general provided by the open-source tianocore project.

I didn’t find much information on how to run OpenBSD with UEFI on x86 or ARM64.

OpenBSD 7.4

So I decided to write a blog post about it, in the hope that this information might be useful to somebody else. First I tried to download the OpenBSD 7.4 ISO image and boot it as a virtual machine on KVM (x86). But the iso image failed to boot on a virtual with UEFI enabled. It looks like the ISO image only supports a legacy BIOS.

ARM64 doesn’t support a “legacy BIOS”. The ARM64 download page for OpenBSD 7.4 doesn’t even have an ISO image, but there is an install-<version>.img image available. So I tried to boot this image on one of my Raspberry Pi systems and this worked. I had more trouble getting NetBSD working as a virtual machine on the Raspberry Pi but this might be a topic for another blog post :-)

You’ll find my journey with my installation instructions below.

Download

Download the installation image

Download the latest OpenBSD installation ARM64 image from: https://www.openbsd.org/faq/faq4.html#Download

The complete list of the mirrors is available at https://www.openbsd.org/ftp.html

Download the image.

[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/install74.img
--2024-02-13 19:04:52--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/install74.img
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 528482304 (504M) [application/octet-stream]
Saving to: 'install74.img'

install74.img       100%[===================>] 504.00M  3.70MB/s    in 79s     

2024-02-13 19:06:12 (6.34 MB/s) - 'install74.img' saved [528482304/528482304]

[staf@staf-pi002 openbsd]$ 

Download the checksum and the signed checksum.

2024-02-13 19:06:12 (6.34 MB/s) - 'install74.img' saved [528482304/528482304]

[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256
--2024-02-13 19:07:00--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 1392 (1.4K) [text/plain]
Saving to: 'SHA256'

SHA256                  100%[=============================>]   1.36K  --.-KB/s    in 0s      

2024-02-13 19:07:01 (8.09 MB/s) - 'SHA256' saved [1392/1392]

[staf@staf-pi002 openbsd]$ 
[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256.sig
--2024-02-13 19:08:01--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256.sig
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 1544 (1.5K) [text/plain]
Saving to: 'SHA256.sig'

SHA256.sig              100%[=============================>]   1.51K  --.-KB/s    in 0s      

2024-02-13 19:08:02 (3.91 MB/s) - 'SHA256.sig' saved [1544/1544]

[staf@staf-pi002 openbsd]$ 

Verify

OpenBSD uses signify to validate the cryptographic signatures. signify is also available for GNU/Linux (at least on Debian GNU/Linux and Arch Linux).

More details on how to verify the signature with signify is available at: https://www.openbsd.org/74.html

This blog post was also useful: https://www.msiism.org/blog/2019/10/20/authentic_pufferfish_for_penguins.html

Install OpenBSD signify

Download the signify public key from: https://www.openbsd.org/74.html

[staf@staf-pi002 openbsd]$ wget https://ftp.openbsd.org/pub/OpenBSD/7.4/openbsd-74-base.pub
--2024-02-13 19:14:25--  https://ftp.openbsd.org/pub/OpenBSD/7.4/openbsd-74-base.pub
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 99 [text/plain]
Saving to: 'openbsd-74-base.pub'

openbsd-74-base.pub     100%[=============================>]      99   397 B/s    in 0.2s    

2024-02-13 19:14:26 (397 B/s) - 'openbsd-74-base.pub' saved [99/99]

[staf@staf-pi002 openbsd]$

I run Debian GNU/Linux on my Raspberry Pi’s, let see which signify packages are available.

[staf@staf-pi002 openbsd]$ sudo apt search signify
sudo: unable to resolve host staf-pi002: Name or service not known
[sudo] password for staf: 
Sorting... Done
Full Text Search... Done
chkrootkit/stable 0.57-2+b1 arm64
  rootkit detector

elpa-diminish/stable 0.45-4 all
  hiding or abbreviation of the mode line displays of minor-modes

fcitx-sayura/stable 0.1.2-2 arm64
  Fcitx wrapper for Sayura IM engine

fcitx5-sayura/stable 5.0.8-1 arm64
  Fcitx5 wrapper for Sayura IM engine

signify/stable 1.14-7 all
  Automatic, semi-random ".signature" rotator/generator

signify-openbsd/stable 31-3 arm64
  Lightweight cryptographic signing and verifying tool

signify-openbsd-keys/stable 2022.2 all
  Public keys for use with signify-openbsd

[staf@staf-pi002 openbsd]$

There’re two OpenBSD signify packages available on Debian 12 (bookworm);

  • signify-openbsd/: The OpenBSD signify tool.
  • signify-openbsd-keys: This package contains the OpenBSD release public keys, installed in /usr/share/signify-openbsd-keys/. Unfortunately, the OpenBSD 7.4 release isn’t (yet) included in Debian 12 (bookworm).
[staf@staf-pi002 openbsd]$ sudo apt install signify-openbsd signify-openbsd-keys
sudo: unable to resolve host staf-pi002: Name or service not known
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  signify-openbsd signify-openbsd-keys
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 70.4 kB of archives.
After this operation, 307 kB of additional disk space will be used.
Get:1 http://deb.debian.org/debian bookworm/main arm64 signify-openbsd arm64 31-3 [62.3 kB]
Get:2 http://deb.debian.org/debian bookworm/main arm64 signify-openbsd-keys all 2022.2 [8020 B]
Fetched 70.4 kB in 0s (404 kB/s)          
Selecting previously unselected package signify-openbsd.
(Reading database ... 94575 files and directories currently installed.)
Preparing to unpack .../signify-openbsd_31-3_arm64.deb ...
Unpacking signify-openbsd (31-3) ...
Selecting previously unselected package signify-openbsd-keys.
Preparing to unpack .../signify-openbsd-keys_2022.2_all.deb ...
Unpacking signify-openbsd-keys (2022.2) ...
Setting up signify-openbsd-keys (2022.2) ...
Setting up signify-openbsd (31-3) ...
[staf@staf-pi002 openbsd]$ 

Verify the checksum

Verify the checksum.

[staf@staf-pi002 openbsd]$ sha256sum install74.img 
09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8  install74.img
[staf@staf-pi002 openbsd]$ 
[staf@staf-pi002 openbsd]$ grep 09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8 SHA256
SHA256 (install74.img) = 09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8
[staf@staf-pi002 openbsd]$ 

Verify with signify

Execute the signify command to verify the checksum. See the OpenBSD signify manpage for more information.

You’ll find a brief list of the arguments that are used to verify the authenticity of the image.

  • -C: Will verify the Checksum.
  • -p <path>: The path to the Public key.
  • -x <path>: The path to the signature file.

Verify the image with signify.

[staf@staf-pi002 openbsd]$ signify-openbsd -C -p openbsd-74-base.pub -x SHA256.sig install74.img
Signature Verified
install74.img: OK
[staf@staf-pi002 openbsd]$

Secure boot

The Debian UEFI package for libvirt ovmf is based on https://github.com/tianocore/tianocore.github.io/wiki/OVMF.

Debian Bookworm comes with the following UEFI BIOS settings:

  • /usr/share/AAVMF/AAVMF_CODE.ms.fd This is with secure boot enabled.
  • /usr/share/AAVMF/AAVMF_CODE.fd This is without secure boot enabled.

The full description is available at /usr/share/doc/ovmf/README.Debian on a Debian system when the ovmf package is installed.

To install OpenBSD we need to disable secure boot.

Test boot

I first started a test boot.

Logon to the Raspberry Pi.

[staf@vicky ~]$ ssh -X -CCC staf-pi002 
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Linux staf-pi002 6.1.0-17-arm64 #1 SMP Debian 6.1.69-1 (2023-12-30) aarch64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Feb 14 06:08:45 2024 from xxx.xxx.xxx.xxx
[staf@staf-pi002 ~]$ 


virt-manager



Start virt-manager and click on the [ Create on new VM ] icon.



new vm




This will bring up the new vm window. Select ( ) Import existing disk image, you review the architecture option by selecting the \/ Architecture options. The defaults are fine. Click on [ Forward ].






import vm



This will open the “import vm” window. Click on [ Browse ] to select the OpenBSD installation image or just copy/paste the path.

At the bottom of the screen, you’ll see Choose the operating system you are installing. Starting type openbsd and select [ X ] include end-of-life operating systems Debian 12 (bookworm) doesn’t include support for OpenBSD 7.4 (yet) so we need to set it to “OpenBSD 7.0”. Click on [ Forward ].





select custom


In the next windows keep the default Memory and CPU settings as we’re just verifying that we can boot from the installation image.

Debian uses “secure boot” by default. We need to disable secure boot. Select [ X ] Customize configuration before install, this allows us to set the UEFI boot image.






begin install


Set the Firmware to: /usr/share/AAVMF/AAVMF_CODE.fd to disable secure boot and click on [ Begin Installation ].








begin install




Let’s check if OpenBSD can boot. Great, it works!




Installation with virt-install

I prefer to use the command line to install as this allows me to make the installation reproducible and automated.

Create a ZFS dataset

I used ZFS on my Raspberry Pi’s, this makes it easier to create snapshots etc when you’re testing software etc.

root@staf-pi002:/var/lib/libvirt/images# zfs create staf-pi002_pool/root/var/lib/libvirt/images/openbsd-gitlabrunner001
root@staf-pi002:/var/lib/libvirt/images# 
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# pwd
/var/lib/libvirt/images/openbsd-gitlabrunner001
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

Get the correct os-variant

To get the operating system settings you can execute the command virt-install --osinfo list

root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# virt-install --osinfo list | grep -i openbsd7
openbsd7.0
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

We’ll use openbsd7.0 as the operating system variant.

Create QEMU image

Create a destination disk image.

root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# qemu-img create -f qcow2 openbsd-gitlabrunner001.qcow2 50G
Formatting 'openbsd-gitlabrunner001.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=53687091200 lazy_refcounts=off refcount_bits=16
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

Run virt-install

Run virt-install to import the virtual machine.

#!/bin/bash

virt-install --name openbsd-gitlabrunner001  \
 --noacpi \
 --boot loader=/usr/share/AAVMF/AAVMF_CODE.fd \
 --os-variant openbsd7.0 \
 --ram 2048 \
 --import \
 --disk /home/staf/Downloads/isos/openbsd/install74.img  \
 --disk /var/lib/libvirt/images/openbsd-gitlabrunner001/openbsd-gitlabrunner001.qcow2

If everything goes well the virtual machine gets booted.

BdsDxe: loading Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x1,0x3)/Pci(0x0,0x0)
BdsDxe: starting Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x1,0x3)/Pci(0x0,0x0)
disks: sd0*
>> OpenBSD/arm64 BOOTAA64 1.18
boot> 
cannot open sd0a:/etc/random.seed: No such file or directory
booting sd0a:/bsd: 2861736+1091248+12711584+634544 [233295+91+666048+260913]=0x13d5cf8
Copyright (c) 1982, 1986, 1989, 1991, 1993
	The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2023 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 7.4 (RAMDISK) #2131: Sun Oct  8 13:35:40 MDT 2023
    deraadt@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/RAMDISK
real mem  = 2138013696 (2038MB)
avail mem = 2034593792 (1940MB)
random: good seed from bootblocks
mainbus0 at root: linux,dummy-virt
psci0 at mainbus0: PSCI 1.1, SMCCC 1.1
efi0 at mainbus0: UEFI 2.7
efi0: EDK II rev 0x10000
smbios0 at efi0: SMBIOS 3.0.0
smbios0:
sd1 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, >
sd1: 51200MB, 512 bytes/sector, 104857600 sectors
virtio35: msix per-VQ
ppb5 at pci0 dev 1 function 5 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci6 at ppb5 bus 6
ppb6 at pci0 dev 1 function 6 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci7 at ppb6 bus 7
ppb7 at pci0 dev 1 function 7 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci8 at ppb7 bus 8
ppb8 at pci0 dev 2 function 0 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci9 at ppb8 bus 9
ppb9 at pci0 dev 2 function 1 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci10 at ppb9 bus 10
ppb10 at pci0 dev 2 function 2 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci11 at ppb10 bus 11
ppb11 at pci0 dev 2 function 3 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci12 at ppb11 bus 12
ppb12 at pci0 dev 2 function 4 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci13 at ppb12 bus 13
ppb13 at pci0 dev 2 function 5 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci14 at ppb13 bus 14
pluart0 at mainbus0: rev 1, 16 byte fifo
pluart0: console
"pmu" at mainbus0 not configured
agtimer0 at mainbus0: 54000 kHz
"apb-pclk" at mainbus0 not configured
softraid0 at root
scsibus2 at softraid0: 256 targets
root on rd0a swap on rd0b dump on rd0b
WARNING: CHECK AND RESET THE DATE!
erase ^?, werase ^W, kill ^U, intr ^C, status ^T

Welcome to the OpenBSD/arm64 7.4 installation program.
(I)nstall, (U)pgrade, (A)utoinstall or (S)hell? 

Continue with the OpenBSD installation as usual. Make sure that you select the second disk during the installation process.

To fully automate the installation we need a system that executes the post-configuration at the first boot. On GNU/Linux is normally done by cloud-init while there are solutions to get cloud-init working on the BSDs. I didn’t look into this (yet).

Have fun!

Links

February 24, 2024

If you use MySQL Shell for Visual Studio Code, using a bastion host is the easiest method to connect to a MySQL HeatWave DB Instance on OCI.

If you already have a connection setup using a bastion to host, you may experience the same problem as me, MySQL Shell complains about an invalid fingerprint detected:

This error has nothing to do with the fingerprint of your user OCI Key. The problem is related to the key of your bastion host as you can see in the output window:

This happens if you have changed your bastion host for example.

To resolve the problem, remove the current ssh host key for the bastion host stored in your know_hosts:

$ ssh-keygen -R "host.bastion.us-ashburn-1.oci.oraclecloud.com"

Use the name of your bastion host of course.

When done, it’s already fixed, you can connect back to your MySQL HeatWave DB Instance using MySQL Shell for Visual Studio Code.

February 23, 2024

Sorry, maar een bedrijf schandelijk de dieperik injagen is onwettelijk. Punt. Daar mag, nee moet, een gepaste straf tegenover staan.

Kurt Meers – opinie stuk

Ik vraag me ook af waarom journalisten die Vansteenbrugge uitnodigen niet ingaan op zijn totaal foute definities:

Laster is iemand kwaadwillig iets verwijten zodat dit deze persoon blootstelt aan publieke verachting, zonder dat men erin slaagt het wettelijke bewijs ervan te leveren, terwijl de wet het leveren van het bewijs veronderstelt. Dit kan bijvoorbeeld wanneer je in het openbaar beschuldigd wordt van pedofilie, terwijl hier absoluut geen bewijs van voorhanden is.

Eerroof is iemand kwaadwillig iets verwijten zodat de eer van die persoon wordt aangetast of hij/zij wordt blootgesteld aan publieke verachting, zonder dat het bewijs hiervan wettelijk gezien geleverd kan of mag worden. Iemand wordt bijvoorbeeld beschuldigd van diefstal, maar het wettelijke bewijs daarvan is niet meer te leveren omdat de zaak verjaard is.

Advo-Recht

M.a.w. Laster en eerroof heeft totaal niets te maken met wat Vansteenbrugge daarover beweert: iets over dat je mening de democratie niet in gevaar mag brengen, wat er totaal niets mee te maken heeft – en wat trouwens wel toegelaten is als meningsuiting. Bijvoorbeeld Sharia4Belgium mocht haar mening uiten. Daar werden ze niet voor bestraft. Wel voor het rondselen van terroristen.

Ik heb zelden zo’n slechte advocaat gezien. Het is een populist die slecht theater speelt en ook de meest dwaze strategie adviseert voor zijn cliënt (waardoor die nu een gevangenisstraf i.p.v. een werkstraf oploopt). Maarja. Dat was duidelijk onderdeel van het theater om zo de publieke opinie te kunnen bespelen met: het is zo’n zware straf! Eigen keuze. Eigen schuld.

Laten we de definitie van Laster even vergelijken met wat er hier gebeurd is: het aan de schandpaal nagelen van de eigenaars van een restaurant die totaal niets met de zaak te maken hebben. Dat is letterlijk en exact hetzelfde als het blootstellen aan publieke verachting zonder dat men erin slaagt het wettelijke bewijs ervan te leveren.

Dat valt niet onder de noemer vrijheid van meningsuiting. Wel onder de noemer misdaad.

Er wordt ook schadevergoeding met straf vergeleken. Wat totaal onjuist is. Een schadevergoeding is geen straf. Het is de vergoeding voor schade die jij anderen hebt aangedaan. Dat is geen straf. Die vergoeding kan trouwens nog oplopen (dus meer dan de 20000 Euro worden) aangezien er een expert is aangesteld die de werkelijke schade nu zal begroten.

Ik denk eerlijk gezegd dat dat restaurant in Antwerpen veel meer dan 20000 Euro schade heeft opgelopen. Dat moet allemaal en volledig vergoed worden. Hoe hoog dat bedrag ook is.

Die schadevergoeding vergelijken met de schadevergoeding die betaald moest worden aan de ouders van een student die gestorven is tijdens een studentendoop: a) die zaken hebben totaal niets met elkaar te maken en het is b) lijkenpikkenrij. Er is maar weinig dat ik meer verachtelijk vind. Behalve dan politieke lijkenpikkerij. Dat is nog erger en bovendien een aanfluiting van de scheiding der machten.

Waar je voor zou kunnen pleiten in het Federaal Parlement is om de schadevergoeding voor onopzettelijke doding met schuld te verhogen. Maar daarover hoort dan eerst een behoorlijk gevoerd democratisch debat plaatst te vinden. Dat is dus niet de populistische onzin die we gewoon zijn van Sammy Mahdi.

De wet veronderstelt het leveren van het bewijs hiervoor: hij heeft de afgelopen jaren als politicus geen enkele aanzet om dat democratisch debat te voeren ondernomen.

Hij is dus een populistische lijkenpikker.

L’amour est comme l’oiseau de Twitter
On est bleu de lui, seulement pour 48 heures

StromaeCarmen

Q.E.D.

ps. Lees ook zeker de open brief van de moeder van Sanda Dia. Zij is de recuperatie van de dood van haar zoon ook volledig beu.

Lectures : utopies, écologie, pirates et meta-bullshit

Les conséquences de l’utopie

Je suis tombé sur plusieurs analyses du cycle de la Culture, de Iain M. Banks, une série de livres de SF que j’aime beaucoup et dont Elon Musk et Jeff Bezos disent s’inspirer.

Science-fiction utopique, le cycle de la Culture est une véritable réflexion sur la notion même d’utopie. À ne pas confondre avec la SF pseudo-utopique (et, pour moi, profondément ennuyeuse) de type Becky Chambers où le monde a été détruit, l’humanité décimée, mais comme tout le monde boit du thé avec de l’eau recyclée chauffée par des panneaux solaires, alors tout va bien et on balaie d’un revers de la main tout questionnement sur les millions de morts du passé.

Tout le contraire des romans « Feel Good », La Culture décrit, avec une approche parfois ardue, une société post-scarcité où chaque humain (ou intelligence artificielle) peut poursuivre sa voie personnelle à sa guise et où toutes les tentatives d’obtenir un pouvoir quelconque sont réprimées gentiment (et n’ont de toute façon aucun sens vu que tout le monde peut obtenir tout ce qu’il veut). Vu comme ça, difficile de voir en quoi cela peut inspirer des milliardaires capitalistes.

Deux théories possibles : la première est qu’ils se soient arrêtés à l’aspect prépubère des vaisseaux spatiaux super-cool qui franchissent l’espace en faisant piou-piou. La seconde, c’est qu’ils se voient eux-mêmes comme des dieux dont tous les désirs sont satisfaits, nonobstant complètement leur propre impact sur les autres. Dans tous les cas, ils ont complètement raté le sous-titre d’un auteur qui est, de son propre aveu, profondément socialiste et anti-capitaliste.

Un des sujets principaux de la Culture, c’est le sens de la vie d’une humanité qui n’a plus aucune contrainte matérielle. Avec cette superbe analogie : non, l’invention de l’hélicoptère n’a pas tué l’alpinisme. Si le but était de se tenir au sommet d’une montagne, il n’y aurait plus d’alpinistes. Mais l’objectif n’est pas de se tenir quelques secondes au sommet. Il est de grimper. De tracer son chemin.

Et nous sommes prêts à suivre les aventures de ceux qui grimpent. Nous nous foutons complètement de la vie de quelqu’un qui prend l’hélicoptère. Il en va de même pour l’art : l’art n’a rien à redouter de l’intelligence artificielle, car l’art, par essence, est un processus de création. Un chemin. Il y a quatre ans, je décrivais le processus créatif comme un lien entre humains.

L’art-gorithme…

Je définis personnellement l’art comme ce qu’un humain ne peut pas ne pas faire. Un écrivain n’est pas quelqu’un qui écrit. C’est quelqu’un qui ne peut pas ne pas écrire. Un musicien ne peut pas ne pas faire de la musique. Un programmeur ne peut pas ne pas programmer.

C’est toute la différence entre l’art et le travail. C’est aussi la raison pour laquelle certains grands artistes s’éteignent en devenant professionnels. Car l’art devient alors une obligation plutôt qu’une pulsion.

Nous sommes d’accord, les algorithmes menacent les rentrées financières, souvent précaires, des artistes. Mais c’est parce que les artistes ne vivent, dans une immense majorité, pas de leur art. Ils vivotent en prostituant leurs capacités techniques et leur talent pour faire du marketing, des logos, du webdesign et des images à coller sur les emballages. Affamés par une société ultra-consumériste, ils survivent en en devenant l’esclave.

Avec l’AI, la société consumériste ne menace pas les artistes. Elle a juste rajouté une corde à son arc pour négocier les salaires de ses esclaves à la baisse. Comme dit Cory Doctorow : « Les AI ne sont pas capables de vous remplacer à votre travail. Mais les producteurs d’AI sont capables de convaincre votre patron qu’elles le sont. ».

De mon côté, j’aime rappeler qu’on n’est pas encore capable de faire une putain de toilette qui se nettoie toute seule de manière efficace, qu’on doit tous, tous les jours, racler notre merde (ou, payer quelqu’un pour le faire à notre place), mais que, ne vous inquiétez pas, on dépense des milliards pour automatiser le travail des artistes… Le sens des priorités est clair.

Personnellement, je pense que la première manière de soutenir les artistes, c’est de le découvrir et de les faire connaître autour de vous, d’en parler comme des humain·e·s en train d’escalader des montagnes plutôt que d’aduler la photo où ils sont au sommet. Le partage inclut bien entendu le piratage (oui, piratez mes livres ! Prêtez-les ! Donnez-les !). Si vous voulez aller plus loin, voyez avec l’artiste en question la meilleure manière de le soutenir et celle qui vous convient le mieux. Personnellement, j’aime que mon éditeur et que les libraires indépendants soient également soutenus, du coup je vous encourage à acheter mes livres sur arbres morts dans des petites librairies. Mais pour d’autres, cela peut être un don, un crowdfunding… Enfin, faites ce que vous voulez !

Piratage et médias

En parlant de piratage justement, mon attention a été attirée sur l’étagère de livres « à donner » de la bibliothèque publique de ma ville. Mon bibliothécaire m’a expliqué qu’ils avaient une obligation formelle d’avoir plus de 50% de livres de moins de 10 ans dans leur stock. Régulièrement, la bibliothèque se débarrasse donc de livres qui ont dépassé l’âge fatidique. Dans ce cas-ci, le livre qui a retenu mon attention s’intitule « Capitaine Paul Watson : Entretien avec un pirate », de Lamya Essemlali. Comme mon épouse venait de m’offrir un pull Sea Sheperd, j’avais justement le personnage en tête je me suis approprié le livre.

Absolument non littéraire, le livre n’est qu’une grande interview du co-fondateur de Greenpeace et puis de Sea Shepperd. Ce qui est fascinant, c’est de découvrir un personnage « no bullshit » comme j’en voudrais plus.

Le personnage de Paul Watson n’est pas vraiment sympathique. Mais il s’en fout.

Notre objectif n’est pas de protester contre la chasse à la baleine, il est de l’arrêter.
– Entretien avec un pirate, p. 77

Tout au long du livre, il assène qu’il est stupide de tenter de lutter pour la préservation des emplois et contre la pauvreté si cela se fait au détriment de la planète. Parce que si l’écosystème est détruit, on sera bien avec nos emplois et nos richesses. On sera tous pauvres. Ou morts. L’écologie est le combat de base de tout militant. Elle vient avant le social. Il pousse la cohérence jusqu’à interdire la cigarette à bord de toute la flotte Sea Sheperd. Les repas sont également végans.

Il tire à boulets rouges sur Greenpeace qui est devenue une association qui a pour but de récolter les dons. Il prétend que Greenpeace gagne désormais plus d’argent avec la chasse à la baleine que l’industrie baleinière elle-même. Tout en ne faisant aucune action : les démarcheurs Greenpeace vont jusqu’à utiliser les images des campagnes… Sea Sheperd. Greenpeace fait du marketing et, comme toute industrie du marketing, tire son profit de ne pas résoudre le problème. Le problème est l’essence du business, il faut le préserver, voire l’amplifier (en prétendant développer des solutions). Paul Watson va plus loin dans sa critique du marketing : rien ne sert de convaincre les gens de ne pas tuer de baleine vu qu’ils ne l’auraient de toute façon pas fait. Il faut agir directement sur les baleiniers. Le parallèle avec l’industrie informatique est laissé en exercice au lecteur.

Les gens qui agissent sont, par essence, décriés dans les médias.

Tout ce que l’on fait et ce que l’on pense est défini et contrôlé par les médias. Ce sont eux qui définissent notre réalité. Et c’est la raison pour laquelle nous nous trouvons sur une voie rapide au bout de laquelle nous attend une récompense darwinienne collective : l’extinction de notre espèce.
— Entretien avec un pirate, p. 158

Avec cynisme, Paul Watson invite donc des célébrités qui n’ont rien à voir avec la cause afin d’asséner, durant la conférence de presse, que s’il avait invité un spécialiste du domaine au lieu d’un acteur de cinéma, les médias ne seraient pas là.

Paul Watson est également placé sur la liste rouge d’Interpol. Il est considéré comme un « écoterroriste ». Ce qui est fascinant, c’est qu’il respecte parfaitement les lois et n’a jamais été condamné ni arrêté pour quoi que ce soit, malgré ses tentatives en se rendant de lui-même dans les commissariats. Sea Shepperd ne fait qu’ennuyer voir éperonner des navires qui pratiquent des activités complètement illégales. Et donc, personne ne porte plainte parce que les navires éperonnés n’avaient rien à faire là en premier lieu !

Il va plus loin en se demandant qui sont réellement les « écoterroristes » alors que personne n’a jamais été tué ni même blessé dans une action Sea Sheperd.

Les gens nous accusent de jeter des bombes puantes sur les baleiniers japonais. Oui, on fait ça, c’est vrai. […] Ceci dit, j’ai du mal à imaginer Oussama Ben Laden en train de jeter une bombe puante sur un bus de touristes.
— Entretien avec un pirate, p. 169

La médiatisation et la corruption de la rébellion

Dans « Mémoires vive », Edward Snowden raconte son incroyable histoire et un point m’a frappé : travaillant à l’intérieur même du service de renseignement, Snowden sait exactement ce qui va lui arriver. Il sait comment on va l’attaquer. Il prédit même comment il va être décrédibilisé dans les médias. Le même phénomène est à l’œuvre avec Paul Watson. Face aux rebelles, le capilo-consumérisme passe par trois lignes de défense:

Premièrement, en rendant la vie aussi difficile au rebelle en rendant ses ressources rares, en l’épuisant, en le décourageant voire en le harcelant. C’est normal, c’est le jeu.

En second lieu, si le rebelle persiste, en l’achetant, en le rémunérant pour sa complicité. Ce que nous faisons tous en travaillant pour payer notre shopping afin de nous changer les idées après une dure semaine de boulot… pour payer le shopping.

Mais les véritables rebelles charismatiques, les artistes, sont achetés bien plus chers. Ils deviennent des stars. Ils sont célébrés par le système, ils sont riches. Ils n’ont plus envie de se rebeller et utilisent leur image de rebelle pour exhorter leur audience à se conformer. Leur situation leur fait perdre toute crédibilité et légitimité dans la rébellion. Comme les vieux et riches musiciens qui luttent de toutes leur force contre le piratage. Ou les groupes punk postant sur Facebook et Instagram. Riches, on se contentera de faire des dons à des associations adoubées par le système et servant à relancer la carrière médiatique de chanteurs has been et de stars de télé-réalité autrement oubliées.

Facebook, au fait, n’est qu’une gigantesque machine à corrompre à moindre coût les rebelles. On étouffe la rébellion en échange… de likes et de followers ! Toute association écologiste présente sur Facebook est, en soit complètement corrompue, car, de par sa seule présence sur la plateforme, elle justifie et promeut le consumérisme et un monde soumis à la publicité. Facebook, c’est la « Planète à gogos » virtuelle (en référence au chef-d’œuvre de Pohl et Kornbluth). Même plus besoin de payer les rebelles, ils viennent gratuitement pour quelques likes. Quoi ? Ce sont eux qui paient pour mettre en avant leur site web et obtenir encore plus de followers ?

Je suis sûr qu’on va me dire que Sea Shepperd a malheureusement une page Facebook. J’en suis triste. La convergence des luttes n’a pas encore percolé à ce niveau-là…

Mais il reste les rebelles incorruptibles, inaliénables. De type Watson ou Snowden. Face à eux, on utilise la troisième ligne de défense. Les médias, littéralement financés par la publicité, vont alors mener une destruction en règle de l’image publique. Il va s’agir de construire de toutes pièces et de manière consciente un personnage détestable. Un mauvais digne d’un film afin de décrédibiliser ses actions.

C’est tellement puissant qu’il est parfois intéressant de prendre un peu de recul et de réaliser ce qu’on reproche vraiment à la personne. À Julian Assange, on reproche littéralement… d’avoir fait un travail de journaliste. Essayez d’imaginer « Les hommes du président », mais où Dustin Hoffman et Robert Redford seraient les méchants qui veulent faire tomber le bon président qui surveille ses opposants politiques pour le bien de son peuple. Et bien voilà, on est en plein dedans. Et à Paul Watson, on reproche… d’empêcher des massacres illégaux de faune marine !

Imagine que tu te rendes dans la ville de La Mecque, que tu marches jusqu’au centre vers la pierre noire et que tu craches dessus. Eh bien, tes chances de sortir de là en vie sont pour ainsi dire très minces et peu de gens éprouveront une quelconque sympathie à ton égard, car tu auras commis un blasphème […] et les gens comprendraient la violence qui te serait faite en réponse.
[…]
Si les forêts tropicales, si les océans et toute la vie qu’ils contiennent avaient autant de valeur, s’ils étaient aussi sacrés à nos yeux qu’une vieille pierre, […] nous taillerons en pièce ces bûcherons et ces chaluts pour leurs actions. Nous ne le faisons pas, car nous sommes totalement aliénés du monde naturel.
— Entretien avec un pirate, p. 143

L’impact de l’homme sur le climat

Étant fanatique d’apnée, je vois chaque année les dégâts que l’humanité inflige aux océans. Se retrouver dans une mer magnifique, au large, pour se rendre compte que ce que je prenais pour des reflets à la surface sont des billes de polystyrène qui couvrent l’océan. Ou réaliser que, dès que mes palmes me portent en dehors d’une minuscule zone « réserve naturelle », la végétation laisse place au sable à perte de vue et que les poissons sont plus rares que les sacs plastiques et les bouteilles vides.

Nous sommes huit milliards. C’est un chiffre inimaginable. Si vous rencontriez une personne sur la planète toutes les secondes, il vous faudrait… 350 ans pour voir tous les humains. Huit milliards de personnes qui tentent de se faire consommer le plus possible les uns les autres. La consommation étant définie comme l’extraction d’une ressource naturelle pour la transformer en déchet (la consommation proprement dite se faisant au milieu). Nous sommes huit milliards à transformer la planète en déchet, en récompensant comme nos maîtres absolus ceux qui le font le plus efficacement et le plus rapidement possible.

Notre impact est énorme, inimaginable. Et l’était déjà il y a des siècles.

La colonisation des Amériques par les Européens entraina, en un siècle, la mort de près de 90% des indigènes. Que ce soit par des massacres directs ou via les épidémies, certaines ayant été propagées à dessein, par exemple en distribuant des couvertures ayant servi à recouvrir des malades de la variole.

On estime que les Européens ont causé 56 millions de morts entre Christophe Colomb et l’année 1600. Un mort chaque minute pendant plus d’un siècle ! C’est un nombre tellement important que d’immenses étendues de champs se sont soudainement retrouvées en friche, que la végétation a explosé capturant beaucoup de carbone présent dans l’atmosphère et contribuant au « petit âge de glace » de cette époque. Le massacre serait, à lui seul, responsable pour une diminution de 0,15°C du climat mondial, avec un effet bien plus prononcé en Europe.

Conclusion et meta-bullshit

J’aime bien tenter de trouver une sorte de conclusion commune à ces réflexions en vrac (qui ne deviendront pas quotidienne, je vous rassure. C’est juste que j’avais du retard dans mes notes). Ici, il y en a une qui me frappe : c’est l’importance, la prépondérance des médias entièrement financés par la publicité sur ce qui fait notre culture et notre perception de la réalité.

Nous vivons dans une société où le bullshit, le mensonge sont l’état de base. Nous sommes tellement accros que l’on discute de la moralité d’installer un bloqueur de publicité sur notre navigateur alors que l’acte même de se rendre sur un site publicitaire devrait être vu comme l’équivalent climatique de la consultation d’un site pornographique : on devrait avoir honte de le consulter et encore plus d’y travailler.

Mais nous revenons toujours, pour nous tenir « informés ». Aaron Swartz, je t’en supplie, reviens !

L’intelligence artificielle est la cerise sur le gâteau. Le bullshit ne sert plus à nous vendre de la merde, mais il se produit tout seul pour vendre de générateurs de bullshit !

Ça y’est, nous sommes entrés dans l’ère du meta-bullshit

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

February 22, 2024

For close to a decade, I dedicated myself to building out Grafana Labs and subsequently took some personal time to recharge. I came out of these experiences as a different person, and am ready to share a few things I learned.

First step is bringing my abandoned blog back to life:

  • Rewrote the about/bio and projects pages
  • Switched from lighttpd to Caddy (automagic https! wow!)
  • Enabled Google Cloud CDN
  • Switched from Google Analytics to the open source PostHog
  • Tried CloudFlare Turnstile for comment spam prevention, but it’s quite complicated and has un-googleable error codes, so i switched to something more simplistic
  • Upgraded hugo from 0.15 to 0.122 with very few breaking changes (!) and found a nice new minimalist theme
  • I took more care than probably necessary to avoid accidental republishing of articles in RSS feeds, since that’s such an easy mistake to happen with rss GUID’s.

I don’t know who still reads this blog, but if you do, stay tuned for more!

Meta-blogging, lectures sur Gemini et conquête spatiale

Meta-blogging

Pluralistic, le blog de Cory Doctorow, fête ses quatre ans.

Cory est une énorme source d’inspiration pour moi. Il poste presque tous les jours, appliquant une méthode qu’il intitule Memex. Je le lis depuis son tout premier billet, je me demande comment il tient un tel rythme. Et je crois avoir percé une partie de son secret : son blog n’est qu’un ensemble de notes qu’il prend durant la journée et qu’il poste le soir ou le lendemain. Je me suis pris à penser à toute cette énergie d’écriture que je passe à répondre à de longs mails, à poster sur des forums, à poster et à lire sur Mastodon, à mettre dans mes notes personnelles. Et si je m’inspirais encore plus de sa méthode ?

Avec la croissance du lectorat de ploum.net, je me suis inconsciemment imposé une forme de perfectionnisme dans l’écriture de mes billets. Je rejette beaucoup de brouillons qui ne sont pas satisfaisants, je relis, je corrige parfois pendant des jours, des semaines ou des mois jusqu’au moment où le billet n’est de toute façon plus pertinent. J’ai énormément perdu en spontanéité. J’ai également peur de vous « spammer ».

Parfois, je me laisse aller à poster un billet spontané. Je le poste en me forçant à ne pas trop réfléchir. Et ça me fait plaisir. J’avais également tenté de faire des billets « en vrac » inspiré par Tristan Nitot. Deux exemples au hasard :

Quand j’y repense, ces billets m’ont vraiment été utiles. Je m’y suis référé plusieurs fois pour trouver des liens, pour revenir à des réflexions que j’avais.

On me demande très souvent des conseils pour se mettre à bloguer. Le plus important que je donne est toujours de bloguer pour soi. De ne pas réfléchir à ce qu’on pense qui pourrait plaire à son audience, mais de publier. De publier ce qu’on a soi-même dans les tripes.

J’ai suffisamment d’expérience avec le buzz pour savoir que le succès à court terme d’un billet de blog est très difficilement prévisible. Et même lorsque le billet fait un buzz incroyable, cela n’est généralement pas perceptible si l’on n’a pas de statistiques et si on n’est pas sur les réseaux sociaux. Être le sujet d’un buzz procure une décharge de dopamine et puis… c’est tout ! C’est addictif, mais souvent plus nocif qu’autre chose, car on se surprend à en vouloir toujours plus, à guetter les votes sur Hackernews, à s’énerver sur les commentaires de ceux qui ne sont pas d’accord avec vous.

L’impact à long terme d’un billet est, lui, encore moins prévisible. Il est également très personnel : je me surprends à relire certains de mes billets, à y refaire référence, à m’en servir comme point d’ancrage pour certaines idées quand bien même tout le monde semble s’en foutre.

Je dis bien « semble s’en foutre » parce qu’on ne peut jamais savoir. Sous la masse d’internautes qui postent sur les réseaux sociaux, sur les forums et qui commentent, il y a dix fois plus de personnes qui lisent en silence, qui partagent en privé.

« Ploum, je suis heureux de te rencontrer. Ton billet X a changé ma vie ! » m’a un jour dit un·e lecteurice en faisant référence à un billet publié cinq ans plus tôt et pour lequel je n’avais eu aucun retour.

« J’ai imprimé ton billet Y pour le faire lire à mon père. Nous avons passé la soirée à en discuter » m’a dit un·e autre.

Je conseille à tous les apprentis blogueurs de lâcher prise, de publier ce qu’ils ont dans la tête. Mais pourquoi ne pourrais-je pas appliquer mes propres principes ?

Car c’est difficile de lâcher prise, de ne pas apparaître comme ayant toujours une prose bien construite, bien arrêtée. C’est dur de montrer ses doutes, de faire des raccourcis de raisonnement dont on aura honte après les premiers retours, de reconnaitre ses erreurs.

Et du coup, sans transition…

Quelques liens de ma journée

Une courte tranche de vie d’un parent dont l’enfant a un diabète de type 1. Tellement courte que l’on dirait un pit-stop en formule 1.

Oui, c’est un lien Gemini. Gemini est un réseau où l’absence de métriques telles que les statistiques ou les likes rend l’écriture beaucoup plus personnelle, beaucoup moins calibrée. Je me retrouve à lire des tranches de vie, des réflexions qui n’ont aucune prétention et c’est très inspirant. À l’inverse, j’ai vu plusieurs personnes abandonner ce réseau parce qu’elles n’avaient pas de likes ni de statistiques et avaient l’impression de publier dans le vide. Le fait que je les lisais se plaindre prouvait qu’elles ne parlaient pas dans le vide. Mais ce qui est effrayant, c’est de se rendre compte qu’un simple chiffre articifiel sous un cœur ou sur un tableau Google Analytics nous fait croire que, ouf, là je n’écris pas dans le vide.

J’ai justement eu un échange à ce sujet avec un lecteur qui me pointait vers le concept du Web Revival:

Je tiens à préciser que je suis assez contre le terme « revival ». Il tend à idéaliser un passé qui n’a jamais existé. Le Web a très vite été envahi de publicité, de popups. Mais, évidemment, je soutiens l’idée d’un web minimaliste. Mon correspondant m’apprend que le site Melonking se targue de faire 38 millions de vues par mois (je n’ai pas la source). Or, c’est exactement là tout le problème du Web : comment le webmaster compte-t-il ces 38 millions de vues ? Pourquoi les compte-t-il ? Pourquoi s’en vante-t-il ? Et pourquoi cela impressionne-t-il tellement les lecteurices ?

J’ai désactivé toutes les statistiques possibles sur mon blog. Les seuls chiffres que je ne peux pas cacher son mon nombre de followers sur Mastodon (qui est public) et le nombre d’abonnés à mes mailing-listes (sauf si vous vous abonnez à la version text-only sur Sourcehut, là je n’ai aucune visibilité). Idéalement, j’aimerais n’avoir accès à aucune de ces statistiques. Et c’est peut-être pour cela que j’apprécie tellement Gemini.

Mais, rassurez-vous, le contenu est identique que sur la version Web de mon blog. Vous ne ratez donc rien en ne venant pas sur Gemini. Pour ceux que ça intéresse, le protocole Gemini est très simple et doit son nom au programme spatial Gemini. Le port utilisé, 1965, est d’ailleurs une référence explicite à l’année du premier vol Gemini habité de Virgil Grissom et John Young. Pour entériner la simplicité du protocole Gemini, Solderpunk, son créateur, vient d’annoncer son intention de clôturer définitivement le travail sur la spécification au plus tard le 18 mars 2027, mais, si tout va bien, le 8 mai 2025. Oui, ces jours ont une signification spéciale en lien avec la conquête spatiale.

La conquête spatiale

La sonde Voyager 1 est en train de rendre l’âme.

Le fait qu’elle ait fonctionné jusqu’ici est proprement hallucinant. Adolescent, j’avais dévoré le livre de Pierre Kohler racontant son épopée. Aujourd’hui, lorsqu’une sonde est envoyée dans l’espace, une réplique exacte, au bit et au boulon prêt, est gardée dans un laboratoire pour déboguer tout problème et tester toute solution. L’ESA avait notamment dû reprogrammer en urgence la sonde Cassini alors qu’elle s’approchait de Saturne, car ils s’étaient rendu compte que l’effet Doppler n’avait pas été correctement pris en compte dans les calculs initiaux du récepteur.

Mais, à l’époque de Voyager 1, personne n’avait encore pensé à cette technique. Cela signifie qu’il n’existe pas de copie de la sonde Voyager et que le code informatique embarqué a complètement disparu. La sonde est en train de boguer, mais personne ne sait exactement quel code ni même quel système d’exploitation tourne. Ce qui rend le débogage particulièrement difficile, surtout lorsque le ping est de 48h. Il y a donc, dans l’espace intersidéral, hors de la zone d’attraction du soleil, un ordinateur qui tourne avec un système d’exploitation qui n’existe pas sur terre.

Conclusion

Je n’avais pas prévu de poster de billet de blog aujourd’hui. Mais regardez où m’amène la méthode de Cory Doctorow. Contrairement à lui, je ne compte pas faire cela tous les jours. Mais j’espère bien me permettre de poster sans arrière-pensée. Je vais également (encore) moins répondre à vos emails (que je lis toujours avec autant de plaisir). Plutôt que de réponses individuelles, parfois très longues, je vais m’inspirer de vos réflexions pour créer des billets publics.

En espérant que vous comprendrez que ces billets sont des réflexions mouvantes, parfois imprécises ou erronées. Comme l’ont d’ailleurs toujours été tous mes billets de blog. C’est juste que j’essayais de me convaincre (et vous convaincre) du contraire.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

February 20, 2024

Ik merkte op het einde van een Terzake dat onze teerbeminde ADIV opzoek is naar mensen.

Ik mag hopen dat ze planet.grep lezen want ja, een deel van het soort volk dat de ADIV nodig heeft leest onze blogs. Gelukkig maar.

Nu, ik heb in 2014 al eens omschreven wat toen Miguel De Bruycker nodig had voor zijn Centrum voor Cybersecurity.

Meneer Van Strythem Michel; jij hebt precies hetzelfde nodig als Miguel toen. Toch? Je gaat die mensen dan ook op precies dezelfde manier vinden als hoe Miguel ze vond: Bescherm ons tegen afluisteren, luister zelf enkel binnen een wettelijk kader af.

Ik schreef in 2014:

Zolang de overheid haar eigen moreel compas volgt, zijn deze mensen bereid hun kunnen voor de overheid in te zetten.

Maar de wet staat boven de militair. Ze moet gevolgd worden. Ook door de inlichtingendiensten. Het is onze enige garantie op een vrije samenleving: ik wil niet werken of geholpen hebben aan een wereld waarin de burger door technologie vrijheden zoals privacy verliest.

Daartussen vond je in mijn schrijfsel een opsommig van een paar opdrachtjes die een klein beetje moeilijk zijn.

Zoals het verbergen van een Linux kernel module, wat dus een standaard Linux virus zou doen. Het exfiltreren van data over een netwerk device zonder dat het al te veel opvalt. Dat is iets wat de ADIV zal willen doen. Het ontvangen van commando’s. Dat is eigenlijk wat Back orifice ook al deed.

Ik omschreef ook dat ze moeten weten hoe een bufferoverflow fout werkt. Omdat dit een veel voorkomende programmeursfout is die tot beveiligingsproblemen leidt. M.a.w. ze moeten toch op zijn minst weten hoe ze zelf zo’n beveiligingsprobleem zouden kunnen maken. Al was het maar om hun aanvalstechnieken te kunnen voorbereiden.

Ik verwachtte ook dat ze een standaard socket-servertje kunnen opzetten. Natuurlijk. Dat is minimaal.

De gemiddelde programmeur zal dat vast allemaal niet kunnen. Maar we hebben de gemiddelde programmeur dan ook niet nodig.

Als de ADIV zich houdt aan de wet en steeds binnen de context van de wet werkt, dan zal de ADIV de besten van ons land vinden. Net zoals hoe de brandweer de beste mensen van het land weet te activeren, zal de ADIV de beste programmeurs kunnen activeren.

Maar dat is omdat het zich aan haar eigen morele compas houdt.

Doet ze dat niet, dan niet.

Met vriendelijke groeten,

Philip. Programmeur.

February 19, 2024

Mon bibliotaphe

Les quelques milliers de livres qui composent notre bibliothèque familiale se mélangent, se transportent, se déplacent et s’étalent, colonisant les chambres, le salon, la salle à la manger, la toilette. Mais ils gardent un semblant d’ordre, une ébauche de discipline obéissant à une logique que je dois être le seul à comprendre : les bandes dessinées vont plus ou moins par genre, les rayonnages de fiction suivent l’ordre alphabétique des auteurs, etc.

Il n’en reste pas moins une grosse centaine d’ouvrages qui s’entassent dans les rayonnages de mon bureau et forment ma liste de lecture vivante, un bibliotaphe dont tout ordre est absolument proscrit.

Malheureux ! Un bibliotaphe, cela ne se classe pas ! Ranger les livres, c’est empêcher les idées de communiquer entre elles. En outre, cela prive le chercheur de découvertes fortuites !

— Henri Lœvenbruck, Les Disparus de Blackmore, page 278

Le pouvoir magique d’un bibliotaphe est parfois effrayant.

J’étais plongé dans « Lettre Ouverte à cet Autre qui est Moi », nouvelle clôturant le recueil « Les Cahiers du labyrinthe » que m’a partagé son auteur, Léo Henry. Dans ce récit qui m’a particulièrement interpellé, Léo Henry imagine que tout écrivain possède un double et explore jusqu’à son paroxysme les conséquences d’un tel postulat.

La lecture est comme l’œnologie. Savourer un passage ou une idée requiert de la tourner et retourner dans son esprit, de l’inspecter, de la humer. Me levant au milieu de la lecture de la nouvelle pour me diriger vers mon bureau, mon regard fut attiré par un fin volume qui dépassait de mon bibliotaphe, exactement à hauteur de mon nez. Le titre me fit sursauter.

Le Double, de Dostoïevski.

Je m’en saisis et contemplai sa couverture, un peu effrayé par la coïncidence.

J’hésitai. Je n’avais pas prévu de lire du Dostoïevski. Mais je n’avais pas le choix. Mon bibliotaphe venait de me donner un ordre implacable, ma prochaine lecture était toute trouvée.

Obéissant à cette force impérieuse, je me glissai ce soir-là sous la couette entre les pages du génial auteur de L’idiot.

Désormais, lorsque je passe à côté de la masse chamarrée de mon bibliotaphe, une forme de crainte m’envahit. Il me toise, semble pouvoir m’écraser sous sa masse. Je me rappelle alors qu’il est mon double bienveillant, qu’il est moi, me connait mieux qui quiconque et me protège. Et qu’il avait, encore une fois, parfaitement raison.

J’aime toujours autant Dostoïevski.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

February 15, 2024

Recently during the Swedish MySQL User Group (SMUG), I presented a session dedicated to MySQL InnoDB Primary Keys.

I forgot to mention a detail that many people are not aware, but Jeremy Cole has pointed out.

Primary Key always included in secondary indexes at the right-most column

When we define a secondary index, the secondary index includes the Primary Key as the right-most column of the index. It’s silently added, meaning that it’s not visible but it’s used to point back to the record in the clustered index.

This is an example with a table having a Primary Key composed of multiple columns:

CREATE TABLE `t1` (
`a` int NOT NULL,
`b` int NOT NULL, 
`c` int NOT NULL,
`d` int NOT NULL,
`e` int NOT NULL,
`f` varchar(10) DEFAULT 'aaa',
`inserted` datetime DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY(`a`,`b`,`c`) ) ENGINE=InnoDB; 

And this is the table’s content (only 2 records):

SELECT * FROM t1;
+---+---+---+----+----+-----+---------------------+
| a | b | c | d  | e  | f   | inserted            |
+---+---+---+----+----+-----+---------------------+
| 1 | 2 | 3 |  4 |  5 | abc | 2024-02-11 17:37:16 |
| 7 | 8 | 9 | 10 | 11 | def | 2024-02-11 17:37:26 |
+---+---+---+----+----+-----+---------------------+

Now let’s create a secondary key for the column f:

ALTER TABLE t1 ADD INDEX f_idx(f);

This key will then include the Primary Key as the right-most column(s) on the secondary index:

The orange filled entries are the hidden one.

Let’s verify this on the InnoDB page for that index:

And indeed, we can see that the Primary Key columns (in red) are included in each entry of the secondary index (in purple).

But not always !

When we have a Primary Key or Part of a Primary Key included in a Secondary Index, only the eventual missing columns of the Primary Key index will be added as right-most and hidden entries to the secondary index.

Let’s create a secondary index where the column b will be missing:

ALTER TABLE t1 ADD INDEX sec_idx (`d`,`c`,`e`,`a`);

The column b will be indeed added as the right-most hidden column of the index. Let’s verify this:

We can see above, that indeed, the value of column b is added. Same for the second record:

If we check in the InnoDB source code, this is also commented:

But what will happen, if we just use a prefix part of the Primary Key in the secondary index?

Let’s try:

CREATE TABLE `t1` (
  `a` varchar(10) NOT NULL DEFAULT 'aaaaaaaaaa',
  `b` varchar(10) NOT NULL DEFAULT 'bbbbbbbbbb',
  `c` int NOT NULL DEFAULT '1',
  `f` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`a`,`b`,`c`),
  KEY `sec_idx` (`c`,`f`,`a`(2))
) ENGINE=InnoDB 

SELECT * FROM t1;
+------------+------------+---+-----+
| a          | b          | c | f   |
+------------+------------+---+-----+
| aaaaaaaaaa | bbbbbbbbbb | 1 | abc |
| ccccccccc  | dddddddddd | 2 | def |
+------------+------------+---+-----+

We can see that only 2 characters of the column a are used in the secondary index.

If we check in the InnoDB page, we can notice that in-fact, the full column will also be added as the right-most hidden part of the secondary index:

So InnoDB needs to have the full PK, visible or hidden in the secondary index.

This is something not often known. But let’s hope this blog post clarifies it.

Enjoy MySQL & InnoDB… and of course, don’t forget to use Primary Keys !

February 14, 2024

So I like and use Spectra (aka “ultimate addons for gutenberg”) on some of my WordPress sites. If you use Spectra as well, you might want to be know that Spectra 2.12 has a significant performance regression, sucking in almost all of . Here’s hoping this was not intentional and that it’ll get fixed soon More info in https://wordpress.org/support/topic/2-12-significant-performance-regression/

Source

Our MySQL Belgian Days were a complete success. With 120 participants, the room was packed throughout the 2 days from the first to the last session. A record for this event !

The audience was made up of customers (some from very far away), users, contributors, community members and, of course, old friends.

The atmosphere was relaxed and everyone seemed happy to be there and to be talking about MySQL.

This is one of the rare opportunities to meet the MySQL developers who also operate our MySQL service on OCI and, for the first time, to meet the developers who code the unique features relating to HeatWave. Their presentations were technical, explaining the problems they encountered and which algorithms they used to to solve them, this was very interesting. The feedback from the audience was very positive.

The first day was dedicated to the MySQL Community. Several speakers presented their solutions, ranging from caching and monitoring to distributed systems.

Some other speakers explained their architecture and how they are using hybrid systems, on premise and cloud.

The second day was dedicated to the MySQL Team. The engineers presented content related to their current projects. Geir Hoydalsvik, the Senior Director of Software Development, presented the new MySQL release model, emphasizing the imminent arrival of the very first Long-Term Support (LTS) version. The need for this session was underlined by the apparent lack of clarity of some participants on the subject.

It was also very clear that many people are using MySQL Shell and that they love it ! Many users are in fact already deploying their architectures using the Admin API, this is a great news!

MySQL Shell Dump & Load utility are also very popular tools and everyone has praised their effectiveness.

The developers present in the room were also interested in Oystein‘s presentation regarding the use of GraalVM in MySQL to write Store Procedures in JavaScript.

Before the end of this 2 days conference, we had the MySQL Rockstar ceremony. All the winners were unanimous, and our first winner of the MySQL Rockstar Legend award received the loudest applause: well done Datacharmer!

The conference concludes with the traditional MySQL Community Dinner dinner expertly organised by Matthias, kudos !

We now have one year to prepare for our next MySQL Belgian Days, when we will celebrate MySQL’s 30th anniversary !

Here are the slides of the sessions:

Day 1 – MySQL Ecosystem Day

Day 2 – Oracle MySQL Day

MySQL Rockstar 2023 Awards

February 12, 2024

Les petits plaisirs du shopping moderne

— Bonjour, je peux vous aider ?

Coincée dans sa boutique au plus profond des entrailles du centre commercial, la vendeuse ne devait pas souvent voir la lumière du jour. Aux reflets criards de son maquillage, j’eus l’impression que ça ne la dérangeait pas spécialement.

— Oui, répondis-je en posant la main sur un modèle d’exposition. Je cherche à m’équiper entièrement.
— Celui-là possède une connexion wifi et 5G ultrarapide. Vous pouvez le contrôler depuis votre téléphone grâce à notre app dédiée…
— Non, merci, l’interrompis-je avec une petite moue non convaincue.
— Ou alors, on rentre dans le haut de gamme. Ces modèles sont liés au cloud et possèdent une intelligence artificielle embarquée.
— Bof, laissez tomber !
— Enfin, le top du top, le neck plus ultra. Stockage des données sur blockchain propriétaire, interaction intégrée avec votre casque de réalité virtuelle…
— Écoutez, je veux juste acheter un frigo. Juste un frigo !
— Et bien ce sont des frigos. Les meilleurs…
— Non, un frigo qui fait du froid et rien d’autre. Sans aucune connexion, sans aucune puce électronique.

Interloquée, la vendeuse ouvrit la bouche sans proférer le moindre son. Pendant quelques secondes, elle me jaugea avant de rompre le silence.

— Vous êtes un collectionneur d’antiquités ?
— Non, je veux un frigo moderne, mais sans puce électronique.
— Mais pourquoi ?
— Parce que je veux un truc qui fait du froid, rien d’autre.
— Ça n’a pas de sens !

Attirée par le bruit de notre discussion, une autre vendeuse s’approcha avec un sourire dégoulinant d’hypocrisie.

— Laisse Amanda, je m’en occupe. Monsieur est certainement ingénieur en informatique ou en électronique.

J’acquiesçai silencieusement avec un petit sourire. Elle avait vu juste !

— Monsieur, je suis au regret de vous annoncer que nous sommes un showroom de démonstration normal. Nous n’avons en démonstration que ce qui est disponible à la livraison sur notre site web, pas les produits de niche. Mais je crois savoir qu’il existe un magasin grand luxe qui fait ce genre de choses : des équipements sans puce électronique. Évidemment, ça coûte très cher. Beaucoup trop pour notre clientèle. Mais peut-être y trouverez-vous votre bonheur ? Attention cependant, il est impossible de commander en ligne. C’est un truc pour les riches originaux.
— Merci, lancé en grommelant. Je vais aller voir.

Tournant le dos, j’entendis Amanda demander à sa collègue.
— Mais pourquoi un ingénieur refuserait de…
— Jamais compris. Ces gens-là, ils sont pas comme tout le monde. Si ça se trouve, il n’a même pas de compte sur…

Leurs voix se perdirent dans le brouhaha du centre commercial. M’enfuyant le plus rapidement possible, je franchis l’épais sas d’air conditionné et me retrouvai face aux automobiles qui, à perte de vue, rechargaient leurs batteries sous une pluie grisâtre. Je n’étais pas prêt de l’avoir mon frigo. Sans compter que, étant nouveau dans le pays, j’avais également besoin d’un micro-ondes, d’un lave-linge, d’une brosse à dents électrique et d’une cafetière… Ça n’allait pas être de tout repos.

Quant à la voiture, je n’osais même pas y penser. Tout compte fait, c’était peut-être le moment de se mettre sérieusement au vélo…

— Ding ! Mise à jour du firrmware de votre batterie en cours. Vérification de l’état de votre abonnement.

Argh ! Même les vélos !

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

February 06, 2024

FOSDEM is over, but our video infrastructure is still working hard! This year, there were 855 events in rooms where the FOSDEM video team had equipment to stream, record, and publish everything, for a total of 365 scheduled hours of content. Postprocessing that much video is a massive task that takes quite a bit of effort; for this reason, since the 2017 edition, we have crowdsourced most of the manual work that needs to happen for this postprocessing to the devroom organizers and the speakers, as explained in this talk at FOSDEM 2019. As of this writing, out of those舰

February 05, 2024

A male business person working on a laptop in a futuristic, cryptopunk setting.

For nearly three decades, third-party cookies have been a major privacy concern on the web. They allow organizations, such as advertisers, to track users' browsing activities across multiple websites, often without explicit consent.

Unlike third-party cookies, first-party cookies are restricted to the website you are on. They are often used to improve the user experience, such as keeping you logged in, remembering what is in your shopping cart, and more.

Of course, first-party cookies can also be used to track your activity, like with Google Analytics, but they can't be used to follow you beyond that site. While both types of cookies can be used to track users, third-party cookies are much more invasive and problematic.

In 2018, I made the decision to remove all tracking tools, including Google Analytics, from my personal site. My website aspires to the privacy of printed works. The anonymity of a book holds a unique charm, and I find joy in not knowing who visits my website.

That said, I have no issue with the use of first-party cookies, provided it's consensual and used to improve the user experience. I understand their importance for many organizations, especially in marketing.

Fortunately, the era of third-party cookies is coming to a close. Browsers like Safari and Firefox have already taken steps to limit third-party tracking. They still allow certain third-party cookies to ensure websites work properly. Two examples include:

  1. E-commerce sites often rely on third-party cookies for integrating payment systems. Blocking these cookies could disrupt the payment process.
  2. Complex digital platforms, like healthcare and government websites, sometimes use cross-site authentication to link departmental websites.  Blocking these could prevent access to important services.

While Safari and Firefox have been limiting third-party cookies for some time, Google Chrome is lagging behind.  Google only began phasing out third-party cookies in early 2024 (a few weeks ago), starting with just 1% of its users. Their plan is to expand this to all users by the end of 2024.

Google's strategy can be viewed in two ways:

  1. Negatively, it could look like Google is delaying the phase-out because it profits from the advertising revenue these cookies generate.
  2. Positively, Google is cautious in removing third-party cookies to avoid disrupting websites that rely on them for non-advertising purposes.

As a regular Chrome user, I didn't want to wait until the end of 2024. If you feel the same, you can block third-party cookies in Chrome now. Just head to Settings, select Privacy and Security and activate Block third-party cookies.  Just beware, as some sites might stop working.

If you manage or develop websites, check they rely on third-party cookies. Use my HTTP Header Analyzer to check for SameSite=None attributes indicating third-party cookies, or test by disabling them in Chrome.

February 04, 2024

Why your app turned into spaghetti

Cover Image

"I do not like your software sir,
your architecture's poor.

Your users can't do anything,
unless you code some more.

This isn't how it used to be,
we had this figured out.

But then you had to mess it up
by moving into clouds."

There's a certain kind of programmer. Let's call him Stanley.

Stanley has been around for a while, and has his fair share of war stories. The common thread is that poorly conceived and specced solutions lead to disaster, or at least, ongoing misery. As a result, he has adopted a firm belief: it should be impossible for his program to reach an invalid state.

Stanley loves strong and static typing. He's a big fan of pattern matching, and enums, and discriminated unions, which allow correctness to be verified at compile time. He also has strong opinions on errors, which must be caught, logged and prevented. He uses only ACID-compliant databases, wants foreign keys and triggers to be enforced, and wraps everything in atomic transactions.

He hates any source of uncertainty or ambiguity, like untyped JSON or plain-text markup. His APIs will accept data only in normalized and validated form. When you use a Stanley lib, and it doesn't work, the answer will probably be: "you're using it wrong."

Stanley is most likely a back-end or systems-level developer. Because nirvana in front-end development is reached when you understand that this view of software is not just wrong, but fundamentally incompatible with the real world.

I will prove it.

Alice in wonderland

State Your Intent

Take a text editor. What happens if you press the up and down arrows?

The keyboard cursor (aka caret) moves up and down. Duh. Except it also moves left and right.

The editor state at the start has the caret on line 1 column 6. Pressing down will move it to line 2 column 6. But line 2 is too short, so the caret is forcibly moved left to column 1. Then, pressing down again will move it back to column 6.

It should be obvious that any editor that didn't remember which column you were actually on would be a nightmare to use. You know it in your bones. Yet this only works because the editor allows the caret to be placed on a position that "does not exist." What is the caret state in the middle? It is both column 1 and column 6.

Intent - State - View

To accommodate this, you need more than just a View that is a pure function of a State, as is now commonly taught. Rather, you need an Intent, which is the source of truth that you mutate... and which is then parsed and validated into a State. Only then can it be used by the View to render the caret in the right place.

To edit the intent, aka what a classic Controller does, is a bit tricky. When you press left/right, it should determine the new Intent.column based on the validated State.column +/- 1. But when you press up/down, it should keep the Intent.column you had before and instead change only Intent.line. New intent is a mixed function of both previous intent and previous state.

The general pattern is that you reuse Intent if it doesn't change, but that new computed Intent should be derived from State. Note that you should still enforce normal validation of Intent.column when editing too: you don't allow a user to go past the end of a line. Any new intent should be as valid as possible, but old intent should be preserved as is, even if non-sensical or inapplicable.

Validate vs Mutate

Functionally, for most of the code, it really does look and feel as if the state is just State, which is valid. It's just that when you make 1 state change, the app may decide to jump into a different State than one would think. When this happens, it means some old intent first became invalid, but then became valid again due to a subsequent intent/state change.

This is how applications actually work IRL. FYI.

Dining Etiquette

Knives and Forks

I chose a text editor as an example because Stanley can't dismiss this as just frivolous UI polish for limp wristed homosexuals. It's essential that editors work like this.

The pattern is far more common than most devs realize:

  • A tree view remembers the expand/collapse state for rows that are hidden.
  • Inspector tabs remember the tab you were on, even if currently disabled or inapplicable.
  • Toggling a widget between type A/B/C should remember all the A, B and C options, even if mutually exclusive.

All of these involve storing and preserving something unknown, invalid or unused, and bringing it back into play later.

More so, if software matches your expected intent, it's a complete non-event. What looks like a "surprise hidden state transition" to a programmer is actually the exact opposite. It would be an unpleasant surprise if that extra state transition didn't occur. It would only annoy users: they already told the software what they wanted, but it keeps forgetting.

The ur-example is how nested popup menus should work: good implementations track the motion of the cursor so you can move diagonally from parent to child, without falsely losing focus:

This is an instance of the goalkeeper's curse: people rarely compliment or notice the goalkeeper if they do their job, only if they mess up. Successful applications of this principle are doomed to remain unnoticed and unstudied.

Validation is not something you do once, discarding the messy input and only preserving the normalized output. It's something you do continuously and non-destructively, preserving the mess as much as possible. It's UI etiquette: the unspoken rules that everyone expects but which are mostly undocumented folklore.

This poses a problem for most SaaS in the wild, both architectural and existential. Most APIs will only accept mutations that are valid. The goal is for the database to be a sequence of fully valid states:

Mutate

The smallest possible operation in the system is a fully consistent transaction. This flattens any prior intent.

In practice, many software deviates from this ad-hoc. For example, spreadsheets let you create cyclic references, which is by definition invalid. The reason it must let you do this is because fixing one side of a cyclic reference also fixes the other side. A user wants and needs to do these operations in any order. So you must allow a state transition through an invalid state:

Google sheets circular reference
Mutate through invalid state

This requires an effective Intent/State split, whether formal or informal.

Edsger Dijkstra

Because cyclic references can go several levels deep, identifying one cyclic reference may require you to spider out the entire dependency graph. This is functionally equivalent to identifying all cyclic references—dixit Dijkstra. Plus, you need to produce sensible, specific error messages. Many "clever" algorithmic tricks fail this test.

Now imagine a spreadsheet API that doesn't allow for any cyclic references ever. This still requires you to validate the entire resulting model, just to determine if 1 change is allowed. It still requires a general validate(Intent). In short, it means your POST and PUT request handlers need to potentially call all your business logic.

That seems overkill, so the usual solution is bespoke validators for every single op. If the business logic changes, there is a risk your API will now accept invalid intent. And the app was not built for that.

If you flip it around and assume intent will go out-of-bounds as a normal matter, then you never have this risk. You can write the validation in one place, and you reuse it for every change as a normal matter of data flow.

Note that this is not cowboy coding. Records and state should not get irreversibly corrupted, because you only ever use valid inputs in computations. If the system is multiplayer, distributed changes should still be well-ordered and/or convergent. But the data structures you're encoding should be, essentially, entirely liberal to your user's needs.

Git diff

Consider git. Here, a "unit of intent" is just a diff applied to a known revision ID. When something's wrong with a merge, it doesn't crash, or panic, or refuse to work. It just enters a conflict state. This state is computed by merging two incompatible intents.

It's a dirty state that can't be turned into a clean commit without human intervention. This means git must continue to work, because you need to use git to clean it up. So git is fully aware when a conflict is being resolved.

As a general rule, the cases where you actually need to forbid a mutation which satisfies all the type and access constraints are small. A good example is trying to move a folder inside itself: the file system has to remain a sensibly connected tree. Enforcing the uniqueness of names is similar, but also comes with a caution: falsehoods programmers believe about names. Adding (Copy) to a duplicate name is usually better than refusing to accept it, and most names in real life aren't unique at all. Having user-facing names actually requires creating tools and affordances for search, renaming references, resolving duplicates, and so on.

Even among front-end developers, few people actually grok this mental model of a user. It's why most React(-like) apps in the wild are spaghetti, and why most blog posts about React gripes continue to miss the bigger picture. Doing React (and UI) well requires you to unlearn old habits and actually design your types and data flow so it uses potentially invalid input as its single source of truth. That way, a one-way data flow can enforce the necessary constraints on the fly.

The way Stanley likes to encode and mutate his data is how programmers think about their own program: it should be bug-free and not crash. The mistake is to think that this should also apply to any sort of creative process that program is meant to enable. It would be like making an IDE that only allows you to save a file if the code compiles and passes all the tests.

surprised, mind blown, cursing, thinking, light bulb

Trigger vs Memo

Coding around intent is a very hard thing to teach, because it can seem overwhelming. But what's overwhelming is not doing this. It leads to codebases where every new feature makes ongoing development harder, because no part of the codebase is ever finished. You will sprinkle copies of your business logic all over the place, in the form of request validation, optimistic local updaters, and guess-based cache invalidation.

If this is your baseline experience, your estimate of what is needed to pull this off will simply be wrong.

In the traditional MVC model, intent is only handled at the individual input widget or form level. e.g. While typing a number, the intermediate representation is a string. This may be empty, incomplete or not a number, but you temporarily allow that.

I've never seen people formally separate Intent from State in an entire front-end. Often their state is just an adhoc mix of both, where validation constraints are relaxed in the places where it was most needed. They might just duplicate certain fields to keep a validated and unvalidated variant side by side.

There is one common exception. In a React-like, when you do a useMemo with a derived computation of some state, this is actually a perfect fit. The eponymous useState actually describes Intent, not State, because the derived state is ephemeral. This is why so many devs get lost here.

const state = useMemo(
  () => validate(intent),
  [intent]
);

Their usual instinct is that every action that has knock-on effects should be immediately and fully realized, as part of one transaction. Only, they discover some of those knock-on effects need to be re-evaluated if certain circumstances change. Often to do so, they need to undo and remember what it was before. This is then triggered anew via a bespoke effect, which requires a custom trigger and mutation. If they'd instead deferred the computation, it could have auto-updated itself, and they would've still had the original data to work with.

e.g. In a WYSIWYG scenario, you often want to preview an operation as part of mouse hovering or dragging. It should look like the final result. You don't need to implement custom previewing and rewinding code for this. You just need the ability to layer on some additional ephemeral intent on top of the intent that is currently committed. Rewinding just means resetting that extra intent back to empty.

You can make this easy to use by treating previews as a special kind of transaction: now you can make preview states with the same code you use to apply the final change. You can also auto-tag the created objects as being preview-only, which is very useful. That is: you can auto-translate editing intent into preview intent, by messing with the contents of a transaction. Sounds bad, is actually good.

The same applies to any other temporary state, for example, highlighting of elements. Instead of manually changing colors, and creating/deleting labels to pull this off, derive the resolved style just-in-time. This is vastly simpler than doing it all on 1 classic retained model. There, you run the risk of highlights incorrectly becoming sticky, or shapes reverting to the wrong style when un-highlighted. You can architect it so this is simply impossible.

The trigger vs memo problem also happens on the back-end, when you have derived collections. Each object of type A must have an associated type B, created on-demand for each A. What happens if you delete an A? Do you delete the B? Do you turn the B into a tombstone? What if the relationship is 1-to-N, do you need to garbage collect?

If you create invisible objects behind the scenes as a user edits, and you never tell them, expect to see a giant mess as a result. It's crazy how often I've heard engineers suggest a user should only be allowed to create something, but then never delete it, as a "solution" to this problem. Everyday undo/redo precludes it. Don't be ridiculous.

The problem is having an additional layer of bookkeeping you didn't need. The source of truth was collection A, but you created a permanent derived collection B. If you instead make B ephemeral, derived via a stateless computation, then the problem goes away. You can still associate data with B records, but you don't treat B as the authoritative source for itself. This is basically what a WeakMap is.

Event Sourcing

In database land this can be realized with a materialized view, which can be incremental and subscribed to. Taken to its extreme, this turns into event-based sourcing, which might seem like a panacea for this mindset. But in most cases, the latter is still a system by and for Stanley. The event-based nature of those systems exists to support housekeeping tasks like migration, backup and recovery. Users are not supposed to be aware that this is happening. They do not have any view into the event log, and cannot branch and merge it. The exceptions are extremely rare.

It's not a system for working with user intent, only for flattening it, because it's append-only. It has a lot of the necessary basic building blocks, but substitutes programmer intent for user intent.

What's most nefarious is that the resulting tech stacks are often quite big and intricate, involving job queues, multi-layered caches, distribution networks, and more. It's a bunch of stuff that Stanley can take joy and pride in, far away from users, with "hard" engineering challenges. Unlike all this *ugh* JavaScript, which is always broken and unreliable and uninteresting.

Except it's only needed because Stanley only solved half the problem, badly.

Patch or Bust

When factored in from the start, it's actually quite practical to split Intent from State, and it has lots of benefits. Especially if State is just a more constrained version of the same data structures as Intent. This doesn't need to be fully global either, but it needs to encompass a meaningful document or workspace to be useful.

It does create an additional problem: you now have two kinds of data in circulation. If reading or writing requires you to be aware of both Intent and State, you've made your code more complicated and harder to reason about.

Validate vs Mutate

More so, making a new Intent requires a copy of the old Intent, which you mutate or clone. But you want to avoid passing Intent around in general, because it's fishy data. It may have the right types, but the constraints and referential integrity aren't guaranteed. It's a magnet for the kind of bugs a type-checker won't catch.

I've published my common solution before: turn changes into first-class values, and make a generic update of type Update<T> be the basic unit of change. As a first approximation, consider a shallow merge {...value, ...update}. This allows you to make an updateIntent(update) function where update only specifies the fields that are changing.

In other words, Update<Intent> looks just like Update<State> and can be derived 100% from State, without Intent. Only one place needs to have access to the old Intent, all other code can just call that. You can make an app intent-aware without complicating all the code.

Validate vs Mutate 2

If your state is cleaved along orthogonal lines, then this is all you need. i.e. If column and line are two separate fields, then you can selectively change only one of them. If they are stored as an XY tuple or vector, now you need to be able to describe a change that only affects either the X or Y component.

const value = {
  hello: 'text',
  foo: { bar: 2, baz: 4 },
};

const update = {
  hello: 'world',
  foo: { baz: 50 },
};

expect(
  patch(value, update)
).toEqual({
  hello: 'world',
  foo: { bar: 2, baz: 50 },
});

So in practice I have a function patch(value, update) which implements a comprehensive superset of a deep recursive merge, with full immutability. It doesn't try to do anything fancy with arrays or strings, they're just treated as atomic values. But it allows for precise overriding of merging behavior at every level, as well as custom lambda-based updates. You can patch tuples by index, but this is risky for dynamic lists. So instead you can express e.g. "append item to list" without the entire list, as a lambda.

I've been using patch for years now, and the uses are myriad. To overlay a set of overrides onto a base template, patch(base, overrides) is all you need. It's the most effective way I know to erase a metric ton of {...splats} and ?? defaultValues and != null from entire swathes of code. This is a real problem.

You could also view this as a "poor man's OT", with the main distinction being that a patch update only describes the new state, not the old state. Such updates are not reversible on their own. But they are far simpler to make and apply.

It can still power a global undo/redo system, in combination with its complement diff(A, B): you can reverse an update by diffing in the opposite direction. This is an operation which is formalized and streamlined into revise(…), so that it retains the exact shape of the original update, and doesn't require B at all. The structure of the update is sufficient information: it too encodes some intent behind the change.

With patch you also have a natural way to work with changes and conflicts as values. The earlier WYSIWIG scenario is just patch(commited, ephemeral) with bells on.

The net result is that mutating my intent or state is as easy as doing a {...value, ...update} splat, but I'm not falsely incentivized to flatten my data structures.

Instead it frees you up to think about what the most practical schema actually is from the data's point of view. This is driven by how the user wishes to edit it, because that's what you will connect it to. It makes you think about what a user's workspace actually is, and lets you align boundaries in UX and process with boundaries in data structure.

Array vs Linked List

Remember: most classic "data structures" are not about the structure of data at all. They serve as acceleration tools to speed up specific operations you need on that data. Having the reads and writes drive the data design was always part of the job. What's weird is that people don't apply that idea end-to-end, from database to UI and back.

SQL tables are shaped the way they are because it enables complex filters and joins. However, I find this pretty counterproductive: it produces derived query results that are difficult to keep up to date on a client. They also don't look like any of the data structures I actually want to use in my code.

A Bike Shed of Schemas

This points to a very under-appreciated problem: it is completely pointless to argue about schemas and data types without citing specific domain logic and code that will be used to produce, edit and consume it. Because that code determines which structures you are incentivized to use, and which structures will require bespoke extra work.

From afar, column and line are just XY coordinates. Just use a 2-vector. But once you factor in the domain logic and etiquette, you realize that the horizontal and vertical directions have vastly different rules applied to them, and splitting might be better. Which one do you pick?

This applies to all data. Whether you should put items in a List<T> or a Map<K, V> largely depends on whether the consuming code will loop over it, or need random access. If an API only provides one, consumers will just build the missing Map or List as a first step. This is O(n log n) either way, because of sorting.

The method you use to read or write your data shouldn't limit use of everyday structure. Not unless you have a very good reason. But this is exactly what happens.

A lot of bad choices in data design come down to picking the "wrong" data type simply because the most appropriate one is inconvenient in some cases. This then leads to Conway's law, where one team picks the types that are most convenient only for them. The other teams are stuck with it, and end up writing bidirectional conversion code around their part, which will never be removed. The software will now always have this shape, reflecting which concerns were considered essential. What color are your types?

{
  order: [4, 11, 9, 5, 15, 43],
  values: {
    4: {...},
    5: {...},
    9: {...},
    11: {...},
    15: {...},
    43: {...},
  },
);

For List vs Map, you can have this particular cake and eat it too. Just provide a List<Id> for the order and a Map<Id, T> for the values. If you structure a list or tree this way, then you can do both iteration and ID-based traversal in the most natural and efficient way. Don't underestimate how convenient this can be.

This also has the benefit that "re-ordering items" and "editing items" are fully orthogonal operations. It decomposes the problem of "patching a list of objects" into "patching a list of IDs" and "patching N separate objects". It makes code for manipulating lists and trees universal. It lets you to decide on a case by case basis whether you need to garbage collect the map, or whether preserving unused records is actually desirable.

Limiting it to ordinary JSON or JS types, rather than going full-blown OT or CRDT, is a useful baseline. With sensible schema design, at ordinary editing rates, CRDTs are overkill compared to the ability to just replay edits, or notify conflicts. This only requires version numbers and retries.

Users need those things anyway: just because a CRDT converges when two people edit, doesn't mean the result is what either person wants. The only case where OTs/CRDTs are absolutely necessary is rich-text editing, and you need bespoke UI solutions for that anyway. For simple text fields, last-write-wins is perfectly fine, and also far superior to what 99% of RESTy APIs do.

CRDT

A CRDT is just a mechanism that translates partially ordered intents into a single state. Like, it's cool that you can make CRDT counters and CRDT lists and whatnot... but each CRDT implements only one particular resolution strategy. If it doesn't produce the desired result, you've created invalid intent no user expected. With last-write-wins, you at least have something 1 user did intend. Whether this is actually destructive or corrective is mostly a matter of schema design and minimal surface area, not math.

The main thing that OTs and CRDTs do well is resolve edits on ordered sequences, like strings. If two users are typing text in the same doc, edits higher-up will shift edits down below, which means the indices change when rebased. But if you are editing structured data, you can avoid referring to indices entirely, and just use IDs instead. This sidesteps the issue, like splitting order from values.

For the order, there is a simple solution: a map with a fractional index, effectively a dead-simple list CRDT. It just comes with some overhead.

Google docs comment

Using a CRDT for string editing might not even be enough. Consider Google Docs-style comments anchored to that text: their indices also need to shift on every edit. Now you need a bespoke domain-aware CRDT. Or you work around it by injecting magic markers into the text. Either way, it seems non-trivial to decouple a CRDT from the specific target domain of the data inside. The constraints get mixed in.

If you ask me, this is why the field of real-time web apps is still in somewhat of a rut. It's mainly viewed as a high-end technical problem: how do we synchronize data structures over a P2P network without any data conflicts? What they should be asking is: what is the minimal amount of structure we need to reliably synchronize, so that users can have a shared workspace where intent is preserved, and conflicts are clearly signposted. And how should we design our schemas, so that our code can manipulate the data in a straightforward and reliable way? Fixing non-trivial user conflicts is simply not your job.

Most SaaS out there doesn't need any of this technical complexity. Consider that a good multiplayer app requires user presence and broadcast anyway. The simplest solution is just a persistent process on a single server coordinating this, one per live workspace. It's what most MMOs do. In fast-paced video games, this even involves lag compensation. Reliable ordering is not the big problem.

The situations where this doesn't scale, or where you absolutely must be P2P, are a minority. If you run into them, you must be doing very well. The solution that I've sketched out here is explicitly designed so it can comfortably be done by small teams, or even just 1 person.

Private CAD app

The (private) CAD app I showed glimpses of above is entirely built this way. It's patch all the way down and it's had undo/redo from day 1. It also has a developer mode where you can just edit the user-space part of the data model, and save/load it.

When the in-house designers come to me with new UX requests, they often ask: "Is it possible to do ____?" The answer is never a laborious sigh from a front-end dev with too much on their plate. It's "sure, and we can do more."

If you're not actively aware the design of schemas and code is tightly coupled, your codebase will explode, and the bulk of it will be glue. Much of it just serves to translate generalized intent into concrete state or commands. Arguments about schemas are usually just hidden debates about whose job it is to translate, split or join something. This isn't just an irrelevant matter of "wire formats" because changing the structure and format of data also changes how you address specific parts of it.

In an interactive UI, you also need a reverse path, to apply edits. What I hope you are starting to realize is that this is really just the forward path in reverse, on so many levels. The result of a basic query is just the ordered IDs of the records that it matched. A join returns a tuple of record IDs per row. If you pre-assemble the associated record data for me, you actually make my job as a front-end dev harder, because there are multiple forward paths for the exact same data, in subtly different forms. What I want is to query and mutate the same damn store you do, and be told when what changes. It's table-stakes now.

With well-architected data, this can be wired up mostly automatically, without any scaffolding. The implementations you encounter in the wild just obfuscate this, because they don't distinguish between the data store and the model it holds. The fact that the data store should not be corruptible, and should enforce permissions and quotas, is incorrectly extended to the entire model stored inside. But that model doesn't belong to Stanley, it belongs to the user. This is why desktop applications didn't have a "Data Export". It was just called Load and Save, and what you saved was the intent, in a file.

Windows 95 save dialog

Having a universal query or update mechanism doesn't absolve you from thinking about this either, which is why I think the patch approach is so rare: it looks like cowboy coding if you don't have the right boundaries in place. Patch is mainly for user-space mutations, not kernel-space, a concept that applies to more than just OS kernels. User-space must be very forgiving.

If you avoid it, you end up with something like GraphQL, a good example of solving only half the problem badly. Its getter assembles data for consumption by laboriously repeating it in dozens of partial variations. And it turns the setter part into an unsavory mix of lasagna and spaghetti. No wonder, it was designed for a platform that owns and hoards all your data.

* * *

Viewed narrowly, Intent is just a useful concept to rethink how you enforce validation and constraints in a front-end app. Viewed broadly, it completely changes how you build back-ends and data flows to support that. It will also teach you how adding new aspects to your software can reduce complexity, not increase it, if done right.

A good metric is to judge implementation choices by how many other places of the code need to care about them. If a proposed change requires adjustments literally everywhere else, it's probably a bad idea, unless the net effect is to remove code rather than add.

Live Canvas

I believe reconcilers like React or tree-sitter are a major guide stone here. What they do is apply structure-preserving transforms on data structures, and incrementally. They actually do the annoying part for you. I based Use.GPU on the same principles, and use it to drive CPU canvases too. The tree-based structure reflects that one function's state just might be the next function's intent, all the way down. This is a compelling argument that the data and the code should have roughly the same shape.

Back-end vs Front-end split

You will also conclude there is nothing more nefarious than a hard split between back-end and front-end. You know, coded by different people, where each side is only half-aware of the other's needs, but one sits squarely in front of the other. Well-intentioned guesses about what the other end needs will often be wrong. You will end up with data types and query models that cannot answer questions concisely and efficiently, and which must be babysat to not go stale.

In the last 20 years, little has changed here in the wild. On the back-end, it still looks mostly the same. Even when modern storage solutions are deployed, people end up putting SQL- and ORM-like layers on top, because that's what's familiar. The split between back-end and database has the exact same malaise.

None of this work actually helps make the app more reliable, it's the opposite: every new feature makes on-going development harder. Many "solutions" in this space are not solutions, they are copes. Maybe we're overdue for a NoSQL-revival, this time with a focus on practical schema design and mutation? SQL was designed to model administrative business processes, not live interaction. I happen to believe a front-end should sit next to the back-end, not in front of it, with only a thin proxy as a broker.

What I can tell you for sure is: it's so much better when intent is a first-class concept. You don't need nor want to treat user data as something to pussy-foot around, or handle like it's radioactive. You can manipulate and transport it without a care. You can build rich, comfy functionality on top. Once implemented, you may find yourself not touching your network code for a very long time. It's the opposite of overwhelming, it's lovely. You can focus on building the tools your users need.

This can pave the way for more advanced concepts like OT and CRDT, but will show you that neither of them is a substitute for getting your application fundamentals right.

In doing so, you reach a synthesis of Dijkstra and anti-Dijkstra: your program should be provably correct in its data flow, which means it can safely break in completely arbitrary ways.

Because the I in UI meant "intent" all along.

More:

February 03, 2024

A while ago I wrote a article comparing Waterloo with Bakhmut. That was in August 2023. We are in February 2024 and I predict that soon I will have to say the same about Avdiivka.

The city is about to be encircled the coming month and is in fact already ~ technically encircled (all of its supply roads are within fire range of Russians). I think that Avdiivka has already surpassed many times the death toll of Bakhmut.

The amounts of deaths are now also becoming comparable to the WWII Operation Bagration which had about 400000 German and 180000 Red Army kills.

The Soviet operation was named after the Georgian prince Pyotr Bagration (1765–1812), a general of the Imperial Russian Army during the Napoleonic Wars.

Wikipedia

February 01, 2024

HTTP headers play a crucial part in making your website fast and secure. For that reason, I often inspect HTTP headers to troubleshoot caching problems or review security settings.

The complexity of the HTTP standard and the challenge to remember all the best practices led me to develop an HTTP Header Analyzer four years ago.

It is pretty simple: enter a URL, and the tool will analyze the headers sent by your webserver, CMS or web application. It then explains these headers, offers a score, and suggests possible improvements.

For a demonstration, click https://dri.es/headers?url=https://www.reddit.com. As the URL suggests, it will analyze the HTTP headers of Reddit.com.

I began this as a weekend project in the early days of COVID, seeing it as just another addition to my toolkit. At the time, I simply added it to my projects page but never announced or mentioned it on my blog.

So why write about it now? Because I happened to check my log files and, lo and behold, the little scanner that could clocked in more than 5 million scans, averaging over 3,500 scans a day.

So four years and five million scans later, I'm finally announcing it to the world!

If you haven't tried my HTTP header analyzer, check it out. It's free, easy to use, requires no sign-up, and is built to help improve your website's performance and security.

The crawler works with all websites, but naturally, I added some special checks for Drupal sites.

I don't have any major plans for the crawler. At some point, I'd like to move it to its own domain, as it feels a bit out of place as part of my personal website. But for now, that isn't a priority.

For the time being, I'm open to any feedback or suggestions and will commit to making any necessary corrections or improvements.

It's rewarding to know that this tool has made thousands of websites faster and safer! It's also a great reminder to share your work, even in the simplest way – you never know the impact it could have.

January 27, 2024

With FOSDEM just a few days away, it is time for us to enlist your help. Every year, an enthusiastic band of volunteers make FOSDEM happen and make it a fun and safe place for all our attendees. We could not do this without you. This year we again need as many hands as possible, especially for heralding during the conference, during the buildup (starting Friday at noon) and teardown (Sunday evening). No need to worry about missing lunch at the weekend, food will be provided. Would you like to be part of the team that makes FOSDEM tick?舰

January 25, 2024

I recently added a new page to my website, which displays the top 10 topics of each year. This page serves a dual purpose: it visualizes the evolution of my interests, and provides visitors with an overview of the core content of my site.

As I stared at this new page, it became very obvious that my blog has maintained a consistent focus throughout the years. The top categories have not changed meaningfully in 18 years. The recurring themes include Drupal, Acquia, Open Source, and photography. The latter, a long-term interest, has experienced ebbs and flows during this time.

Just as having the same cereal each day can become boring – a fact my wife kindly reminds me of – covering the same topics for 18 years can lead to a certain "predictability". This realization sparked a desire to spice things up!

A cabinet displaying jars, each with a different interest inside: travel, photography, electronics, tennis, food, investing, coffee, and more. A cabinet with all my interests.

My blog history wouldn't suggest it, but I consider myself a person with many interests. Sometimes I have too many interests. As a result, I'm often drawn into various side projects – another fact my wife kindly reminds me of.

My point is that there are certainly different aspects of my life and interests that I could write more about. With that in mind, a goal for 2024 is to diversify my blog's content. I set a goal to introduce at least two new topics into my top 10 list for 2024, alongside Drupal and Acquia.

For example, I've been passionate about investing for 15 years. A few of my investments have generated a 5,000% return. Yet, I've been reluctant to write about investing for 15 years. The reasons for this hesitation are as diverse as a well-balanced investment portfolio: the challenge of finding enough time, moments of impostor syndrome, fear of confusing my long-term readers, and the sensitivity around discussing financial successes and failures.

Some might say that finance-related blog posts don't exactly bring the heat when it comes to spicing up a blog. They are probably right. But then I reminded myself that this blog is, first and foremost, for me. I write to learn and engage with readers. Consider this a heads-up: brace yourselves for some new flavors on my blog, from finance to who knows what else.

What I love most about blogging is how it stimulates my thoughts and sparks engaging discussions, often offline or via email. Introducing new two topics should most certainly lead to more of that. I look forward to such conversations.

January 21, 2024

As of Autoptimize Pro 2.3 there is an option to delay all JavaScript. Delaying can have a bigger positive performance impact then asyncing or deferring because AOPro will only load the delayed JS after a to be defined delay or (better even) only at the moment user interaction (such as swiping/ scrolling or mouse movement) is noticed. Obviously when delaying all JS you might want to be able to…

Source

January 18, 2024

2024 will be the first year there will be a FOSDEM junior track, organizing workshops for children 7-17. Developers from the FOSS Educational Programming Languages devroom will be organizing workshops for children from 7 - 17 with the help of CoderDojo. At FOSDEM, volunteers from CoderDojo International, CoderDojo Netherlands, and CoderDojo Belgium will be helping during the workshops. There will be workshops from MicroBlocks, Snap!, Zim, MIT App Inventor, Raspberry Pi, Hedy, and CoderDojo. For these workshops, you will need to get a free ticket. Visit https://fosdem.org/2024/schedule/track/junior and click on the workshop of your choice. On the workshop page, you舰

De la brièveté de la vie et de la pérennité d’un blog

Cette année 2024 m’est particulièrement symbolique. En octobre, je fêterai les 20 ans de ce blog et ne serai plus très loin du million de mots publiés. Vingt ans qui ont fait de ce blog un élément central et constitutif de mon identité. Si je rêve d’être reconnu comme écrivain voire développeur ou scientifique, je resterai toujours avant tout un blogueur. Blogueur est d’ailleurs le premier qualificatif accolé à mon patronyme dans les médias ou sur Wikipédia.

Ironiquement, l’année de ces 20 ans de blog a commencé avec la plus longue indisponibilité que ce site ait jamais connue. Durant plusieurs jours, ploum.net a été complètement retiré du réseau. En cause, une attaque massive contre Sourcehut, mon hébergeur, forçant ce dernier à migrer en catastrophe vers une nouvelle infrastructure.

Si j’ai vécu la mise hors-ligne de mon blog comme une amputation, une perte d’une partie de mon identité, je sais fort bien que cela ne me porte aucun préjudice durable, que je ne perds aucune donnée. Ce genre d’incidents font partie de la vie et je tiens à remercier l’équipe de Sourcehut pour la transparence totale et le travail accompli ces derniers jours.

Cet événement m’a également fait prendre conscience à quel point l’œuvre de toute une vie pouvait disparaitre rapidement, dans l’indifférence la plus générale. Si j’espère que mes livres sur arbres morts continueront toujours à tomber aléatoirement d’une étagère ou être découverts chez des bouquinistes, mes écrits en ligne, eux, sont d’éphémères bouteilles à la mer susceptibles de couler dans les abysses de l’oubli au moindre incident.

Cette question m’obsède depuis que je suis papa : comment faire en sorte que mes écrits en ligne me survivent ?

Nom de domaine et indépendance

J’espère que mon blog sera encore consultable le jour inévitable où Facebook, Medium et Youtube auront été relégués avec Myspace et Skyblog dans l’armoire des souvenirs historiques du web. Et quand je vois Alias fêter les 15 ans de son blog, je pense ne pas être le seul. Les blogs sont les archives à long terme du web:

Concrètement, je fuis toutes les plateformes propriétaires et je lie tous mes écrits à mon propre nom de domaine : ploum.net. En fait, pour être sûr, je dispose même de trois noms, sur trois premiers niveaux différents: ploum.net, ploum.be, ploum.eu.

À court terme, cela peut être un mauvais calcul: les pages Facebook de mes amis décédés sont encore en ligne alors que les domaines expirent rapidement. Je dois donc préparer la transmission de ces domaines pour assurer qu’ils puissent me survivre.

Simplicité et minimalisme

Outre le nom de domaine, la première cause de mise hors ligne d’un site est généralement le manque de maintenance. Un Wordpress piraté, car une mise à jour n’a pas été faite. Une base de données incompatible avec une nouvelle version du CMS. Même les générateurs de sites statiques doivent être mis à jour, avec leur dépendance, et sont parfois abandonnés par les développeurs.

Pour résoudre ce problème, mon blog est depuis décembre 2022 autogénéré et ne produit que du HTML très simple, compatible avec la plupart des navigateurs passés, présents, futurs et sans ambiguïté. Chaque page est complètement indépendante et contient ses propres instructions CSS (42 lignes). Vous pouvez sauver un article de mon blog depuis votre navigateur et le réouvrir n’importe où, même sans connexion. Pas de JavaScript, pas de framework, pas de chargement dynamique, pas de fonte, pas d’appel extérieur. Ça parait extraordinaire de nos jours, mais c’est tellement simple à gérer que je me demande encore pourquoi on se casse la tête à apprendre et maintenir des couches de framework. Penser la pérennité peut avoir des bénéfices immédiats !

Autre point: je prends le plus grand soin à ce que les URLs de mes posts ne changent pas. Un lien vers un article devrait rester valide aussi longtemps que le nom de domaine existera.

Sources disponibles et distribuées

Les sources de mon blog sont écrites au format Gemtext, le format utilisé sur le réseau Gemini. La particularité de ce format est qu’il s’agit essentiellement de texte pur. Aussi longtemps que nous aurons des ordinateurs, ces fichiers pourront être lus.

Mes billets ainsi que le générateur Python pour créer les pages HTML sont disponibles via Git, un outil décentralisé bien connu de tous les développeurs et certainement appelé à exister pour les décennies qui viennent.

La commande

git clone https://git.sr.ht/~lioploum/ploum.net

vous donne accès à toutes les sources de mon blog, que vous pouvez générer avec un simple "python publish.py". Cela ne nécessite aucune dépendance particulière autre que Python3.

Mais que faire si sourcehut est indisponible ? Je me rends compte qu’il est nécessaire que je pousse mes sources sur plusieurs miroirs.

Miroirs ?

Parlons de miroirs justement. Une copie de mon site, version web et gemini, est disponible sur rawtext.club.

Mais ce miroir est « accidentel ». Rawtext.club est un petit serveur géré par un passionné que je remercie chaleureusement. C’est grâce à lui que j’ai pu explorer l’univers Gemini. Je ne peux lui imposer un hébergement sérieux ni compter sur sa pérennité.

Sebsauvage me fait également l’immense honneur de maintenir une copie de mes publications, pour le cas où je serais forcé de retirer du contenu:

Pour le futur, je réalise aujourd’hui qu’il serait pertinent que je trouve un hébergeur autre que Sourcehut, pouvant être synchronisé par Git et offrant également un hébergement Gemini. Je pourrais rediriger ploum.be ou ploum.eu vers ce serveur de secours.

Les mailing-lists

Si j’avoue préférer le RSS pour mon usage personnel, force est de constater que les mailing-listes offrent un avantage : vous gardez une copie de tous les anciens billets dans votre boîte mail. Chaque nouvel abonné est donc, sans le savoir, un nouvel archiviste de mes écrits.

J’envoie d’ailleurs une version "text-only" (pas de HTML) sur une seconde mailing-list dont les archives sont publiques. Défaut majeur : contrairement aux mailing-listes précitées, ces archives text-only sont hébergées par… Sourcehut, le même hébergeur que mon blog !

Le futur

Le protocole Internet (IP) a été conçu à une époque où on pensait que le stockage serait toujours très coûteux, mais que la bande passante serait essentiellement gratuite. Le principe d’IP est donc de transmettre aussi vite que possible les paquets de données et de tout oublier instantanément.

En conséquence, les protocoles s’appuyant sur IP, comme Gemini et HTTPS, sont particulièrement fragiles. La perte d’un seul serveur peut signifier la disparition définitive de sites entiers.

De nouveaux protocoles ou de nouveaux usages sont nécessaires pour transformer Internet depuis le simple "échange de données temps réels" en "archive planétaire". Cet usage d’archive planétaire va bien entendu à l’encontre des intérêts financiers actuels qui souhaitent que vous payiez, directement ou à travers la pub, à chaque fois que vous consultez le même contenu. Le « copyright », comme son nom l’indique, cherche à empêcher toute copie ! Mais, petit à petit, des solutions apparaissent, qu’elles soient nouvelles comme IPFS ou qu’il s’agissent de réflexion sur une utilisation de technologies préexistantes.

Tout cela est encore expérimental, mais je garde un œil et réfléchis à rendre ploum.net disponible sur IPFS.

Le présent

La pérennité des contenus en ligne est une quête complexe et longue haleine. Je vous invite à y penser pour vos propres créations. Que voulez-vous qu’il reste de vous en ligne dans quelques années ?

Paradoxalement, nous laissons trop de traces involontaires (des données personnelles, des commentaires écrits sous le coup de la colère, des critiques de produits sur Amazon) et nous perdons trop facilement ces œuvres auxquelles nous accordons de l’importance (combien de vidéos personnelles disparaitront avec la fin inéluctable de Youtube ?). J’ai moi-même le regret d’avoir perdu beaucoup de textes rédigés et publiés un peu partout avant l’existence de ce blog ou cédant aux sirènes de la mode, postés sur Google+, Facebook ou Medium.

J’en ai retenu une leçon majeure : on ne sait a priori pas quels contenus seront importants pour notre futur moi. Il est également primordial de dater les contenus. Avec l’année. Je ne compte plus les fiches de notes que j’ai retrouvées, mais que je peux ne raccrocher à rien, ne sachant même pas estimer à quelle année voire à quel lustre le texte se rapporte. Cette leçon est également ce qui motive ma tentative de pérennisation de mes écrits : je n’ai pas l’impression de n’avoir jamais écrit quelque chose qui mérite une préservation éternelle. Mais un historien du futur pourrait peut-être un jour trouver dans les écrits de ce blogueur obscur et oublié une information cruciale, glissée par hasard dans un billet et lui permettant de comprendre certains paradoxes de notre époque.

Pour cette mission, la technologie qui semble la plus robuste, la plus résistante pour traverser les siècles voire les millénaires reste le livre. Livres que ma famille collectionne et accumule dans tous les coins de notre logis. Des milliers d’auteurs, vivants ou morts, mondialement connus ou obscurs, qui continuent chaque jour à nous parler !

Si vous souhaitez m’aider dans ma quête de pérennité, je vous invite à acquérir, prêter, offrir ou abandonner sur un banc mes livres.

Car lorsque les ordinateurs seront éteints, seuls continueront à nous bercer, à nous parler les mots gravés sur le papier, ces éphémères imaginaires survivants à la prétention d’immortalité des corps décomposés.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

January 16, 2024

Maintaining a production dataset at a manageable size can present a considerable challenge during the administration of a MySQL InnoDB Cluster.

Old Days

Back in the day when we only had one main copy of our data (the source), and one read copy (the replica) that we used to look at current and old data from our main system, we used a special trick to remove data without affecting the replica. The trick was to turn off writes to the binary log for our removal commands in the main system. External tools like pt-archiver were also able to use that trick. To stop bypass writing into the binary log, we used the command: SET SQL_LOG_BIN=0.

This mean that on the main production server (replication source), we were purging the data without writing the delete operation into the binary logs:

Current Days

These days, nobody runs a single MySQL node in production. High Availability is highly recommended (required!).

So people are using MySQL InnoDB Cluster as main production solution. In such environment, having a reasonable size dataset is also recommended (CLONE, backups, …).

However the old solution to archive data is not working anymore as if we skip writing events in the binary logs, those events won’t be replicated across the other members of the cluster as Group Replication is also relying on those binary logs.

This means we will end up with an inconsistent cluster where all the nodes have a different dataset!

GTIDs can assist in the process; however, it is often challenging to recall the particular UUID component of the GTID that is utilized for archiving purposes. Moreover, it gets complicated to figure out where these transactions come from.

MySQL 8.3, which launched today, introduces even more efficient ways to streamline our purging and archiving workflows.

GTID TAGS

We MySQL 8.3, the GTID format has evolved and now includes the possibility to have some identifiable tags in it:

source_id:tag:transaction_id

Tags are of course not mandatory and GTIDs will work as before too.

So on the replica we want to be used as the archiving server, we will change the GTID set to include a large range of transaction coming from the cluster but with a tag we defined. In this example, I will use “archiving”.

On a member of the MySQL InnoDB Cluster, the following query was executed to retrieve the UUID segment of the GTID used by the Group:

InnoDB Cluster (any node) > show global variables like '%group_name';
+------------------------------+--------------------------------------+
| Variable_name                | Value                                |
+------------------------------+--------------------------------------+
| group_replication_group_name | 1e4516ec-b47c-11ee-bb6e-c8cb9e32df8e |
+------------------------------+--------------------------------------+

Now on the Replica used as archived server, I run the following statement:

Replica > set global gtid_purged=
             "+1e4516ec-b47c-11ee-bb6e-c8cb9e32df8e:archiving:1-9999";

On the Primary instance, we can now start the archiving process like this:

InnoDB Cluster Primary > set gtid_next="automatic:archiving";
InnoDB Cluster Primary > delete from t1 where date(timestamp)<"2023-02-01";
InnoDB Cluster Primary > set gtid_next="automatic"

It’s of course better to use a dedicated session to archive the data.

This is the illustration of the process:

It’s imperative to ensure to use a large range and closely monitor its usage on the cluster. In this example we could only have 9,999 purging transaction, upon reaching this threshold, next actions will be applied on the replica too.

Of course it was already possible to perform such operation with standard GTIDs but it was more complicated, or you needed to use specific UUID. With the new GTID Tag, we can directly and easily identify the purging transactions.

But the integration with external tools is also easier, for example this is an example on how to use GTID and TAGS with pt-archiver:

./pt-archiver \
--source h=production_cluster,P=3306,u=user,p=xxxx,D=mydata,t=t1 \
--purge --where 'date(timestamp)<"2023-04-01"' \
--set-vars 'GTID_NEXT="automatic:archiving"'

Conclusion

MySQL 8.3 marks a notable improvement for database administrators who manage High Availability environments using MySQL InnoDB Cluster. With the introduction of GTID tags, the archiving and purging process becomes substantially more manageable and less prone to human error. These tags allow for clear identification of transaction sources, thus avoiding the pitfalls of inconsistent datasets across cluster nodes.

Beyond simplifying the identification process, GTID tags facilitate the integration with external tools.

Enjoy archiving and purging your data !

January 12, 2024

The MySQL Belgian Days are sold-out, our event that will take place the 1st and 2nd February 2024 in Brussels seems to have already attracted a lot of interest.

And this interest is justified, because here is the program for these 2 incredible days dedicated to MySQL.

Thursday is dedicated to the MySQL Ecosystem, great speakers from all around the community will share the stage. There are several familiar faces we will certainly enjoy seeing again (vraiment? oui oui!)*.

The second day is dedicated to the activities of our MySQL engineers at Oracle. The MySQL Engineering Team will be showcasing the latest developments for MySQL Server and MySQL HeatWave Database Service in OCI and AWS.

During the intermissions, do not hesitate to engage in conversation with the speakers and take the opportunity to directly engage with the MySQL Engineers.

On Friday, we will conclude the day by extending a warm welcome to the new MySQL Rockstars. Be sure not to miss the award ceremony.

And finally, on Friday night, the Belgian Party Squad is organizing the traditional MySQL & Friends Community Dinner, don’t forget to register, tickets are going fast !

See you soon in Belgium !!

(*) private joke

January 08, 2024

At the beginning of every year, I publish a retrospective that looks back at the previous year at Acquia. I also discuss the changing dynamics in our industry, focusing on Content Management Systems (CMS) and Digital Experience Platforms (DXP).

If you'd like, you can read all of my retrospectives for the past 15 years: 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009.

Resilience and growth amid market turbulence

At the beginning of 2023, interest rates were 4.5%. Technology companies, investors, and PE firms were optimistic, anticipating modest growth. However, as inflation persisted and central banks raised rates more than expected, this optimism dwindled.

The first quarter also saw a regional bank crisis, notably the fall of Silicon Valley Bank, which many tech firms, including Acquia, relied on. Following these events, the market's pace slowed, and the early optimism shifted to a more cautious outlook.

Despite these challenges, Acquia thrived. We marked 16 years of revenue increase, achieved record renewal rates, and continued our five-year trend of rising profitability. 2023 was another standout year for Acquia.

One of our main objectives for 2023 was to expand our platform through M&A. However, tighter credit lending, valuation discrepancies, and economic uncertainty complicated these efforts. By the end of 2023, with the public markets rebounding, the M&A landscape showed slight improvement.

In November, we announced Acquia's plan to acquire Monsido, a platform for improving website accessibility, content quality, SEO, privacy, and performance. The acquisition closed last Friday. I'm excited about expanding the value we offer to our customers and look forward to welcoming Monsido's employees to Acquia.

Working towards a safer, responsible and inclusive digital future

Image with panels showcasing various digital trends: one emphasizes inclusivity with a woman in a wheelchair at a workstation, and another shows a man interacting with an AI robot, illustrating human-AI collaboration.

Looking ahead to 2024, I anticipate these to be the dominant trends in the CMS and DXP markets:

  • Converging technology ecosystems: MACH and Jamstack are evolving beyond their original approaches. As a result, we'll see their capabilities converge with one another, and with those of traditional CMSes. I wrote extensively about this in Jamstack and MACH's journey towards traditional CMS concepts.
  • Navigating the cookie-less future: Marketers will need to navigate a cookie-less future. This means organizations will depend more and more on data they collect from their own digital channels (websites, newsletters, video platforms, etc).
  • Digital decentralization: The deterioration of commercial social media platforms has been a positive development in my opinion. I anticipate users will continue to reduce their time on these commercial platforms. The steady shift towards open, decentralized alternatives like Mastodon, Nostr, and personal websites is a welcome trend.
  • Growth in digital accessibility: The importance of accessibility is growing and will become even more important in 2024 as organizations prepare for enforcement of the European Accessibility Act in 2025. This trend isn't just about responding to legislation; it's about making sure digital experiences are inclusive to everyone, including individuals with disabilities.
  • AI's impact on digital marketing and websites: As people start getting information directly from Artificial Intelligence (AI) tools, organic website traffic will decline. Just like with the cookie-less future, organizations will need to focus more on growing their own digital channels with exclusive and personalized content.
  • AI's impact on website building: We'll witness AI's evolution from assisting in content production to facilitating the building of applications. Instead of laboriously piecing together landing pages or campaigns with a hundred clicks, users will simply be able to guide the process with AI prompts. AI will evolve to become the new user interface for complex tasks.
  • Cybersecurity prioritization: As digital landscapes expand, so do vulnerabilities. People and organizations will become more protective of their personal and privacy data, and will demand greater control over the sharing and storage of their information. This means a growing focus on regulation, more strict compliance rules, automatic software updates, AI-driven monitoring and threat detection, passwordless authentication, and more.
  • Central content and data stores: Organizations are gravitating more and more towards all-in-one platforms that consolidate data and content. This centralization enables businesses to better understand and anticipate customer needs, and deliver better, personalized customer experiences.

While some of these trends suggest a decline in the importance of traditional websites, others trends point towards a positive future for websites. On one side, the rise of AI in information gathering will decrease the need for traditional websites. On the other side, the decline of commercial social media and the shift to a cookie-less future suggest that websites will continue to be important, perhaps even more so.

What I like most about many of these trends is that they are shaping a more intuitive, inclusive, and secure digital future. Their impact on end-users will be profound, making every interaction more personal, accessible, and secure.

However, I suspect the ways in which we do digital marketing will need to change quite a bit. Marketing teams will need to evolve how they generate leads. They'll have to use privacy-friendly methods to develop strong customer relationships and offer more value than what AI tools provide.

This means getting closer to customers with content that is personal and relevant. The use of intent data, first-party data and predictive marketing for determining the "next best actions" will continue to grow in importance.

It also means that more content may transition into secure areas such as newsletters, members-only websites, or websites that tailor content dynamically for each user, where it can't be mined by AI tools.

All this bodes well for CMSes, Customer Data Platforms (CDPs), personalization software, Account-Based Marketing (ABM), etc. By utilizing these platforms, marketing teams can better engage with individuals and offer more meaningful experiences. Acquia is well-positioned based on these trends.

Reaffirming our DXP strategy, with a focus on openness

On a personal front, my title expanded from CTO to CTO & Chief Strategy Officer. Since Acquia's inception, I've always played a key role in shaping both our technology and business strategies. This title change reflects my ongoing responsibilities.

Until 2018, Acquia mainly focused on CMS. In 2018, we made a strategic shift from being a leader in CMS to becoming a leader in DXP. We have greatly expanded our product portfolio since then. Today, Acquia's DXP includes CMS, Digital Asset Management (DAM), Customer Data Platform (CDP), Marketing Automation, Digital Experience Optimization, and more. We've been recognized as leaders in DXP by analyst firms including Gartner, GigaOm, Aragon Research and Omdia.

As we entered 2023, we felt we had successfully executed our big 2018 strategy shift. With my updated title, I spearheaded an effort to revise our corporate strategy to figure out what is next. The results were that we reaffirmed our commitment to our core DXP market with the goal of creating the best "Open DXP" in the market.

We see "Open" as a key differentiator. As part of our updated strategy, we explicitly defined what "Open" means to us. While this topic deserves a blog post on its own, I will touch on it here.

Being "Open" means we actively promote integrations with third-party vendors. When you purchase an Acquia product, you're not just buying a tool; you're also buying into a technology ecosystem.

However, our definition of "Open" extends far beyond mere integrations. It's also about creating an inclusive environment where everyone is empowered to participate and contribute to meaningful digital experiences in a safe and secure manner. Our updated strategy, while still focused on the DXP ecosystem, champions empowerment, inclusivity, accessibility, and safety.

A slide titled "The Open DXP" detailing its four core principles: integrations that support customization without vendor lock-in, empowerment for user-driven development, accessibility for all backgrounds and abilities, and a focus on safety with security and compliance. A slide from our strategy presentation, summarizing our definition of an Open DXP. The definition is the cornerstone of Acquia's "Why"-statement.

People who have followed me for a while know that I've long advocated for an Open Web, promoting inclusivity, accessibility, and safety. It's inspiring to see Acquia fully embrace these principles, a move I hope will inspire not just me, but our employees, customers, and partners too. It's not just a strategy; it's a reflection of our core values.

It probably doesn't come as a surprise that our updated strategy aligns with the trends I outlined above, many of which also point towards a safer, more responsible, and inclusive digital future. Our enthusiasm for the Monsido acquisition is also driven by these core principles.

Needless to say, our strategy update is about much more than a commitment to openness. Our commitment to openness drives a lot of our strategic decisions. Here are a few key examples to illustrate our direction.

  • Expanding into the mid-market: Acquia has primarily catered to the enterprise and upper mid-market sectors. We're driven by the belief that an open platform, dedicated to inclusivity, accessibility, and safety, enhances the web for everyone. Our commitment to contributing to a better web is motivating us to broaden our reach, making expanding into the mid-market a logical strategic move.
  • Expanding partnerships, empowering co-creation: Our partnership program is well-established with Drupal, and we're actively expanding partnerships for Digital Asset Management (DAM), CDP, and marketing automation. We aim to go beyond a standard partner program by engaging more deeply in co-creation with our partners, similar to what we do in the Open Source community. The goal is to foster an open ecosystem where everyone can contribute to developing customer solutions, embodying our commitment to empowerment and collaboration. We've already launched a marketplace in 2023, Acquia Exchange, featuring more than 100 co-created solutions, with the goal of expanding to 500 by the end of 2024.
  • Be an AI-fueled organization: In 2023, we launched numerous AI features and we anticipate introducing even more in 2024. Acquia already adheres to responsible AI principles. This aligns with our definition of "Open", emphasizing accountability and safety for the AI systems we develop. We want to continue to be a leader in this space.

Stronger together

We've always been very focused on our greatest asset: our people. This year, we welcomed exceptional talent across the organization, including two key additions to our Executive Leadership Team (ELT): Tarang Patel, leading Corporate Development, and Jennifer Griffin Smith, our Chief Market Officer. Their expertise has already made a significant impact.

Eight logos in a row with different '2023 Best Places to Work' awards. Eight awards for "Best Places to Work 2023" in various categories such as IT, mid-size workplaces, female-friendly environments, wellbeing, and appeal for millennials, across India, the UK, and the USA.

In 2023, we dedicated ourselves to redefining and enhancing the Acquia employee experience, committing daily to its principles through all our programs. This focus, along with our strong emphasis on diversity, equity, and inclusion (DEI), has cultivated a culture of exceptional productivity and collaboration. As a result, we've seen not just record-high employee retention rates but also remarkable employee satisfaction and engagement. Our efforts have earned us various prestigious "Best Place to Work" awards.

Customer-centric excellence, growth, and renewals

Our commitment to delivering great customer experiences is evident in the awards and recognition we received, many of which are influenced by customer feedback. These accolades include recognition on platforms like TrustRadius and G2, as well as the prestigious 2023 CODiE Award.

A list of 32 awards across various products for 2023, including 'Leader' and 'High Performer' from G2, 'Best Of,' 'Best Feature Set,' 'Best Value for Price,' and 'Best Relationship' from TrustRadius, and the '2023 SIIA CODiE Winner' recognition, all highlighting top customer ratings. Acquia received 32 awards for leadership in various categories across its products.

As mentioned earlier, we delivered consistently excellent, and historically high, renewal rates throughout 2023. It means our customers are voting with their feet (and wallets) to stay with Acquia.

Furthermore, we achieved remarkable growth within our customer base with record rates of expansion growth. Not only did customers choose to stay with Acquia, they chose to buy more from Acquia as well.

To top it all off, we experienced a substantial increase in the number of customers who were willing to serve as references for Acquia, endorsing our products and services to prospects.

Many of the notable customer stories for 2023 came from some of the world's most recognizable organizations, including:

  • Nestlé: With thousands of brand sites hosted on disparate technologies, Nestlé brand managers had difficulty maintaining, updating, and securing brand assets globally. Not only was it a challenge to standardize and govern the multiple brands, it was costly to maintain resources for each technology and inefficient with work being duplicated across sites. Today, Nestlé uses Drupal, Acquia Cloud Platform and Acquia Site Factory to face these challenges. Nearly all (90%) of Nestlé sites are built using Drupal. Across the brand's entire portfolio of sites, approximately 60% are built on a shared codebase – made possible by Acquia and Acquia Site Factory.
  • Novartis: The Novartis product sites in the U.S. were fragmented across multiple platforms, with different approaches and capabilities and varying levels of technical debt. This led to uncertainty in the level of effort and time to market for new properties. Today, the Novartis platform built with Acquia and EPAM has become a model within the larger Novartis organization for how a design system can seamlessly integrate with Drupal to build a decoupled front end. The new platform allows Novartis to create new or move existing websites in a standardized design framework, leading to more efficient development cycles and more functionality delivered in each sprint.
  • US Drug Enforcement Administration: The U.S. DEA wanted to create a campaign site to increase public awareness regarding the increasing danger of fake prescription pills laced with fentanyl. Developed with Tactis and Acquia, the campaign website One Pill Can Kill highlights the lethal nature of fentanyl. The campaign compares real and fake pills through videos featuring parents and teens who share their experiences with fentanyl. It also provides resources and support for teens, parents, and teachers and discusses the use of Naloxone in reversing the effects of drug overdose.
  • Cox Automotive: Cox Automotive uses first-party data through Acquia Campaign Studio for better targeted marketing. With their Automotive Marketing Platform (AMP) powered by Acquia, they access real-time data and insights, delivering personalized messages at the right time. The results? Dealers using AMP see consumers nine times more likely to purchase within 45 days and a 14-fold increase in sales gross profit ROI.

I'm proud of outcomes like this: it show how valuable our DXP is to our customers.

Product innovation

In 2023, we remained focused on solving problems for our current and future customers. We use both quantitative and qualitative data to assess areas of opportunities and run hypothesis-driven experiments with design prototypes, hackathons, and proofs-of-concept. This approach has led to hundreds of improvements across our products, both by our development teams and through partnerships. Below are some key innovations that have transformed the way our customers operate:

  • We released many AI features in 2023, including AI assistants and automations for Acquia DAM, Acquia CDP, Acquia Campaign Studio, and Drupal. This includes: AI assist during asset creation in Drupal and Campaign Studio, AI-generated descriptions for assets and products in DAM, auto-tagging in DAM with computer vision, next best action/channel predictions in CDP, ML-powered customer segmentation in CDP, and much more.
  • Our Drupal Acceleration Team (DAT) worked with the Drupal community on major upgrade of the Drupal field UI, which makes it significantly faster and more user-friendly to perform content modeling. We also open sourced Acquia Migrate Accelerate as part of the run-up to the Drupal 7 community end-of-life in January 2025. Finally, DAT contributed to a number of major ongoing initiatives including Project Browser, Automatic Updates, Page Building, Recipes, and more that will be seen in later versions of Drupal.
  • We launched a new trial experience for Acquia Cloud Platform, our Drupal platform. Organizations can now explore Acquia's hosting and developer tools to see how their Drupal applications perform on our platform.
  • Our Kuberbetes-native Drupal hosting platform backed by AWS, Acquia Cloud Next, continued to roll out to more customers. Over two-thirds of our customers are now enjoying Acquia Cloud Next, which provides them the highest levels of performance, self-healing, and dynamic scaling. We've seen a 50% decrease in critical support tickets since transitioning customers to Acquia Cloud Next, all while maintaining an impressive uptime record of 99.99% or higher.
  • Our open source marketing automation tool, Acquia Campaign Studio, is now running on Acquia Cloud Next as its core processing platform. This consolidation benefits everyone: it streamlines and accelerates innovation for us while enabling our customers to deliver targeted and personalized messages at a massive scale.
  • We decided to make Mautic a completely independent Open Source project, letting it grow and change freely. We've remained the top contributor ever since.
  • Marketers can now easily shape the Acquia CDP data model using low-code tools, custom attributes and custom calculations features. This empowers all Acquia CDP users, regardless of technical skill, to explore new use cases.
  • Acquia CDP's updated architecture enables nearly limitless elasticity, which allows the platform to scale automatically based on demand. We put this to the test during Black Friday, when our CDP efficiently handled billions of events. Our new architecture has led to faster, more consistent processing times, with speeds improving by over 70%.
  • With Snowflake as Acquia's data backbone, Acquia customers can now collaborate on their data within their organization and across business units. Customers can securely share and access governed data while preserving privacy, offering them advanced data strategies and solutions.
  • Our DAM innovation featured 47 updates and 13 new integrations. These updates included improved Product Information Management (PIM) functionality, increased accessibility, and a revamped search experience. Leveraging AI, we automated the generation of alt-text and product descriptions, which streamlines content management. Additionally, we established three partnerships to enhance content creation, selection, and distribution in DAM: Moovly for AI-driven video creation and translation, Vizit for optimizing content based on audience insights, and Syndic8 for distributing visual and product content across online commerce platforms.
  • With the acquisition of Monsido and new partnerships with VWO (tools for optimizing website engagement and conversions) and Conductor (SEO platform), Acquia DXP now offers an unparalleled suite of tools for experience optimization. Acquia already provided the best tools to build, manage and operate websites. With these additions, Acquia DXP also offers the best solution for experience optimization.
  • Acquia also launched Acquia TV, a one-stop destination for all things digital experience. It features video podcasts, event highlights, product breakdowns, and other content from a diverse collection of industry voices. This is a great example of how we use our own technology to connect more powerfully with our audiences. It's something our customers strive to do everyday.

Conclusion

In spite of the economic uncertainties of 2023, Acquia had a remarkable year. We achieved our objectives, overcame challenges, and delivered outstanding results. I'm grateful to be in the position that we are in.

Our achievements in 2023 underscore the importance of putting our customers first and nurturing exceptional teams. Alongside effective management and financial responsibility, these elements fuel ongoing growth, irrespective of economic conditions.

Of course, none of our results would be possible without the support of our customers, our partners, and the Drupal and Mautic communities. Last but not least, I'm grateful for the dedication and hard work of all Acquians who made 2023 another exceptional year.

January 03, 2024

A skilled craftsperson works on the intricate mechanism of a droplet-shaped vintage watch.

Since 1999, I've been consistently working on this website, making it one of my longest-standing projects. Even after all these years, the satisfaction of working on my website remains strong. Remarkable, indeed.

During rare moments of calm — be it a slow holiday afternoon, a long flight home, or the early morning stillness — I'm often drawn to tinkering with my website.

When working on my website, I often make small tweaks and improvements. Much like a watchmaker meticulously fine-tuning the gears of an antique clock, I pay close attention to details.

This holiday, I improved the lazy loading of images in my blog posts, leading to a perfect Lighthouse score. A perfect score isn't necessary, but it shows the effort and care I put into my website.

A screenshot of dri.es' Lighthouse scores showing 100% scores in performance, accessibility, best practices, and SEO. Screenshot of Lighthouse scores via https://pagespeed.web.dev/.

I also validated my RSS feeds, uncovering a few opportunities for improvement. Like a good Belgian school boy, I promptly implemented these improvements, added new PHPUnit tests and integrated these into my CI/CD pipeline. Some might consider this overkill for a personal site, but for me, it's about mastering the craft, adhering to high standards, and building something that is durable.

Last year, I added 135 new photos to my website, a way for me to document my adventures and family moments. As the year drew to a close, I made sure all new photos have descriptive alt-texts, ensuring they're accessible to all. Writing alt-texts can be tedious, yet it's these small but important details that give me satisfaction.

Just like the watchmaker working on an antique watch, it's not just about keeping time better; it's about cherishing the process and craft. There is something uniquely calming about slowly iterating on the details of a website. I call it the The Watchmaker's Approach to Web Development, where the process holds as much value as the result.

I'm thankful for my website as it provides me a space where I can create, share, and unwind. Why share all this? Perhaps to encourage more people to dive into the world of website creation and maintenance.

January 02, 2024

December 27, 2023

Ik schrijf nu zo’n zeven jaar software voor high-end vijf-as CNC machines bij Heidenhain.

Wij zijn bezig aan de TNC 7 software. Dit is de opvolger van de TNC 640 software. Ik ben samen met een team ontwikkelaars die in Traunreut zitten de ontwikkelaar van voornamelijk de NC editor en alles wat daarbij hoort (alle integraties met de omliggende onderdelen van al wat er ook bij komt kijken – en dat is veel).

Als ik naar de TNC 640 kijk is dit software die veertig jaar zal meegaan.

Als ik naar onze eigen Belgische ruimtevaartindustrie maar ook bv. wapenindustrie kijk, maar ook echt om het even wat, is dat software die ongeveer overal gebruikt wordt. Voor ongeveer alles. De Jobs-pagina van FN Herstal bijvoorbeeld toont al een paar jaar mensen die effectief een CNC machine met een Heidenhain TNC 640 bedienen. Binnenkort zal daar dus onze nieuwe TNC 7 gebruikt worden! (om dan weet ik veel, bijvoorbeeld onze soldaten hun machinegeweren mee te maken).

Enfin. Genoeg gestoef daarover!

Ik ben (dus) de laatste tijd geïnteresseerd geraakt in het wereldje van metaalbewerking. Toen ik een jonge gast van zo’n 15 – 16 was, was mijn grootvader een draaibank metaalbewerker (bij AGFA-gevaert). Die man was altijd trots om mij voortdurend uitleg te geven daarover.

Ik heb ook op school tijdens mijn jaren electromechanica een paar toetsen mogen of moeten doen met een eenvoudige manuele draaibank.

Dat was eigenlijk wel interessant. Maar computers, die waren toen ook heel erg interessant!

Vandaag komt het voor mij samen.

Maar ik moet zeggen. Nee echt. De echte ‘hackers‘ (het aanpassen van een systeem opdat het meer doet dan dat waar het voor ontworpen is) zitten nog veel meer in de machining wereld dan in de of onze software wereld. Hoewel wij er ook wel een stel hebben rondlopen.

Jullie (of wij) andere sofware ontwikkelaars hebben er vaak geen idee van hoe enorm uitgebreid die andere wereld is. Zowel in hout als in metaal. Het loopt daar de spuigaten uit van de hackers.

Ik wil maar zeggen, vooral aan de jonge kuikens: leg uw interesses breed. Ga heel erg ver. Het is allemaal zo ontzettend interessant.

En vooral: het zit in de combinatie van software én iets anders. Dat kan security zijn. Maar dus ook machining. Of medisch (imaging, enzo)? Een combinatie van alles. Ik sprak met mensen die tools maken voor CNC machines die zaken voor chirurgen produceren.

Enkel Python code kloppen is niet de essentie van onze passie. Je schrijft die Python code opdat het samenwerkt met of voor iets anders. Dat iets anders moet je dus ook kennen of willen kennen.

December 23, 2023

Het ontbreekt ons in het Westen aan analysten die kunnen bewegen in hun mening.

Een groot probleem is dat om het even wie die iets zegt, onmiddellijk vastegepint is.

Zo’n persoon kan nooit nog een andere richting uitgaan.

We zijn nu aangekomen in een militair conflict met Rusland. Onze analysten moeten terug kunnen bewegen zoals de wind waait. Want die waait heel de tijd.

Nieuwe inzichten zijn eigenlijk belangrijker dan al onze oude onzin.

We moeten niet zomaar mensen cancelen. We moeten terug toelaten dat analysten fouten maakten, fouten maken en zich dus (daarom) willen aanpassen.

Ik vraag ook ondubbelzinnig dat Duitsland opstaat, en haar leger volledig brengt waar het hoort: de grootste macht van Europa. De Wereld Oorlogen zijn voorbij en wat gebeurd is is gebeurd. We hebben Duitsland nodig om ons te beschermen. We moeten daar aan bijdragen.

Zeit bleibt nicht stehen.

December 22, 2023

De huidige tijd wordt denk ik het best omschreven door Rammstein’s Zeit.

De oorlog tussen Rusland en Oekraïne is voor het Westen eigenlijk volledig verloren. Rusland zal bijna zeker een groot deel van Oekraïne innemen.

Massa’s mannen in Oekraïne sterven momenteel. Ongeveer zo’n 20 000 mensen per maand. Dat waren er zo’n 650 vandaag.

Dit komt overeen met wat DPA, Weeb Union, History Legends en Military Summary zeggen.

Ik ben er al langer uit dat de Westerse propaganda (ja sorry jongens, het is nu echt wel propaganda hoor) kei hard liegt. En dat de bovenstaanden een veel beter en klaar beeld geven van de werkelijkheid.

Die werkelijkheid is bitterhard. Namelijk dat Rusland dit aan het winnen is. Geen beetje. Maar ernstig en echt.

Extreem veel mensen uit Oekraïne zijn aan het sterven. Iedere dag.

Er zijn ook geen gewonnen gebieden. Men kan ze nauwelijks vinden op een map als ze er zijn. En als ze al niet terug ingenomen zijn door Rusland.

Zeit, bitte bleib stehen, bleib stehen. Zeit, das soll immer so weitergehen.

Rammstein, song Zeit

De werkelijkheid is dat Rusland vanaf nu gebieden zal gaan innemen. De tijd zal dus niet blijven staan. Laat staan stilstaan.

Dat is hard en dit zijn harde woorden. Het zijn harde tijden.

Het Westen zal dit verliezen.

Wat U daar ook van vindt.

I could have set MySQL between parenthesis in the title as this article is more about how to use OpenTofu to deploy on OCI.

I will explain how to install OpenTofu and how to use it to deploy on OCI. I will also mention what are the required changes be able to use my previous Terraform deployment files.

As an example, let’s use the modules to deploy WordPress with MySQL HeatWave Database Service: oci-wordpress-mds.

Installing OpenTofu

If like me you are using a RPM based Linux distro, you can find the necessary information to create the yum repository on OpenTofu’s website:

$ sudo su -
# cat >/etc/yum.repos.d/opentofu.repo <<EOF
[opentofu]
name=opentofu
baseurl=https://packages.opentofu.org/opentofu/tofu/rpm_any/rpm_any/\$basearch
repo_gpgcheck=0
gpgcheck=1
enabled=1
gpgkey=https://get.opentofu.org/opentofu.gpg
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300

[opentofu-source]
name=opentofu-source
baseurl=https://packages.opentofu.org/opentofu/tofu/rpm_any/rpm_any/SRPMS
repo_gpgcheck=0
gpgcheck=1
enabled=1
gpgkey=https://get.opentofu.org/opentofu.gpg
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
EOF

When the repo is created, you just need to use yum (or dnf) like this:

$ sudo dnf install -y tofu

You can verify that OpenTofu is installed correctly by running the following command that returns the installed version:

$ tofu version
OpenTofu v1.6.0-rc1
on linux_amd64

Terraform code

To test, we download the code from GitHub (v1.9.2):

$ git clone https://github.com/lefred/oci-wordpress-mds.git
cd oci-wordpress-mds

We need to first copy the file terraform.tfvars.template to terraform.tvars and edit the content with our OCI information (tenancy, ocids, keys, …).

When ready, we can start with the initialization of the environment:

$ tofu init

Initializing the backend...
Initializing modules...
- mds-instance in modules/mds-instance
- wordpress in modules/wordpress

Initializing provider plugins...
- Finding latest version of hashicorp/template...
- Finding latest version of hashicorp/oci...
- Finding latest version of hashicorp/tls...
- Installing hashicorp/template v2.2.0...
- Installed hashicorp/template v2.2.0 (signed, key ID 0C0AF313E5FD9F80)
- Installing hashicorp/tls v4.0.5...
- Installed hashicorp/tls v4.0.5 (signed, key ID 0C0AF313E5FD9F80)

Providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://opentofu.org/docs/cli/plugins/signing/
╷
│ Error: Failed to query available provider packages
│ 
│ Could not retrieve the list of available versions for provider hashicorp/oci: provider registry
│ registry.opentofu.org does not have a provider named registry.opentofu.org/hashicorp/oci
│ 
│ All modules should specify their required_providers so that external consumers will get the correct
│ providers when using a module. To see which modules are currently depending on hashicorp/oci, run the
│ following command:
│     tofu providers
│ 
│ If you believe this provider is missing from the registry, please submit a issue on the OpenTofu
│ Registry https://github.com/opentofu/registry/issues/
╵

With Terraform, the same code will work but will return the following warning:

│ Warning: Additional provider information from registry
│ 
│ The remote registry returned warnings for registry.terraform.io/hashicorp/oci:
│ - For users on Terraform 0.13 or greater, this provider has moved to oracle/oci. Please update your
│ source in required_providers.
╵

Provider

Step 1 is to fix the provider and use Oracle’s OCI. We edit provider.tf and we add the following lines:

terraform {
  required_providers {
    oci = {
      source  = "oracle/oci"
    }
  }
}

We can run again the init command:

$ tofu init

Initializing the backend...
Initializing modules...

Initializing provider plugins...
- Finding latest version of hashicorp/template...
- Finding latest version of hashicorp/oci...
- Finding latest version of oracle/oci...
- Finding latest version of hashicorp/tls...
- Installing hashicorp/template v2.2.0...
- Installed hashicorp/template v2.2.0 (signed, key ID 0C0AF313E5FD9F80)
- Installing oracle/oci v5.23.0...
- Installed oracle/oci v5.23.0 (signed, key ID 1533A49284137CEB)
- Installing hashicorp/tls v4.0.5...
- Installed hashicorp/tls v4.0.5 (signed, key ID 0C0AF313E5FD9F80)

Providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://opentofu.org/docs/cli/plugins/signing/
╷
│ Error: Failed to query available provider packages
│ 
│ Could not retrieve the list of available versions for provider hashicorp/oci: provider registry
│ registry.opentofu.org does not have a provider named registry.opentofu.org/hashicorp/oci
│ 
│ Did you intend to use oracle/oci? If so, you must specify that source address in each module which
│ requires that provider. To see which modules are currently depending on hashicorp/oci, run the
│ following command:
│     tofu providers
│ 
│ If you believe this provider is missing from the registry, please submit a issue on the OpenTofu
│ Registry https://github.com/opentofu/registry/issues/
╵

We can see that the oracle/oci v5.23.0 provider plugin was installed, but it still fails. Let’s run the recommended tofu providers command:

$ tofu providers

Providers required by configuration:
.
├── provider[registry.opentofu.org/hashicorp/tls]
├── provider[registry.opentofu.org/hashicorp/template]
├── provider[registry.opentofu.org/oracle/oci]
├── module.wordpress
│   ├── provider[registry.opentofu.org/hashicorp/oci]
│   └── provider[registry.opentofu.org/hashicorp/template]
└── module.mds-instance
    └── provider[registry.opentofu.org/hashicorp/oci]

We can observe that the oracle/oci provider plugin is indeed used for the root (.) but we can also see that for the 2 modules (wordpress and mds-instance), hashicorp/oci is still used (and not found).

Let’s add a provider.tf file containing the following lines in both modules:

terraform {
  required_providers {
    oci = {
      source  = "oracle/oci"
    }
  }
}

If we check again the providers, we can see now that they all use the correct one:

$ tofu providers

Providers required by configuration:
.
├── provider[registry.opentofu.org/oracle/oci]
├── provider[registry.opentofu.org/hashicorp/tls]
├── provider[registry.opentofu.org/hashicorp/template]
├── module.mds-instance
│   └── provider[registry.opentofu.org/oracle/oci]
└── module.wordpress
    ├── provider[registry.opentofu.org/oracle/oci]
    └── provider[registry.opentofu.org/hashicorp/template]

We can run init again:

$ tofu init

Initializing the backend...
Initializing modules...

Initializing provider plugins...
- Finding latest version of oracle/oci...
- Finding latest version of hashicorp/tls...
- Finding latest version of hashicorp/template...
- Installing oracle/oci v5.23.0...
- Installed oracle/oci v5.23.0 (signed, key ID 1533A49284137CEB)
- Installing hashicorp/tls v4.0.5...
- Installed hashicorp/tls v4.0.5 (signed, key ID 0C0AF313E5FD9F80)
- Installing hashicorp/template v2.2.0...
- Installed hashicorp/template v2.2.0 (signed, key ID 0C0AF313E5FD9F80)

Providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://opentofu.org/docs/cli/plugins/signing/

OpenTofu has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that OpenTofu can guarantee to make the same selections by default when
you run "tofu init" in the future.

OpenTofu has been successfully initialized!

You may now begin working with OpenTofu. Try running "tofu plan" to see
any changes that are required for your infrastructure. All OpenTofu commands
should now work.

If you ever set or change modules or backend configuration for OpenTofu,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Perfect ! Note that with Terraform the warnings will also be gone.

OpenTofu Plan & Apply

We can now plan and if no error, we can easily apply:

$ tofu plan
[...]
Plan: 14 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + mds_instance_ip       = (known after apply)
  + ssh_private_key       = "/home/fred/.ssh/id_rsa_oci"
  + wordpress_db_password = "MyWPpassw0rd!"
  + wordpress_db_user     = "wp"
  + wordpress_public_ip   = (known after apply)
  + wordpress_schema      = "wordpress"

$ tofu apply

And it’s deployed !

Conclusion

OpenTofu works as expected but it requires the code and module to be following the latest Terraform specifications.

Of course, in the future, the compatibility might change, but at present, deploying on OCI using OpenTofu instead of Terraform works perfectly.

Enjoy deploying in OCI !

December 19, 2023

We recently saw that we can benefit from using JSON documents with MySQL HeatWave cluster in OCI (HeatWave accelerator for MySQL Database Service).

However sometimes the data can’t be loaded to HeatWave Cluster or the query cannot use it.

And this is not always easy to understand why. Let’s get familiar on how to find the info.

Data not loaded in HW Cluster

Let’s see an example with this table (collection):

SQL > show create table listingsAndReviews\G
*************************** 1. row ***************************
       Table: listingsAndReviews
Create Table: CREATE TABLE `listingsAndReviews` (
  `doc` json DEFAULT NULL,
  `_id` char(28) GENERATED ALWAYS AS
 (json_unquote(json_extract(`doc`,_utf8mb4'$._id'))) STORED NOT NULL,
  `_json_schema` json GENERATED ALWAYS AS 
  (_utf8mb4'{"type":"object"}') VIRTUAL,
  PRIMARY KEY (`_id`),
  CONSTRAINT `$val_strict_B70EB65BDDBAB20B0EE5BB7532C9060C422A06F8` 
  CHECK (json_schema_valid(`_json_schema`,`doc`)) /*!80016 NOT ENFORCED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
 SECONDARY_ENGINE=rapid

Now when we try to load the data int HeatWave Cluster we get the following output:

SQL > alter table listingsAndReviews secondary_load;
Query OK, 0 rows affected, 3 warnings (19.1727 sec)
Warning (code 3877): Guided Load executed 
'ALTER TABLE `docstore`.`listingsAndReviews` MODIFY `_json_schema` json 
GENERATED ALWAYS AS (_utf8mb4'{"type":"object"}') VIRTUAL 
COMMENT '' NOT SECONDARY'.
Warning (code 3877): Guided Load executed 'ALTER TABLE
`docstore`.`listingsAndReviews` MODIFY `doc` json 
DEFAULT NULL COMMENT '' NOT SECONDARY'.
Warning (code 3877): Guided Load requested 11 InnoDB parallel read 
threads to load docstore.listingsAndReviews (session variable value: 32).

We can verify which columns are loaded in HeatWave cluster from that table:

SQL> select name, column_name, data_placement_type, nrows 
   from rpd_column_id ci 
   join rpd_tables t on t.id=ci.table_id 
     join rpd_table_id ri on ri.id=t.id 
     where table_name='listingsAndReviews';
+-----------------------------+-------------+---------------------+-------+
| name                        | column_name | data_placement_type | nrows |
+-----------------------------+-------------+---------------------+-------+
| docstore.listingsAndReviews | _id         | PrimaryKey          |  5555 |
+-----------------------------+-------------+---------------------+-------+
1 row in set (0.0010 sec)

We can see that the column doc, the JSON one, is not loaded. But why ?

If we use in OCI Console the HeatWave Cluster node estimation tool, we also get a warning notifying us that only one column will be loaded:

But once again, we don’t know the reason and as we are using MySQL 8.2.0-u1, JSON is supported, so the column doc should also be loaded into the secondary engine.

When we try with the auto pilot load (call sys.heatwave_load(JSON_ARRAY('docstore'), NULL)) we don’t get much more details.

But in the error_log (available in performance_schema) we have the info we are looking for:

{"warn": "Column size is more than 65532 bytes", "table_name": "listingsAndReviews", "column_name": "doc", "schema_name": "docstore"}

We can see why the JSON column was not loaded, it exceeds the maximum column size allowed of 65532 bytes.

The information is also available in sys.heatwave_autopilot_report after using the auto pilot load procedure (call sys.heatwave_load()).

Query not off-loaded to secondary engine

Now let’s see if a query is off-loaded to HeatWave cluster (secondary engine). So first, we check how much queries have been offloaded:

SQL >  show status like 'rapid%query%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| rapid_query_offload_count | 59    |
+---------------------------+-------+

And then we run a query:

SQL > with cte1 as (
       select doc->>"$.name" as name,  doc->>"$.cuisine" as cuisine, 
       (
        select avg(score) 
         from json_table (
            doc, "$.grades[*]" columns (score int path "$.score")) as r
         ) as  avg_score 
        from restaurants) select *, rank() 
        over (  
          partition by cuisine order by avg_score desc) as `rank`  
     from cte1 order by `rank`, avg_score desc limit 5;
+-----------------------+--------------------------------+-----------+------+
| name                  | cuisine                        | avg_score | rank |
+-----------------------+--------------------------------+-----------+------+
| Juice It Health Bar   | Juice, Smoothies, Fruit Salads |   75.0000 |    1 |
| Golden Dragon Cuisine | Chinese                        |   73.0000 |    1 |
| Palombo Pastry Shop   | Bakery                         |   69.0000 |    1 |
| Go Go Curry           | Japanese                       |   65.0000 |    1 |
| Koyla                 | Middle Eastern                 |   61.0000 |    1 |
+-----------------------+--------------------------------+-----------+------+
5 rows in set (31.4319 sec)

31 .4 seconds, this doesn’t seem to be using the HeatWave cluster.

Let’s check again the amount of query offloaded:

SQL > show status like 'rapid%query%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| rapid_query_offload_count | 59    |
+---------------------------+-------+

Same amount….

We can verify the Query Execution Plan using EXPLAIN:

EXPLAIN: -> Limit: 5 row(s)
    -> Sort row IDs: rank, cte1.avg_score DESC
        -> Table scan on <temporary>  (cost=2.5..2.5 rows=0)
            -> Temporary table  (cost=0..0 rows=0)
                -> Window aggregate: rank() OVER (PARTITION BY cte1.cuisine ORDER BY cte1.avg_score desc ) 
                    -> Sort row IDs: cte1.cuisine, cte1.avg_score DESC  (cost=5.08e+6..5.08e+6 rows=2.1e+6)
                        -> Table scan on cte1  (cost=438291..464507 rows=2.1e+6)
                            -> Materialize CTE cte1  (cost=438291..438291 rows=2.1e+6)
                                -> Table scan on restaurants  (cost=228577 rows=2.1e+6)
                                -> Select #3 (subquery in projection; dependent)
                                    -> Aggregate: avg(r.score)  (cost=0 rows=1)
                                        -> Materialize table function

There is indeed not sign of using the secondary engine.

Usually, when using HeatWave cluster, we can enable the optimizer trace:

SQL > set  optimizer_trace="enabled=on";
SQL > explain format=tree with cte1 as .... <THE QUERY> ... desc limit 5\G
[the output of the QEP]

And then we verify the reason in information_schema.optimizer_trace:

SQL > select query, trace->'$**.Rapid_Offload_Fails' 'Offload Failed',
             trace->'$**.secondary_engine_not_used' 'HeatWave Not Used'
      from information_schema.optimizer_trace\G
*************************** 1. row ***************************
            query: explain format=tree with cte1 as (select doc->>"$.name" as name,
  doc->>"$.cuisine" as cuisine, (select avg(score) from json_table (doc,
 "$.grades[*]" columns (score int path "$.score")) as r) as  avg_score from
 restaurants) select *, rank() over (  partition by cuisine order by avg_score
 desc) as `rank`  from cte1 order by `rank`, avg_score desc limit 5
   Offload Failed: NULL
HeatWave Not Used: NULL

Most of the time, when a query is not offloaded to the secondary engine, we have the reason… but not this time, the value is NULL !

This could be related to the JSON datatype which was very recently supported.

Let’s have a look in the error log then:

SQL> select * from (select * from performance_schema.error_log 
     where subsystem='RAPID' order by 1 desc limit 3) a order by 1\G
*************************** 1. row ***************************
    LOGGED: 2023-12-15 17:41:58.876588
 THREAD_ID: 3535
      PRIO: System
ERROR_CODE: MY-011071
 SUBSYSTEM: RAPID
      DATA: Attempting offload of query for HeatWave. qid=142, my_qid=5253596,
 DB = "docstore". Query = "explain format=tree with cte1 as 
(select doc->>"$.name" as name,  doc->>"$.cuisine" as cuisine,
(select avg(score) from json_table (doc, "$.grades[*]" columns 
(score int path "$.score")) as r) as  avg_score from restaurants) 
select *, rank() over (  partition by cuisine order by avg_score desc)
 as `rank`  from cte1 order by `rank`, avg_score desc limit 5"
*************************** 2. row ***************************
    LOGGED: 2023-12-15 17:41:58.876733
 THREAD_ID: 3535
      PRIO: System
ERROR_CODE: MY-011071
 SUBSYSTEM: RAPID
      DATA: Starting costing for HeatWave query qid=142
*************************** 3. row ***************************
    LOGGED: 2023-12-15 17:41:58.876763
 THREAD_ID: 3535
      PRIO: System
ERROR_CODE: MY-011071
 SUBSYSTEM: RAPID
      DATA: Heatwave chosen plan quality accepted: qid=142 : total SG pairs: 0, Plan Quality metric: 1.0 , Plan Quality threshold threshold: 90.0

We can see that the MySQL attempted to offload the query….

So, how can we know the reason why the secondary engine is not used ?

There is an Optimizer Hint that will help us finding the reason: /*+ set_var(use_secondary_engine=forced) */.

When used, this optimizer hint will try to offload the query to HeatWave cluster but won’t run it on InnoDB when it fails to run on the secondary engine. Let’s try:

SQL > with cte1 as (
       select /*+ set_var(use_secondary_engine=forced) */
       doc->>"$.name" as name,  doc->>"$.cuisine" as cuisine, 
       (
        select avg(score) 
         from json_table (
            doc, "$.grades[*]" columns (score int path "$.score")) as r
         ) as  avg_score 
        from restaurants) select *, rank() 
        over (  
          partition by cuisine order by avg_score desc) as `rank`  
     from cte1 order by `rank`, avg_score desc limit 5;
ERROR: 3889: Secondary engine operation failed. Reason: "SUBQUERY not yet
 supported", "Table ``.`/db/tmp/#sql36f2_dcf_135` is not loaded in HeatWave".

Here we have a the reason why the query is not running in our HeatWave Clustser.

And in fact this unsupported subquery is related to the function JSON_TABLES() that is not supported in HeatWave cluster.

Conclusion

When encountering issues while utilizing JSON documents in the MySQL HeatWave Database Service in OCI and enabling the HeatWave cluster, it is crucial to become familiar with the proper way to obtain the necessary information for resolution. We learn how users can find those resource that can guide them in troubleshooting and rectifying any unexpected functionality problems.

Enjoy fast JSON in MySQL HeatWave on OCI !

December 17, 2023

Like many games of that time, Snow Bros. is a simple game that can be mastered with practice (or rather, coins). It’s fun all the way, specially when playing with 2 players (in co-op). The gameplay is somewhat reminiscent of Bubble Bobble, but instead of bubbles you shoot snow to bury enemies in snowballs. Push the snowball to get rid of all the enemies on its path! The arcade game was ported to a lot of game consoles and gaming computers.

December 16, 2023

Since the release of MySQL 8.0, the MySQL X Dev API has provided users with the convenient ability to utilize MySQL without the need to write a single line of SQL!

MySQL X Dev API brings the support for CRUD operations on JSON documents that are stored in MySQL. MySQL Document Store is ACID compliant and is compatible with everything MySQL like replication, InnoDB Cluster, …

The MySQL X Dev API is available using the MySQL X Protocol, listening by default on port 33060.

If you are interested in learning more about MySQL Document Store, please refer to the these presentations [1], [2], [3].

OCI MySQL HeatWave Database Service

The MySQL DBaaS on OCI is the only one providing access to the X Protocol. This mean you can create applications that doesn’t use any SQL to add, delete and modify and retrieve JSON documents stored in the DB System.

Let’s add some documents to our DB System. We will use the restaurants collection used in various MongoDB examples.

We first connect to our DB System (MySQL HeatWave instance in OCI) using MySQL Shell and the X Protocol (default). You can use a VPN, a Compute Instance, a bastion host…

Next, we create a new schema and we import all the JSON documents using the importJson() utility:

JS > session.createSchema('docstore')
<Schema:docstore>
JS > \u docstore
Default schema `docstore` accessible through db.
JS > util.importJson('restaurants.json', {convertBsonOid: true})
Importing from file "restaurants.json" to collection 
`docstore`.`restaurants` in MySQL Server at 10.0.1.249:33060

.. 25359.. 25359
Processed 15.60 MB in 25359 documents in 0.9856 sec (25.36K documents/s)
Total successfully imported documents 25359 (25.36K documents/s)

Done ! It’s fast and easy.

CRUD

It’s time to test the import and read the data we have imported.

Let’s start by just displaying the first document and verify it’s indeed a valid JSON document:

JS > db.restaurants.find().limit(1)
{
    "_id": "000065796b9c0000000000000001",
    "name": "Howard-Reyes",
    "grades": [
        {
            "date": {
                "$date": "2023-03-12T00:00:00.000+0000"
            },
            "grade": "B",
            "score": 5
        }
    ],
    "address": {
        "coord": [
            8.7301765,
            52.705775
        ],
        "street": "Ward Branch",
        "zipcode": "22737",
        "building": "74523"
    },
    "borough": "West Brandimouth",
    "cuisine": "Hawaiian"
}
1 document in set (0.0016 sec)

Let’s display 10 restaurants and their type of cuisine that are in Brooklyn:

JS > db.restaurants.find("borough = 'Brooklyn'").fields("name", "cuisine").limit(10)
{
    "name": "Wendy'S",
    "cuisine": "Hamburgers"
}
{
    "name": "Riviera Caterer",
    "cuisine": "American"
}
{
    "name": "Wilken'S Fine Food",
    "cuisine": "Delicatessen"
}
{
    "name": "Regina Caterers",
    "cuisine": "American"
}
{
    "name": "Taste The Tropics Ice Cream",
    "cuisine": "Ice Cream, Gelato, Yogurt, Ices"
}
{
    "name": "C & C Catering Service",
    "cuisine": "American"
}
{
    "name": "May May Kitchen",
    "cuisine": "Chinese"
}
{
    "name": "Seuda Foods",
    "cuisine": "Jewish/Kosher"
}
{
    "name": "Carvel Ice Cream",
    "cuisine": "Ice Cream, Gelato, Yogurt, Ices"
}
{
    "name": "Nordic Delicacies",
    "cuisine": "Delicatessen"
}
10 documents in set (0.3392 sec)

Adding Documents

We can also generate JSON documents without a single line of SQL using the X Dev API.

In this example, we use Python (mysql-connector-python3-8.2.0-1.el8.x86_64) and we add a document using the add() method of a collection (line 72):

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
#!/usr/bin/python3.8
import mysqlx
import sys
from random import choice, randrange
from faker import Faker
def connect():
session = mysqlx.get_session(
{
"host": "db instance IP",
"port": 33060,
"user": "login",
"password": "password",
"ssl-mode": "REQUIRED",
}
)
return session
def gen_cuisine():
cuisine_list = [
"Belgian",
"Italian"
]
return choice(cuisine_list)
session = connect()
db = session.get_schema("docstore")
col = db.get_collection("restaurants")
fake = Faker()
print("Generating new documents.", end="", flush=True)
total = 1000
if len(sys.argv) > 1:
if sys.argv[1]:
total = int(sys.argv[1])
for _ in range(total):
doc = {}
doc["name"] = fake.company()
address = {}
address["street"] = fake.street_name()
address["building"] = fake.building_number()
address["zipcode"] = fake.postcode()
doc["borough"] = fake.city()
doc["cuisine"] = gen_cuisine()
coord = []
coord.append(float(fake.latitude()))
coord.append(float(fake.longitude()))
address["coord"] = coord
doc["address"] = address
grades = []
for _ in range(randrange(5)):
grade = {}
grade_date = {}
date = fake.date_time_this_decade()
grade_date["$date"] = date.strftime("%Y-%m-%dT00:00:00.000+0000")
grade_note = choice(["A", "B", "C"])
grade_score = randrange(20)
grade["date"] = grade_date
grade["grade"] = grade_note
grade["score"] = grade_score
grades.append(grade)
doc["grades"] = grades
col.add(doc).execute()
if total > 100000 and not _ % 1000:
print(".", end="", flush=True)
else:
print(".", end="", flush=True)
print("\nDone ! {} documents generated.".format(total))

You can find on GitHub all the required files: https://github.com/lefred/restaurants-mysql-ds

In case you need large JSON documents, you can increase the size of the mysqlx_max_allowed_packet of your DB System’s configuration:

This will allow you to store JSON documents of 1GB !

HeatWave Accelerator

Now that we all our documents stored in a MySQL HeatWave instance on OCI, could we also benefit from a HeatWave Cluster to boost our queries ?

Since the release of 8.2.0-u1 in OCI, MySQL HeatWave also supports JSON datatype in the HeatWave Cluster. See Accelerating JSON Query Processing using MySQL HeatWave.

When using MySQL Document Store, to be able to load a collection to HeatWave Cluster, we need to modify how the Primary Key is stored as by default, a VARBINARY is used but not supported in HeatWave secondary engine.

If you run the HeatWave cluster nodes estimation you will get the following message:

Error comment: UNABLE TO LOAD TABLE WHEN PRIMARY KEY COLUMN(S) CANNOT BE LOADED
Columns to be loaded:1 (100% VARLEN-encoded columns)
Estimated number of rows:24082

This is the statement to perform the Primary Key change:

SQL> alter table restaurants modify _id char(28) GENERATED ALWAYS AS
     (json_unquote(json_extract(`doc`,_utf8mb4'$._id'))) STORED NOT NULL;

As soon as we have enabled HeatWave Cluster we can load our collection to it:

But before loading it, let’s try a SQL query (yes you can do SQL queries on your JSON documents inside a collection) to see if there is some improvement:

SQL > select doc->>"$.borough" borough, count(*) tot,
             max(length(doc->>"$.name")) longest_name, 
             min(length(doc->>"$.name")) shortest_name,
             round(avg(length(doc->>"$.name")),2) avg_name_length 
      from restaurants 
      where doc->>"$.cuisine" collate utf8mb4_0900_ai_ci like 'belgian' 
      group by borough order by tot desc limit 10;
+---------------+-----+--------------+---------------+-----------------+
| borough       | tot | longest_name | shortest_name | avg_name_length |
+---------------+-----+--------------+---------------+-----------------+
| Lake Michael  |  17 |           28 |             9 |           18.59 |
| Port Michael  |  11 |           29 |             8 |           19.91 |
| East Michael  |  11 |           25 |             8 |           13.64 |
| South Michael |  11 |           27 |            11 |           19.64 |
| South David   |  11 |           30 |             8 |           13.82 |
| Port Matthew  |  11 |           27 |            10 |           18.45 |
| Michaelmouth  |  10 |           25 |             9 |           15.50 |
| Port Jennifer |  10 |           29 |             7 |           18.50 |
| West Michael  |  10 |           26 |            10 |           21.10 |
| Lake James    |  10 |           27 |             9 |           15.70 |
+---------------+-----+--------------+---------------+-----------------+
10 rows in set (0.5191 sec)

This took a bit more than half second (0.5191 sec) to parse 962,520 JSON documents.

Now we load our data into our HeatWave Cluster:

SQL > call sys.heatwave_load(JSON_ARRAY('docstore'), NULL);
+------------------------------------------+
| INITIALIZING HEATWAVE AUTO PARALLEL LOAD |
+------------------------------------------+
| Version: 2.41                            |
|                                          |
| Load Mode: normal                        |
| Load Policy: disable_unsupported_columns |
| Output Mode: normal                      |
|                                          |
+------------------------------------------+
6 rows in set (19.47 sec)
...
+------------------------------------------+
| LOADING TABLE                            |
+------------------------------------------+
| TABLE (2 of 2): `docstore`.`restaurants` |
| Commands executed successfully: 2 of 2   |
| Warnings encountered: 0                  |
| Table loaded successfully!               |
|   Total columns loaded: 2                |
|   Table loaded using 32 thread(s)        |
|   Elapsed time: 19.47 s                  |
|                                          |
+------------------------------------------+
8 rows in set (19.47 sec)
+----------------------------------------------------------------+
| LOAD SUMMARY                                                   |
+----------------------------------------------------------------+
|                                                                |
| SCHEMA           TABLES       TABLES      COLUMNS         LOAD |
| NAME             LOADED       FAILED       LOADED     DURATION |
| ------           ------       ------      -------     -------- |
| `docstore`            1            0            2      19.47 s |
|                                                                |
+----------------------------------------------------------------+
6 rows in set (19.47 sec)

Let’s now run the query again:

SQL > select doc->>"$.borough" borough, count(*) tot,
             max(length(doc->>"$.name")) longest_name, 
             min(length(doc->>"$.name")) shortest_name,
             round(avg(length(doc->>"$.name")),2) avg_name_length 
      from restaurants 
      where doc->>"$.cuisine" collate utf8mb4_0900_ai_ci like 'belgian' 
      group by borough order by tot desc limit 10;
+---------------+-----+--------------+---------------+-----------------+
| borough       | tot | longest_name | shortest_name | avg_name_length |
+---------------+-----+--------------+---------------+-----------------+
| Lake Michael  |  17 |           28 |             9 |           18.59 |
| South David   |  11 |           30 |             8 |           13.82 |
| Port Michael  |  11 |           29 |             8 |           19.91 |
| South Michael |  11 |           27 |            11 |           19.64 |
| Port Matthew  |  11 |           27 |            10 |           18.45 |
| East Michael  |  11 |           25 |             8 |           13.64 |
| Port Jennifer |  10 |           29 |             7 |           18.50 |
| West Michael  |  10 |           26 |            10 |           21.10 |
| Lake James    |  10 |           27 |             9 |           15.70 |
| Michaelmouth  |  10 |           25 |             9 |           15.50 |
+---------------+-----+--------------+---------------+-----------------+
10 rows in set (0.1017 sec)

It’s faster ! But it was already very fast. I even had to reduce the value of secondary_engine_cost_threshold because the query cost was just a little lower and MySQL HeatWave was not offloading it to the accelerator.

But with even more data you could really benefit for the acceleration.

For example with 2,162,520 documents we go from 1.1713 seconds to 0.1922 seconds when using HeatWeave Cluster.

Limitations

All is well in the best of worlds! Almost! When it comes to utilizing MySQL Document Store with MySQL HeatWave Database Service in OCI, you will find an absence of limitations. However, if your intention is to enable the HeatWave Cluster (Accelerator), certain restrictions should be taken into consideration.

The JSON document cannot be larger than 65532 bytes. As the default character set is utf8mb4, the maximum size of a JSON document is then 16Kb.

Currently, the JSON_TABLE() function is not supported.

Conclusion

MySQL Document Store is highly valued by developers and is available in both MySQL Community Edition and only in MySQL HeatWave when opting for a hosted and managed MySQL Service in the cloud.

With the lasted update, it’s also possible to benefit from HeatWave Cluster Accelerator when using MySQL Document Store on OCI which makes a very good choice for those having to deal with a large amount of JSON documents like in the IoT systems for example.

Enjoy using MySQL without a single line of SQL… even in the cloud !

Iedere keer Jonathan Holslag weer eens heeft mogen factureren aan de VRT omdat ze hem opdraven bij De Afspraak of Terzake vraag ik me af: met welke kennis van zaken spreekt deze man eigenlijk?

Bijna iedere uitspraak van hem is een ideologisch pro NAVO, pro VS, anti China en (uiteraard) anti Rusland. De volstrekte Atlanticist.

Ik kan helemaal volgen wanneer we Frank Creyelman veroordelen voor het hand- en spandiensten te leveren voor Chinese inlichtingendiensten.

Maar waar het ineens vandaan komt dat we dat feit ook moeten koppelen aan wat Yves Leterme doet, ontgaat mij volledig.

Als we Yves van iets moeten beschuldigen, is het dat hij de scheiding der machten niet respecteerde door de Fortis-rechter te beïnvloeden. Dat heeft dan ook gepaste gevolgen gehad.

Maar spionage voor China?

Ik vind het eigenlijk de titel professor bij de VUB onwaardig om zulke valse beschuldigingen te uiten, gewoon om wat centen te kunnen factureren bij de VRT.

Beste heer Holslag. Heb jij eigenlijk iets te zeggen wat wel op feiten is gebaseerd? Heb jij bewijzen?

Ik verwacht dat van iemand die zich professor mag noemen. En ik verwacht ook het ontslag wanneer zo iemand met zo’n titel niet voldoet aan die verwachting.

M.a.w. beste professor Holslag: bewijs dat Leterme een Chinese spion is. Of neem ontslag. Valse beschuldigingen zijn trouwens strafbaar onder de noemer Laster en Eerroof.

Willen we wel professors die zich schuldig maken aan strafbare feiten?

December 14, 2023

Time can be cruel for games. Some games don’t age well. The advance in computational and graphical power is painfully clear. But time can also be a filter to separate great from mediocre games. If a game survives for decades, there is a big chance it’s a great game. Games are released to run on very specific hardware and operating systems. Older games do often not run out of the box on modern computers.

December 13, 2023

Some time ago, we saw how we could deploy WordPress on OCI using MySQL HeatWave Database Service with Read Replicas. We had to modify WordPress to use a specific plugin that configures the Read/Write Splitting on the application (WordPress): LudicrousDB.

Today, we will not modify WordPress to split the Read and Write operations, but we will use MySQL Router 8.2.0 (see [1], [2], [3]).

Architecture

The architecture is simple, we will use one application server (Oracle Linux 8 on an Ampere Compute Instance) with Apache, PHP 8.2 and MySQL Router. The latest WordPress will be installed on it:

You can see that we will use different version of MySQL in MySQL HeatWave Database Service. We can use the latest 8.0 and an Innovation Release as Read Replica.

WordPress

We deploy WordPress (6.4.2) on an Always Free Ampere Compute instance using 4 cores and 24GB of RAM:

As reminder, this is how we install Apache, PHP 8.2 and WordPress on the compute instance:

ssh -i <path to the ssh key> opc@<public IP address>
sudo su -
yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
rpm --import https://repo.mysql.com/RPM-GPG-KEY-mysql-2022
yum -y install https://rpms.remirepo.net/enterprise/remi-release-8.rpm
rpm -ivh https://dev.mysql.com/get/mysql80-community-release-el8-9.noarch.rpm

yum install -y mysql-shell --enablerepo=mysql-tools-innovation-community
yum install -y mysql-router --enablerepo=mysql-tools-innovation-community

dnf -y module enable php:remi-8.2
dnf -y install httpd php php-cli php-mysqlnd php-zip php-gd \
 php-mcrypt php-mbstring php-xml php-json

cd /var/www/
wget https://wordpress.org/latest.tar.gz
tar zxvf latest.tar.gz
rm -rf html/ latest.tar.gz
mv wordpress html
chown apache. -R html

systemctl start httpd
systemctl enable httpd

firewall-cmd --zone=public --permanent --add-port=80/tcp
firewall-cmd --zone=public --permanent --add-port=443/tcp
firewall-cmd --reload

chcon --type httpd_sys_rw_content_t /var/www/html
chcon --type httpd_sys_rw_content_t /var/www/html/*
setsebool -P httpd_can_network_connect_db 1
setsebool -P httpd_can_network_connect=1

MySQL HeatWave Database Service

Now we need to deploy a MySQL HeatWave instance in OCI. To use Read Replicas, the current requirement is to use at lease 4 OCPUs:

The IP of our new instance is available in the DB system’s connections tab:

Read Replicas

We can create one or more Read Replicas, and we can specify the required version, there is no need to use the same as the Primary DB System (source):

We need to keep the following information (IP of the Primary DB System and the Load Balancer one) for later:

WordPress Database and User

It’s time to create the WordPress database and its dedicated user in MySQL. We first connect to our new DB System using MySQL Shell from the Compute Instance (the WordPress instance):

mysqlsh --sql admin@10.0.1.191

And we can create the database and the dedicated user:

create database wordpress;
create user wp identified by 'W0rdPress4#';
grant all privileges on wordpress.* to wp;

MySQL Router

Transparent Read/Write splitting in MySQL Router is compatible with all integrated solutions for replication, high availability (HA), and disaster recovery (DR). These solutions are MySQL InnoDB ReplicaSet, MySQL InnoDB Cluster, and MySQL InnoDB ClusterSet.

To enable transparent Read/Write splitting, it is necessary to manually create certain tables and populate them with the IP address of our MySQL HeatWave Instance and the IP address of the Read Replica Load Balancer.

Download the following file, unzip it and load it in the DB System (via the compute instance):

[root@wordpress opc]# wget https://lefred.be/wp-content/uploads/2023/12/router_metadata.zip

[root@wordpress opc]# unzip router_metadata.zip 
Archive:  router_metadata.zip
  inflating: router_metadata.sql
     
[root@wordpress opc]# mysqlsh --sql admin@10.0.1.191 < router_metadata.sql 

Records: 3  Duplicates: 0  Warnings: 0
Records: 3  Duplicates: 0  Warnings: 0
Records: 2  Duplicates: 0  Warnings: 0

And we change the instances data to match what we have in OCI:

mysqlsh --sql admin@10.0.1.191

use mysql_innodb_cluster_metadata

update instances set address="10.0.1.191:3306", 
       mysql_server_uuid=@@server_uuid, instance_name=@@hostname
  where instance_id=1;
update instances set addresses = '{"mysqlX": "10.0.1.191:33060", 
       "mysqlClassic": "10.0.1.191:3306"}' 
  where instance_id=1;

update instances set address="10.0.1.16:3306", 
       mysql_server_uuid="86b0b07b-98d2-11ee-9c5a-020017241124",
       instance_name="read_replica_load_balancer" 
  where instance_id=2;
update instances set addresses = '{"mysqlX": "10.0.1.16:33060", 
       "mysqlClassic": "10.0.1.16:3306"}' 
  where instance_id=2;

I used the uuid of one of the replica for the load balancer entry.

Bootstrap

We can now bootstrap MySQL Router on the WordPress application server (the compute instance):

mysqlrouter --bootstrap admin@10.0.1.191 \
--conf-set-option=routing:bootstrap_rw_split.bind_port=3306 \
--user=mysqlrouter

You need to bootstrap using the IP of the Primary DB instance.

We can verify that in the new generated configuration (/etc/mysqlrouter/mysqlrouter.conf) the default R/W splitting port 6450 has been replaced by 3306:

We can start MySQL Router:

systemclt start mysqlrouter.service

Finishing WordPress Installation

Everything is now prepared to complete the WordPress installation process.

We enter the public IP of the compute instance in a browser and we follow the WordPress installation wizard:

Testing

To test, we will add a PHP code snippet to a new WordPress post that display some information related to the database it’s connected to:

And now we load the article several times:

Sometimes, it is observed that the PHP snippet is executed on one Read Replica at times and on the other one at other times.

We can see that we can create posts, meaning writes are indeed redirected as expected to the Primary DB System.

We can test that also by putting the query in a transaction, this will have the same effect than a write:

Perfect, it’s evident that we are reaching the MySQL HeatWave instance being the source of the Read Replicas !

Conclusion

We have observed that the MySQL HeatWave Database Service Read Replicas capability can be advantageous for WordPress without necessitating any modifications or configuration changes within WordPress itself. With the assistance of MySQL Router 8.2.0, which enables Transparent Read/Write Splitting, we can automatically direct read and write queries to the appropriate instance. This eliminates the need for plugins or altering the WordPress configuration on disk.

Enjoy deploying WordPress in OCI, MySQL HeatWave, Read Replicas and MySQL Router !

Le bateau de bois de chauffage

Il était une fois un groupe d’humains perdus au milieu d’un gigantesque océan. Aussi loin que remontait la mémoire, le gigantesque bateau de bois avait toujours navigué, aucune terre n’ayant jamais été trouvée ni même aperçue.

Le temps n’était jamais glacial, mais jamais trop chaud non plus. Souvent, un petit vent frisquet parcourait les cabines. Pour se réchauffer, les passagers de première classe avaient engagé ceux de troisième classe, les chargeant de découper le bois de la coque afin de le brûler dans leurs énormes poêles. Comme les cabines étaient bien plus confortables chauffées, l’idée vint aux premières classes de revendre aux passagers de seconde classe l’excédent du bois.

Le marché était juteux. Une partie de la tuyauterie du navire fut reconvertie en poêle à bois vendus à très bon prix afin d’équiper les cabines des secondes.

Naturellement, le bateau prenait désormais l’eau de partout. Un original aux cheveux en bataille émit l’idée d’arrêter de découper la coque si l’on ne voulait pas couler.

— Et mes profits ? dirent les premières.
— Et mon chauffage ? dirent les secondes.
— Et mon boulot ? dirent les troisièmes.
— Ben je ne sais pas trop. On pourrait mettre des pulls ?
– Ahaha, ricanèrent les premières. Un original qui ne connait rien au brûlage du bois et voudrait nous faire la leçon avec un pull !

Un instant désarçonné, l’original s’entêta.
— N’empêche que là, on coule. Les troisièmes classes seront bientôt sous eau.
— Tu as raison, déclara le capitaine. C’est une problématique importante. J’organise immédiatement une réunion dans le salon des premières classes. Nous envisagerons une solution.

Lorsqu’il redescendit quelques heures plus tard, le ventre bombé de petits-fours, le capitaine se fit interpeller par l’original qui s’était vu refuser l’accès au pont des premières.
— Alors capitaine ? Qu’allons-nous faire pour éviter que le bateau coule ?
— La réunion fut très productive. Nous allons construire désormais des poêles plus performants pour équiper les cabines qui n’en ont pas encore et pourront dès lors se chauffer avec moins de bois. D’ailleurs, nous envisageons de réduire graduellement la vitesse avec laquelle nous débitons le bois de la coque. Cela ne va pas plaire aux troisièmes, qui auront moins de travail, ni aux secondes, car le bois sera plus cher afin de compenser la perte de profit, mais il faut ce qu’il faut pour sauver le navire.
— Capitaine, vous êtes sûr que cela sera suffisant pour éviter de couler ?
— Ne t’inquiète pas, sourit malicieusement le capitaine. On a également demandé aux deuxièmes classes d’imaginer des pompes pour extraire l’eau du navire. Il y en a bien un qui va nous inventer ça, non ?
— Mais le temps presse…
— Et puis, qui sait… Si ça se trouve, on va bientôt découvrir un rivage, aborder une terre. Pourquoi s’en faire ?
— Pourquoi, en effet, murmura piteusement l’original en retirant ses chaussettes pour les essorer alors que l’eau commençait à monter dans les coursives.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

December 12, 2023

SF en VF, le coffret pour découvrir la science-fiction européenne

« C’est l’histoire d’un écrivain belge, un écrivain suisse et un écrivain français qui vont en boîte. »

Bon, vu comme ça, ça ressemble à une blague. Mais attendez, ce n’est que le début ! Parce que les trois écrivains ont commis chacun un livre de science-fiction francophone et que si on assemble tous les titres de ces livres, on a, en tout et pour tout, un seul mot de français : le mot « projet ».

Excellente blague, non ?

L’histoire en question, c’est celle d’une boîte, ou plutôt un coffret intitulé « SF en VF » contenant trois romans de science-fiction francophone. Une idée parfaite de cadeau pour offrir ou se faire offrir si l’on souhaite découvrir ou faire découvrir la science-fiction européenne.

Trois livres, trois visions de la science-fiction.

Tout d’abord avec Ploum, le Belge qui pense qu’il faut un titre à consonance anglophone pour faire de la SF, mais ne peut s’empêcher de le franciser : « Printeurs ». Un roman dystopique et cyberpunk sur la face cachée du capitalisme de surveillance : publicités envahissantes, vie privée réduite à néant, attentats sponsorisés, chômeurs hypnotisés par les stars du petit écran tandis qu’en coulisses, les esclaves se tuent à la tâche pour produire les biens de consommation jetables.

On enchaîne ensuite avec le Suisse, Pascal Lovis et son « Projet Idaho ». Le seul titre en français. L’histoire d’un homme en vacances qui se réveille après une cuite magistrale et constate qu’il n’a plus de connexion au réseau. Plus de contacts. Plus d’amis. Et littéralement plus de chambre d’hôtel. Mais, très vite, l’histoire va prendre un tournant inattendu et vous entrainer dans un space opera endiablé. Le second tome, « Mémoires Spectrales », est disponible et clôt le cycle. Il explique également la raison du nom « Idaho » (car moi, je n’avais pas compris la référence à Dune).

Pour terminer en beauté, l’incontournable Thierry Crouzet et le premier tome de son projet démentiel : « One Minute ». Parce que ça pète plus en anglais. Pas de héros. Pas de trame narrative traditionnelle. Ici, le lecteur est invité à vivre et à revivre la même minute de l’histoire de la planète. Celle où l’humanité a soudainement compris, de Paris à Bangkok et de New York à Ouagadougou, qu’elle n’était plus seule. Qu’elle était en contact avec une intelligence extra-terrestre.

Trois livres. Trois pays. Trois accents. Trois histoires. Trois futurs.

Bref, un triple cadeau dans un superbe coffret argenté, à commander chez votre libraire ou, si ce n’est pas possible, directement sur le site de l’éditeur. Mais soutenez votre librairie favorite, elle en a bien besoin !

Le coffret « SF en VF » et les trois livres qui le composent Le coffret « SF en VF » et les trois livres qui le composent

Notez que Pascal Lovis, le grand mage mauve, fait également partie du coffret régional « Fantasy Suisse » avec Aequilegia Nox et Stéphane Paccaud. De la magie et de la fantasy qui sent bon la raclette.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

December 11, 2023

Firewatch is a single person adventure that I enjoyed from the beginning to (almost the) end. Certainly recommended. My only point of criticism is the rather abrupt ending. The game is short (around 5 hours), but the story is by no means rushed. Your life, as the protagonist, is in shambles. What better way to get your thoughts straight than wandering all day in nature while spotting wildfires? As a firewatcher in a very remote area, your only lifeline to the outside world is a warm and friendly voice through the walkie-talkie.

December 10, 2023

A Short Hike is a small indie game. And by small, well, I mean small: you’ll finish the story in around 1 hour and half. You can spend a few additional hours to 100% the game, but there isn’t much incentive to do so once you finish it. That said, the game is a great example of a cozy game. The artwork is beautiful, the mechanics are good and the quests are easy.

December 09, 2023

Is a game released on 2017 still retrogaming? It is if you played on a Wii-U, the abandoned last-generation console of Nintendo. Yes, it was also released on a Nintendo Switch, as exactly the same game with a very small bump in resolution (from 720p to 900p). Originally it was planned as a Wii-U exclusive, but it took Nintendo a few extra years to get it right. And they did.

December 02, 2023

Misfortune Ramirez Hernandez, a cute 8-year-old girl, loves to tell the world she is “a little Miss”. Little Misfortune is more than a wordplay on the story of the child. It’s a lovely, funny and sad little adventure game. All at the same time. It’s about a child’s sense of wonder and about the loss of innocence. Although short (I finished it around 3 hours), it’s rewarding. No need to make it longer just for the sake of it.
A fun, kind of realistic tennis game. Best for playing in Quick Tournament mode (the history mode is silly). Not easy, but certainly worth it. Handhelds like the rp3+ can upscale it to about 3x the original resolution. My old 2016-Celeron NUC with Batocera can upscale it to 1080p without any issues. Have a look at “How to play retrogames?” if you don’t know how to play retrogames.
This is a beautiful game with interesting puzzles and… a story. Another perfect example of SNES-level graphics on a small handheld. The game takes a little investment in the beginning because it starts slowly, certainly if you’re not familiar with the action-adventure genre The quests will become clearer and the speed will pick up. You will be stuck more than once (actually, lots of time). You’ll want to finish it in one go, but this open world game will keep you occupied for way, way, longer than you expected.
This game is probably my favorite in the Metroid Series. It’s a perfect example of SNES-level graphics on a small handheld. This is the most story-driven installment of all Metroid 2D games. Be ready to be lost for hours and experience a huge adrenaline boost once you finally figure out what you need to do or when you finish that impossible boss. Oh, and in this Metroid there are situations where there is no way you can win, and you have to run.
The 2D Metroid Series offer a perfect mix of platformer action, adventure and intelligent puzzles. While preceded by Metroid (1986) on the NES and Metroid II: Return of Samus (1991) on the GameBoy, this is the game that gave the series its distinct look and identity. While Metrovania is a thing and inspired many great games for decades to come, it is Metroid and not Castlevania the game that immerses the played in an eerie, dreadful, inmense and most of all lonely universe.