Planet Grep

Planet'ing Belgian FLOSS people

Planet Grep is maintained by Wouter Verhelst. All times are in UTC.

March 21, 2024

Mario & Luigi: Superstar Saga is a surprisingly fun game. At its core it’s a role-playing game. But you wouldn’t tell because of the amount of platforming, the quantity and quality of small puzzles and -most of all- the funny self-deprecating dialogue. As the player you control Mario and Luigi simultaneously and that takes some time to master. As you learn techniques along the way, each character get special abilities allowing you to solve new puzzles and advance on your journey.

You may have noticed that MySQL now supports creating functions (and store procedures) in Javascript using GraalVM.

This new functionality is only available in MySQL Enterprise and MySQL HeatWave.

As a developer, you can also get free access to MySQL Enterprise from Oracle Technology Network (OTN): MySQL Enterprise Download.

I also recommend you look at Øystein’s presentation during the Belgian MySQL Days: [1], [2].

Why use a JS function?

As you may know, UUIDs are increasingly popular and their usage in MySQL is limited to the UUID V1.

I already covered this in the blog post titled MySQL & UUIDs and in the following presentation.

So we know that it’s possible to extract the timestamp from the UUIDv1. I’ve created a MySQL Component (C++) that provides 3 UDFs (User Defined Functions) for that purpose. The component is available on GitHub: mysql-component-uuid_v1.When using MySQL DBaaS, it’s not possible to install such components, and it’s the same with MySQL HeatWave. Therefore, the capacity to employ JavaScript functions becomes particularly advantageous in this scenario.Reminder: if you use the native UUIDs in MySQL as Primary Key, you need to specify them like this:
uuid BINARY(16) DEFAULT (UUID_TO_BIN(UUID(), 1)) PRIMARY KEY

So we know that it’s possible to extract the timestamp from the UUIDv1. I’ve created a MySQL Component (C++) that provides 3 UDFs (User Defined Functions) for that purpose. The component is available on GitHub: mysql-component-uuid_v1.

When using MySQL DBaaS, it’s not possible to install such components, and it’s the same with MySQL HeatWave. Therefore, the capacity to employ JavaScript functions becomes particularly advantageous in this scenario.

Functions to handle UUIDv1

We will then create functions to parse the MySQL UUID and display the timestamp.

Let’s start with extracting the time stamp as Unix time in milliseconds:

USE test;
DROP function IF EXISTS js_uuid_to_unixtime;
CREATE FUNCTION js_uuid_to_unixtime (uuid_in CHAR(36))
RETURNS CHAR(23) LANGUAGE JAVASCRIPT AS $$
    
const UUID_T_LENGTH = 16;
const UNIX_TS_LENGTH = 6;

function uuidToUnixTs(uuid_str) {
    const MS_FROM_100NS_FACTOR = 10000;
    const OFFSET_FROM_15_10_1582_TO_EPOCH = 122192928000000000n;

    // Split UUID with '-' as delimiter
    let uuid_parts = uuid_str.split('-');

    // Construct UUID timestamp from its parts
    let uuid_timestamp =
        uuid_parts[2].substring(1) +
        uuid_parts[1] +
        uuid_parts[0];

    // Parse hexadecimal timestamp string to integer
    let timestamp = BigInt('0x' + uuid_timestamp);

    // Calculate Unix timestamp in milliseconds
    let unixTimestampMs = Number((timestamp - OFFSET_FROM_15_10_1582_TO_EPOCH) / BigInt(MS_FROM_100NS_FACTOR));

    return unixTimestampMs;
}

function stringToUuid(str) {
    if (str.length !== 36) {
        return 1;
    }
    if (str[14] !== '1') {
        return 1;
    }
    return 0;
}

let result = stringToUuid(uuid_in);
let timestamp_out;
if (result === 0) {
    timestamp_out = uuidToUnixTs(uuid_in)/1000;
} else {
    timestamp_out="Error parsing UUID";
}

return (timestamp_out);

$$    

; 

Let’s test it:

MySQL > select now(), from_unixtime(js_uuid_to_unixtime(uuid()));
+---------------------+--------------------------------------------+
| now()               | from_unixtime(js_uuid_to_unixtime(uuid())) |
+---------------------+--------------------------------------------+
| 2024-03-20 22:24:13 | 2024-03-20 22:24:13.503000                 |
+---------------------+--------------------------------------------+
1 row in set (0.0009 sec)

I’ve created other Javascript functions offering a different output:

  • js_uuid_to_datetime()
  • js_uuid_to_datetime_long()

Let’s see them in action:

MySQL > select js_uuid_to_datetime(uuid());
+-----------------------------+
| js_uuid_to_datetime(uuid()) |
+-----------------------------+
| 2024-03-20 22:31:20.824     |
+-----------------------------+
1 row in set (0.0009 sec)

MySQL > select js_uuid_to_datetime_long(uuid());
+----------------------------------------------+
| js_uuid_to_datetime_long(uuid())             |
+----------------------------------------------+
| Wednesday, March 20, 2024 at 10:31:30 PM GMT |
+----------------------------------------------+
1 row in set (0.0012 sec)

UUIDv7

More and more people are using UUIDv7. Unfortunately, those are not available in MySQL. UUIDv7 are sequential and they are not using the Mac Address.

We can also create Javascript functions to generate and handle UUIDv7 in MySQL HeatWave.

The functions are available here.

Let’s test UUIDv7 as the Primary Key for a table:

MySQL > CREATE TABLE `item` (
  `id` varbinary(16) NOT NULL,
  `name` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

MySQL > insert into item values (uuid_to_bin(js_uuidv7()),'item01'),
                                (uuid_to_bin(js_uuidv7()),'item02'), 
                                (uuid_to_bin(js_uuidv7()),'item03');
Query OK, 3 rows affected (0.0060 sec)

MySQL > select * from item;
+------------------------------------+--------+
| id                                 | name   |
+------------------------------------+--------+
| 0x018E5E04BA3376AED34A5B5EB51720A5 | item01 |
| 0x018E5E04BA34732799E01BED0C1A06F9 | item03 |
| 0x018E5E04BA34780F6A683BA37F00CA27 | item02 |
+------------------------------------+--------+
3 rows in set (0.0007 sec)

MySQL > select bin_to_uuid(id) uuid, name from item;
+--------------------------------------+--------+
| uuid                                 | name   |
+--------------------------------------+--------+
| 018e5e04-ba33-76ae-d34a-5b5eb51720a5 | item01 |
| 018e5e04-ba34-7327-99e0-1bed0c1a06f9 | item03 |
| 018e5e04-ba34-780f-6a68-3ba37f00ca27 | item02 |
+--------------------------------------+--------+
3 rows in set (0.0008 sec)

We can also retrieve the creation time of each record:

MySQL > insert into item values (uuid_to_bin(js_uuidv7()),'item04');
Query OK, 1 row affected (0.0035 sec)

MySQL > select js_uuidv7_to_datetime(bin_to_uuid(id)) insert_date, 
        name from item;
+-------------------------+--------+
| insert_date             | name   |
+-------------------------+--------+
| 2024-03-20 22:39:11.923 | item01 |
| 2024-03-20 22:39:11.924 | item03 |
| 2024-03-20 22:39:11.924 | item02 |
| 2024-03-20 22:41:54.567 | item04 |
+-------------------------+--------+
4 rows in set (0.0041 sec)

It’s better for performance to store the UUIDs using uuid_to_bin() as BINARY(16). Certainly, if we plan to have secondary indexes too.

But maybe you prefer to directly see the UUID’s representation (value) when querying the table.

We can then modify our table like this:

MySQL > alter table item add column uuid char(36) 
        generated always as (bin_to_uuid(id)) virtual after id,
        alter column id set invisible;

MySQL > insert into item (id, name) 
        values (uuid_to_bin(js_uuidv7()),'item05');

MySQL > select * from item;
+--------------------------------------+--------+
| uuid                                 | name   |
+--------------------------------------+--------+
| 018e5e04-ba33-76ae-d34a-5b5eb51720a5 | item01 |
| 018e5e04-ba34-7327-99e0-1bed0c1a06f9 | item03 |
| 018e5e04-ba34-780f-6a68-3ba37f00ca27 | item02 |
| 018e5e07-3587-73af-6535-f805bee0cc04 | item04 |
| 018e5e15-12bf-7590-a010-7a5b5457a6c3 | item05 |
+--------------------------------------+--------+
5 rows in set (0.0006 sec)

MySQL > select *, js_uuidv7_to_datetime(uuid) inserted_at from item;
+--------------------------------------+--------+-------------------------+
| uuid                                 | name   | inserted_at             |
+--------------------------------------+--------+-------------------------+
| 018e5e04-ba33-76ae-d34a-5b5eb51720a5 | item01 | 2024-03-20 22:39:11.923 |
| 018e5e04-ba34-7327-99e0-1bed0c1a06f9 | item03 | 2024-03-20 22:39:11.924 |
| 018e5e04-ba34-780f-6a68-3ba37f00ca27 | item02 | 2024-03-20 22:39:11.924 |
| 018e5e07-3587-73af-6535-f805bee0cc04 | item04 | 2024-03-20 22:41:54.567 |
| 018e5e15-12bf-7590-a010-7a5b5457a6c3 | item05 | 2024-03-20 22:57:03.167 |
+--------------------------------------+--------+-------------------------+
5 rows in set (0.0014 sec)

Conclusion

Having the possibility to write Javascript programs directly in MySQL is a nice addition, especially when running MySQL HeatWave in the cloud where extra components or plugins are not possible to install.

For those possessing such User-Defined Functions (UDFs), you can now transcribe them into Javascript and use them where MySQL supports GraalVm: MySQL Enterprise and MySQL Heatwave.

The examples are available on GitHub: mysql-graalvm-examples.

Enjoy writing Javascript functions in MySQL!

March 19, 2024

In MySQL 8.0.12, we introduced a new algorithm for DDLs that won’t block the table when changing its definition. The first instant operation was adding a column at the end of a table, this was a contribution from Tencent Games.

Then in MySQL 8.0.29 we added the possibility to add (or remove) a column anywhere in the table.

For more information, please check these articles from Mayank Prasad : [1], [2]

In this article, I want to focus on some dangers that could happen when using blindly this feature.

Default Algorithm

Since MySQL 8.0.12, for any supported DDL, the default algorithm is INSTANT. This means that the ALTER statement will only modify the table’s metadata in the data dictionary. No exclusive metadata locks are taken on the table during the preparation and execution phases of the operation, and table data is unaffected, making the operations instantaneous.

The other two algorithms are COPY and INPLACE, see the manual for the online DDL operations.

However, there is a limitation for INSTANT DDLs even when the operation is supported: a table supports 64 instant changes. After reaching that counter, the table needs to be “rebuilt”.

If the algorithm is not specified during the ALTER statement (DDL operation), the appropriate algorithm will be chosen silently. Of course, this can lead to a nightmare situation in production if not expected.

Always specify the ALGORITHM

So the first recommendation is always to specify the algorithm even if it’s the default one when performing DDLs. When the algorithm is specified, if MySQL is not able to use it, it will throw an error instead of executing the operation using another algorithm:

SQL > ALTER TABLE t1 DROP col1, ALGORITHM=INSTANT;
ERROR: 4092 (HY000): Maximum row versions reached for table test/t1.
No more columns can be added or dropped instantly. Please use COPY/INPLACE.

Monitor the instant changes

The second recommendation is also to monitor the number of instant changes performed on the tables.

MySQL keeps the row versions in Information_Schema:

SQL > SELECT NAME, TOTAL_ROW_VERSIONS 
      FROM INFORMATION_SCHEMA.INNODB_TABLES WHERE NAME LIKE 'test/t1';
+---------+--------------------+
| NAME    | TOTAL_ROW_VERSIONS |
+---------+--------------------+
| test/t1 |                 63 |
+---------+--------------------+

In the example above, the DBA will be able to perform one extra INSTANT DDL operation but after that one, MySQL won’t be able to perform another one.

As DBA, it’s a good practice to monitor all the tables and decide when a table needs to be rebuilt (to reset that counter).

This is an example of a recommended query to add to your monitoring tool:

SQL > SELECT NAME, TOTAL_ROW_VERSIONS, 64-TOTAL_ROW_VERSIONS AS 
             "REMAINING_INSTANT_DDLs", 
             ROUND(TOTAL_ROW_VERSIONS/64 * 100,2) AS "DDLs %"
      FROM INFORMATION_SCHEMA.INNODB_TABLES 
      WHERE TOTAL_ROW_VERSIONS > 0 ORDER BY 2 DESC;
+--------------------------+--------------------+------------------------+--------+
| NAME                     | TOTAL_ROW_VERSIONS | REMAINING_INSTANT_DDLs | DDLs % |
+--------------------------+--------------------+------------------------+--------+
| test/t1                  |                 63 |                      1 |  98.44 |
| test/t                   |                  4 |                     60 |   6.25 |
| test2/t1                 |                  3 |                     61 |   4.69 |
| sbtest/sbtest1           |                  2 |                     62 |   3.13 |
| test/deprecation_warning |                  1 |                     63 |   1.56 |
+--------------------------+--------------------+------------------------+--------+

To reset the counter and rebuild the table, you can use OPTIMIZE TABLE <table> or ALTER TABLE <table> ENGINE=InnoDB

Conclusion

In conclusion, MySQL 8.0’s introduction of the INSTANT algorithm for DDL operations has revolutionized schema changes by avoiding blocking changes. However, with the limitation of 64 instant changes, before a table rebuild is required, it’s crucial to specify the algorithm explicitly during ALTER statements to avoid unexpected behaviors. Monitoring the number of instant changes through Information_Schema is also recommended to avoid surprises by reaching the instant change limit unaware and plan carefully the table rebuilds

Enjoy MySQL !

March 15, 2024

Dit kan helemaal niet in België want het vereist een serieus niet populistisch debat tussen alle politieke partijen (waarbij ze hun rekenmachine en logisch redeneren meenemen én er op toezien dat het budget écht in balans blijft). Dat bestaat al decennia niet meer op het Federaal niveau. Dus ja.

Maar een manier om het budgetair probleem van het land op te lossen zou kunnen door meer te belasten op consumptie door het BTW tarief van 21% naar bv. 25% of 30% te verhogen, (veel) minder te belasten op inkomen en tot slot meer producten en diensten op 6% en 12% te zetten.

M.a.w. alle noodzakelijke uitgaven zoals electriciteit, Internet, (gas) verwarming, water, brood, en zo verder op 6%. Een (veel) grotere mand van (bv. gezonde) voedingsmiddelen, woning (en kosten zoals bv. verbouwingen) op 12%.

Maar voor alle luxeconsumptie 25% of 30% of eventueel zelfs 40%.

Daaraan gekoppeld een stevige verlaging van de personenbelasting.

Gevolg meer BTW-inkomsten uit consumptie. Betere concurrentiepositie t.o.v. onze buurlanden doordat de salarissen niet hoeven te stijgen (door de verlaging van de personenbelasting). Koopkracht wordt daardoor versterkt en die versterking wordt aangevuld doordat er een grotere mand voor 6% en 12% BTW tarieven is.

M.a.w. enkel wordt luxeconsumptie (veel) meer belast. Noodzakelijke consumptie gaat naar 6% of 12% (of men behoudt hiervoor de 21%).

Grootste nadeel is dat onze boekhouders meer werk hebben. Maar ik denk wel dat ze dat zo erg niet zullen vinden (ze factureren hun extra uren trouwens aan het luxeconsumptie BTW tarief).

Voorbeeld voordeel is dat je consumptie beter kan sturen. Wil je dat wij Belgen meer electrische wagens kopen? Maak zo’n wagen 6% BTW en een wagen met een dieselverbrandingsmotor 40% BTW. Wil je dat jongeren gezondere voeding eten? Coca Cola 40% BTW en fruitsap 6% BTW.

Uiteraard moet je ook besparen en een serieuze efficiëntieoefening doen.

March 12, 2024

Nintendo killed the switch emulator Yuzu. Sadly, the official Citra distribution disappeared a well because they shared some developers. The good news is that Citra being open source, is still available as archive builds and new forks. EmuDeck on the Steam Deck adapted to this new reality and adopted a bring-your-own Yuzu and Citra model. The changes to this emulators came at the same time of a big EmuDeck release, complication things a little.

Last fall, we toured the Champagne region in France, famous for its sparkling wines. We explored the ancient, underground cellars where Champagne undergoes its magical transformation from grape juice to sparkling wine. These cellars, often 30 meters deep and kilometers long, maintain a constant temperature of around 10-12°C, providing the perfect conditions for aging and storing Champagne.

A glowing light bulb hanging in an underground tunnel. 25 meters underground in a champagne tunnel, which often stretches for miles/kilometers.

After sampling various Champagnes, we returned home with eight cases to store in our home's basement. However, unlike those deep cellars, our basement is just a few meters deep, prompting a simple question that sent me down a rabbit hole: how does our basement's temperature compare?

Rather than just buying a thermometer, I decided to build my own "temperature monitoring system" using open hardware and custom-built software. After all, who needs a simple solution when you can spend evenings tinkering with hardware, sensors, wires and writing your own software? Sometimes, more is more!

The basic idea is this: track the temperature and humidity of our basement every 15 minutes and send this information to a web service. This web service analyzes the data and alerts us if our basement becomes too cold or warm.

I launched this monitoring system around Christmas last year, so it's been running for nearly three months now. You can view the live temperature and historical data trends at https://dri.es/sensors. Yes, publishing our basement's temperature online is a bit quirky, but it's all in good fun.

A webpage displaying temperature and humidity readings for a basement in Belgium. A screenshot of my basement temperature dashboard.

So far, the temperature in our basement has been ideal for storing wine. However, I expect it will change during the summer months.

In the rest of this blog post, I'll share how I built the client that collects and sends the data, as well as the web service backend that processes and visualizes that data.

Hardware used

For this project, I bought:

  1. Adafruit ESP32-S3 Feather: A microcontroller board with Wi-Fi and Bluetooth capabilities, serving as the central processing unit of my project.
  2. Adafruit SHT4x sensor: A high-accuracy temperature and humidity sensor.
  3. 3.7v 500mAh battery: A small and portable power source.
  4. STEMMA QT / Qwiic JST SH 4-pin cable: To connect the sensor to the board without soldering.

The total hardware cost was $32.35 USD. I like Adafruit a lot, but it's worth noting that their products often come at a higher cost. You can find comparable hardware for as little as $10-15 elsewhere. Adafruit's premium cost is understandable, considering how much valuable content they create for the maker community.

An ESP32-S3 development board is linked to an SHT41 temperature and humidity sensor and powered by a battery pack. For scale, a 2 Euro coin is included. The SHT41 sensor is roughly equivalent in size to the coin, and the ESP32-S3 board is about twice the coin's diameter. An ESP32-S3 development board (middle) linked to an SHT41 temperature and humidity sensor (left) and powered by a battery pack (right).

Client code for Adafruit ESP32-S3 Feather

I developed the client code for the Adafruit ESP32-S3 Feather using the Arduino IDE, a widely used platform for developing and uploading code to Arduino-compatible boards.

The code measures temperature and humidity every 15 minutes, connects to WiFi, and sends this data to https://dri.es/sensors, my web service endpoint.

One of my goals was to create a system that could operate for a long time without needing to recharge the battery. The ESP32-S3 supports a "deep sleep" mode where it powers down almost all its functions, except for the clock and memory. By placing the ESP32-S3 into deep sleep mode between measurements, I was able to significantly reduce power.

Now that you understand the high-level design goals, including deep sleep mode, I'll share the complete client code below. It includes detailed code comments, making it self-explanatory.

#include "Adafruit_SHT4x.h"
#include "Adafruit_MAX1704X.h"
#include "WiFiManager.h"
#include "ArduinoJson.h"
#include "HTTPClient.h"

// The Adafruit_SHT4x sensor is a high-precision, temperature and humidity
// sensor with I2C interface.
Adafruit_SHT4x sht4 = Adafruit_SHT4x();

// The Adafruit ESP32-S3 Feather comes with a built-in MAX17048 LiPoly / LiIon
// battery monitor. The MAX17048 provides accurate monitoring of the battery's
// voltage. Utilizing the Adafruit library, not only helps us obtain the raw
// voltage data from the battery cell, but also converts this data into a more
// intuitive battery percentage or charge level. We will pass on the battery
// percentage to the web service endpoint, which can visualize it or use it to
// send notifications when the battery needs recharging.
Adafruit_MAX17048 maxlipo;

void setup() {
   Serial.begin(115200);

   // Wait for the serial connection to establish before proceeding further.
   // This is crucial for boards with native USB interfaces. Without this loop,
   // initial output sent to the serial monitor is lost. This code is not
   // needed when running on battery.
   //delay(1000);

   // Generates a unique device ID from a segment of the MAC address.
   // Since the MAC address is permanent and unchanged after reboots,
   // this guarantees the device ID remains consistent. To achieve a
   // compact ID, only a specific portion of the MAC address is used,
   // specifically the range between 0x10000 and 0xFFFFF. This range
   // translates to a hexadecimal string of a fixed 5-character length,
   // giving us roughly 1 million unique IDs. This approach balances
   // uniqueness with compactness.
   uint64_t chipid = ESP.getEfuseMac();
   uint32_t deviceValue = ((uint32_t)(chipid >> 16) & 0x0FFFFF) | 0x10000;
   char device[6]; // 5 characters for the hex representation + the null terminator.
   sprintf(device, "%x", deviceValue); // Use '%x' for lowercase hex letters

   // Initialize the SHT4x sensor:
  if (sht4.begin()) {
    Serial.println(F("SHT4 temperature and humidity sensor initialized."));
    sht4.setPrecision(SHT4X_HIGH_PRECISION);
    sht4.setHeater(SHT4X_NO_HEATER);
  }
  else {
   Serial.println(F("Could not find SHT4 sensor."));
  }

  // Initialize the MAX17048 sensor:
  if (maxlipo.begin()) {
    Serial.println(F("MAX17048 battery monitor initialized."));
  }
  else {
    Serial.println(F("Could not find MAX17048 battery monitor!"));
  }

  // Insert a short delay to ensure the sensors are ready and their data is stable:
  delay(200);

  // Retrieve temperature and humidity data from SHT4 sensor:
  sensors_event_t humidity, temp;
  sht4.getEvent(&humidity, &temp);

  // Get the battery percentage and calibrate if it's over 100%:
  float batteryPercent = maxlipo.cellPercent();
  batteryPercent = (batteryPercent > 100) ? 100 : batteryPercent;

  WiFiManager wifiManager;

  // Uncomment the following line to erase all saved WiFi credentials.
  // This can be useful for debugging or reconfiguration purposes.
  // wifiManager.resetSettings();
 
  // This WiFI manager attempts to establish a WiFi connection using known
  // credentials, stored in RAM. If it fails, the device will switch to Access
  // Point mode, creating a network named "Temperature Monitor". In this mode,
  // connect to this network, navigate to the device's IP address (default IP
  // is 192.168.4.1) using a web browser, and a configuration portal will be
  // presented, allowing you to enter new WiFi credentials. Upon submission,
  // the device will reboot and try connecting to the specified network with
  // these new credentials.
  if (!wifiManager.autoConnect("Temperature Monitor")) {
    Serial.println(F("Failed to connect to WiFi ..."));

    // If the device fails to connect to WiFi, it will restart to try again.
    // This approach is useful for handling temporary network issues. However,
    // in scenarios where the network is persistently unavailable (e.g. router
    // down for more than an hour, consistently poor signal), the repeated
    // restarts and WiFi connection attempts can quickly drain the battery.
    ESP.restart();

    // Mandatory delay to allow the restart process to initiate properly:
    delay(1000);
  }

  // Send collected data as JSON to the specified URL:
  sendJsonData("https://dri.es/sensors", device, temp.temperature, humidity.relative_humidity, batteryPercent);


  // WiFi consumes significant power so turn it off when done:
  WiFi.disconnect(true);
 
  // Enter deep sleep for 15 minutes. The ESP32-S3's deep sleep mode minimizes
  // power consumption by powering down most components, except the RTC. This
  // mode is efficient for battery-powered projects where constant operation
  // isn't needed. When the device wakes up after the set period, it runs
  // setup() again, as the state  isn't preserved.
  Serial.println(F("Going to sleep for 15 minutes ..."));
  ESP.deepSleep(15 * 60 * 1000000); // 15 mins * 60 secs/min * 1,000,000 μs/sec.
}

bool sendJsonData(const char* url, const char* device, float temperature, float humidity, float battery) {
  StaticJsonDocument<200> doc;

  // Round floating-point values to one decimal place for efficient data
  // transmission. This approach reduces the JSON payload size, which is
  // important for IoT applications running on batteries.
  doc["device"] = device;
  doc["temperature"] = String(temperature, 1);
  doc["humidity"] = String(humidity, 1);
  doc["battery"] = String(battery, 1);

  // Serialize JSON to a string:
  String jsonData;
  serializeJson(doc, jsonData);

  // Initialize an HTTP client with the provided URL:
  HTTPClient httpClient;
  httpClient.begin(url);
  httpClient.addHeader("Content-Type", "application/json");

  // Send a HTTP POST request:
  int httpCode = httpClient.POST(jsonData);

  // Close the HTTP connection:
  httpClient.end();

  // Print debug information to the serial console:
  Serial.println("Sent '" + jsonData + "' to " + String(url) + ", return code " + httpCode);
  return (httpCode == 200);
}

void loop() {
  // The ESP32-S3 resets and runs setup() after waking up from deep sleep,
  // making this continuous loop unnecessary.
}

Further optimizing battery usage

When I launched my thermometer around Christmas 2023, the battery was at 88%. Today, it is at 52%. Some quick math suggests it's using approximately 12% of its battery per month. Given its current rate of usage, it needs recharging about every 8 months.

Connecting to the WiFi and sending data are by far the main power drains. To extend the battery life, I could send updates less frequently than every 15 minutes, only send them when there is a change in temperature (which is often unchanged or only different by 0.1°C), or send batches of data points together. Any of these methods would work for my needs, but I haven't implemented them yet.

Alternatively, I could hook the microcontroller up to a 5V power adapter, but where is the fun in that? It goes against the project's "more is more" principle.

Handling web service requests

With the client code running on the ESP32-S3 and sending sensor data to https://dri.es/sensors, the next step is to set up a web service endpoint to receive this incoming data.

As I use Drupal for my website, I implemented the web service endpoint in Drupal. Drupal uses Symfony, a popular PHP framework, for large parts of its architecture. This combination offers an easy but powerful way for implementing web services, similar to those found across other modern server-side web development frameworks like Laravel, Django, etc.

Here is what my Drupal routing configuration looks like:

sensors.sensor_data:
  path: '/sensors'
  methods: [POST]
  defaults:
    _controller: '\Drupal\sensors\Controller\SensorMonitorController::postSensorData'
  requirements:
    _access: 'TRUE'

The above configuration directs Drupal to send POST requests made to https://dri.es/sensors to the postSensorData() method of the SensorMonitorController class.

The implementation of this method handles request authentication, validates the JSON payload, and saves the data to a MariaDB database table. Pseudo-code:

public function postSensorData(Request $request) : JsonResponse {
  $content = $request->getContent();
  $data = json_decode($content, TRUE);

  // Validate the JSON payload:
  …

  // Authenticate the request:
  … 

  $device = DeviceFactory::getDevice($data['device']);
  if ($device) {
    $device->recordSensorEvent($data);
  }

  return new JsonResponse(['message' => 'Thank you!']);
 }

For testing your web service, you can use tools like cURL:

$ curl -X POST -H "Content-Type: application/json" -d '{"device":"0xdb123", "temperature":21.5, "humidity":42.5, "battery":90.0}' https://localhost/sensors

While cURL is great for quick tests, I use PHPUnit tests for automated testing in my CI/CD workflow. This ensures that everything keeps working, even when upgrading Drupal, Symfony, or other components of my stack.

Storing sensor data in a database

The primary purpose of $device->recordSensorEvent() in SensorMonitorController::postSensorData() is to store sensor data into a SQL database. So, let's delve into the database design.

My main design goals for the database backend were:

  1. Instead of storing every data point indefinitely, only keep the daily average, minimum, maximum, and the latest readings for each sensor type across all devices.
  2. Make it easy to add new devices and new sensors in the future. For instance, if I decide to add a CO2 sensor for our bedroom one day (a decision made in my head but not yet pitched to my better half), I want that to be easy.

To this end, I created the following MariaDB table:

CREATE TABLE sensor_data (
  date DATE,
  device VARCHAR(255),
  sensor VARCHAR(255),
  avg_value DECIMAL(5,1),
  min_value DECIMAL(5,1),
  max_value DECIMAL(5,1),
  min_timestamp DATETIME,
  max_timestamp DATETIME,
  readings SMALLINT NOT NULL,
  UNIQUE KEY unique_stat (date, device, sensor)
);

A brief explanation for each field:

  • date: The date for each sensor reading. It doesn't include a time component as we aggregate data on a daily basis.
  • device: The device ID of the device providing the sensor data, such as 'basement' or 'bedroom'.
  • sensor: The type of sensor, such as 'temperature', 'humidity' or 'co2'.
  • avg_value: The average value of the sensor readings for the day. Since individual readings are not stored, a rolling average is calculated and updated with each new reading using the formula: avg_value = avg_value + new_value - avg_value new_total_readings . This method can accumulate minor rounding errors, but simulations show these are negligible for this use case.
  • min_value and max_value: The daily minimum and maximum sensor readings.
  • min_timestamp and max_timestamp: The exact moments when the minimum and maximum values for that day were recorded.
  • readings: The number of readings (or measurements) taken throughout the day, which is used for calculating the rolling average.

In essence, the recordSensorEvent() method needs to determine if a record already exists for the current date. Depending on this determination, it will either insert a new record or update the existing one.

In Drupal this process is streamlined with the merge() function in Drupal's database layer. This function handles both inserting new data and updating existing data in one step.

private function updateDailySensorEvent(string $sensor, float $value): void {
  $timestamp = \Drupal::time()->getRequestTime();
  $date = date('Y-m-d', $timestamp);
  $datetime = date('Y-m-d H:i:s', $timestamp);

  $connection = Database::getConnection();

  $result = $connection->merge('sensor_data')
    ->keys([
     'device' => $this->id,
     'sensor' => $sensor,
     'date' => $date,
    ])
    ->fields([
     'avg_value' => $value,
     'min_value' => $value,
     'max_value' => $value,
     'min_timestamp' => $datetime,
     'max_timestamp' => $datetime,
     'readings' => 1,
    ])
    ->expression('avg_value', 'avg_value + ((:new_value - avg_value) / (readings + 1))', [':new_value' => $value])
    ->expression('min_value', 'LEAST(min_value, :value)', [':value' => $value])
    ->expression('max_value', 'GREATEST(max_value, :value)', [':value' => $value])
    ->expression('min_timestamp', 'IF(LEAST(min_value, :value) = :value, :timestamp, min_timestamp)', [':value' => $value, ':timestamp' => $datetime])
    ->expression('max_timestamp', 'IF(GREATEST(max_value, :value) = :value, :timestamp, max_timestamp)', [':value' => $value, ':timestamp' => $datetime])
    ->expression('readings', 'readings + 1')
    ->execute();
 }

Here is what the query does:

  • It checks if a record for the current sensor and date exists.
  • If not, it creates a new record with the sensor data, including the initial average, minimum, maximum, and latest value readings, along with the timestamp for these values.
  • If a record does exist, it updates the record with the new sensor data, adjusting the average value, and updating minimum and maximum values and their timestamps if the new reading is a new minimum or maximum.
  • The function also increments the count of readings.

For those not using Drupal, similar functionality can be achieved with MariaDB's INSERT ... ON DUPLICATE KEY UPDATE command, which allows for the same conditional insert or update logic based on whether the specified unique key already exists in the table.

Here are example queries, extracted from MariaDB's General Query Log to help you get started:

INSERT INTO sensor_data (device, sensor, date, min_value, min_timestamp, max_value, max_timestamp, readings) 
VALUES ('0xdb123', 'temperature', '2024-01-01', 21, '2024-01-01 00:00:00', 21, '2024-01-01 00:00:00', 1);

UPDATE sensor_data 
SET min_value = LEAST(min_value, 21), 
   min_timestamp = IF(LEAST(min_value, 21) = 21, '2024-01-01 00:00:00', min_timestamp), 
   max_value = GREATEST(max_value, 21), 
   max_timestamp = IF(GREATEST(max_value, 21) = 21, '2024-01-01 00:00:00', max_timestamp), 
   readings = readings + 1
WHERE device = '0xdb123' AND sensor = 'temperature' AND date = '2024-01-01';

Generating graphs

With the data securely stored in the database, the next step involved generating the graphs. To accomplish this, I wrote some custom PHP code that generates Scalable Vector Graphics (SVGs).

Given that is blog post is already quite long, I'll spare you the details. For now, those curious can use the 'View source' feature in their web browser to examine the SVGs on the thermometer page.

Conclusion

It's fun how a visit to the Champagne cellars in France sparked an unexpected project. Choosing to build a thermometer rather than buying one allowed me to dive back into an old passion for hardware and low-level software.

I also like taking control of my own data and software. It gives me a sense of control and creativity.

As Drupal's project lead, using Drupal for an Internet-of-Things (IoT) backend brought me unexpected joy. I just love the power and flexibility of open-source platforms like Drupal.

As a next step, I hope to design and 3D print a case for my thermometer, something I've never done before. And as mentioned, I'm also considering integrating additional sensors. Stay tuned for updates!

March 04, 2024

Next week, I'm traveling to Japan and Australia. I've been to both countries before and can't wait to return – they're among my favorite places in the world.

My goal is to connect with the local Drupal community in each country, discussing the future of Drupal, learning from each other, and collaborating.

I'll also be connecting with Acquia's customers and partners in both countries, sharing our vision, strategy and product roadmap. As part of that, I look forward to spending some time with the Acquia teams as well – about 20 employees in Japan and 35 in Australia.

I'll present at a Drupal event in Tokyo the evening of March 14th at Yahoo! Japan.

While in Australia, I'll be attending Drupal South, held at the Sydney Masonic Centre from March 20-22. I'm excited to deliver the opening keynote on the morning of March 20th, where I'll delve into Drupal's past, present, and future.

I look forward to being back in Australia and Japan, reconnecting with old friends and the local communities.

February 29, 2024

Two people on a platform observe a decentralized web of nodes.

Two years ago, I launched a simple Web3 website using IPFS (InterPlanetary File System) and ENS (Ethereum Name Service). Back then, Web3 tools were getting a lot of media attention and I wanted to try it out.

Since I set up my Web3 website two years ago, I basically forgot about it. I didn't update it or pay attention to it for two years. But now that we hit the two-year mark, I'm curious: is my Web3 website still online?

At that time, I also stated that Web3 was not fit for hosting modern web applications, except for a small niche: static sites requiring high resilience and infrequent content updates.

I was also curious to explore the evolution of Web3 technologies to see if they became more applicable for website hosting.

My original Web3 experiment

In my original blog post, I documented the process of setting up what could be called the "Hello World" of Web3 hosting. I stored an HTML file on IPFS, ensured its availability using "pinning services", and made it accessible using an ENS domain.

For those with a basic understanding of Web3, here is a summary of the steps I took to launch my first Web3 website two years ago:

  1. Purchased an ENS domain name: I used a crypto wallet with Ethereum to acquire dries.eth through the Ethereum Name Service, a decentralized alternative to the traditional DNS (Domain Name System).
  2. Uploaded an HTML File to IPFS: I uploaded a static HTML page to the InterPlanetary File System (IPFS), which involved running my own IPFS node and utilizing various pinning services like Infura, Fleek, and Pinata. These pinning services ensure that the content remains available online even when my own IPFS node is offline.
  3. Accessed the website: I confirmed that my website was accessible through IPFS-compatible browsers.
  4. Mapped my webpage to my domain name: As the last step, I linked my IPFS-hosted site to my ENS domain dries.eth, making the web page accessible under an easy domain name.

If the four steps above are confusing to you, I recommend reading my original post. It is over 2,000 words, complete with screenshots and detailed explanations of the steps above.

Checking the pulse of various Web3 services

As the first step in my check-up, I wanted to verify if the various services I referenced in my original blog post are still operational.

The results, displayed in the table below, are really encouraging: Ethereum, ENS, IPFS, Filecoin, Infura, Fleek, Pinata, and web3.storage are all operational.

The two main technologies – ENS and IPFS – are both actively maintained and developed. This indicates that Web3 technology has built a robust foundation.

Service Description Still around in February 2024)
ENS A blockchain-based naming protocol offering DNS for Web3, mapping domain names to Ethereum addresses. Yes
IPFS A peer-to-peer protocol for storing and sharing data in a distributed file system. Yes
Filecoin A blockchain-based storage network and cryptocurrency that incentivizes data storage and replication. Yes
Infura Provides tools and infrastructure to manage content on IPFS and other tools for developers to connect their applications to blockchain networks and deploy smart contracts. Yes
Fleek A platform for building websites using IPFS and ENS. Yes
Pinata Provides tools and infrastructure to manage content on IPFS, and more recently Farcaster applications. Yes
web3.storage Provides tools and infrastructure to manage content on IPFS with support for Filecoin. Yes

Is my Web3 website still up?

Seeing all these Web3 services operational is encouraging, but the ultimate test is to check if my Web3 webpage, dries.eth, remained live. It's one thing for these services to work, but another for my site to function properly. Here is what I found in a detailed examination:

  1. Domain ownership verification: A quick check on etherscan.io confirmed that dries.eth is still registered to me. Relief!
  2. ENS registrar access: Using my crypto wallet, I could easily log into the ENS registrar and manage my domains. I even successfully renewed dries.eth as a test.
  3. IPFS content availability: My webpage is still available on IPFS, thanks to having pinned it two years ago. Logging into Fleek and Pinata, I found my content on their admin dashboards.
  4. Web3 and ENS gateway access: I can visit dries.eth using a Web3 browser, and also via an IPFS-compatible ENS gateway like https://dries.eth.limo/ – a privacy-centric service, new since my initial blog post.

The verdict? Not only are these Web3 services still operational, but my webpage also continues to work!

This is particularly noteworthy given that I haven't logged in to these services, didn't perform any maintenance, or didn't pay any hosting fees for two years (the pinning services I'm using have a free tier).

Visit my Web3 page yourself

For anyone interested in visiting my Web3 page (perhaps your first Web3 visit?), there are several methods to choose from, each with a different level of Web3-ness.

  • Use a Web3-enabled browser: Browsers such as Brave and Opera, offer built-in ENS and IPFS support. They can resolve ENS addresses and interpret IPFS addresses, making it as easy to navigate IPFS content as if it is traditional web content via HTTP or HTTPS.
  • Install a Web3 browser extension: If your favorite browser does not support Web3 out of the box, adding a browser extension like MetaMask can help you access Web3 applications. MetaMask works with Chrome, Firefox, and Edge. It enables you to use .eth domains for doing Ethereum transactions or for accessing content on IPFS.
  • Access through an ENS gateway: For those looking for the simplest way to access Web3 content without installing anything new, using an ENS gateway, such as eth.limo, is the easiest method. This gateway maps ENS domains to DNS, offering direct navigation to Web3 sites like mine at https://dries.eth.limo/. It serves as a simple bridge between Web2 (the conventional web) and Web3.

Streamlining content updates with IPNS

In my original post, I highlighted various challenges, such as the limitations for hosting dynamic applications, the cost of updates, and the slow speed of these updates. Although these issues still exist, my initial analysis was conducted with an incomplete understanding of the available technology. I want to delve deeper into these limitations, and refine my previous statements.

Some of these challenges stem from the fact that IPFS operates as a "content-addressed network". Unlike traditional systems that use URLs or file paths to locate content, IPFS uses a unique hash of the content itself. This hash is used to locate and verify the content, but also to facilitate decentralized storage.

While the principle of addressing content by a hash is super interesting, it also introduces some complications: whenever content is updated, its hash changes, making it tricky to link to the updated content. Specifically, every time I updated my Web3 site's content, I had to update my ENS record, and pay a translation fee on the Ethereum network.

At the time, I wasn't familiar with the InterPlanetary Name System (IPNS). IPNS, not to be confused with IPFS, addresses this challenge by assigning a mutable name to content on IPFS. You can think of IPNS as providing an "alias" or "redirect" for IPFS addresses: the IPNS address always stays the same and points to the latest IPFS address. It effectively eliminates the necessity of updating ENS records with each content change, cutting down on expenses and making the update process more automated and efficient.

To leverage IPNS, you have to take the following steps:

  1. Upload your HTML file to IPFS and receive an IPFS hash.
  2. Publish this hash to IPNS, creating an IPNS hash that directs to the latest IPFS hash.
  3. Link your ENS domain to this IPNS hash. Since the IPNS hash remains constant, you only need to update your ENS record once.

Without IPNS, updating content involved:

  1. Update the HTML file.
  2. Upload the revised file to IPFS, generating a new IPFS hash.
  3. Update the ENS record with the new IPFS hash, which costs some Ether and can take a few minutes.

With IPNS, updating content involves:

  1. Update the HTML file.
  2. Upload the revised file to IPFS, generating a new IPFS hash.
  3. Update the IPNS record to reference this new hash, which is free and almost instant.

Although IPNS is a faster and more cost-effective approach compared to the original method, it still carries a level of complexity. There is also a minor runtime delay due to the extra redirection step. However, I believe this tradeoff is worth it.

Updating my Web3 site to use IPNS

With this newfound knowledge, I decided to use IPNS for my own site. I generated an IPNS hash using both the IPFS desktop application (see screenshot) and IPFS' command line tools:

$ ipfs name publish /ipfs/bafybeibbkhmln7o4ud6an4qk6bukcpri7nhiwv6pz6ygslgtsrey2c3o3q
> Published to k51qzi5uqu5dgy8mzjtcqvgr388xjc58fwprededbb1fisq1kvl34sy4h2qu1a: /ipfs/bafybeibbkhmln7o4ud6an4qk6bukcpri7nhiwv6pz6ygslgtsrey2c3o3q
A screenshot of the IPFS Desktop application displaying an HTML file with a drop-down menu option to 'Publish to IPNS'. The IPFS Desktop application showing my index.html file with an option to 'Publish to IPNS'.

After generating the IPNS hash, I was able to visit my site in Brave using the IPFS protocol at ipfs://bafybeibbkhmln7o4ud6an4qk6bukcpri7nhiwv6pz6ygslgtsrey2c3o3q, or via the IPNS protocol at ipns://k51qzi5uqu5dgy8mzjtcqvgr388xjc58fwprededbb1fisq1kvl34sy4h2qu1a.

A browser window displaying a simple "Hello world!" webpage on IPNS. My Web3 site in Brave using IPNS.

Next, I updated the ENS record for dries.eth to link to my IPNS hash. This change cost me 0.0011 ETH (currently $4.08 USD), as shown in the Etherscan transaction. Once the transaction was processed, dries.eth began directing to the new IPNS address.

A screen confirming an updated content hash for 'dries.eth' on the Ethereum Name Service, with a link to view the transaction on Etherscan. A transaction confirmation on the ENS website, showing a successful update for dries.eth.

Rolling back my IPNS record in ENS

Unfortunately, my excitement was short-lived. A day later, dries.eth stopped working. IPNS records, it turns out, need to be kept alive – a lesson learned the hard way.

While IPFS content can be persisted through "pinning", IPNS records require periodic "republishing" to remain active. Essentially, the network's Distributed Hash Table (DHT) may drop IPNS records after a certain amount of time, typically 24 hours. To prevent an IPNS record from being dropped, the owner must "republish" it before the DHT forgets it.

I found out that the pinning services I use – Dolphin, Fleek and Pinata – don't support IPNS republishing. Looking into it further, it turns out few IPFS providers do.

During my research, I discovered Filebase, a small Boston-based company with fewer than five employees that I hadn't come across before. Interestingly, they provide both IPFS pinning and IPNS republishing. However, to pin my existing HTML file and republish its IPNS hash, I had to subscribe to their service at a cost of $20 per month.

Faced with the challenge of keeping my IPNS hash active, I found myself at a crossroads: either fork out $20 a month for a service like Filebase that handles IPNS republishing for me, or take on the responsibility of running my own IPFS node.

Of course, the whole point of decentralized storage is that people run their own nodes. However, considering the scope of my project – a single HTML file – the effort of running a dedicated node seemed disproportionate. I'm also running my IPFS node on my personal laptop, which is not always online. Maybe one day I'll try setting up a dedicated IPFS node on a Raspberry Pi or similar setup.

Ultimately, I decided to switch my ENS record back to the original IPFS link. This change, documented in the Etherscan transaction, cost me 0.002 ETH (currently $6.88 USD).

Although IPNS works, or can work, it just didn't work for me. Despite the setback, the whole experience was a great learning journey.

(Update: A couple of days after publishing this blog post, someone kindly recommended https://dwebservices.xyz/, claiming their free tier includes IPNS republishing. Although I haven't personally tested it yet, a quick look at their about page suggests they might be a promising solution.)

Web3 remains too complex for most people

Over the past two years, Web3 hosting hasn't disrupted the mainstream website hosting market. Despite the allure of Web3, mainstream website hosting is simple, reliable, and meets the needs of nearly all users.

Despite a significant upgrade of the Ethereum network that reduced energy consumption by over 99% through its transition to a Proof of Stake (PoS) consensus mechanism, environmental considerations, especially the carbon footprint associated with blockchain technologies, continue to create further challenges for the widespread adoption of Web3 technologies. (Note: ENS operates on the blockchain but IPFS does not.)

As I went through the check-up, I discovered islands of innovation and progress. Wallets and ENS domains got easier to use. However, the overall process of creating a basic website with IPFS and ENS remains relatively complex compared to the simplicity of Web2 hosting.

The need for a SQL-compatible Web3 database

Modern web applications like those built with Drupal and WordPress rely on a technology stack that includes a file system, a domain name system (e.g. DNS), a database (e.g. MariaDB or MySQL), and a server-side runtime environment (e.g. PHP).

While IPFS and ENS offer decentralized alternatives for the first two, the equivalents for databases and runtime environments are less mature. This limits the types of applications that can easily move from Web2 to Web3.

A major breakthrough would be the development of a decentralized database that is compatible with SQL, but currently, this does not seem to exist. The complexity of ensuring data integrity and confidentiality across multiple nodes without a central authority, along with meeting the throughput demands of modern web applications, may be too complex to solve.

After all, blockchains, as decentralized databases, have been in development for over a decade, yet lack support for the SQL language and fall short in speed and efficiency required for dynamic websites.

The need for a distributed runtime

Another critical component for modern websites is the runtime environment, which executes the server-side logic of web applications. Traditionally, this has been the domain of PHP, Python, Node.js, Java, etc.

WebAssembly (WASM) could emerge as a potential solution. It could make for an interesting decentralized solution as WASM binaries can be hosted on IPFS.

However, when WASM runs on the client-side – i.e. in the browser – it can't deliver the full capabilities of a server-side environment. This limitation makes it challenging to fully replicate traditional web applications.

So for now, Web3's applications are quite limited. While it's possible to host static websites on IPFS, dynamic applications requiring database interactions and server-side processing are difficult to transition to Web3.

Bridging the gap between Web2 and Web3

In the short term, the most likely path forward is blending decentralized and traditional technologies. For example, a website could store its static files on IPFS while relying on traditional Web2 solutions for its dynamic features.

Looking to the future, initiatives like OrbitDB's peer-to-peer database, which integrates with IPFS, show promise. However, OrbitDB lacks compatibility with SQL, meaning applications would need to be redesigned rather than simply transferred.

Web3 site hosting remains niche

Even the task of hosting static websites, which don't need a database or server-side processing, is relatively niche within the Web3 ecosystem.

As I wrote in my original post: In its current state, IPFS and ENS offer limited value to most website owners, but tremendous value to a very narrow subset of all website owners.. This observation remains accurate today.

IPFS and ENS stand out for their strengths in censorship resistance and reliability. However, for the majority of users, the convenience and adequacy of Web2 for hosting static sites often outweigh these benefits.

The key to broader acceptance of new technologies, like Web3, hinges on either discovering new mass-market use cases or significantly enhancing the user experience for existing ones. Web3 has not found a universal application or surpassed Web2 in user experience.

The popularity of SaaS platforms underscores this point. They dominate not because they're the most resilient or robust options, but because they're the most convenient. Despite the benefits of resilience and autonomy offered by Web3, most individuals opt for less resilient but more convenient SaaS solutions.

Conclusion

Despite the billions invested in Web3 and notable progress, its use for website hosting still has significant limitations.

The main challenge for the Web3 community is to either develop new, broadly appealing applications or significantly improve the usability of existing technologies.

Website hosting falls into the category of existing use cases.

Unfortunately, Web3 remains mostly limited to static websites, as it does not yet offer robust alternatives to SQL databases and server-side runtime.

Even within the limited scope of static websites, improvements to the user experience have been marginal, focused on individual parts of the technology stack. The overall end-to-end experience remains complex.

Nonetheless, the fact that my Web3 page is still up and running after two years is encouraging, showing the robustness of the underlying technology, even if its current use remains limited. I've grown quite fond of IPFS, and I hope to do more useful experiments with it in the future.

All things considered, I don't see Web3 taking the website hosting world by storm any time soon. That said, over time, Web3 could become significantly more attractive and functional. All in all, keeping an eye on this space is definitely fun and worthwhile.

February 26, 2024

A graph showing the state of the Digital Experience Platforms in 2024. Vendors are plotted on a grid based on their ability to execute and completeness of vision. Acquia is placed in the 'Leaders' quadrant, indicating strong performance in both vision and execution.

For the fifth year in a row, Acquia has been named a Leader in the Gartner Magic Quadrant for Digital Experience Platforms (DXP).

Acquia received this recognition from Gartner based on both the completeness of product vision and ability to execute.

Central to our vision and execution is a deep commitment to openness. Leveraging Drupal, Mautic and open APIs, we've built the most open DXP, empowering customers and partners to tailor our platform to their needs.

Our emphasis on openness extends to ensuring our solutions are accessible and inclusive, making them available to everyone. We also prioritize building trust through data security and compliance, integral to our philosophy of openness.

We're proud to be included in this report and thank our customers and partners for their support and collaboration.

Mandatory disclaimer from Gartner

Gartner, Magic Quadrant for Digital Experience Platforms, Irina Guseva, Jim Murphy, Mike Lowndes, John Field - February 21, 2024.

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Acquia.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Gartner is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.

February 25, 2024

Wij houden met Euroclear zo’n 260 miljard euro aan Russische geblokkeerde tegoeden vast.

Moesten wij die gebruiken dan houdt dat in dat ons land de komende honderd en meer jaar internationaal gezien zal worden als een dief.

Tegelijkertijd spenderen wij slechts zo’n procent van onze GDP aan defensie. Dat is niet goed. Maar ja. Misschien indien de VS zo erg graag wil dat we die Russische tegoeden voor hun politieke pleziertjes gebruiken, dat ze deze ‘gunsten’ als Belgische defensie-uitgaven kunnen zien?

Bovendien horen er een aantal garanties te zijn t.o.v. de represailles die Rusland zal uitwerken tegen België wanneer Euroclear de tegoeden herinvesteert (dit wil voor Rusland zeggen: steelt).

Waar zijn die garanties? Welke zijn ze? Over hoeveel geld gaat het? Is het extreem veel? Want dat moet. Is het walgelijk extreem veel? Want dat moet.

Die garanties, gaan die over meer dan honderd jaar? Want dat moet.

Het is allemaal wel gemakkelijk en zo om een en ander te willen. Maar wat staat daar tegenover?

Martti Malmi, an early contributor to the Bitcoin project, recently shared a fascinating piece of internet history: an archive of private emails between himself and Satoshi Nakamoto, Bitcoin's mysterious founder.

The identity of Satoshi Nakamoto remains one of the biggest mysteries in the technology world. Despite extensive investigations, speculative reports, and numerous claims over the years, the true identity of Bitcoin's creator(s) is still unknown.

Martti Malmi released these private conversations in reaction to a court case focused on the true identity of Satoshi Nakamoto and the legal entitlements to the Bitcoin brand and technology.

The emails provide some interesting details into Bitcoin's early days, and might also provide some new clues about Satoshi's identity.

Satoshi and Martti worked together on a variety of different things, including the relaunch of the Bitcoin website. Their goal was to broaden public understanding and awareness of Bitcoin.

And to my surprise, the emails reveal they chose Drupal as their preferred CMS! (Thanks to Jeremy Andrews for making me aware.)

Email from Satoshi Nakamoto to Martti Malmi, dated December 2009, about Bitcoin's new Drupal site going online and being an improvement over the old bitcoin.org page.

The emails detail Satoshi's hands-on involvement, from installing Drupal themes, to configuring Drupal's .htaccess file, to exploring Drupal's multilingual capabilities.

Email from Satoshi Nakamoto to Martti Malmi, dated November 2009, discussing Drupal themes installation and comparing Drupal and Joomla!.

At some point in the conversation, Satoshi expressed reservations about Drupal's forum module.

Email from Satoshi Nakamoto to Martti Malmi, dated November 2009, about preferring Drupal as a CMS over a Wiki, and expressing that Drupal's forum capabilities are not ideal but better than Wikis.

For what it is worth, this proves that I'm not Satoshi Nakamoto. Had I been, I'd have picked Drupal right away, and I would never have questioned Drupal's forum module.

Jokes aside, as Drupal's Founder and Project Lead, learning about Satoshi's use of Drupal is a nice addition to Drupal's rich history. Almost every day, I'm inspired by the unexpected impact Drupal has.

I started to migrate all the services that I use on my internal network to my Raspberry Pi 4 cluster. I migrated my FreeBSD jails to BastileBSD on a virtual machine running on a Raspberry Pi. See my blog post on how to migrate from ezjail to BastilleBSD. https://stafwag.github.io/blog/blog/2023/09/10/migrate-from-ezjail-to-bastille-part1-introduction-to-bastillebsd/

tianocore

Running FreeBSD as a virtual machine with UEFI on ARM64 came to the point that it just works. I have to use QEMU with u-boot to get FreeBSD up and running on the Raspberry Pi as a virtual machine with older FreeBSD versions: https://stafwag.github.io/blog/blog/2021/03/14/howto_run_freebsd_as_vm_on_pi/.

But with the latest versions of FreeBSD ( not sure when it started to work, but it works on FreeBSD 14) you can run FreeBSD as a virtual machine on ARM64 with UEFI just like on x86 on GNU/Linux with KVM.

UEFI on KVM is in general provided by the open-source tianocore project.

I didn’t find much information on how to run OpenBSD with UEFI on x86 or ARM64.

OpenBSD 7.4

So I decided to write a blog post about it, in the hope that this information might be useful to somebody else. First I tried to download the OpenBSD 7.4 ISO image and boot it as a virtual machine on KVM (x86). But the iso image failed to boot on a virtual with UEFI enabled. It looks like the ISO image only supports a legacy BIOS.

ARM64 doesn’t support a “legacy BIOS”. The ARM64 download page for OpenBSD 7.4 doesn’t even have an ISO image, but there is an install-<version>.img image available. So I tried to boot this image on one of my Raspberry Pi systems and this worked. I had more trouble getting NetBSD working as a virtual machine on the Raspberry Pi but this might be a topic for another blog post :-)

You’ll find my journey with my installation instructions below.

Download

Download the installation image

Download the latest OpenBSD installation ARM64 image from: https://www.openbsd.org/faq/faq4.html#Download

The complete list of the mirrors is available at https://www.openbsd.org/ftp.html

Download the image.

[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/install74.img
--2024-02-13 19:04:52--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/install74.img
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 528482304 (504M) [application/octet-stream]
Saving to: 'install74.img'

install74.img       100%[===================>] 504.00M  3.70MB/s    in 79s     

2024-02-13 19:06:12 (6.34 MB/s) - 'install74.img' saved [528482304/528482304]

[staf@staf-pi002 openbsd]$ 

Download the checksum and the signed checksum.

2024-02-13 19:06:12 (6.34 MB/s) - 'install74.img' saved [528482304/528482304]

[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256
--2024-02-13 19:07:00--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 1392 (1.4K) [text/plain]
Saving to: 'SHA256'

SHA256                  100%[=============================>]   1.36K  --.-KB/s    in 0s      

2024-02-13 19:07:01 (8.09 MB/s) - 'SHA256' saved [1392/1392]

[staf@staf-pi002 openbsd]$ 
[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256.sig
--2024-02-13 19:08:01--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256.sig
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 1544 (1.5K) [text/plain]
Saving to: 'SHA256.sig'

SHA256.sig              100%[=============================>]   1.51K  --.-KB/s    in 0s      

2024-02-13 19:08:02 (3.91 MB/s) - 'SHA256.sig' saved [1544/1544]

[staf@staf-pi002 openbsd]$ 

Verify

OpenBSD uses signify to validate the cryptographic signatures. signify is also available for GNU/Linux (at least on Debian GNU/Linux and Arch Linux).

More details on how to verify the signature with signify is available at: https://www.openbsd.org/74.html

This blog post was also useful: https://www.msiism.org/blog/2019/10/20/authentic_pufferfish_for_penguins.html

Install OpenBSD signify

Download the signify public key from: https://www.openbsd.org/74.html

[staf@staf-pi002 openbsd]$ wget https://ftp.openbsd.org/pub/OpenBSD/7.4/openbsd-74-base.pub
--2024-02-13 19:14:25--  https://ftp.openbsd.org/pub/OpenBSD/7.4/openbsd-74-base.pub
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 99 [text/plain]
Saving to: 'openbsd-74-base.pub'

openbsd-74-base.pub     100%[=============================>]      99   397 B/s    in 0.2s    

2024-02-13 19:14:26 (397 B/s) - 'openbsd-74-base.pub' saved [99/99]

[staf@staf-pi002 openbsd]$

I run Debian GNU/Linux on my Raspberry Pi’s, let see which signify packages are available.

[staf@staf-pi002 openbsd]$ sudo apt search signify
sudo: unable to resolve host staf-pi002: Name or service not known
[sudo] password for staf: 
Sorting... Done
Full Text Search... Done
chkrootkit/stable 0.57-2+b1 arm64
  rootkit detector

elpa-diminish/stable 0.45-4 all
  hiding or abbreviation of the mode line displays of minor-modes

fcitx-sayura/stable 0.1.2-2 arm64
  Fcitx wrapper for Sayura IM engine

fcitx5-sayura/stable 5.0.8-1 arm64
  Fcitx5 wrapper for Sayura IM engine

signify/stable 1.14-7 all
  Automatic, semi-random ".signature" rotator/generator

signify-openbsd/stable 31-3 arm64
  Lightweight cryptographic signing and verifying tool

signify-openbsd-keys/stable 2022.2 all
  Public keys for use with signify-openbsd

[staf@staf-pi002 openbsd]$

There’re two OpenBSD signify packages available on Debian 12 (bookworm);

  • signify-openbsd/: The OpenBSD signify tool.
  • signify-openbsd-keys: This package contains the OpenBSD release public keys, installed in /usr/share/signify-openbsd-keys/. Unfortunately, the OpenBSD 7.4 release isn’t (yet) included in Debian 12 (bookworm).
[staf@staf-pi002 openbsd]$ sudo apt install signify-openbsd signify-openbsd-keys
sudo: unable to resolve host staf-pi002: Name or service not known
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  signify-openbsd signify-openbsd-keys
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 70.4 kB of archives.
After this operation, 307 kB of additional disk space will be used.
Get:1 http://deb.debian.org/debian bookworm/main arm64 signify-openbsd arm64 31-3 [62.3 kB]
Get:2 http://deb.debian.org/debian bookworm/main arm64 signify-openbsd-keys all 2022.2 [8020 B]
Fetched 70.4 kB in 0s (404 kB/s)          
Selecting previously unselected package signify-openbsd.
(Reading database ... 94575 files and directories currently installed.)
Preparing to unpack .../signify-openbsd_31-3_arm64.deb ...
Unpacking signify-openbsd (31-3) ...
Selecting previously unselected package signify-openbsd-keys.
Preparing to unpack .../signify-openbsd-keys_2022.2_all.deb ...
Unpacking signify-openbsd-keys (2022.2) ...
Setting up signify-openbsd-keys (2022.2) ...
Setting up signify-openbsd (31-3) ...
[staf@staf-pi002 openbsd]$ 

Verify the checksum

Verify the checksum.

[staf@staf-pi002 openbsd]$ sha256sum install74.img 
09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8  install74.img
[staf@staf-pi002 openbsd]$ 
[staf@staf-pi002 openbsd]$ grep 09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8 SHA256
SHA256 (install74.img) = 09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8
[staf@staf-pi002 openbsd]$ 

Verify with signify

Execute the signify command to verify the checksum. See the OpenBSD signify manpage for more information.

You’ll find a brief list of the arguments that are used to verify the authenticity of the image.

  • -C: Will verify the Checksum.
  • -p <path>: The path to the Public key.
  • -x <path>: The path to the signature file.

Verify the image with signify.

[staf@staf-pi002 openbsd]$ signify-openbsd -C -p openbsd-74-base.pub -x SHA256.sig install74.img
Signature Verified
install74.img: OK
[staf@staf-pi002 openbsd]$

Secure boot

The Debian UEFI package for libvirt ovmf is based on https://github.com/tianocore/tianocore.github.io/wiki/OVMF.

Debian Bookworm comes with the following UEFI BIOS settings:

  • /usr/share/AAVMF/AAVMF_CODE.ms.fd This is with secure boot enabled.
  • /usr/share/AAVMF/AAVMF_CODE.fd This is without secure boot enabled.

The full description is available at /usr/share/doc/ovmf/README.Debian on a Debian system when the ovmf package is installed.

To install OpenBSD we need to disable secure boot.

Test boot

I first started a test boot.

Logon to the Raspberry Pi.

[staf@vicky ~]$ ssh -X -CCC staf-pi002 
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Linux staf-pi002 6.1.0-17-arm64 #1 SMP Debian 6.1.69-1 (2023-12-30) aarch64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Feb 14 06:08:45 2024 from xxx.xxx.xxx.xxx
[staf@staf-pi002 ~]$ 


virt-manager



Start virt-manager and click on the [ Create on new VM ] icon.



new vm




This will bring up the new vm window. Select ( ) Import existing disk image, you review the architecture option by selecting the \/ Architecture options. The defaults are fine. Click on [ Forward ].






import vm



This will open the “import vm” window. Click on [ Browse ] to select the OpenBSD installation image or just copy/paste the path.

At the bottom of the screen, you’ll see Choose the operating system you are installing. Starting type openbsd and select [ X ] include end-of-life operating systems Debian 12 (bookworm) doesn’t include support for OpenBSD 7.4 (yet) so we need to set it to “OpenBSD 7.0”. Click on [ Forward ].





select custom


In the next windows keep the default Memory and CPU settings as we’re just verifying that we can boot from the installation image.

Debian uses “secure boot” by default. We need to disable secure boot. Select [ X ] Customize configuration before install, this allows us to set the UEFI boot image.






begin install


Set the Firmware to: /usr/share/AAVMF/AAVMF_CODE.fd to disable secure boot and click on [ Begin Installation ].








begin install




Let’s check if OpenBSD can boot. Great, it works!




Installation with virt-install

I prefer to use the command line to install as this allows me to make the installation reproducible and automated.

Create a ZFS dataset

I used ZFS on my Raspberry Pi’s, this makes it easier to create snapshots etc when you’re testing software etc.

root@staf-pi002:/var/lib/libvirt/images# zfs create staf-pi002_pool/root/var/lib/libvirt/images/openbsd-gitlabrunner001
root@staf-pi002:/var/lib/libvirt/images# 
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# pwd
/var/lib/libvirt/images/openbsd-gitlabrunner001
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

Get the correct os-variant

To get the operating system settings you can execute the command virt-install --osinfo list

root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# virt-install --osinfo list | grep -i openbsd7
openbsd7.0
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

We’ll use openbsd7.0 as the operating system variant.

Create QEMU image

Create a destination disk image.

root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# qemu-img create -f qcow2 openbsd-gitlabrunner001.qcow2 50G
Formatting 'openbsd-gitlabrunner001.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=53687091200 lazy_refcounts=off refcount_bits=16
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

Run virt-install

Run virt-install to import the virtual machine.

#!/bin/bash

virt-install --name openbsd-gitlabrunner001  \
 --noacpi \
 --boot loader=/usr/share/AAVMF/AAVMF_CODE.fd \
 --os-variant openbsd7.0 \
 --ram 2048 \
 --import \
 --disk /home/staf/Downloads/isos/openbsd/install74.img  \
 --disk /var/lib/libvirt/images/openbsd-gitlabrunner001/openbsd-gitlabrunner001.qcow2

If everything goes well the virtual machine gets booted.

BdsDxe: loading Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x1,0x3)/Pci(0x0,0x0)
BdsDxe: starting Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x1,0x3)/Pci(0x0,0x0)
disks: sd0*
>> OpenBSD/arm64 BOOTAA64 1.18
boot> 
cannot open sd0a:/etc/random.seed: No such file or directory
booting sd0a:/bsd: 2861736+1091248+12711584+634544 [233295+91+666048+260913]=0x13d5cf8
Copyright (c) 1982, 1986, 1989, 1991, 1993
	The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2023 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 7.4 (RAMDISK) #2131: Sun Oct  8 13:35:40 MDT 2023
    deraadt@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/RAMDISK
real mem  = 2138013696 (2038MB)
avail mem = 2034593792 (1940MB)
random: good seed from bootblocks
mainbus0 at root: linux,dummy-virt
psci0 at mainbus0: PSCI 1.1, SMCCC 1.1
efi0 at mainbus0: UEFI 2.7
efi0: EDK II rev 0x10000
smbios0 at efi0: SMBIOS 3.0.0
smbios0:
sd1 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, >
sd1: 51200MB, 512 bytes/sector, 104857600 sectors
virtio35: msix per-VQ
ppb5 at pci0 dev 1 function 5 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci6 at ppb5 bus 6
ppb6 at pci0 dev 1 function 6 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci7 at ppb6 bus 7
ppb7 at pci0 dev 1 function 7 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci8 at ppb7 bus 8
ppb8 at pci0 dev 2 function 0 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci9 at ppb8 bus 9
ppb9 at pci0 dev 2 function 1 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci10 at ppb9 bus 10
ppb10 at pci0 dev 2 function 2 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci11 at ppb10 bus 11
ppb11 at pci0 dev 2 function 3 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci12 at ppb11 bus 12
ppb12 at pci0 dev 2 function 4 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci13 at ppb12 bus 13
ppb13 at pci0 dev 2 function 5 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci14 at ppb13 bus 14
pluart0 at mainbus0: rev 1, 16 byte fifo
pluart0: console
"pmu" at mainbus0 not configured
agtimer0 at mainbus0: 54000 kHz
"apb-pclk" at mainbus0 not configured
softraid0 at root
scsibus2 at softraid0: 256 targets
root on rd0a swap on rd0b dump on rd0b
WARNING: CHECK AND RESET THE DATE!
erase ^?, werase ^W, kill ^U, intr ^C, status ^T

Welcome to the OpenBSD/arm64 7.4 installation program.
(I)nstall, (U)pgrade, (A)utoinstall or (S)hell? 

Continue with the OpenBSD installation as usual. Make sure that you select the second disk during the installation process.

To fully automate the installation we need a system that executes the post-configuration at the first boot. On GNU/Linux is normally done by cloud-init while there are solutions to get cloud-init working on the BSDs. I didn’t look into this (yet).

Have fun!

Links

February 24, 2024

If you use MySQL Shell for Visual Studio Code, using a bastion host is the easiest method to connect to a MySQL HeatWave DB Instance on OCI.

If you already have a connection setup using a bastion to host, you may experience the same problem as me, MySQL Shell complains about an invalid fingerprint detected:

This error has nothing to do with the fingerprint of your user OCI Key. The problem is related to the key of your bastion host as you can see in the output window:

This happens if you have changed your bastion host for example.

To resolve the problem, remove the current ssh host key for the bastion host stored in your know_hosts:

$ ssh-keygen -R "host.bastion.us-ashburn-1.oci.oraclecloud.com"

Use the name of your bastion host of course.

When done, it’s already fixed, you can connect back to your MySQL HeatWave DB Instance using MySQL Shell for Visual Studio Code.

February 23, 2024

Sorry, maar een bedrijf schandelijk de dieperik injagen is onwettelijk. Punt. Daar mag, nee moet, een gepaste straf tegenover staan.

Kurt Meers – opinie stuk

Ik vraag me ook af waarom journalisten die Vansteenbrugge uitnodigen niet ingaan op zijn totaal foute definities:

Laster is iemand kwaadwillig iets verwijten zodat dit deze persoon blootstelt aan publieke verachting, zonder dat men erin slaagt het wettelijke bewijs ervan te leveren, terwijl de wet het leveren van het bewijs veronderstelt. Dit kan bijvoorbeeld wanneer je in het openbaar beschuldigd wordt van pedofilie, terwijl hier absoluut geen bewijs van voorhanden is.

Eerroof is iemand kwaadwillig iets verwijten zodat de eer van die persoon wordt aangetast of hij/zij wordt blootgesteld aan publieke verachting, zonder dat het bewijs hiervan wettelijk gezien geleverd kan of mag worden. Iemand wordt bijvoorbeeld beschuldigd van diefstal, maar het wettelijke bewijs daarvan is niet meer te leveren omdat de zaak verjaard is.

Advo-Recht

M.a.w. Laster en eerroof heeft totaal niets te maken met wat Vansteenbrugge daarover beweert: iets over dat je mening de democratie niet in gevaar mag brengen, wat er totaal niets mee te maken heeft – en wat trouwens wel toegelaten is als meningsuiting. Bijvoorbeeld Sharia4Belgium mocht haar mening uiten. Daar werden ze niet voor bestraft. Wel voor het rondselen van terroristen.

Ik heb zelden zo’n slechte advocaat gezien. Het is een populist die slecht theater speelt en ook de meest dwaze strategie adviseert voor zijn cliënt (waardoor die nu een gevangenisstraf i.p.v. een werkstraf oploopt). Maarja. Dat was duidelijk onderdeel van het theater om zo de publieke opinie te kunnen bespelen met: het is zo’n zware straf! Eigen keuze. Eigen schuld.

Laten we de definitie van Laster even vergelijken met wat er hier gebeurd is: het aan de schandpaal nagelen van de eigenaars van een restaurant die totaal niets met de zaak te maken hebben. Dat is letterlijk en exact hetzelfde als het blootstellen aan publieke verachting zonder dat men erin slaagt het wettelijke bewijs ervan te leveren.

Dat valt niet onder de noemer vrijheid van meningsuiting. Wel onder de noemer misdaad.

Er wordt ook schadevergoeding met straf vergeleken. Wat totaal onjuist is. Een schadevergoeding is geen straf. Het is de vergoeding voor schade die jij anderen hebt aangedaan. Dat is geen straf. Die vergoeding kan trouwens nog oplopen (dus meer dan de 20000 Euro worden) aangezien er een expert is aangesteld die de werkelijke schade nu zal begroten.

Ik denk eerlijk gezegd dat dat restaurant in Antwerpen veel meer dan 20000 Euro schade heeft opgelopen. Dat moet allemaal en volledig vergoed worden. Hoe hoog dat bedrag ook is.

Die schadevergoeding vergelijken met de schadevergoeding die betaald moest worden aan de ouders van een student die gestorven is tijdens een studentendoop: a) die zaken hebben totaal niets met elkaar te maken en het is b) lijkenpikkenrij. Er is maar weinig dat ik meer verachtelijk vind. Behalve dan politieke lijkenpikkerij. Dat is nog erger en bovendien een aanfluiting van de scheiding der machten.

Waar je voor zou kunnen pleiten in het Federaal Parlement is om de schadevergoeding voor onopzettelijke doding met schuld te verhogen. Maar daarover hoort dan eerst een behoorlijk gevoerd democratisch debat plaatst te vinden. Dat is dus niet de populistische onzin die we gewoon zijn van Sammy Mahdi.

De wet veronderstelt het leveren van het bewijs hiervoor: hij heeft de afgelopen jaren als politicus geen enkele aanzet om dat democratisch debat te voeren ondernomen.

Hij is dus een populistische lijkenpikker.

L’amour est comme l’oiseau de Twitter
On est bleu de lui, seulement pour 48 heures

StromaeCarmen

Q.E.D.

ps. Lees ook zeker de open brief van de moeder van Sanda Dia. Zij is de recuperatie van de dood van haar zoon ook volledig beu.

February 22, 2024

For close to a decade, I dedicated myself to building out Grafana Labs and subsequently took some personal time to recharge. I came out of these experiences as a different person, and am ready to share a few things I learned.

First step is bringing my abandoned blog back to life:

  • Rewrote the about/bio and projects pages
  • Switched from lighttpd to Caddy (automagic https! wow!)
  • Enabled Google Cloud CDN
  • Switched from Google Analytics to the open source PostHog
  • Tried CloudFlare Turnstile for comment spam prevention, but it’s quite complicated and has un-googleable error codes, so i switched to something more simplistic
  • Upgraded hugo from 0.15 to 0.122 with very few breaking changes (!) and found a nice new minimalist theme
  • I took more care than probably necessary to avoid accidental republishing of articles in RSS feeds, since that’s such an easy mistake to happen with rss GUID’s.

I don’t know who still reads this blog, but if you do, stay tuned for more!

February 20, 2024

Ik merkte op het einde van een Terzake dat onze teerbeminde ADIV opzoek is naar mensen.

Ik mag hopen dat ze planet.grep lezen want ja, een deel van het soort volk dat de ADIV nodig heeft leest onze blogs. Gelukkig maar.

Nu, ik heb in 2014 al eens omschreven wat toen Miguel De Bruycker nodig had voor zijn Centrum voor Cybersecurity.

Meneer Van Strythem Michel; jij hebt precies hetzelfde nodig als Miguel toen. Toch? Je gaat die mensen dan ook op precies dezelfde manier vinden als hoe Miguel ze vond: Bescherm ons tegen afluisteren, luister zelf enkel binnen een wettelijk kader af.

Ik schreef in 2014:

Zolang de overheid haar eigen moreel compas volgt, zijn deze mensen bereid hun kunnen voor de overheid in te zetten.

Maar de wet staat boven de militair. Ze moet gevolgd worden. Ook door de inlichtingendiensten. Het is onze enige garantie op een vrije samenleving: ik wil niet werken of geholpen hebben aan een wereld waarin de burger door technologie vrijheden zoals privacy verliest.

Daartussen vond je in mijn schrijfsel een opsommig van een paar opdrachtjes die een klein beetje moeilijk zijn.

Zoals het verbergen van een Linux kernel module, wat dus een standaard Linux virus zou doen. Het exfiltreren van data over een netwerk device zonder dat het al te veel opvalt. Dat is iets wat de ADIV zal willen doen. Het ontvangen van commando’s. Dat is eigenlijk wat Back orifice ook al deed.

Ik omschreef ook dat ze moeten weten hoe een bufferoverflow fout werkt. Omdat dit een veel voorkomende programmeursfout is die tot beveiligingsproblemen leidt. M.a.w. ze moeten toch op zijn minst weten hoe ze zelf zo’n beveiligingsprobleem zouden kunnen maken. Al was het maar om hun aanvalstechnieken te kunnen voorbereiden.

Ik verwachtte ook dat ze een standaard socket-servertje kunnen opzetten. Natuurlijk. Dat is minimaal.

De gemiddelde programmeur zal dat vast allemaal niet kunnen. Maar we hebben de gemiddelde programmeur dan ook niet nodig.

Als de ADIV zich houdt aan de wet en steeds binnen de context van de wet werkt, dan zal de ADIV de besten van ons land vinden. Net zoals hoe de brandweer de beste mensen van het land weet te activeren, zal de ADIV de beste programmeurs kunnen activeren.

Maar dat is omdat het zich aan haar eigen morele compas houdt.

Doet ze dat niet, dan niet.

Met vriendelijke groeten,

Philip. Programmeur.

February 15, 2024

Recently during the Swedish MySQL User Group (SMUG), I presented a session dedicated to MySQL InnoDB Primary Keys.

I forgot to mention a detail that many people are not aware, but Jeremy Cole has pointed out.

Primary Key always included in secondary indexes at the right-most column

When we define a secondary index, the secondary index includes the Primary Key as the right-most column of the index. It’s silently added, meaning that it’s not visible but it’s used to point back to the record in the clustered index.

This is an example with a table having a Primary Key composed of multiple columns:

CREATE TABLE `t1` (
`a` int NOT NULL,
`b` int NOT NULL, 
`c` int NOT NULL,
`d` int NOT NULL,
`e` int NOT NULL,
`f` varchar(10) DEFAULT 'aaa',
`inserted` datetime DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY(`a`,`b`,`c`) ) ENGINE=InnoDB; 

And this is the table’s content (only 2 records):

SELECT * FROM t1;
+---+---+---+----+----+-----+---------------------+
| a | b | c | d  | e  | f   | inserted            |
+---+---+---+----+----+-----+---------------------+
| 1 | 2 | 3 |  4 |  5 | abc | 2024-02-11 17:37:16 |
| 7 | 8 | 9 | 10 | 11 | def | 2024-02-11 17:37:26 |
+---+---+---+----+----+-----+---------------------+

Now let’s create a secondary key for the column f:

ALTER TABLE t1 ADD INDEX f_idx(f);

This key will then include the Primary Key as the right-most column(s) on the secondary index:

The orange filled entries are the hidden one.

Let’s verify this on the InnoDB page for that index:

And indeed, we can see that the Primary Key columns (in red) are included in each entry of the secondary index (in purple).

But not always !

When we have a Primary Key or Part of a Primary Key included in a Secondary Index, only the eventual missing columns of the Primary Key index will be added as right-most and hidden entries to the secondary index.

Let’s create a secondary index where the column b will be missing:

ALTER TABLE t1 ADD INDEX sec_idx (`d`,`c`,`e`,`a`);

The column b will be indeed added as the right-most hidden column of the index. Let’s verify this:

We can see above, that indeed, the value of column b is added. Same for the second record:

If we check in the InnoDB source code, this is also commented:

But what will happen, if we just use a prefix part of the Primary Key in the secondary index?

Let’s try:

CREATE TABLE `t1` (
  `a` varchar(10) NOT NULL DEFAULT 'aaaaaaaaaa',
  `b` varchar(10) NOT NULL DEFAULT 'bbbbbbbbbb',
  `c` int NOT NULL DEFAULT '1',
  `f` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`a`,`b`,`c`),
  KEY `sec_idx` (`c`,`f`,`a`(2))
) ENGINE=InnoDB 

SELECT * FROM t1;
+------------+------------+---+-----+
| a          | b          | c | f   |
+------------+------------+---+-----+
| aaaaaaaaaa | bbbbbbbbbb | 1 | abc |
| ccccccccc  | dddddddddd | 2 | def |
+------------+------------+---+-----+

We can see that only 2 characters of the column a are used in the secondary index.

If we check in the InnoDB page, we can notice that in-fact, the full column will also be added as the right-most hidden part of the secondary index:

So InnoDB needs to have the full PK, visible or hidden in the secondary index.

This is something not often known. But let’s hope this blog post clarifies it.

Enjoy MySQL & InnoDB… and of course, don’t forget to use Primary Keys !

February 14, 2024

So I like and use Spectra (aka “ultimate addons for gutenberg”) on some of my WordPress sites. If you use Spectra as well, you might want to be know that Spectra 2.12 has a significant performance regression, sucking in almost all of . Here’s hoping this was not intentional and that it’ll get fixed soon More info in https://wordpress.org/support/topic/2-12-significant-performance-regression/

Source

Our MySQL Belgian Days were a complete success. With 120 participants, the room was packed throughout the 2 days from the first to the last session. A record for this event !

The audience was made up of customers (some from very far away), users, contributors, community members and, of course, old friends.

The atmosphere was relaxed and everyone seemed happy to be there and to be talking about MySQL.

This is one of the rare opportunities to meet the MySQL developers who also operate our MySQL service on OCI and, for the first time, to meet the developers who code the unique features relating to HeatWave. Their presentations were technical, explaining the problems they encountered and which algorithms they used to to solve them, this was very interesting. The feedback from the audience was very positive.

The first day was dedicated to the MySQL Community. Several speakers presented their solutions, ranging from caching and monitoring to distributed systems.

Some other speakers explained their architecture and how they are using hybrid systems, on premise and cloud.

The second day was dedicated to the MySQL Team. The engineers presented content related to their current projects. Geir Hoydalsvik, the Senior Director of Software Development, presented the new MySQL release model, emphasizing the imminent arrival of the very first Long-Term Support (LTS) version. The need for this session was underlined by the apparent lack of clarity of some participants on the subject.

It was also very clear that many people are using MySQL Shell and that they love it ! Many users are in fact already deploying their architectures using the Admin API, this is a great news!

MySQL Shell Dump & Load utility are also very popular tools and everyone has praised their effectiveness.

The developers present in the room were also interested in Oystein‘s presentation regarding the use of GraalVM in MySQL to write Store Procedures in JavaScript.

Before the end of this 2 days conference, we had the MySQL Rockstar ceremony. All the winners were unanimous, and our first winner of the MySQL Rockstar Legend award received the loudest applause: well done Datacharmer!

The conference concludes with the traditional MySQL Community Dinner dinner expertly organised by Matthias, kudos !

We now have one year to prepare for our next MySQL Belgian Days, when we will celebrate MySQL’s 30th anniversary !

Here are the slides of the sessions:

Day 1 – MySQL Ecosystem Day

Day 2 – Oracle MySQL Day

MySQL Rockstar 2023 Awards

February 06, 2024

FOSDEM is over, but our video infrastructure is still working hard! This year, there were 855 events in rooms where the FOSDEM video team had equipment to stream, record, and publish everything, for a total of 365 scheduled hours of content. Postprocessing that much video is a massive task that takes quite a bit of effort; for this reason, since the 2017 edition, we have crowdsourced most of the manual work that needs to happen for this postprocessing to the devroom organizers and the speakers, as explained in this talk at FOSDEM 2019. As of this writing, out of those舰

February 05, 2024

A male business person working on a laptop in a futuristic, cryptopunk setting.

For nearly three decades, third-party cookies have been a major privacy concern on the web. They allow organizations, such as advertisers, to track users' browsing activities across multiple websites, often without explicit consent.

Unlike third-party cookies, first-party cookies are restricted to the website you are on. They are often used to improve the user experience, such as keeping you logged in, remembering what is in your shopping cart, and more.

Of course, first-party cookies can also be used to track your activity, like with Google Analytics, but they can't be used to follow you beyond that site. While both types of cookies can be used to track users, third-party cookies are much more invasive and problematic.

In 2018, I made the decision to remove all tracking tools, including Google Analytics, from my personal site. My website aspires to the privacy of printed works. The anonymity of a book holds a unique charm, and I find joy in not knowing who visits my website.

That said, I have no issue with the use of first-party cookies, provided it's consensual and used to improve the user experience. I understand their importance for many organizations, especially in marketing.

Fortunately, the era of third-party cookies is coming to a close. Browsers like Safari and Firefox have already taken steps to limit third-party tracking. They still allow certain third-party cookies to ensure websites work properly. Two examples include:

  1. E-commerce sites often rely on third-party cookies for integrating payment systems. Blocking these cookies could disrupt the payment process.
  2. Complex digital platforms, like healthcare and government websites, sometimes use cross-site authentication to link departmental websites.  Blocking these could prevent access to important services.

While Safari and Firefox have been limiting third-party cookies for some time, Google Chrome is lagging behind.  Google only began phasing out third-party cookies in early 2024 (a few weeks ago), starting with just 1% of its users. Their plan is to expand this to all users by the end of 2024.

Google's strategy can be viewed in two ways:

  1. Negatively, it could look like Google is delaying the phase-out because it profits from the advertising revenue these cookies generate.
  2. Positively, Google is cautious in removing third-party cookies to avoid disrupting websites that rely on them for non-advertising purposes.

As a regular Chrome user, I didn't want to wait until the end of 2024. If you feel the same, you can block third-party cookies in Chrome now. Just head to Settings, select Privacy and Security and activate Block third-party cookies.  Just beware, as some sites might stop working.

If you manage or develop websites, check they rely on third-party cookies. Use my HTTP Header Analyzer to check for SameSite=None attributes indicating third-party cookies, or test by disabling them in Chrome.

February 04, 2024

Why your app turned into spaghetti

Cover Image

"I do not like your software sir,
your architecture's poor.

Your users can't do anything,
unless you code some more.

This isn't how it used to be,
we had this figured out.

But then you had to mess it up
by moving into clouds."

There's a certain kind of programmer. Let's call him Stanley.

Stanley has been around for a while, and has his fair share of war stories. The common thread is that poorly conceived and specced solutions lead to disaster, or at least, ongoing misery. As a result, he has adopted a firm belief: it should be impossible for his program to reach an invalid state.

Stanley loves strong and static typing. He's a big fan of pattern matching, and enums, and discriminated unions, which allow correctness to be verified at compile time. He also has strong opinions on errors, which must be caught, logged and prevented. He uses only ACID-compliant databases, wants foreign keys and triggers to be enforced, and wraps everything in atomic transactions.

He hates any source of uncertainty or ambiguity, like untyped JSON or plain-text markup. His APIs will accept data only in normalized and validated form. When you use a Stanley lib, and it doesn't work, the answer will probably be: "you're using it wrong."

Stanley is most likely a back-end or systems-level developer. Because nirvana in front-end development is reached when you understand that this view of software is not just wrong, but fundamentally incompatible with the real world.

I will prove it.

Alice in wonderland

State Your Intent

Take a text editor. What happens if you press the up and down arrows?

The keyboard cursor (aka caret) moves up and down. Duh. Except it also moves left and right.

The editor state at the start has the caret on line 1 column 6. Pressing down will move it to line 2 column 6. But line 2 is too short, so the caret is forcibly moved left to column 1. Then, pressing down again will move it back to column 6.

It should be obvious that any editor that didn't remember which column you were actually on would be a nightmare to use. You know it in your bones. Yet this only works because the editor allows the caret to be placed on a position that "does not exist." What is the caret state in the middle? It is both column 1 and column 6.

Intent - State - View

To accommodate this, you need more than just a View that is a pure function of a State, as is now commonly taught. Rather, you need an Intent, which is the source of truth that you mutate... and which is then parsed and validated into a State. Only then can it be used by the View to render the caret in the right place.

To edit the intent, aka what a classic Controller does, is a bit tricky. When you press left/right, it should determine the new Intent.column based on the validated State.column +/- 1. But when you press up/down, it should keep the Intent.column you had before and instead change only Intent.line. New intent is a mixed function of both previous intent and previous state.

The general pattern is that you reuse Intent if it doesn't change, but that new computed Intent should be derived from State. Note that you should still enforce normal validation of Intent.column when editing too: you don't allow a user to go past the end of a line. Any new intent should be as valid as possible, but old intent should be preserved as is, even if non-sensical or inapplicable.

Validate vs Mutate

Functionally, for most of the code, it really does look and feel as if the state is just State, which is valid. It's just that when you make 1 state change, the app may decide to jump into a different State than one would think. When this happens, it means some old intent first became invalid, but then became valid again due to a subsequent intent/state change.

This is how applications actually work IRL. FYI.

Dining Etiquette

Knives and Forks

I chose a text editor as an example because Stanley can't dismiss this as just frivolous UI polish for limp wristed homosexuals. It's essential that editors work like this.

The pattern is far more common than most devs realize:

  • A tree view remembers the expand/collapse state for rows that are hidden.
  • Inspector tabs remember the tab you were on, even if currently disabled or inapplicable.
  • Toggling a widget between type A/B/C should remember all the A, B and C options, even if mutually exclusive.

All of these involve storing and preserving something unknown, invalid or unused, and bringing it back into play later.

More so, if software matches your expected intent, it's a complete non-event. What looks like a "surprise hidden state transition" to a programmer is actually the exact opposite. It would be an unpleasant surprise if that extra state transition didn't occur. It would only annoy users: they already told the software what they wanted, but it keeps forgetting.

The ur-example is how nested popup menus should work: good implementations track the motion of the cursor so you can move diagonally from parent to child, without falsely losing focus:

This is an instance of the goalkeeper's curse: people rarely compliment or notice the goalkeeper if they do their job, only if they mess up. Successful applications of this principle are doomed to remain unnoticed and unstudied.

Validation is not something you do once, discarding the messy input and only preserving the normalized output. It's something you do continuously and non-destructively, preserving the mess as much as possible. It's UI etiquette: the unspoken rules that everyone expects but which are mostly undocumented folklore.

This poses a problem for most SaaS in the wild, both architectural and existential. Most APIs will only accept mutations that are valid. The goal is for the database to be a sequence of fully valid states:

Mutate

The smallest possible operation in the system is a fully consistent transaction. This flattens any prior intent.

In practice, many software deviates from this ad-hoc. For example, spreadsheets let you create cyclic references, which is by definition invalid. The reason it must let you do this is because fixing one side of a cyclic reference also fixes the other side. A user wants and needs to do these operations in any order. So you must allow a state transition through an invalid state:

Google sheets circular reference
Mutate through invalid state

This requires an effective Intent/State split, whether formal or informal.

Edsger Dijkstra

Because cyclic references can go several levels deep, identifying one cyclic reference may require you to spider out the entire dependency graph. This is functionally equivalent to identifying all cyclic references—dixit Dijkstra. Plus, you need to produce sensible, specific error messages. Many "clever" algorithmic tricks fail this test.

Now imagine a spreadsheet API that doesn't allow for any cyclic references ever. This still requires you to validate the entire resulting model, just to determine if 1 change is allowed. It still requires a general validate(Intent). In short, it means your POST and PUT request handlers need to potentially call all your business logic.

That seems overkill, so the usual solution is bespoke validators for every single op. If the business logic changes, there is a risk your API will now accept invalid intent. And the app was not built for that.

If you flip it around and assume intent will go out-of-bounds as a normal matter, then you never have this risk. You can write the validation in one place, and you reuse it for every change as a normal matter of data flow.

Note that this is not cowboy coding. Records and state should not get irreversibly corrupted, because you only ever use valid inputs in computations. If the system is multiplayer, distributed changes should still be well-ordered and/or convergent. But the data structures you're encoding should be, essentially, entirely liberal to your user's needs.

Git diff

Consider git. Here, a "unit of intent" is just a diff applied to a known revision ID. When something's wrong with a merge, it doesn't crash, or panic, or refuse to work. It just enters a conflict state. This state is computed by merging two incompatible intents.

It's a dirty state that can't be turned into a clean commit without human intervention. This means git must continue to work, because you need to use git to clean it up. So git is fully aware when a conflict is being resolved.

As a general rule, the cases where you actually need to forbid a mutation which satisfies all the type and access constraints are small. A good example is trying to move a folder inside itself: the file system has to remain a sensibly connected tree. Enforcing the uniqueness of names is similar, but also comes with a caution: falsehoods programmers believe about names. Adding (Copy) to a duplicate name is usually better than refusing to accept it, and most names in real life aren't unique at all. Having user-facing names actually requires creating tools and affordances for search, renaming references, resolving duplicates, and so on.

Even among front-end developers, few people actually grok this mental model of a user. It's why most React(-like) apps in the wild are spaghetti, and why most blog posts about React gripes continue to miss the bigger picture. Doing React (and UI) well requires you to unlearn old habits and actually design your types and data flow so it uses potentially invalid input as its single source of truth. That way, a one-way data flow can enforce the necessary constraints on the fly.

The way Stanley likes to encode and mutate his data is how programmers think about their own program: it should be bug-free and not crash. The mistake is to think that this should also apply to any sort of creative process that program is meant to enable. It would be like making an IDE that only allows you to save a file if the code compiles and passes all the tests.

surprised, mind blown, cursing, thinking, light bulb

Trigger vs Memo

Coding around intent is a very hard thing to teach, because it can seem overwhelming. But what's overwhelming is not doing this. It leads to codebases where every new feature makes ongoing development harder, because no part of the codebase is ever finished. You will sprinkle copies of your business logic all over the place, in the form of request validation, optimistic local updaters, and guess-based cache invalidation.

If this is your baseline experience, your estimate of what is needed to pull this off will simply be wrong.

In the traditional MVC model, intent is only handled at the individual input widget or form level. e.g. While typing a number, the intermediate representation is a string. This may be empty, incomplete or not a number, but you temporarily allow that.

I've never seen people formally separate Intent from State in an entire front-end. Often their state is just an adhoc mix of both, where validation constraints are relaxed in the places where it was most needed. They might just duplicate certain fields to keep a validated and unvalidated variant side by side.

There is one common exception. In a React-like, when you do a useMemo with a derived computation of some state, this is actually a perfect fit. The eponymous useState actually describes Intent, not State, because the derived state is ephemeral. This is why so many devs get lost here.

const state = useMemo(
  () => validate(intent),
  [intent]
);

Their usual instinct is that every action that has knock-on effects should be immediately and fully realized, as part of one transaction. Only, they discover some of those knock-on effects need to be re-evaluated if certain circumstances change. Often to do so, they need to undo and remember what it was before. This is then triggered anew via a bespoke effect, which requires a custom trigger and mutation. If they'd instead deferred the computation, it could have auto-updated itself, and they would've still had the original data to work with.

e.g. In a WYSIWYG scenario, you often want to preview an operation as part of mouse hovering or dragging. It should look like the final result. You don't need to implement custom previewing and rewinding code for this. You just need the ability to layer on some additional ephemeral intent on top of the intent that is currently committed. Rewinding just means resetting that extra intent back to empty.

You can make this easy to use by treating previews as a special kind of transaction: now you can make preview states with the same code you use to apply the final change. You can also auto-tag the created objects as being preview-only, which is very useful. That is: you can auto-translate editing intent into preview intent, by messing with the contents of a transaction. Sounds bad, is actually good.

The same applies to any other temporary state, for example, highlighting of elements. Instead of manually changing colors, and creating/deleting labels to pull this off, derive the resolved style just-in-time. This is vastly simpler than doing it all on 1 classic retained model. There, you run the risk of highlights incorrectly becoming sticky, or shapes reverting to the wrong style when un-highlighted. You can architect it so this is simply impossible.

The trigger vs memo problem also happens on the back-end, when you have derived collections. Each object of type A must have an associated type B, created on-demand for each A. What happens if you delete an A? Do you delete the B? Do you turn the B into a tombstone? What if the relationship is 1-to-N, do you need to garbage collect?

If you create invisible objects behind the scenes as a user edits, and you never tell them, expect to see a giant mess as a result. It's crazy how often I've heard engineers suggest a user should only be allowed to create something, but then never delete it, as a "solution" to this problem. Everyday undo/redo precludes it. Don't be ridiculous.

The problem is having an additional layer of bookkeeping you didn't need. The source of truth was collection A, but you created a permanent derived collection B. If you instead make B ephemeral, derived via a stateless computation, then the problem goes away. You can still associate data with B records, but you don't treat B as the authoritative source for itself. This is basically what a WeakMap is.

Event Sourcing

In database land this can be realized with a materialized view, which can be incremental and subscribed to. Taken to its extreme, this turns into event-based sourcing, which might seem like a panacea for this mindset. But in most cases, the latter is still a system by and for Stanley. The event-based nature of those systems exists to support housekeeping tasks like migration, backup and recovery. Users are not supposed to be aware that this is happening. They do not have any view into the event log, and cannot branch and merge it. The exceptions are extremely rare.

It's not a system for working with user intent, only for flattening it, because it's append-only. It has a lot of the necessary basic building blocks, but substitutes programmer intent for user intent.

What's most nefarious is that the resulting tech stacks are often quite big and intricate, involving job queues, multi-layered caches, distribution networks, and more. It's a bunch of stuff that Stanley can take joy and pride in, far away from users, with "hard" engineering challenges. Unlike all this *ugh* JavaScript, which is always broken and unreliable and uninteresting.

Except it's only needed because Stanley only solved half the problem, badly.

Patch or Bust

When factored in from the start, it's actually quite practical to split Intent from State, and it has lots of benefits. Especially if State is just a more constrained version of the same data structures as Intent. This doesn't need to be fully global either, but it needs to encompass a meaningful document or workspace to be useful.

It does create an additional problem: you now have two kinds of data in circulation. If reading or writing requires you to be aware of both Intent and State, you've made your code more complicated and harder to reason about.

Validate vs Mutate

More so, making a new Intent requires a copy of the old Intent, which you mutate or clone. But you want to avoid passing Intent around in general, because it's fishy data. It may have the right types, but the constraints and referential integrity aren't guaranteed. It's a magnet for the kind of bugs a type-checker won't catch.

I've published my common solution before: turn changes into first-class values, and make a generic update of type Update<T> be the basic unit of change. As a first approximation, consider a shallow merge {...value, ...update}. This allows you to make an updateIntent(update) function where update only specifies the fields that are changing.

In other words, Update<Intent> looks just like Update<State> and can be derived 100% from State, without Intent. Only one place needs to have access to the old Intent, all other code can just call that. You can make an app intent-aware without complicating all the code.

Validate vs Mutate 2

If your state is cleaved along orthogonal lines, then this is all you need. i.e. If column and line are two separate fields, then you can selectively change only one of them. If they are stored as an XY tuple or vector, now you need to be able to describe a change that only affects either the X or Y component.

const value = {
  hello: 'text',
  foo: { bar: 2, baz: 4 },
};

const update = {
  hello: 'world',
  foo: { baz: 50 },
};

expect(
  patch(value, update)
).toEqual({
  hello: 'world',
  foo: { bar: 2, baz: 50 },
});

So in practice I have a function patch(value, update) which implements a comprehensive superset of a deep recursive merge, with full immutability. It doesn't try to do anything fancy with arrays or strings, they're just treated as atomic values. But it allows for precise overriding of merging behavior at every level, as well as custom lambda-based updates. You can patch tuples by index, but this is risky for dynamic lists. So instead you can express e.g. "append item to list" without the entire list, as a lambda.

I've been using patch for years now, and the uses are myriad. To overlay a set of overrides onto a base template, patch(base, overrides) is all you need. It's the most effective way I know to erase a metric ton of {...splats} and ?? defaultValues and != null from entire swathes of code. This is a real problem.

You could also view this as a "poor man's OT", with the main distinction being that a patch update only describes the new state, not the old state. Such updates are not reversible on their own. But they are far simpler to make and apply.

It can still power a global undo/redo system, in combination with its complement diff(A, B): you can reverse an update by diffing in the opposite direction. This is an operation which is formalized and streamlined into revise(…), so that it retains the exact shape of the original update, and doesn't require B at all. The structure of the update is sufficient information: it too encodes some intent behind the change.

With patch you also have a natural way to work with changes and conflicts as values. The earlier WYSIWIG scenario is just patch(commited, ephemeral) with bells on.

The net result is that mutating my intent or state is as easy as doing a {...value, ...update} splat, but I'm not falsely incentivized to flatten my data structures.

Instead it frees you up to think about what the most practical schema actually is from the data's point of view. This is driven by how the user wishes to edit it, because that's what you will connect it to. It makes you think about what a user's workspace actually is, and lets you align boundaries in UX and process with boundaries in data structure.

Array vs Linked List

Remember: most classic "data structures" are not about the structure of data at all. They serve as acceleration tools to speed up specific operations you need on that data. Having the reads and writes drive the data design was always part of the job. What's weird is that people don't apply that idea end-to-end, from database to UI and back.

SQL tables are shaped the way they are because it enables complex filters and joins. However, I find this pretty counterproductive: it produces derived query results that are difficult to keep up to date on a client. They also don't look like any of the data structures I actually want to use in my code.

A Bike Shed of Schemas

This points to a very under-appreciated problem: it is completely pointless to argue about schemas and data types without citing specific domain logic and code that will be used to produce, edit and consume it. Because that code determines which structures you are incentivized to use, and which structures will require bespoke extra work.

From afar, column and line are just XY coordinates. Just use a 2-vector. But once you factor in the domain logic and etiquette, you realize that the horizontal and vertical directions have vastly different rules applied to them, and splitting might be better. Which one do you pick?

This applies to all data. Whether you should put items in a List<T> or a Map<K, V> largely depends on whether the consuming code will loop over it, or need random access. If an API only provides one, consumers will just build the missing Map or List as a first step. This is O(n log n) either way, because of sorting.

The method you use to read or write your data shouldn't limit use of everyday structure. Not unless you have a very good reason. But this is exactly what happens.

A lot of bad choices in data design come down to picking the "wrong" data type simply because the most appropriate one is inconvenient in some cases. This then leads to Conway's law, where one team picks the types that are most convenient only for them. The other teams are stuck with it, and end up writing bidirectional conversion code around their part, which will never be removed. The software will now always have this shape, reflecting which concerns were considered essential. What color are your types?

{
  order: [4, 11, 9, 5, 15, 43],
  values: {
    4: {...},
    5: {...},
    9: {...},
    11: {...},
    15: {...},
    43: {...},
  },
);

For List vs Map, you can have this particular cake and eat it too. Just provide a List<Id> for the order and a Map<Id, T> for the values. If you structure a list or tree this way, then you can do both iteration and ID-based traversal in the most natural and efficient way. Don't underestimate how convenient this can be.

This also has the benefit that "re-ordering items" and "editing items" are fully orthogonal operations. It decomposes the problem of "patching a list of objects" into "patching a list of IDs" and "patching N separate objects". It makes code for manipulating lists and trees universal. It lets you to decide on a case by case basis whether you need to garbage collect the map, or whether preserving unused records is actually desirable.

Limiting it to ordinary JSON or JS types, rather than going full-blown OT or CRDT, is a useful baseline. With sensible schema design, at ordinary editing rates, CRDTs are overkill compared to the ability to just replay edits, or notify conflicts. This only requires version numbers and retries.

Users need those things anyway: just because a CRDT converges when two people edit, doesn't mean the result is what either person wants. The only case where OTs/CRDTs are absolutely necessary is rich-text editing, and you need bespoke UI solutions for that anyway. For simple text fields, last-write-wins is perfectly fine, and also far superior to what 99% of RESTy APIs do.

CRDT

A CRDT is just a mechanism that translates partially ordered intents into a single state. Like, it's cool that you can make CRDT counters and CRDT lists and whatnot... but each CRDT implements only one particular resolution strategy. If it doesn't produce the desired result, you've created invalid intent no user expected. With last-write-wins, you at least have something 1 user did intend. Whether this is actually destructive or corrective is mostly a matter of schema design and minimal surface area, not math.

The main thing that OTs and CRDTs do well is resolve edits on ordered sequences, like strings. If two users are typing text in the same doc, edits higher-up will shift edits down below, which means the indices change when rebased. But if you are editing structured data, you can avoid referring to indices entirely, and just use IDs instead. This sidesteps the issue, like splitting order from values.

For the order, there is a simple solution: a map with a fractional index, effectively a dead-simple list CRDT. It just comes with some overhead.

Google docs comment

Using a CRDT for string editing might not even be enough. Consider Google Docs-style comments anchored to that text: their indices also need to shift on every edit. Now you need a bespoke domain-aware CRDT. Or you work around it by injecting magic markers into the text. Either way, it seems non-trivial to decouple a CRDT from the specific target domain of the data inside. The constraints get mixed in.

If you ask me, this is why the field of real-time web apps is still in somewhat of a rut. It's mainly viewed as a high-end technical problem: how do we synchronize data structures over a P2P network without any data conflicts? What they should be asking is: what is the minimal amount of structure we need to reliably synchronize, so that users can have a shared workspace where intent is preserved, and conflicts are clearly signposted. And how should we design our schemas, so that our code can manipulate the data in a straightforward and reliable way? Fixing non-trivial user conflicts is simply not your job.

Most SaaS out there doesn't need any of this technical complexity. Consider that a good multiplayer app requires user presence and broadcast anyway. The simplest solution is just a persistent process on a single server coordinating this, one per live workspace. It's what most MMOs do. In fast-paced video games, this even involves lag compensation. Reliable ordering is not the big problem.

The situations where this doesn't scale, or where you absolutely must be P2P, are a minority. If you run into them, you must be doing very well. The solution that I've sketched out here is explicitly designed so it can comfortably be done by small teams, or even just 1 person.

Private CAD app

The (private) CAD app I showed glimpses of above is entirely built this way. It's patch all the way down and it's had undo/redo from day 1. It also has a developer mode where you can just edit the user-space part of the data model, and save/load it.

When the in-house designers come to me with new UX requests, they often ask: "Is it possible to do ____?" The answer is never a laborious sigh from a front-end dev with too much on their plate. It's "sure, and we can do more."

If you're not actively aware the design of schemas and code is tightly coupled, your codebase will explode, and the bulk of it will be glue. Much of it just serves to translate generalized intent into concrete state or commands. Arguments about schemas are usually just hidden debates about whose job it is to translate, split or join something. This isn't just an irrelevant matter of "wire formats" because changing the structure and format of data also changes how you address specific parts of it.

In an interactive UI, you also need a reverse path, to apply edits. What I hope you are starting to realize is that this is really just the forward path in reverse, on so many levels. The result of a basic query is just the ordered IDs of the records that it matched. A join returns a tuple of record IDs per row. If you pre-assemble the associated record data for me, you actually make my job as a front-end dev harder, because there are multiple forward paths for the exact same data, in subtly different forms. What I want is to query and mutate the same damn store you do, and be told when what changes. It's table-stakes now.

With well-architected data, this can be wired up mostly automatically, without any scaffolding. The implementations you encounter in the wild just obfuscate this, because they don't distinguish between the data store and the model it holds. The fact that the data store should not be corruptible, and should enforce permissions and quotas, is incorrectly extended to the entire model stored inside. But that model doesn't belong to Stanley, it belongs to the user. This is why desktop applications didn't have a "Data Export". It was just called Load and Save, and what you saved was the intent, in a file.

Windows 95 save dialog

Having a universal query or update mechanism doesn't absolve you from thinking about this either, which is why I think the patch approach is so rare: it looks like cowboy coding if you don't have the right boundaries in place. Patch is mainly for user-space mutations, not kernel-space, a concept that applies to more than just OS kernels. User-space must be very forgiving.

If you avoid it, you end up with something like GraphQL, a good example of solving only half the problem badly. Its getter assembles data for consumption by laboriously repeating it in dozens of partial variations. And it turns the setter part into an unsavory mix of lasagna and spaghetti. No wonder, it was designed for a platform that owns and hoards all your data.

* * *

Viewed narrowly, Intent is just a useful concept to rethink how you enforce validation and constraints in a front-end app. Viewed broadly, it completely changes how you build back-ends and data flows to support that. It will also teach you how adding new aspects to your software can reduce complexity, not increase it, if done right.

A good metric is to judge implementation choices by how many other places of the code need to care about them. If a proposed change requires adjustments literally everywhere else, it's probably a bad idea, unless the net effect is to remove code rather than add.

Live Canvas

I believe reconcilers like React or tree-sitter are a major guide stone here. What they do is apply structure-preserving transforms on data structures, and incrementally. They actually do the annoying part for you. I based Use.GPU on the same principles, and use it to drive CPU canvases too. The tree-based structure reflects that one function's state just might be the next function's intent, all the way down. This is a compelling argument that the data and the code should have roughly the same shape.

Back-end vs Front-end split

You will also conclude there is nothing more nefarious than a hard split between back-end and front-end. You know, coded by different people, where each side is only half-aware of the other's needs, but one sits squarely in front of the other. Well-intentioned guesses about what the other end needs will often be wrong. You will end up with data types and query models that cannot answer questions concisely and efficiently, and which must be babysat to not go stale.

In the last 20 years, little has changed here in the wild. On the back-end, it still looks mostly the same. Even when modern storage solutions are deployed, people end up putting SQL- and ORM-like layers on top, because that's what's familiar. The split between back-end and database has the exact same malaise.

None of this work actually helps make the app more reliable, it's the opposite: every new feature makes on-going development harder. Many "solutions" in this space are not solutions, they are copes. Maybe we're overdue for a NoSQL-revival, this time with a focus on practical schema design and mutation? SQL was designed to model administrative business processes, not live interaction. I happen to believe a front-end should sit next to the back-end, not in front of it, with only a thin proxy as a broker.

What I can tell you for sure is: it's so much better when intent is a first-class concept. You don't need nor want to treat user data as something to pussy-foot around, or handle like it's radioactive. You can manipulate and transport it without a care. You can build rich, comfy functionality on top. Once implemented, you may find yourself not touching your network code for a very long time. It's the opposite of overwhelming, it's lovely. You can focus on building the tools your users need.

This can pave the way for more advanced concepts like OT and CRDT, but will show you that neither of them is a substitute for getting your application fundamentals right.

In doing so, you reach a synthesis of Dijkstra and anti-Dijkstra: your program should be provably correct in its data flow, which means it can safely break in completely arbitrary ways.

Because the I in UI meant "intent" all along.

More:

February 03, 2024

A while ago I wrote a article comparing Waterloo with Bakhmut. That was in August 2023. We are in February 2024 and I predict that soon I will have to say the same about Avdiivka.

The city is about to be encircled the coming month and is in fact already ~ technically encircled (all of its supply roads are within fire range of Russians). I think that Avdiivka has already surpassed many times the death toll of Bakhmut.

The amounts of deaths are now also becoming comparable to the WWII Operation Bagration which had about 400000 German and 180000 Red Army kills.

The Soviet operation was named after the Georgian prince Pyotr Bagration (1765–1812), a general of the Imperial Russian Army during the Napoleonic Wars.

Wikipedia

February 01, 2024

HTTP headers play a crucial part in making your website fast and secure. For that reason, I often inspect HTTP headers to troubleshoot caching problems or review security settings.

The complexity of the HTTP standard and the challenge to remember all the best practices led me to develop an HTTP Header Analyzer four years ago.

It is pretty simple: enter a URL, and the tool will analyze the headers sent by your webserver, CMS or web application. It then explains these headers, offers a score, and suggests possible improvements.

For a demonstration, click https://dri.es/headers?url=https://www.reddit.com. As the URL suggests, it will analyze the HTTP headers of Reddit.com.

I began this as a weekend project in the early days of COVID, seeing it as just another addition to my toolkit. At the time, I simply added it to my projects page but never announced or mentioned it on my blog.

So why write about it now? Because I happened to check my log files and, lo and behold, the little scanner that could clocked in more than 5 million scans, averaging over 3,500 scans a day.

So four years and five million scans later, I'm finally announcing it to the world!

If you haven't tried my HTTP header analyzer, check it out. It's free, easy to use, requires no sign-up, and is built to help improve your website's performance and security.

The crawler works with all websites, but naturally, I added some special checks for Drupal sites.

I don't have any major plans for the crawler. At some point, I'd like to move it to its own domain, as it feels a bit out of place as part of my personal website. But for now, that isn't a priority.

For the time being, I'm open to any feedback or suggestions and will commit to making any necessary corrections or improvements.

It's rewarding to know that this tool has made thousands of websites faster and safer! It's also a great reminder to share your work, even in the simplest way – you never know the impact it could have.

January 27, 2024

With FOSDEM just a few days away, it is time for us to enlist your help. Every year, an enthusiastic band of volunteers make FOSDEM happen and make it a fun and safe place for all our attendees. We could not do this without you. This year we again need as many hands as possible, especially for heralding during the conference, during the buildup (starting Friday at noon) and teardown (Sunday evening). No need to worry about missing lunch at the weekend, food will be provided. Would you like to be part of the team that makes FOSDEM tick?舰

January 25, 2024

I recently added a new page to my website, which displays the top 10 topics of each year. This page serves a dual purpose: it visualizes the evolution of my interests, and provides visitors with an overview of the core content of my site.

As I stared at this new page, it became very obvious that my blog has maintained a consistent focus throughout the years. The top categories have not changed meaningfully in 18 years. The recurring themes include Drupal, Acquia, Open Source, and photography. The latter, a long-term interest, has experienced ebbs and flows during this time.

Just as having the same cereal each day can become boring – a fact my wife kindly reminds me of – covering the same topics for 18 years can lead to a certain "predictability". This realization sparked a desire to spice things up!

A cabinet displaying jars, each with a different interest inside: travel, photography, electronics, tennis, food, investing, coffee, and more. A cabinet with all my interests.

My blog history wouldn't suggest it, but I consider myself a person with many interests. Sometimes I have too many interests. As a result, I'm often drawn into various side projects – another fact my wife kindly reminds me of.

My point is that there are certainly different aspects of my life and interests that I could write more about. With that in mind, a goal for 2024 is to diversify my blog's content. I set a goal to introduce at least two new topics into my top 10 list for 2024, alongside Drupal and Acquia.

For example, I've been passionate about investing for 15 years. A few of my investments have generated a 5,000% return. Yet, I've been reluctant to write about investing for 15 years. The reasons for this hesitation are as diverse as a well-balanced investment portfolio: the challenge of finding enough time, moments of impostor syndrome, fear of confusing my long-term readers, and the sensitivity around discussing financial successes and failures.

Some might say that finance-related blog posts don't exactly bring the heat when it comes to spicing up a blog. They are probably right. But then I reminded myself that this blog is, first and foremost, for me. I write to learn and engage with readers. Consider this a heads-up: brace yourselves for some new flavors on my blog, from finance to who knows what else.

What I love most about blogging is how it stimulates my thoughts and sparks engaging discussions, often offline or via email. Introducing new two topics should most certainly lead to more of that. I look forward to such conversations.

January 21, 2024

As of Autoptimize Pro 2.3 there is an option to delay all JavaScript. Delaying can have a bigger positive performance impact then asyncing or deferring because AOPro will only load the delayed JS after a to be defined delay or (better even) only at the moment user interaction (such as swiping/ scrolling or mouse movement) is noticed. Obviously when delaying all JS you might want to be able to…

Source

January 18, 2024

2024 will be the first year there will be a FOSDEM junior track, organizing workshops for children 7-17. Developers from the FOSS Educational Programming Languages devroom will be organizing workshops for children from 7 - 17 with the help of CoderDojo. At FOSDEM, volunteers from CoderDojo International, CoderDojo Netherlands, and CoderDojo Belgium will be helping during the workshops. There will be workshops from MicroBlocks, Snap!, Zim, MIT App Inventor, Raspberry Pi, Hedy, and CoderDojo. For these workshops, you will need to get a free ticket. Visit https://fosdem.org/2024/schedule/track/junior and click on the workshop of your choice. On the workshop page, you舰

January 16, 2024

Maintaining a production dataset at a manageable size can present a considerable challenge during the administration of a MySQL InnoDB Cluster.

Old Days

Back in the day when we only had one main copy of our data (the source), and one read copy (the replica) that we used to look at current and old data from our main system, we used a special trick to remove data without affecting the replica. The trick was to turn off writes to the binary log for our removal commands in the main system. External tools like pt-archiver were also able to use that trick. To stop bypass writing into the binary log, we used the command: SET SQL_LOG_BIN=0.

This mean that on the main production server (replication source), we were purging the data without writing the delete operation into the binary logs:

Current Days

These days, nobody runs a single MySQL node in production. High Availability is highly recommended (required!).

So people are using MySQL InnoDB Cluster as main production solution. In such environment, having a reasonable size dataset is also recommended (CLONE, backups, …).

However the old solution to archive data is not working anymore as if we skip writing events in the binary logs, those events won’t be replicated across the other members of the cluster as Group Replication is also relying on those binary logs.

This means we will end up with an inconsistent cluster where all the nodes have a different dataset!

GTIDs can assist in the process; however, it is often challenging to recall the particular UUID component of the GTID that is utilized for archiving purposes. Moreover, it gets complicated to figure out where these transactions come from.

MySQL 8.3, which launched today, introduces even more efficient ways to streamline our purging and archiving workflows.

GTID TAGS

We MySQL 8.3, the GTID format has evolved and now includes the possibility to have some identifiable tags in it:

source_id:tag:transaction_id

Tags are of course not mandatory and GTIDs will work as before too.

So on the replica we want to be used as the archiving server, we will change the GTID set to include a large range of transaction coming from the cluster but with a tag we defined. In this example, I will use “archiving”.

On a member of the MySQL InnoDB Cluster, the following query was executed to retrieve the UUID segment of the GTID used by the Group:

InnoDB Cluster (any node) > show global variables like '%group_name';
+------------------------------+--------------------------------------+
| Variable_name                | Value                                |
+------------------------------+--------------------------------------+
| group_replication_group_name | 1e4516ec-b47c-11ee-bb6e-c8cb9e32df8e |
+------------------------------+--------------------------------------+

Now on the Replica used as archived server, I run the following statement:

Replica > set global gtid_purged=
             "+1e4516ec-b47c-11ee-bb6e-c8cb9e32df8e:archiving:1-9999";

On the Primary instance, we can now start the archiving process like this:

InnoDB Cluster Primary > set gtid_next="automatic:archiving";
InnoDB Cluster Primary > delete from t1 where date(timestamp)<"2023-02-01";
InnoDB Cluster Primary > set gtid_next="automatic"

It’s of course better to use a dedicated session to archive the data.

This is the illustration of the process:

It’s imperative to ensure to use a large range and closely monitor its usage on the cluster. In this example we could only have 9,999 purging transaction, upon reaching this threshold, next actions will be applied on the replica too.

Of course it was already possible to perform such operation with standard GTIDs but it was more complicated, or you needed to use specific UUID. With the new GTID Tag, we can directly and easily identify the purging transactions.

But the integration with external tools is also easier, for example this is an example on how to use GTID and TAGS with pt-archiver:

./pt-archiver \
--source h=production_cluster,P=3306,u=user,p=xxxx,D=mydata,t=t1 \
--purge --where 'date(timestamp)<"2023-04-01"' \
--set-vars 'GTID_NEXT="automatic:archiving"'

Conclusion

MySQL 8.3 marks a notable improvement for database administrators who manage High Availability environments using MySQL InnoDB Cluster. With the introduction of GTID tags, the archiving and purging process becomes substantially more manageable and less prone to human error. These tags allow for clear identification of transaction sources, thus avoiding the pitfalls of inconsistent datasets across cluster nodes.

Beyond simplifying the identification process, GTID tags facilitate the integration with external tools.

Enjoy archiving and purging your data !

January 12, 2024

The MySQL Belgian Days are sold-out, our event that will take place the 1st and 2nd February 2024 in Brussels seems to have already attracted a lot of interest.

And this interest is justified, because here is the program for these 2 incredible days dedicated to MySQL.

Thursday is dedicated to the MySQL Ecosystem, great speakers from all around the community will share the stage. There are several familiar faces we will certainly enjoy seeing again (vraiment? oui oui!)*.

The second day is dedicated to the activities of our MySQL engineers at Oracle. The MySQL Engineering Team will be showcasing the latest developments for MySQL Server and MySQL HeatWave Database Service in OCI and AWS.

During the intermissions, do not hesitate to engage in conversation with the speakers and take the opportunity to directly engage with the MySQL Engineers.

On Friday, we will conclude the day by extending a warm welcome to the new MySQL Rockstars. Be sure not to miss the award ceremony.

And finally, on Friday night, the Belgian Party Squad is organizing the traditional MySQL & Friends Community Dinner, don’t forget to register, tickets are going fast !

See you soon in Belgium !!

(*) private joke

January 08, 2024

At the beginning of every year, I publish a retrospective that looks back at the previous year at Acquia. I also discuss the changing dynamics in our industry, focusing on Content Management Systems (CMS) and Digital Experience Platforms (DXP).

If you'd like, you can read all of my retrospectives for the past 15 years: 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009.

Resilience and growth amid market turbulence

At the beginning of 2023, interest rates were 4.5%. Technology companies, investors, and PE firms were optimistic, anticipating modest growth. However, as inflation persisted and central banks raised rates more than expected, this optimism dwindled.

The first quarter also saw a regional bank crisis, notably the fall of Silicon Valley Bank, which many tech firms, including Acquia, relied on. Following these events, the market's pace slowed, and the early optimism shifted to a more cautious outlook.

Despite these challenges, Acquia thrived. We marked 16 years of revenue increase, achieved record renewal rates, and continued our five-year trend of rising profitability. 2023 was another standout year for Acquia.

One of our main objectives for 2023 was to expand our platform through M&A. However, tighter credit lending, valuation discrepancies, and economic uncertainty complicated these efforts. By the end of 2023, with the public markets rebounding, the M&A landscape showed slight improvement.

In November, we announced Acquia's plan to acquire Monsido, a platform for improving website accessibility, content quality, SEO, privacy, and performance. The acquisition closed last Friday. I'm excited about expanding the value we offer to our customers and look forward to welcoming Monsido's employees to Acquia.

Working towards a safer, responsible and inclusive digital future

Image with panels showcasing various digital trends: one emphasizes inclusivity with a woman in a wheelchair at a workstation, and another shows a man interacting with an AI robot, illustrating human-AI collaboration.

Looking ahead to 2024, I anticipate these to be the dominant trends in the CMS and DXP markets:

  • Converging technology ecosystems: MACH and Jamstack are evolving beyond their original approaches. As a result, we'll see their capabilities converge with one another, and with those of traditional CMSes. I wrote extensively about this in Jamstack and MACH's journey towards traditional CMS concepts.
  • Navigating the cookie-less future: Marketers will need to navigate a cookie-less future. This means organizations will depend more and more on data they collect from their own digital channels (websites, newsletters, video platforms, etc).
  • Digital decentralization: The deterioration of commercial social media platforms has been a positive development in my opinion. I anticipate users will continue to reduce their time on these commercial platforms. The steady shift towards open, decentralized alternatives like Mastodon, Nostr, and personal websites is a welcome trend.
  • Growth in digital accessibility: The importance of accessibility is growing and will become even more important in 2024 as organizations prepare for enforcement of the European Accessibility Act in 2025. This trend isn't just about responding to legislation; it's about making sure digital experiences are inclusive to everyone, including individuals with disabilities.
  • AI's impact on digital marketing and websites: As people start getting information directly from Artificial Intelligence (AI) tools, organic website traffic will decline. Just like with the cookie-less future, organizations will need to focus more on growing their own digital channels with exclusive and personalized content.
  • AI's impact on website building: We'll witness AI's evolution from assisting in content production to facilitating the building of applications. Instead of laboriously piecing together landing pages or campaigns with a hundred clicks, users will simply be able to guide the process with AI prompts. AI will evolve to become the new user interface for complex tasks.
  • Cybersecurity prioritization: As digital landscapes expand, so do vulnerabilities. People and organizations will become more protective of their personal and privacy data, and will demand greater control over the sharing and storage of their information. This means a growing focus on regulation, more strict compliance rules, automatic software updates, AI-driven monitoring and threat detection, passwordless authentication, and more.
  • Central content and data stores: Organizations are gravitating more and more towards all-in-one platforms that consolidate data and content. This centralization enables businesses to better understand and anticipate customer needs, and deliver better, personalized customer experiences.

While some of these trends suggest a decline in the importance of traditional websites, others trends point towards a positive future for websites. On one side, the rise of AI in information gathering will decrease the need for traditional websites. On the other side, the decline of commercial social media and the shift to a cookie-less future suggest that websites will continue to be important, perhaps even more so.

What I like most about many of these trends is that they are shaping a more intuitive, inclusive, and secure digital future. Their impact on end-users will be profound, making every interaction more personal, accessible, and secure.

However, I suspect the ways in which we do digital marketing will need to change quite a bit. Marketing teams will need to evolve how they generate leads. They'll have to use privacy-friendly methods to develop strong customer relationships and offer more value than what AI tools provide.

This means getting closer to customers with content that is personal and relevant. The use of intent data, first-party data and predictive marketing for determining the "next best actions" will continue to grow in importance.

It also means that more content may transition into secure areas such as newsletters, members-only websites, or websites that tailor content dynamically for each user, where it can't be mined by AI tools.

All this bodes well for CMSes, Customer Data Platforms (CDPs), personalization software, Account-Based Marketing (ABM), etc. By utilizing these platforms, marketing teams can better engage with individuals and offer more meaningful experiences. Acquia is well-positioned based on these trends.

Reaffirming our DXP strategy, with a focus on openness

On a personal front, my title expanded from CTO to CTO & Chief Strategy Officer. Since Acquia's inception, I've always played a key role in shaping both our technology and business strategies. This title change reflects my ongoing responsibilities.

Until 2018, Acquia mainly focused on CMS. In 2018, we made a strategic shift from being a leader in CMS to becoming a leader in DXP. We have greatly expanded our product portfolio since then. Today, Acquia's DXP includes CMS, Digital Asset Management (DAM), Customer Data Platform (CDP), Marketing Automation, Digital Experience Optimization, and more. We've been recognized as leaders in DXP by analyst firms including Gartner, GigaOm, Aragon Research and Omdia.

As we entered 2023, we felt we had successfully executed our big 2018 strategy shift. With my updated title, I spearheaded an effort to revise our corporate strategy to figure out what is next. The results were that we reaffirmed our commitment to our core DXP market with the goal of creating the best "Open DXP" in the market.

We see "Open" as a key differentiator. As part of our updated strategy, we explicitly defined what "Open" means to us. While this topic deserves a blog post on its own, I will touch on it here.

Being "Open" means we actively promote integrations with third-party vendors. When you purchase an Acquia product, you're not just buying a tool; you're also buying into a technology ecosystem.

However, our definition of "Open" extends far beyond mere integrations. It's also about creating an inclusive environment where everyone is empowered to participate and contribute to meaningful digital experiences in a safe and secure manner. Our updated strategy, while still focused on the DXP ecosystem, champions empowerment, inclusivity, accessibility, and safety.

A slide titled "The Open DXP" detailing its four core principles: integrations that support customization without vendor lock-in, empowerment for user-driven development, accessibility for all backgrounds and abilities, and a focus on safety with security and compliance. A slide from our strategy presentation, summarizing our definition of an Open DXP. The definition is the cornerstone of Acquia's "Why"-statement.

People who have followed me for a while know that I've long advocated for an Open Web, promoting inclusivity, accessibility, and safety. It's inspiring to see Acquia fully embrace these principles, a move I hope will inspire not just me, but our employees, customers, and partners too. It's not just a strategy; it's a reflection of our core values.

It probably doesn't come as a surprise that our updated strategy aligns with the trends I outlined above, many of which also point towards a safer, more responsible, and inclusive digital future. Our enthusiasm for the Monsido acquisition is also driven by these core principles.

Needless to say, our strategy update is about much more than a commitment to openness. Our commitment to openness drives a lot of our strategic decisions. Here are a few key examples to illustrate our direction.

  • Expanding into the mid-market: Acquia has primarily catered to the enterprise and upper mid-market sectors. We're driven by the belief that an open platform, dedicated to inclusivity, accessibility, and safety, enhances the web for everyone. Our commitment to contributing to a better web is motivating us to broaden our reach, making expanding into the mid-market a logical strategic move.
  • Expanding partnerships, empowering co-creation: Our partnership program is well-established with Drupal, and we're actively expanding partnerships for Digital Asset Management (DAM), CDP, and marketing automation. We aim to go beyond a standard partner program by engaging more deeply in co-creation with our partners, similar to what we do in the Open Source community. The goal is to foster an open ecosystem where everyone can contribute to developing customer solutions, embodying our commitment to empowerment and collaboration. We've already launched a marketplace in 2023, Acquia Exchange, featuring more than 100 co-created solutions, with the goal of expanding to 500 by the end of 2024.
  • Be an AI-fueled organization: In 2023, we launched numerous AI features and we anticipate introducing even more in 2024. Acquia already adheres to responsible AI principles. This aligns with our definition of "Open", emphasizing accountability and safety for the AI systems we develop. We want to continue to be a leader in this space.

Stronger together

We've always been very focused on our greatest asset: our people. This year, we welcomed exceptional talent across the organization, including two key additions to our Executive Leadership Team (ELT): Tarang Patel, leading Corporate Development, and Jennifer Griffin Smith, our Chief Market Officer. Their expertise has already made a significant impact.

Eight logos in a row with different '2023 Best Places to Work' awards. Eight awards for "Best Places to Work 2023" in various categories such as IT, mid-size workplaces, female-friendly environments, wellbeing, and appeal for millennials, across India, the UK, and the USA.

In 2023, we dedicated ourselves to redefining and enhancing the Acquia employee experience, committing daily to its principles through all our programs. This focus, along with our strong emphasis on diversity, equity, and inclusion (DEI), has cultivated a culture of exceptional productivity and collaboration. As a result, we've seen not just record-high employee retention rates but also remarkable employee satisfaction and engagement. Our efforts have earned us various prestigious "Best Place to Work" awards.

Customer-centric excellence, growth, and renewals

Our commitment to delivering great customer experiences is evident in the awards and recognition we received, many of which are influenced by customer feedback. These accolades include recognition on platforms like TrustRadius and G2, as well as the prestigious 2023 CODiE Award.

A list of 32 awards across various products for 2023, including 'Leader' and 'High Performer' from G2, 'Best Of,' 'Best Feature Set,' 'Best Value for Price,' and 'Best Relationship' from TrustRadius, and the '2023 SIIA CODiE Winner' recognition, all highlighting top customer ratings. Acquia received 32 awards for leadership in various categories across its products.

As mentioned earlier, we delivered consistently excellent, and historically high, renewal rates throughout 2023. It means our customers are voting with their feet (and wallets) to stay with Acquia.

Furthermore, we achieved remarkable growth within our customer base with record rates of expansion growth. Not only did customers choose to stay with Acquia, they chose to buy more from Acquia as well.

To top it all off, we experienced a substantial increase in the number of customers who were willing to serve as references for Acquia, endorsing our products and services to prospects.

Many of the notable customer stories for 2023 came from some of the world's most recognizable organizations, including:

  • Nestlé: With thousands of brand sites hosted on disparate technologies, Nestlé brand managers had difficulty maintaining, updating, and securing brand assets globally. Not only was it a challenge to standardize and govern the multiple brands, it was costly to maintain resources for each technology and inefficient with work being duplicated across sites. Today, Nestlé uses Drupal, Acquia Cloud Platform and Acquia Site Factory to face these challenges. Nearly all (90%) of Nestlé sites are built using Drupal. Across the brand's entire portfolio of sites, approximately 60% are built on a shared codebase – made possible by Acquia and Acquia Site Factory.
  • Novartis: The Novartis product sites in the U.S. were fragmented across multiple platforms, with different approaches and capabilities and varying levels of technical debt. This led to uncertainty in the level of effort and time to market for new properties. Today, the Novartis platform built with Acquia and EPAM has become a model within the larger Novartis organization for how a design system can seamlessly integrate with Drupal to build a decoupled front end. The new platform allows Novartis to create new or move existing websites in a standardized design framework, leading to more efficient development cycles and more functionality delivered in each sprint.
  • US Drug Enforcement Administration: The U.S. DEA wanted to create a campaign site to increase public awareness regarding the increasing danger of fake prescription pills laced with fentanyl. Developed with Tactis and Acquia, the campaign website One Pill Can Kill highlights the lethal nature of fentanyl. The campaign compares real and fake pills through videos featuring parents and teens who share their experiences with fentanyl. It also provides resources and support for teens, parents, and teachers and discusses the use of Naloxone in reversing the effects of drug overdose.
  • Cox Automotive: Cox Automotive uses first-party data through Acquia Campaign Studio for better targeted marketing. With their Automotive Marketing Platform (AMP) powered by Acquia, they access real-time data and insights, delivering personalized messages at the right time. The results? Dealers using AMP see consumers nine times more likely to purchase within 45 days and a 14-fold increase in sales gross profit ROI.

I'm proud of outcomes like this: it show how valuable our DXP is to our customers.

Product innovation

In 2023, we remained focused on solving problems for our current and future customers. We use both quantitative and qualitative data to assess areas of opportunities and run hypothesis-driven experiments with design prototypes, hackathons, and proofs-of-concept. This approach has led to hundreds of improvements across our products, both by our development teams and through partnerships. Below are some key innovations that have transformed the way our customers operate:

  • We released many AI features in 2023, including AI assistants and automations for Acquia DAM, Acquia CDP, Acquia Campaign Studio, and Drupal. This includes: AI assist during asset creation in Drupal and Campaign Studio, AI-generated descriptions for assets and products in DAM, auto-tagging in DAM with computer vision, next best action/channel predictions in CDP, ML-powered customer segmentation in CDP, and much more.
  • Our Drupal Acceleration Team (DAT) worked with the Drupal community on major upgrade of the Drupal field UI, which makes it significantly faster and more user-friendly to perform content modeling. We also open sourced Acquia Migrate Accelerate as part of the run-up to the Drupal 7 community end-of-life in January 2025. Finally, DAT contributed to a number of major ongoing initiatives including Project Browser, Automatic Updates, Page Building, Recipes, and more that will be seen in later versions of Drupal.
  • We launched a new trial experience for Acquia Cloud Platform, our Drupal platform. Organizations can now explore Acquia's hosting and developer tools to see how their Drupal applications perform on our platform.
  • Our Kuberbetes-native Drupal hosting platform backed by AWS, Acquia Cloud Next, continued to roll out to more customers. Over two-thirds of our customers are now enjoying Acquia Cloud Next, which provides them the highest levels of performance, self-healing, and dynamic scaling. We've seen a 50% decrease in critical support tickets since transitioning customers to Acquia Cloud Next, all while maintaining an impressive uptime record of 99.99% or higher.
  • Our open source marketing automation tool, Acquia Campaign Studio, is now running on Acquia Cloud Next as its core processing platform. This consolidation benefits everyone: it streamlines and accelerates innovation for us while enabling our customers to deliver targeted and personalized messages at a massive scale.
  • We decided to make Mautic a completely independent Open Source project, letting it grow and change freely. We've remained the top contributor ever since.
  • Marketers can now easily shape the Acquia CDP data model using low-code tools, custom attributes and custom calculations features. This empowers all Acquia CDP users, regardless of technical skill, to explore new use cases.
  • Acquia CDP's updated architecture enables nearly limitless elasticity, which allows the platform to scale automatically based on demand. We put this to the test during Black Friday, when our CDP efficiently handled billions of events. Our new architecture has led to faster, more consistent processing times, with speeds improving by over 70%.
  • With Snowflake as Acquia's data backbone, Acquia customers can now collaborate on their data within their organization and across business units. Customers can securely share and access governed data while preserving privacy, offering them advanced data strategies and solutions.
  • Our DAM innovation featured 47 updates and 13 new integrations. These updates included improved Product Information Management (PIM) functionality, increased accessibility, and a revamped search experience. Leveraging AI, we automated the generation of alt-text and product descriptions, which streamlines content management. Additionally, we established three partnerships to enhance content creation, selection, and distribution in DAM: Moovly for AI-driven video creation and translation, Vizit for optimizing content based on audience insights, and Syndic8 for distributing visual and product content across online commerce platforms.
  • With the acquisition of Monsido and new partnerships with VWO (tools for optimizing website engagement and conversions) and Conductor (SEO platform), Acquia DXP now offers an unparalleled suite of tools for experience optimization. Acquia already provided the best tools to build, manage and operate websites. With these additions, Acquia DXP also offers the best solution for experience optimization.
  • Acquia also launched Acquia TV, a one-stop destination for all things digital experience. It features video podcasts, event highlights, product breakdowns, and other content from a diverse collection of industry voices. This is a great example of how we use our own technology to connect more powerfully with our audiences. It's something our customers strive to do everyday.

Conclusion

In spite of the economic uncertainties of 2023, Acquia had a remarkable year. We achieved our objectives, overcame challenges, and delivered outstanding results. I'm grateful to be in the position that we are in.

Our achievements in 2023 underscore the importance of putting our customers first and nurturing exceptional teams. Alongside effective management and financial responsibility, these elements fuel ongoing growth, irrespective of economic conditions.

Of course, none of our results would be possible without the support of our customers, our partners, and the Drupal and Mautic communities. Last but not least, I'm grateful for the dedication and hard work of all Acquians who made 2023 another exceptional year.

January 03, 2024

A skilled craftsperson works on the intricate mechanism of a droplet-shaped vintage watch.

Since 1999, I've been consistently working on this website, making it one of my longest-standing projects. Even after all these years, the satisfaction of working on my website remains strong. Remarkable, indeed.

During rare moments of calm — be it a slow holiday afternoon, a long flight home, or the early morning stillness — I'm often drawn to tinkering with my website.

When working on my website, I often make small tweaks and improvements. Much like a watchmaker meticulously fine-tuning the gears of an antique clock, I pay close attention to details.

This holiday, I improved the lazy loading of images in my blog posts, leading to a perfect Lighthouse score. A perfect score isn't necessary, but it shows the effort and care I put into my website.

A screenshot of dri.es' Lighthouse scores showing 100% scores in performance, accessibility, best practices, and SEO. Screenshot of Lighthouse scores via https://pagespeed.web.dev/.

I also validated my RSS feeds, uncovering a few opportunities for improvement. Like a good Belgian school boy, I promptly implemented these improvements, added new PHPUnit tests and integrated these into my CI/CD pipeline. Some might consider this overkill for a personal site, but for me, it's about mastering the craft, adhering to high standards, and building something that is durable.

Last year, I added 135 new photos to my website, a way for me to document my adventures and family moments. As the year drew to a close, I made sure all new photos have descriptive alt-texts, ensuring they're accessible to all. Writing alt-texts can be tedious, yet it's these small but important details that give me satisfaction.

Just like the watchmaker working on an antique watch, it's not just about keeping time better; it's about cherishing the process and craft. There is something uniquely calming about slowly iterating on the details of a website. I call it the The Watchmaker's Approach to Web Development, where the process holds as much value as the result.

I'm thankful for my website as it provides me a space where I can create, share, and unwind. Why share all this? Perhaps to encourage more people to dive into the world of website creation and maintenance.

January 02, 2024

December 27, 2023

Ik schrijf nu zo’n zeven jaar software voor high-end vijf-as CNC machines bij Heidenhain.

Wij zijn bezig aan de TNC 7 software. Dit is de opvolger van de TNC 640 software. Ik ben samen met een team ontwikkelaars die in Traunreut zitten de ontwikkelaar van voornamelijk de NC editor en alles wat daarbij hoort (alle integraties met de omliggende onderdelen van al wat er ook bij komt kijken – en dat is veel).

Als ik naar de TNC 640 kijk is dit software die veertig jaar zal meegaan.

Als ik naar onze eigen Belgische ruimtevaartindustrie maar ook bv. wapenindustrie kijk, maar ook echt om het even wat, is dat software die ongeveer overal gebruikt wordt. Voor ongeveer alles. De Jobs-pagina van FN Herstal bijvoorbeeld toont al een paar jaar mensen die effectief een CNC machine met een Heidenhain TNC 640 bedienen. Binnenkort zal daar dus onze nieuwe TNC 7 gebruikt worden! (om dan weet ik veel, bijvoorbeeld onze soldaten hun machinegeweren mee te maken).

Enfin. Genoeg gestoef daarover!

Ik ben (dus) de laatste tijd geïnteresseerd geraakt in het wereldje van metaalbewerking. Toen ik een jonge gast van zo’n 15 – 16 was, was mijn grootvader een draaibank metaalbewerker (bij AGFA-gevaert). Die man was altijd trots om mij voortdurend uitleg te geven daarover.

Ik heb ook op school tijdens mijn jaren electromechanica een paar toetsen mogen of moeten doen met een eenvoudige manuele draaibank.

Dat was eigenlijk wel interessant. Maar computers, die waren toen ook heel erg interessant!

Vandaag komt het voor mij samen.

Maar ik moet zeggen. Nee echt. De echte ‘hackers‘ (het aanpassen van een systeem opdat het meer doet dan dat waar het voor ontworpen is) zitten nog veel meer in de machining wereld dan in de of onze software wereld. Hoewel wij er ook wel een stel hebben rondlopen.

Jullie (of wij) andere sofware ontwikkelaars hebben er vaak geen idee van hoe enorm uitgebreid die andere wereld is. Zowel in hout als in metaal. Het loopt daar de spuigaten uit van de hackers.

Ik wil maar zeggen, vooral aan de jonge kuikens: leg uw interesses breed. Ga heel erg ver. Het is allemaal zo ontzettend interessant.

En vooral: het zit in de combinatie van software én iets anders. Dat kan security zijn. Maar dus ook machining. Of medisch (imaging, enzo)? Een combinatie van alles. Ik sprak met mensen die tools maken voor CNC machines die zaken voor chirurgen produceren.

Enkel Python code kloppen is niet de essentie van onze passie. Je schrijft die Python code opdat het samenwerkt met of voor iets anders. Dat iets anders moet je dus ook kennen of willen kennen.

December 23, 2023

Het ontbreekt ons in het Westen aan analysten die kunnen bewegen in hun mening.

Een groot probleem is dat om het even wie die iets zegt, onmiddellijk vastegepint is.

Zo’n persoon kan nooit nog een andere richting uitgaan.

We zijn nu aangekomen in een militair conflict met Rusland. Onze analysten moeten terug kunnen bewegen zoals de wind waait. Want die waait heel de tijd.

Nieuwe inzichten zijn eigenlijk belangrijker dan al onze oude onzin.

We moeten niet zomaar mensen cancelen. We moeten terug toelaten dat analysten fouten maakten, fouten maken en zich dus (daarom) willen aanpassen.

Ik vraag ook ondubbelzinnig dat Duitsland opstaat, en haar leger volledig brengt waar het hoort: de grootste macht van Europa. De Wereld Oorlogen zijn voorbij en wat gebeurd is is gebeurd. We hebben Duitsland nodig om ons te beschermen. We moeten daar aan bijdragen.

Zeit bleibt nicht stehen.

December 22, 2023

De huidige tijd wordt denk ik het best omschreven door Rammstein’s Zeit.

De oorlog tussen Rusland en Oekraïne is voor het Westen eigenlijk volledig verloren. Rusland zal bijna zeker een groot deel van Oekraïne innemen.

Massa’s mannen in Oekraïne sterven momenteel. Ongeveer zo’n 20 000 mensen per maand. Dat waren er zo’n 650 vandaag.

Dit komt overeen met wat DPA, Weeb Union, History Legends en Military Summary zeggen.

Ik ben er al langer uit dat de Westerse propaganda (ja sorry jongens, het is nu echt wel propaganda hoor) kei hard liegt. En dat de bovenstaanden een veel beter en klaar beeld geven van de werkelijkheid.

Die werkelijkheid is bitterhard. Namelijk dat Rusland dit aan het winnen is. Geen beetje. Maar ernstig en echt.

Extreem veel mensen uit Oekraïne zijn aan het sterven. Iedere dag.

Er zijn ook geen gewonnen gebieden. Men kan ze nauwelijks vinden op een map als ze er zijn. En als ze al niet terug ingenomen zijn door Rusland.

Zeit, bitte bleib stehen, bleib stehen. Zeit, das soll immer so weitergehen.

Rammstein, song Zeit

De werkelijkheid is dat Rusland vanaf nu gebieden zal gaan innemen. De tijd zal dus niet blijven staan. Laat staan stilstaan.

Dat is hard en dit zijn harde woorden. Het zijn harde tijden.

Het Westen zal dit verliezen.

Wat U daar ook van vindt.

I could have set MySQL between parenthesis in the title as this article is more about how to use OpenTofu to deploy on OCI.

I will explain how to install OpenTofu and how to use it to deploy on OCI. I will also mention what are the required changes be able to use my previous Terraform deployment files.

As an example, let’s use the modules to deploy WordPress with MySQL HeatWave Database Service: oci-wordpress-mds.

Installing OpenTofu

If like me you are using a RPM based Linux distro, you can find the necessary information to create the yum repository on OpenTofu’s website:

$ sudo su -
# cat >/etc/yum.repos.d/opentofu.repo <<EOF
[opentofu]
name=opentofu
baseurl=https://packages.opentofu.org/opentofu/tofu/rpm_any/rpm_any/\$basearch
repo_gpgcheck=0
gpgcheck=1
enabled=1
gpgkey=https://get.opentofu.org/opentofu.gpg
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300

[opentofu-source]
name=opentofu-source
baseurl=https://packages.opentofu.org/opentofu/tofu/rpm_any/rpm_any/SRPMS
repo_gpgcheck=0
gpgcheck=1
enabled=1
gpgkey=https://get.opentofu.org/opentofu.gpg
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
EOF

When the repo is created, you just need to use yum (or dnf) like this:

$ sudo dnf install -y tofu

You can verify that OpenTofu is installed correctly by running the following command that returns the installed version:

$ tofu version
OpenTofu v1.6.0-rc1
on linux_amd64

Terraform code

To test, we download the code from GitHub (v1.9.2):

$ git clone https://github.com/lefred/oci-wordpress-mds.git
cd oci-wordpress-mds

We need to first copy the file terraform.tfvars.template to terraform.tvars and edit the content with our OCI information (tenancy, ocids, keys, …).

When ready, we can start with the initialization of the environment:

$ tofu init

Initializing the backend...
Initializing modules...
- mds-instance in modules/mds-instance
- wordpress in modules/wordpress

Initializing provider plugins...
- Finding latest version of hashicorp/template...
- Finding latest version of hashicorp/oci...
- Finding latest version of hashicorp/tls...
- Installing hashicorp/template v2.2.0...
- Installed hashicorp/template v2.2.0 (signed, key ID 0C0AF313E5FD9F80)
- Installing hashicorp/tls v4.0.5...
- Installed hashicorp/tls v4.0.5 (signed, key ID 0C0AF313E5FD9F80)

Providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://opentofu.org/docs/cli/plugins/signing/
╷
│ Error: Failed to query available provider packages
│ 
│ Could not retrieve the list of available versions for provider hashicorp/oci: provider registry
│ registry.opentofu.org does not have a provider named registry.opentofu.org/hashicorp/oci
│ 
│ All modules should specify their required_providers so that external consumers will get the correct
│ providers when using a module. To see which modules are currently depending on hashicorp/oci, run the
│ following command:
│     tofu providers
│ 
│ If you believe this provider is missing from the registry, please submit a issue on the OpenTofu
│ Registry https://github.com/opentofu/registry/issues/
╵

With Terraform, the same code will work but will return the following warning:

│ Warning: Additional provider information from registry
│ 
│ The remote registry returned warnings for registry.terraform.io/hashicorp/oci:
│ - For users on Terraform 0.13 or greater, this provider has moved to oracle/oci. Please update your
│ source in required_providers.
╵

Provider

Step 1 is to fix the provider and use Oracle’s OCI. We edit provider.tf and we add the following lines:

terraform {
  required_providers {
    oci = {
      source  = "oracle/oci"
    }
  }
}

We can run again the init command:

$ tofu init

Initializing the backend...
Initializing modules...

Initializing provider plugins...
- Finding latest version of hashicorp/template...
- Finding latest version of hashicorp/oci...
- Finding latest version of oracle/oci...
- Finding latest version of hashicorp/tls...
- Installing hashicorp/template v2.2.0...
- Installed hashicorp/template v2.2.0 (signed, key ID 0C0AF313E5FD9F80)
- Installing oracle/oci v5.23.0...
- Installed oracle/oci v5.23.0 (signed, key ID 1533A49284137CEB)
- Installing hashicorp/tls v4.0.5...
- Installed hashicorp/tls v4.0.5 (signed, key ID 0C0AF313E5FD9F80)

Providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://opentofu.org/docs/cli/plugins/signing/
╷
│ Error: Failed to query available provider packages
│ 
│ Could not retrieve the list of available versions for provider hashicorp/oci: provider registry
│ registry.opentofu.org does not have a provider named registry.opentofu.org/hashicorp/oci
│ 
│ Did you intend to use oracle/oci? If so, you must specify that source address in each module which
│ requires that provider. To see which modules are currently depending on hashicorp/oci, run the
│ following command:
│     tofu providers
│ 
│ If you believe this provider is missing from the registry, please submit a issue on the OpenTofu
│ Registry https://github.com/opentofu/registry/issues/
╵

We can see that the oracle/oci v5.23.0 provider plugin was installed, but it still fails. Let’s run the recommended tofu providers command:

$ tofu providers

Providers required by configuration:
.
├── provider[registry.opentofu.org/hashicorp/tls]
├── provider[registry.opentofu.org/hashicorp/template]
├── provider[registry.opentofu.org/oracle/oci]
├── module.wordpress
│   ├── provider[registry.opentofu.org/hashicorp/oci]
│   └── provider[registry.opentofu.org/hashicorp/template]
└── module.mds-instance
    └── provider[registry.opentofu.org/hashicorp/oci]

We can observe that the oracle/oci provider plugin is indeed used for the root (.) but we can also see that for the 2 modules (wordpress and mds-instance), hashicorp/oci is still used (and not found).

Let’s add a provider.tf file containing the following lines in both modules:

terraform {
  required_providers {
    oci = {
      source  = "oracle/oci"
    }
  }
}

If we check again the providers, we can see now that they all use the correct one:

$ tofu providers

Providers required by configuration:
.
├── provider[registry.opentofu.org/oracle/oci]
├── provider[registry.opentofu.org/hashicorp/tls]
├── provider[registry.opentofu.org/hashicorp/template]
├── module.mds-instance
│   └── provider[registry.opentofu.org/oracle/oci]
└── module.wordpress
    ├── provider[registry.opentofu.org/oracle/oci]
    └── provider[registry.opentofu.org/hashicorp/template]

We can run init again:

$ tofu init

Initializing the backend...
Initializing modules...

Initializing provider plugins...
- Finding latest version of oracle/oci...
- Finding latest version of hashicorp/tls...
- Finding latest version of hashicorp/template...
- Installing oracle/oci v5.23.0...
- Installed oracle/oci v5.23.0 (signed, key ID 1533A49284137CEB)
- Installing hashicorp/tls v4.0.5...
- Installed hashicorp/tls v4.0.5 (signed, key ID 0C0AF313E5FD9F80)
- Installing hashicorp/template v2.2.0...
- Installed hashicorp/template v2.2.0 (signed, key ID 0C0AF313E5FD9F80)

Providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://opentofu.org/docs/cli/plugins/signing/

OpenTofu has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that OpenTofu can guarantee to make the same selections by default when
you run "tofu init" in the future.

OpenTofu has been successfully initialized!

You may now begin working with OpenTofu. Try running "tofu plan" to see
any changes that are required for your infrastructure. All OpenTofu commands
should now work.

If you ever set or change modules or backend configuration for OpenTofu,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Perfect ! Note that with Terraform the warnings will also be gone.

OpenTofu Plan & Apply

We can now plan and if no error, we can easily apply:

$ tofu plan
[...]
Plan: 14 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + mds_instance_ip       = (known after apply)
  + ssh_private_key       = "/home/fred/.ssh/id_rsa_oci"
  + wordpress_db_password = "MyWPpassw0rd!"
  + wordpress_db_user     = "wp"
  + wordpress_public_ip   = (known after apply)
  + wordpress_schema      = "wordpress"

$ tofu apply

And it’s deployed !

Conclusion

OpenTofu works as expected but it requires the code and module to be following the latest Terraform specifications.

Of course, in the future, the compatibility might change, but at present, deploying on OCI using OpenTofu instead of Terraform works perfectly.

Enjoy deploying in OCI !

December 19, 2023

We recently saw that we can benefit from using JSON documents with MySQL HeatWave cluster in OCI (HeatWave accelerator for MySQL Database Service).

However sometimes the data can’t be loaded to HeatWave Cluster or the query cannot use it.

And this is not always easy to understand why. Let’s get familiar on how to find the info.

Data not loaded in HW Cluster

Let’s see an example with this table (collection):

SQL > show create table listingsAndReviews\G
*************************** 1. row ***************************
       Table: listingsAndReviews
Create Table: CREATE TABLE `listingsAndReviews` (
  `doc` json DEFAULT NULL,
  `_id` char(28) GENERATED ALWAYS AS
 (json_unquote(json_extract(`doc`,_utf8mb4'$._id'))) STORED NOT NULL,
  `_json_schema` json GENERATED ALWAYS AS 
  (_utf8mb4'{"type":"object"}') VIRTUAL,
  PRIMARY KEY (`_id`),
  CONSTRAINT `$val_strict_B70EB65BDDBAB20B0EE5BB7532C9060C422A06F8` 
  CHECK (json_schema_valid(`_json_schema`,`doc`)) /*!80016 NOT ENFORCED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
 SECONDARY_ENGINE=rapid

Now when we try to load the data int HeatWave Cluster we get the following output:

SQL > alter table listingsAndReviews secondary_load;
Query OK, 0 rows affected, 3 warnings (19.1727 sec)
Warning (code 3877): Guided Load executed 
'ALTER TABLE `docstore`.`listingsAndReviews` MODIFY `_json_schema` json 
GENERATED ALWAYS AS (_utf8mb4'{"type":"object"}') VIRTUAL 
COMMENT '' NOT SECONDARY'.
Warning (code 3877): Guided Load executed 'ALTER TABLE
`docstore`.`listingsAndReviews` MODIFY `doc` json 
DEFAULT NULL COMMENT '' NOT SECONDARY'.
Warning (code 3877): Guided Load requested 11 InnoDB parallel read 
threads to load docstore.listingsAndReviews (session variable value: 32).

We can verify which columns are loaded in HeatWave cluster from that table:

SQL> select name, column_name, data_placement_type, nrows 
   from rpd_column_id ci 
   join rpd_tables t on t.id=ci.table_id 
     join rpd_table_id ri on ri.id=t.id 
     where table_name='listingsAndReviews';
+-----------------------------+-------------+---------------------+-------+
| name                        | column_name | data_placement_type | nrows |
+-----------------------------+-------------+---------------------+-------+
| docstore.listingsAndReviews | _id         | PrimaryKey          |  5555 |
+-----------------------------+-------------+---------------------+-------+
1 row in set (0.0010 sec)

We can see that the column doc, the JSON one, is not loaded. But why ?

If we use in OCI Console the HeatWave Cluster node estimation tool, we also get a warning notifying us that only one column will be loaded:

But once again, we don’t know the reason and as we are using MySQL 8.2.0-u1, JSON is supported, so the column doc should also be loaded into the secondary engine.

When we try with the auto pilot load (call sys.heatwave_load(JSON_ARRAY('docstore'), NULL)) we don’t get much more details.

But in the error_log (available in performance_schema) we have the info we are looking for:

{"warn": "Column size is more than 65532 bytes", "table_name": "listingsAndReviews", "column_name": "doc", "schema_name": "docstore"}

We can see why the JSON column was not loaded, it exceeds the maximum column size allowed of 65532 bytes.

The information is also available in sys.heatwave_autopilot_report after using the auto pilot load procedure (call sys.heatwave_load()).

Query not off-loaded to secondary engine

Now let’s see if a query is off-loaded to HeatWave cluster (secondary engine). So first, we check how much queries have been offloaded:

SQL >  show status like 'rapid%query%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| rapid_query_offload_count | 59    |
+---------------------------+-------+

And then we run a query:

SQL > with cte1 as (
       select doc->>"$.name" as name,  doc->>"$.cuisine" as cuisine, 
       (
        select avg(score) 
         from json_table (
            doc, "$.grades[*]" columns (score int path "$.score")) as r
         ) as  avg_score 
        from restaurants) select *, rank() 
        over (  
          partition by cuisine order by avg_score desc) as `rank`  
     from cte1 order by `rank`, avg_score desc limit 5;
+-----------------------+--------------------------------+-----------+------+
| name                  | cuisine                        | avg_score | rank |
+-----------------------+--------------------------------+-----------+------+
| Juice It Health Bar   | Juice, Smoothies, Fruit Salads |   75.0000 |    1 |
| Golden Dragon Cuisine | Chinese                        |   73.0000 |    1 |
| Palombo Pastry Shop   | Bakery                         |   69.0000 |    1 |
| Go Go Curry           | Japanese                       |   65.0000 |    1 |
| Koyla                 | Middle Eastern                 |   61.0000 |    1 |
+-----------------------+--------------------------------+-----------+------+
5 rows in set (31.4319 sec)

31 .4 seconds, this doesn’t seem to be using the HeatWave cluster.

Let’s check again the amount of query offloaded:

SQL > show status like 'rapid%query%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| rapid_query_offload_count | 59    |
+---------------------------+-------+

Same amount….

We can verify the Query Execution Plan using EXPLAIN:

EXPLAIN: -> Limit: 5 row(s)
    -> Sort row IDs: rank, cte1.avg_score DESC
        -> Table scan on <temporary>  (cost=2.5..2.5 rows=0)
            -> Temporary table  (cost=0..0 rows=0)
                -> Window aggregate: rank() OVER (PARTITION BY cte1.cuisine ORDER BY cte1.avg_score desc ) 
                    -> Sort row IDs: cte1.cuisine, cte1.avg_score DESC  (cost=5.08e+6..5.08e+6 rows=2.1e+6)
                        -> Table scan on cte1  (cost=438291..464507 rows=2.1e+6)
                            -> Materialize CTE cte1  (cost=438291..438291 rows=2.1e+6)
                                -> Table scan on restaurants  (cost=228577 rows=2.1e+6)
                                -> Select #3 (subquery in projection; dependent)
                                    -> Aggregate: avg(r.score)  (cost=0 rows=1)
                                        -> Materialize table function

There is indeed not sign of using the secondary engine.

Usually, when using HeatWave cluster, we can enable the optimizer trace:

SQL > set  optimizer_trace="enabled=on";
SQL > explain format=tree with cte1 as .... <THE QUERY> ... desc limit 5\G
[the output of the QEP]

And then we verify the reason in information_schema.optimizer_trace:

SQL > select query, trace->'$**.Rapid_Offload_Fails' 'Offload Failed',
             trace->'$**.secondary_engine_not_used' 'HeatWave Not Used'
      from information_schema.optimizer_trace\G
*************************** 1. row ***************************
            query: explain format=tree with cte1 as (select doc->>"$.name" as name,
  doc->>"$.cuisine" as cuisine, (select avg(score) from json_table (doc,
 "$.grades[*]" columns (score int path "$.score")) as r) as  avg_score from
 restaurants) select *, rank() over (  partition by cuisine order by avg_score
 desc) as `rank`  from cte1 order by `rank`, avg_score desc limit 5
   Offload Failed: NULL
HeatWave Not Used: NULL

Most of the time, when a query is not offloaded to the secondary engine, we have the reason… but not this time, the value is NULL !

This could be related to the JSON datatype which was very recently supported.

Let’s have a look in the error log then:

SQL> select * from (select * from performance_schema.error_log 
     where subsystem='RAPID' order by 1 desc limit 3) a order by 1\G
*************************** 1. row ***************************
    LOGGED: 2023-12-15 17:41:58.876588
 THREAD_ID: 3535
      PRIO: System
ERROR_CODE: MY-011071
 SUBSYSTEM: RAPID
      DATA: Attempting offload of query for HeatWave. qid=142, my_qid=5253596,
 DB = "docstore". Query = "explain format=tree with cte1 as 
(select doc->>"$.name" as name,  doc->>"$.cuisine" as cuisine,
(select avg(score) from json_table (doc, "$.grades[*]" columns 
(score int path "$.score")) as r) as  avg_score from restaurants) 
select *, rank() over (  partition by cuisine order by avg_score desc)
 as `rank`  from cte1 order by `rank`, avg_score desc limit 5"
*************************** 2. row ***************************
    LOGGED: 2023-12-15 17:41:58.876733
 THREAD_ID: 3535
      PRIO: System
ERROR_CODE: MY-011071
 SUBSYSTEM: RAPID
      DATA: Starting costing for HeatWave query qid=142
*************************** 3. row ***************************
    LOGGED: 2023-12-15 17:41:58.876763
 THREAD_ID: 3535
      PRIO: System
ERROR_CODE: MY-011071
 SUBSYSTEM: RAPID
      DATA: Heatwave chosen plan quality accepted: qid=142 : total SG pairs: 0, Plan Quality metric: 1.0 , Plan Quality threshold threshold: 90.0

We can see that the MySQL attempted to offload the query….

So, how can we know the reason why the secondary engine is not used ?

There is an Optimizer Hint that will help us finding the reason: /*+ set_var(use_secondary_engine=forced) */.

When used, this optimizer hint will try to offload the query to HeatWave cluster but won’t run it on InnoDB when it fails to run on the secondary engine. Let’s try:

SQL > with cte1 as (
       select /*+ set_var(use_secondary_engine=forced) */
       doc->>"$.name" as name,  doc->>"$.cuisine" as cuisine, 
       (
        select avg(score) 
         from json_table (
            doc, "$.grades[*]" columns (score int path "$.score")) as r
         ) as  avg_score 
        from restaurants) select *, rank() 
        over (  
          partition by cuisine order by avg_score desc) as `rank`  
     from cte1 order by `rank`, avg_score desc limit 5;
ERROR: 3889: Secondary engine operation failed. Reason: "SUBQUERY not yet
 supported", "Table ``.`/db/tmp/#sql36f2_dcf_135` is not loaded in HeatWave".

Here we have a the reason why the query is not running in our HeatWave Clustser.

And in fact this unsupported subquery is related to the function JSON_TABLES() that is not supported in HeatWave cluster.

Conclusion

When encountering issues while utilizing JSON documents in the MySQL HeatWave Database Service in OCI and enabling the HeatWave cluster, it is crucial to become familiar with the proper way to obtain the necessary information for resolution. We learn how users can find those resource that can guide them in troubleshooting and rectifying any unexpected functionality problems.

Enjoy fast JSON in MySQL HeatWave on OCI !

December 17, 2023

Like many games of that time, Snow Bros. is a simple game that can be mastered with practice (or rather, coins). It’s fun all the way, specially when playing with 2 players (in co-op). The gameplay is somewhat reminiscent of Bubble Bobble, but instead of bubbles you shoot snow to bury enemies in snowballs. Push the snowball to get rid of all the enemies on its path! The arcade game was ported to a lot of game consoles and gaming computers.

December 16, 2023

Since the release of MySQL 8.0, the MySQL X Dev API has provided users with the convenient ability to utilize MySQL without the need to write a single line of SQL!

MySQL X Dev API brings the support for CRUD operations on JSON documents that are stored in MySQL. MySQL Document Store is ACID compliant and is compatible with everything MySQL like replication, InnoDB Cluster, …

The MySQL X Dev API is available using the MySQL X Protocol, listening by default on port 33060.

If you are interested in learning more about MySQL Document Store, please refer to the these presentations [1], [2], [3].

OCI MySQL HeatWave Database Service

The MySQL DBaaS on OCI is the only one providing access to the X Protocol. This mean you can create applications that doesn’t use any SQL to add, delete and modify and retrieve JSON documents stored in the DB System.

Let’s add some documents to our DB System. We will use the restaurants collection used in various MongoDB examples.

We first connect to our DB System (MySQL HeatWave instance in OCI) using MySQL Shell and the X Protocol (default). You can use a VPN, a Compute Instance, a bastion host…

Next, we create a new schema and we import all the JSON documents using the importJson() utility:

JS > session.createSchema('docstore')
<Schema:docstore>
JS > \u docstore
Default schema `docstore` accessible through db.
JS > util.importJson('restaurants.json', {convertBsonOid: true})
Importing from file "restaurants.json" to collection 
`docstore`.`restaurants` in MySQL Server at 10.0.1.249:33060

.. 25359.. 25359
Processed 15.60 MB in 25359 documents in 0.9856 sec (25.36K documents/s)
Total successfully imported documents 25359 (25.36K documents/s)

Done ! It’s fast and easy.

CRUD

It’s time to test the import and read the data we have imported.

Let’s start by just displaying the first document and verify it’s indeed a valid JSON document:

JS > db.restaurants.find().limit(1)
{
    "_id": "000065796b9c0000000000000001",
    "name": "Howard-Reyes",
    "grades": [
        {
            "date": {
                "$date": "2023-03-12T00:00:00.000+0000"
            },
            "grade": "B",
            "score": 5
        }
    ],
    "address": {
        "coord": [
            8.7301765,
            52.705775
        ],
        "street": "Ward Branch",
        "zipcode": "22737",
        "building": "74523"
    },
    "borough": "West Brandimouth",
    "cuisine": "Hawaiian"
}
1 document in set (0.0016 sec)

Let’s display 10 restaurants and their type of cuisine that are in Brooklyn:

JS > db.restaurants.find("borough = 'Brooklyn'").fields("name", "cuisine").limit(10)
{
    "name": "Wendy'S",
    "cuisine": "Hamburgers"
}
{
    "name": "Riviera Caterer",
    "cuisine": "American"
}
{
    "name": "Wilken'S Fine Food",
    "cuisine": "Delicatessen"
}
{
    "name": "Regina Caterers",
    "cuisine": "American"
}
{
    "name": "Taste The Tropics Ice Cream",
    "cuisine": "Ice Cream, Gelato, Yogurt, Ices"
}
{
    "name": "C & C Catering Service",
    "cuisine": "American"
}
{
    "name": "May May Kitchen",
    "cuisine": "Chinese"
}
{
    "name": "Seuda Foods",
    "cuisine": "Jewish/Kosher"
}
{
    "name": "Carvel Ice Cream",
    "cuisine": "Ice Cream, Gelato, Yogurt, Ices"
}
{
    "name": "Nordic Delicacies",
    "cuisine": "Delicatessen"
}
10 documents in set (0.3392 sec)

Adding Documents

We can also generate JSON documents without a single line of SQL using the X Dev API.

In this example, we use Python (mysql-connector-python3-8.2.0-1.el8.x86_64) and we add a document using the add() method of a collection (line 72):

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
#!/usr/bin/python3.8
import mysqlx
import sys
from random import choice, randrange
from faker import Faker
def connect():
session = mysqlx.get_session(
{
"host": "db instance IP",
"port": 33060,
"user": "login",
"password": "password",
"ssl-mode": "REQUIRED",
}
)
return session
def gen_cuisine():
cuisine_list = [
"Belgian",
"Italian"
]
return choice(cuisine_list)
session = connect()
db = session.get_schema("docstore")
col = db.get_collection("restaurants")
fake = Faker()
print("Generating new documents.", end="", flush=True)
total = 1000
if len(sys.argv) > 1:
if sys.argv[1]:
total = int(sys.argv[1])
for _ in range(total):
doc = {}
doc["name"] = fake.company()
address = {}
address["street"] = fake.street_name()
address["building"] = fake.building_number()
address["zipcode"] = fake.postcode()
doc["borough"] = fake.city()
doc["cuisine"] = gen_cuisine()
coord = []
coord.append(float(fake.latitude()))
coord.append(float(fake.longitude()))
address["coord"] = coord
doc["address"] = address
grades = []
for _ in range(randrange(5)):
grade = {}
grade_date = {}
date = fake.date_time_this_decade()
grade_date["$date"] = date.strftime("%Y-%m-%dT00:00:00.000+0000")
grade_note = choice(["A", "B", "C"])
grade_score = randrange(20)
grade["date"] = grade_date
grade["grade"] = grade_note
grade["score"] = grade_score
grades.append(grade)
doc["grades"] = grades
col.add(doc).execute()
if total > 100000 and not _ % 1000:
print(".", end="", flush=True)
else:
print(".", end="", flush=True)
print("\nDone ! {} documents generated.".format(total))

You can find on GitHub all the required files: https://github.com/lefred/restaurants-mysql-ds

In case you need large JSON documents, you can increase the size of the mysqlx_max_allowed_packet of your DB System’s configuration:

This will allow you to store JSON documents of 1GB !

HeatWave Accelerator

Now that we all our documents stored in a MySQL HeatWave instance on OCI, could we also benefit from a HeatWave Cluster to boost our queries ?

Since the release of 8.2.0-u1 in OCI, MySQL HeatWave also supports JSON datatype in the HeatWave Cluster. See Accelerating JSON Query Processing using MySQL HeatWave.

When using MySQL Document Store, to be able to load a collection to HeatWave Cluster, we need to modify how the Primary Key is stored as by default, a VARBINARY is used but not supported in HeatWave secondary engine.

If you run the HeatWave cluster nodes estimation you will get the following message:

Error comment: UNABLE TO LOAD TABLE WHEN PRIMARY KEY COLUMN(S) CANNOT BE LOADED
Columns to be loaded:1 (100% VARLEN-encoded columns)
Estimated number of rows:24082

This is the statement to perform the Primary Key change:

SQL> alter table restaurants modify _id char(28) GENERATED ALWAYS AS
     (json_unquote(json_extract(`doc`,_utf8mb4'$._id'))) STORED NOT NULL;

As soon as we have enabled HeatWave Cluster we can load our collection to it:

But before loading it, let’s try a SQL query (yes you can do SQL queries on your JSON documents inside a collection) to see if there is some improvement:

SQL > select doc->>"$.borough" borough, count(*) tot,
             max(length(doc->>"$.name")) longest_name, 
             min(length(doc->>"$.name")) shortest_name,
             round(avg(length(doc->>"$.name")),2) avg_name_length 
      from restaurants 
      where doc->>"$.cuisine" collate utf8mb4_0900_ai_ci like 'belgian' 
      group by borough order by tot desc limit 10;
+---------------+-----+--------------+---------------+-----------------+
| borough       | tot | longest_name | shortest_name | avg_name_length |
+---------------+-----+--------------+---------------+-----------------+
| Lake Michael  |  17 |           28 |             9 |           18.59 |
| Port Michael  |  11 |           29 |             8 |           19.91 |
| East Michael  |  11 |           25 |             8 |           13.64 |
| South Michael |  11 |           27 |            11 |           19.64 |
| South David   |  11 |           30 |             8 |           13.82 |
| Port Matthew  |  11 |           27 |            10 |           18.45 |
| Michaelmouth  |  10 |           25 |             9 |           15.50 |
| Port Jennifer |  10 |           29 |             7 |           18.50 |
| West Michael  |  10 |           26 |            10 |           21.10 |
| Lake James    |  10 |           27 |             9 |           15.70 |
+---------------+-----+--------------+---------------+-----------------+
10 rows in set (0.5191 sec)

This took a bit more than half second (0.5191 sec) to parse 962,520 JSON documents.

Now we load our data into our HeatWave Cluster:

SQL > call sys.heatwave_load(JSON_ARRAY('docstore'), NULL);
+------------------------------------------+
| INITIALIZING HEATWAVE AUTO PARALLEL LOAD |
+------------------------------------------+
| Version: 2.41                            |
|                                          |
| Load Mode: normal                        |
| Load Policy: disable_unsupported_columns |
| Output Mode: normal                      |
|                                          |
+------------------------------------------+
6 rows in set (19.47 sec)
...
+------------------------------------------+
| LOADING TABLE                            |
+------------------------------------------+
| TABLE (2 of 2): `docstore`.`restaurants` |
| Commands executed successfully: 2 of 2   |
| Warnings encountered: 0                  |
| Table loaded successfully!               |
|   Total columns loaded: 2                |
|   Table loaded using 32 thread(s)        |
|   Elapsed time: 19.47 s                  |
|                                          |
+------------------------------------------+
8 rows in set (19.47 sec)
+----------------------------------------------------------------+
| LOAD SUMMARY                                                   |
+----------------------------------------------------------------+
|                                                                |
| SCHEMA           TABLES       TABLES      COLUMNS         LOAD |
| NAME             LOADED       FAILED       LOADED     DURATION |
| ------           ------       ------      -------     -------- |
| `docstore`            1            0            2      19.47 s |
|                                                                |
+----------------------------------------------------------------+
6 rows in set (19.47 sec)

Let’s now run the query again:

SQL > select doc->>"$.borough" borough, count(*) tot,
             max(length(doc->>"$.name")) longest_name, 
             min(length(doc->>"$.name")) shortest_name,
             round(avg(length(doc->>"$.name")),2) avg_name_length 
      from restaurants 
      where doc->>"$.cuisine" collate utf8mb4_0900_ai_ci like 'belgian' 
      group by borough order by tot desc limit 10;
+---------------+-----+--------------+---------------+-----------------+
| borough       | tot | longest_name | shortest_name | avg_name_length |
+---------------+-----+--------------+---------------+-----------------+
| Lake Michael  |  17 |           28 |             9 |           18.59 |
| South David   |  11 |           30 |             8 |           13.82 |
| Port Michael  |  11 |           29 |             8 |           19.91 |
| South Michael |  11 |           27 |            11 |           19.64 |
| Port Matthew  |  11 |           27 |            10 |           18.45 |
| East Michael  |  11 |           25 |             8 |           13.64 |
| Port Jennifer |  10 |           29 |             7 |           18.50 |
| West Michael  |  10 |           26 |            10 |           21.10 |
| Lake James    |  10 |           27 |             9 |           15.70 |
| Michaelmouth  |  10 |           25 |             9 |           15.50 |
+---------------+-----+--------------+---------------+-----------------+
10 rows in set (0.1017 sec)

It’s faster ! But it was already very fast. I even had to reduce the value of secondary_engine_cost_threshold because the query cost was just a little lower and MySQL HeatWave was not offloading it to the accelerator.

But with even more data you could really benefit for the acceleration.

For example with 2,162,520 documents we go from 1.1713 seconds to 0.1922 seconds when using HeatWeave Cluster.

Limitations

All is well in the best of worlds! Almost! When it comes to utilizing MySQL Document Store with MySQL HeatWave Database Service in OCI, you will find an absence of limitations. However, if your intention is to enable the HeatWave Cluster (Accelerator), certain restrictions should be taken into consideration.

The JSON document cannot be larger than 65532 bytes. As the default character set is utf8mb4, the maximum size of a JSON document is then 16Kb.

Currently, the JSON_TABLE() function is not supported.

Conclusion

MySQL Document Store is highly valued by developers and is available in both MySQL Community Edition and only in MySQL HeatWave when opting for a hosted and managed MySQL Service in the cloud.

With the lasted update, it’s also possible to benefit from HeatWave Cluster Accelerator when using MySQL Document Store on OCI which makes a very good choice for those having to deal with a large amount of JSON documents like in the IoT systems for example.

Enjoy using MySQL without a single line of SQL… even in the cloud !

Iedere keer Jonathan Holslag weer eens heeft mogen factureren aan de VRT omdat ze hem opdraven bij De Afspraak of Terzake vraag ik me af: met welke kennis van zaken spreekt deze man eigenlijk?

Bijna iedere uitspraak van hem is een ideologisch pro NAVO, pro VS, anti China en (uiteraard) anti Rusland. De volstrekte Atlanticist.

Ik kan helemaal volgen wanneer we Frank Creyelman veroordelen voor het hand- en spandiensten te leveren voor Chinese inlichtingendiensten.

Maar waar het ineens vandaan komt dat we dat feit ook moeten koppelen aan wat Yves Leterme doet, ontgaat mij volledig.

Als we Yves van iets moeten beschuldigen, is het dat hij de scheiding der machten niet respecteerde door de Fortis-rechter te beïnvloeden. Dat heeft dan ook gepaste gevolgen gehad.

Maar spionage voor China?

Ik vind het eigenlijk de titel professor bij de VUB onwaardig om zulke valse beschuldigingen te uiten, gewoon om wat centen te kunnen factureren bij de VRT.

Beste heer Holslag. Heb jij eigenlijk iets te zeggen wat wel op feiten is gebaseerd? Heb jij bewijzen?

Ik verwacht dat van iemand die zich professor mag noemen. En ik verwacht ook het ontslag wanneer zo iemand met zo’n titel niet voldoet aan die verwachting.

M.a.w. beste professor Holslag: bewijs dat Leterme een Chinese spion is. Of neem ontslag. Valse beschuldigingen zijn trouwens strafbaar onder de noemer Laster en Eerroof.

Willen we wel professors die zich schuldig maken aan strafbare feiten?

December 14, 2023

Time can be cruel for games. Some games don’t age well. The advance in computational and graphical power is painfully clear. But time can also be a filter to separate great from mediocre games. If a game survives for decades, there is a big chance it’s a great game. Games are released to run on very specific hardware and operating systems. Older games do often not run out of the box on modern computers.

December 11, 2023

Firewatch is a single person adventure that I enjoyed from the beginning to (almost the) end. Certainly recommended. My only point of criticism is the rather abrupt ending. The game is short (around 5 hours), but the story is by no means rushed. Your life, as the protagonist, is in shambles. What better way to get your thoughts straight than wandering all day in nature while spotting wildfires? As a firewatcher in a very remote area, your only lifeline to the outside world is a warm and friendly voice through the walkie-talkie.

December 10, 2023

A Short Hike is a small indie game. And by small, well, I mean small: you’ll finish the story in around 1 hour and half. You can spend a few additional hours to 100% the game, but there isn’t much incentive to do so once you finish it. That said, the game is a great example of a cozy game. The artwork is beautiful, the mechanics are good and the quests are easy.

December 09, 2023

Is a game released on 2017 still retrogaming? It is if you played on a Wii-U, the abandoned last-generation console of Nintendo. Yes, it was also released on a Nintendo Switch, as exactly the same game with a very small bump in resolution (from 720p to 900p). Originally it was planned as a Wii-U exclusive, but it took Nintendo a few extra years to get it right. And they did.

December 02, 2023

Misfortune Ramirez Hernandez, a cute 8-year-old girl, loves to tell the world she is “a little Miss”. Little Misfortune is more than a wordplay on the story of the child. It’s a lovely, funny and sad little adventure game. All at the same time. It’s about a child’s sense of wonder and about the loss of innocence. Although short (I finished it around 3 hours), it’s rewarding. No need to make it longer just for the sake of it.
A fun, kind of realistic tennis game. Best for playing in Quick Tournament mode (the history mode is silly). Not easy, but certainly worth it. Handhelds like the rp3+ can upscale it to about 3x the original resolution. My old 2016-Celeron NUC with Batocera can upscale it to 1080p without any issues. Have a look at “How to play retrogames?” if you don’t know how to play retrogames.
This is a beautiful game with interesting puzzles and… a story. Another perfect example of SNES-level graphics on a small handheld. The game takes a little investment in the beginning because it starts slowly, certainly if you’re not familiar with the action-adventure genre The quests will become clearer and the speed will pick up. You will be stuck more than once (actually, lots of time). You’ll want to finish it in one go, but this open world game will keep you occupied for way, way, longer than you expected.
This game is probably my favorite in the Metroid Series. It’s a perfect example of SNES-level graphics on a small handheld. This is the most story-driven installment of all Metroid 2D games. Be ready to be lost for hours and experience a huge adrenaline boost once you finally figure out what you need to do or when you finish that impossible boss. Oh, and in this Metroid there are situations where there is no way you can win, and you have to run.
The 2D Metroid Series offer a perfect mix of platformer action, adventure and intelligent puzzles. While preceded by Metroid (1986) on the NES and Metroid II: Return of Samus (1991) on the GameBoy, this is the game that gave the series its distinct look and identity. While Metrovania is a thing and inspired many great games for decades to come, it is Metroid and not Castlevania the game that immerses the played in an eerie, dreadful, inmense and most of all lonely universe.
This game returns to the top-down perspective of the first Zelda while making the game graphically stunning (the NES graphics ages less well compared to the SNES/GBA sequels). Even today, more than 30 years later, it’s pretty much the pattern to follow for pixel-art games. This game is long and hard, but at the same time extremely rewarding. It’s a must-play classic. Have a look at “How to play retrogames?
Dr. Mario is often bundled with its more famous predecessor Tetris. The puzzle game is certainly a worthy successor. While there are versions of this game on more recent consoles, the straightforwardness of the Game Boy is something the more recent ports lack. The game is fast, the music fantastic. Simple and a lot of addictive fun. Have a look at “How to play retrogames?” if you don’t know how to play retrogames.
Tetris was bundled with the European and North-American release of the new Game Boy handheld. It’s until today in the top 10 of most sold Nintendo games of all time. There are many older versions of Tetris, including the original release on the Sovjet Electronika 60 and later versions on Western and Eastern arcade machines around the world (most of the time, arcade releases were far superior to home console ports).
This game is the first Nintendo game for a handheld console. No pressure. It’s probably the most quirky Mario game, but in a good way. Mostly line art and 2 Gradius-like flying levels. It’s a short game and rather difficult at the end, but a lot of fun. A worthy adaptation of the Super Mario Bros. formula for the weaker handheld. Have a look at “How to play retrogames?” if you don’t know how to play retrogames.
This is the game that started the almost 40-years franchise that expanded to animated TV-series and movies. It’s fun, it’s challenging and the soundtrack is recognizable everywhere. The original version is still the best version. Later versions on other consoles often crop the image in order to adapt it to different screen geometries, e.g. the NES Classics release on the Game Boy Advanced. Still the king! Have a look at “How to play retrogames?

November 30, 2023

What a disaster this is, the DIY framework laptop.

The box...

...contains more boxes.


The parts are few...


...so the laptop is assembled in less than a minute?!

Unpacking the thing took longer than assembling it!

Reading this webpage took longer than assembling it!

Installing the DDR5 memory took 1 second: click and done.
Installing the nvme took 3 seconds: unscrew one screw, click and screw.
Installing the bezel was far more challenging, took almost 30 seconds! (*)
Attaching the keyboard: about three seconds?
Installing the four modules: 1 second per module (or less?)

I'm not that good at irony, but I was hoping for at least 20 minutes of fun assembling this thing... and only got one minute. :-/

(*) This gave fond memories of assembling tower PC's in the nineties; ISA/PCI slots click, CD-ROM atapi click, hard disk and IDE jumpers easy. But closing the tower case... well that was the hard part!

Installing an OS

The guide says to be patient when booting (powering on) the laptop for the first time, it may take several minutes.

I timed it, it took 57 seconds to get to this screen:

And now there is this; it has been a long time since I saw a default XFCE, Firefox and terminal.


Let the fun begin!

November 26, 2023

In my previous blog post, we installed GitLab-CE and did some post configuration. In this blog post, we’ll continue to create user accounts and set up SSH to the git repository.

In the next blog posts will add code to GitLab and set up GitLab runners on different Operating systems.

Users

Update root password

There was an initial root user with a password created during the installation, the initial root password is saved - or was saved as GitLab will delete the initial password after 24H - to /etc/gitlab/initial_root_password.

To update the root password, log in to GitLab with your web browser click on your “avatar” icon and click on [ Edit profile ]

Alt text

Select [ Password ] to update the root password.

Alt text

You need to re-login after the password is updated.

Creating users

In this section we’ll set up an additional administrator user, I usually try to have a backup admin account. It isn’t recommended to use an admin account directly so we’ll set up a regular user that we can use for your daily development tasks.

The full documentation about GitLab users is available at: https://docs.gitlab.com/ee/user/profile/account/create_accounts.html#create-users-in-admin-area

Admin user

Alt text

In the admin area select users on the left-hand side.

Alt text

In the New user window, fill in the Account Name / Username and Email Address. Please note that you can’t use admin as this is a reversed user name in GitLab. At the Access level select Administrator.

Alt text

Alt text

If everything goes well you or the user will receive an email to reset his or her password.

Regular user

Repeat the same steps to create a normal user and keep the Access level to “Regular”.

Alt text

GitLab SSH access

To access your git repositories using ssh you’ll need to upload your ssh public key. I use hardware tokens ( smartcard-hsm ) to protect my private keys.

You can find more information on how to use SmartCard-HSM with ssh in a previous blog post: https://stafwag.github.io/blog/blog/2015/12/05/protecting-your-ssh-keys-with-smartcard-hsm/

You can find information on how to generate a ssh key pair for GitLab at:https://docs.gitlab.com/ee/user/ssh.html

If you don’t mind having your keypair on your filesystem you can use the steps below ( but I advise you to look at the better options ).

Generate a ssh key pair

Execute ssh-keygen to generate a ssh key-pair.

I recommend using a passphrase so the private key is encrypted; this prevents an attacker from copying it as plain text from the filesystem. To prevent that you need to type in a passphrase during the development you can use a ssh-agent so you don’t need to type in your password each time you push a commit to your git repository (see below).

The unencrypted key is still stored in memory if you use an ssh-agent unless you store your private key on a hardware token (HSM).

Please note that we use a non-default ssh key location in the command below ~/.ssh/testgitlab.key, we update your ssh client config to use this private key.

[staf@vicky ~]$ ssh-keygen -f ~/.ssh/testgitlab.key -t ed25519 -C "test gitlab key"
Generating public/private ed25519 key pair.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/staf/.ssh/testgitlab.key
Your public key has been saved in /home/staf/.ssh/testgitlab.key.pub
The key fingerprint is:
SHA256:m4TpKmOjvwkoY2H3arG3tTLTOgYeA03BJoACD86Gkr0 test gitlab key
The key's randomart image is:
+--[ED25519 256]--+
|B ...            |
|** +             |
|=+B              |
|o. o   o         |
| oE.  o S        |
|o o=.. . o       |
|=.. *.o.o        |
|oo==.O...        |
|.+=*+oB.         |
+----[SHA256]-----+
[staf@vicky ~]$ 

Upload the public key to GitLab

Edit your profile.

Alt text

Goto SSH Keys and select [ Add new key ].

[staf@vicky ~]$ cat ~/.ssh/testgitlab.key.pub 
ssh-ed25519 <snip> test gitlab key
[staf@vicky ~]$ 

Add the public key to GitLab. If you don’t want to set an Expiration date, clear the field.

And click on [ Add key ]

Alt text

When you upload your public key to the GitLab GUI, it will update ~/.ssh/authorized_keys for the git user. The git user was created during the installation.

Let’s quickly review it.

Log on to your GitLab server.

[staf@vicky ~]$ ssh staf@gitlab.stafnet.local 
Linux gitlab 6.1.0-13-arm64 #1 SMP Debian 6.1.55-1 (2023-09-29) aarch64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sat Nov 25 08:02:53 2023 from 192.168.1.10
staf@gitlab:~$ 

Get details about the git user.

staf@gitlab:~$ cat /etc/passwd | grep -i git
gitlab-www:x:999:996::/var/opt/gitlab/nginx:/bin/false
git:x:996:995::/var/opt/gitlab:/bin/sh
gitlab-redis:x:995:994::/var/opt/gitlab/redis:/bin/false
gitlab-psql:x:994:993::/var/opt/gitlab/postgresql:/bin/sh
gitlab-prometheus:x:993:992::/var/opt/gitlab/prometheus:/bin/sh
staf@gitlab:~$ 

Display the ~/.ssh/authorized_keys contents.

staf@gitlab:~$ sudo cat /var/opt/gitlab/.ssh/authorized_keys
[sudo] password for staf: 
command="/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell key-1",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty <snip>
command="/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell key-2",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty <snip>
command="/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell key-3",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty <snip>
staf@gitlab:~$ 

Test connection

Check the connection with the command below. Please note that we use the git user. GitLab will use the matching public key to determine (map to) the real GitLab user.

[staf@vicky ~]$ ssh -T git@gitlab.stafnet.local -i ~/.ssh/testgitlab.key 
Enter passphrase for key '/home/staf/.ssh/testgitlab.key': 
Welcome to GitLab, @stafwag!
[staf@vicky ~]$ 

The -T option disables pseudo-terminal allocation (pty). Without it you’ll get a warning.

Reconfigure your SSH client

We’ll update the ssh config to use our private key.

Open your ssh config with your favourite editor.

[staf@vicky ~]$ vi ~/.ssh/config

Add a section for our GitLab host and set the IdentityFile set to our private key.

Host gitlab.stafnet.local
  PreferredAuthentications publickey
  IdentityFile ~/.ssh/testgitlab.key

Test the SSH access.

[staf@vicky ~]$ ssh -T git@gitlab.stafnet.local
Welcome to GitLab, @stafwag!
[staf@vicky ~]$ 

Use a ssh-agent

We’ll add our private key to the ssh-agent to avoid the need to type in our password each time we perform a git action.

Most modern Unix desktop systems ( GNOME, KDE, Xfce ) will set up a ssh-agent automatically.

Some GNU/Linux distributions will even add to the key automatically when you use it. My FreeBSD desktop (Xfce) only starts the ssh-agent.

When your Unix desktop doesn’t set up a ssh-agent you’ll need to start it.

When a ssh-agent is configured the SSH_AGENT_PID and the SSH_AUTH_SOCK environment variables are set.

Check if there is already a ssh-agent configured.

staf@fedora39:~$ set | grep SSH
SSH_CLIENT='192.168.122.1 49618 22'
SSH_CONNECTION='192.168.122.1 49618 192.168.122.96 22'
SSH_TTY=/dev/pts/2
staf@fedora39:~$ 

The easiest way to start a ssh-agent and set up the environment variables is to execute the command eval $(ssh-agent).

staf@fedora39:~$ eval $(ssh-agent)
Agent pid 3542
staf@fedora39:~$ 

This starts the ssh-agent, eval will execute the ssh-agent output and set the required environment variables.

staf@fedora39:~$ set | grep SSH
SSH_AGENT_PID=3542
SSH_AUTH_SOCK=/tmp/ssh-XXXXXXcQhGER/agent.3541
SSH_CLIENT='192.168.122.1 60044 22'
SSH_CONNECTION='192.168.122.1 60044 192.168.122.96 22'
SSH_TTY=/dev/pts/2
staf@fedora39:~$ 

Add the private key to the ssh-agent.

staf@fedora39:~$ ssh-add /home/staf/.ssh/testgitlab.key
Enter passphrase for /home/staf/.ssh/testgitlab.key: 
Identity added: /home/staf/.ssh/testgitlab.key (test gitlab key)
staf@fedora39:~$ 

Test the connection.

staf@fedora39:~$ ssh -T git@gitlab.stafnet.local
Welcome to GitLab, @stafwag!
staf@fedora39:~$ 

With everything set up we’re ready to start to use GitLab. In a further blog post we will go start with adding code to GitLab, add runners and create artifacts.

Have fun!

Links

November 22, 2023

Today my rooster August(us) was not standing well on its legs. Because it’s getting winter and more importantly because my woman is pregnant I decided that this time we wouldn’t try to save the animal ..

Last time one of our chickens (Livia, our white chicken) fell ill we took it inside (somewhere in the basement where the temperature is constant) to save it from the other chickens (who were also picking on her – because that’s what healthy chickens do to a sick one) and give it the yolk of the other chickens’ eggs for extra strength.

It was a mess all the time. Outside the chickens take care of themselves. But petting it inside of the house you kinda have to do all that for the chicken. Chickens shit a lot. The outcome isn’t very hygienic. I was not planning to go through this ordeal with our rooster August.

Anyway. I had already bought a restraining cone for chicken slaughter a few months ago. A sharp knife I had in the kitchen. August was very relax and not very defensive because well, he was feeling sick and all that. It was not hard to put him head first in the cone. I also petted him and calmed him before all this.

That’s all very bad news. But the good news is that this new year’s eve we will be doing Coq au vin!

Afterwards in warm water we plucked the feathers, then opened up the body and took the intestines out (carefully not cutting open the gallblatter and urineblatter to make sure we don’t spoil any meat).

Meanwhile Livia, Julia and the newest of the flock Gwendolyn (that name because she’s a Aarschot’s breed of chicken – I always use political figures for chicken’s names) are still alive and happily laying eggs for us.

It’s ~ decided that Julia must go too. Because our new year’s eve family gathering will count five attendants and we need enough chickenmeat. Plus, the nice people of Hagelandsneerhof organize the Neerhofdieren show every year around the end of Januari at O-Green in Rillaar. I plan to start with a new flock of chickens and buy a new entire group there.

Poor chickens.

They had a good life. I don’t know yet what I’ll do with Gwendolyn and Livia. Especially Livia is very special: she’s always the first to arrive when we bring food. Fiercely fighting for it with Gwendolyn.

Poor August. He was so strong and defensive for his chickens. He cut me several times during one of his attacks (before I learned how to tame him).

But I couldn’t let him be in agony of sickness.

November 19, 2023

We now invite proposals for presentations. FOSDEM offers open source and free software developers a place to meet, share ideas and collaborate. Renowned for being highly developer-oriented, the event brings together some 8000+ geeks from all over the world. The twenty-fourth edition will take place on Saturday 3rd and Sunday 4th February 2024 at the usual location, ULB Campus Solbosch in Brussels. Developer Rooms For more details about the Developer Rooms, please refer to the Calls for Papers at https://fosdem.org/2024/news/2023-11-08-devrooms-announced. Main Tracks Main track presentations cover topics of interest to a significant part of our audience that do not舰