Planet Grep

Planet'ing Belgian FLOSS people

Planet Grep is maintained by Wouter Verhelst. All times are in UTC.

July 26, 2024

We saw in the previous post how we can deal with data stored in the new VECTOR datatype that was released with MySQL 9.0.

We implemented the 4 basic mathematical operations between two vectors. To do so we created JavaScript functions. MySQL JavaScript functions are available in MySQL HeatWave and MySQL Enterprise Edition (you can use MySQL EE for free while learning, developing, and prototyping as mentioned here).

For the MySQL Community Users, extending the operations dealing with Vectors can be done by implementing User Defined Functions (UDFs) in C++ as a component.

In this article, we will see how we can create a component to add our 4 mathematical operations to MySQL.

Of course, you need to have the source code of MySQL and be able to compile it. For more information please refer to these blog posts:

The Code

The code of the component consists of 3 files that should be placed in a dedicated folder in the components directory of the source code:

mysql-server
├──components
   └──vector_operations
      ├── CMakeLists.txt
      ├── vector_operations.cc
      └── vector_operations.h

Disclaimer:

This code is not intended for production use and is provided solely for illustrative purposes.

DISABLE_MISSING_PROFILE_WARNING()

INCLUDE_DIRECTORIES(SYSTEM)

MYSQL_ADD_COMPONENT(vector_operations
  vector_operations.cc
  MODULE_ONLY
  TEST_ONLY
)
/* Copyright (c) 2017, 2024, Oracle and/or its affiliates. All rights reserved.

  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License, version 2.0,
  as published by the Free Software Foundation.

  This program is also distributed with certain software (including
  but not limited to OpenSSL) that is licensed under separate terms,
  as designated in a particular file or component or in included license
  documentation.  The authors of MySQL hereby grant you an additional
  permission to link the program and your derivative works with the
  separately licensed software that they have included with MySQL.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License, version 2.0, for more details.

  You should have received a copy of the GNU General Public License
  along with this program; if not, write to the Free Software
  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */

#define NO_SIGNATURE_CHANGE 0
#define SIGNATURE_CHANGE 1

#include "vector_operations.h"

REQUIRES_SERVICE_PLACEHOLDER(log_builtins);
REQUIRES_SERVICE_PLACEHOLDER(log_builtins_string);
REQUIRES_SERVICE_PLACEHOLDER(udf_registration);
REQUIRES_SERVICE_PLACEHOLDER(mysql_udf_metadata);
REQUIRES_SERVICE_PLACEHOLDER(mysql_runtime_error);

SERVICE_TYPE(log_builtins) * log_bi;
SERVICE_TYPE(log_builtins_string) * log_bs;

class udf_list {
  typedef std::list<std::string> udf_list_t;

 public:
  ~udf_list() { unregister(); }
  bool add_scalar(const char *func_name, enum Item_result return_type,
                  Udf_func_any func, Udf_func_init init_func = NULL,
                  Udf_func_deinit deinit_func = NULL) {
    if (!mysql_service_udf_registration->udf_register(
            func_name, return_type, func, init_func, deinit_func)) {
      set.push_back(func_name);
      return false;
    }
    return true;
  }

  bool unregister() {
    udf_list_t delete_set;
    /* try to unregister all of the udfs */
    for (auto udf : set) {
      int was_present = 0;
      if (!mysql_service_udf_registration->udf_unregister(udf.c_str(),
                                                          &was_present) ||
          !was_present)
        delete_set.push_back(udf);
    }

    /* remove the unregistered ones from the list */
    for (auto udf : delete_set) set.remove(udf);

    /* success: empty set */
    if (set.empty()) return false;

    /* failure: entries still in the set */
    return true;
  }

 private:
  udf_list_t set;
} *list;

namespace udf_impl {

void error_msg_size() {
  mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                  ER_UDF_ERROR, 0, "vector operation",
                                  "both vectors must have the same size");
}

const char *udf_init = "udf_init", *my_udf = "my_udf",
           *my_udf_clear = "my_clear", *my_udf_add = "my_udf_add";

// UDF to implement a vector addition function between two vectors

static bool vector_addition_udf_init(UDF_INIT *initid, UDF_ARGS *args, char *) {
  if (args->arg_count < 2) {
    mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                    ER_UDF_ERROR, 0, "vector_addition",
                                    "this function requires 2 parameters");
    return true;
  }
  initid->maybe_null = true;
  return false;
}

static void vector_addition_udf_deinit(__attribute__((unused))
                                       UDF_INIT *initid) {
  assert(initid->ptr == udf_init || initid->ptr == my_udf);
}

const char *vector_addition_udf(UDF_INIT *, UDF_ARGS *args, char *vector_sum,
                                unsigned long *length, char *is_null,
                                char *error) {
  *error = 0;
  *is_null = 0;

  uint32_t dim_vec1 = get_dimensions(args->lengths[0], sizeof(float));
  uint32_t dim_vec2 = get_dimensions(args->lengths[1], sizeof(float));
  if (dim_vec1 != dim_vec2 || dim_vec1 == UINT32_MAX ||
      dim_vec2 == UINT32_MAX) {
    error_msg_size();
    *error = 1;
    *is_null = 1;
    return 0;
  }

  float *vec1 = ((float *)args->args[0]);
  float *vec2 = ((float *)args->args[1]);

  unsigned long vector_length = 0;

  vector_sum = vector_addition(dim_vec1, vec1, vec2, &vector_length);
  *length = vector_length;

  return const_cast<char *>(vector_sum);
}

// UDF to implement a vector subtraction function between two vectors

static bool vector_subtraction_udf_init(UDF_INIT *initid, UDF_ARGS *args,
                                        char *) {
  if (args->arg_count < 2) {
    mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                    ER_UDF_ERROR, 0, "vector_subtraction",
                                    "this function requires 2 parameters");
    return true;
  }
  initid->maybe_null = true;
  return false;
}

static void vector_subtraction_udf_deinit(__attribute__((unused))
                                          UDF_INIT *initid) {
  assert(initid->ptr == udf_init || initid->ptr == my_udf);
}

const char *vector_subtraction_udf(UDF_INIT *, UDF_ARGS *args, char *vector_sum,
                                   unsigned long *length, char *is_null,
                                   char *error) {
  *error = 0;
  *is_null = 0;

  uint32_t dim_vec1 = get_dimensions(args->lengths[0], sizeof(float));
  uint32_t dim_vec2 = get_dimensions(args->lengths[1], sizeof(float));
  if (dim_vec1 != dim_vec2 || dim_vec1 == UINT32_MAX ||
      dim_vec2 == UINT32_MAX) {
    error_msg_size();
    *error = 1;
    *is_null = 1;
    return 0;
  }

  float *vec1 = ((float *)args->args[0]);
  float *vec2 = ((float *)args->args[1]);

  unsigned long vector_length = 0;

  vector_sum = vector_subtraction(dim_vec1, vec1, vec2, &vector_length);
  *length = vector_length;

  return const_cast<char *>(vector_sum);
}

// UDF to implement a vector product function of two vectors

static bool vector_multiplication_udf_init(UDF_INIT *initid, UDF_ARGS *args,
                                           char *) {
  if (args->arg_count < 2) {
    mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                    ER_UDF_ERROR, 0, "vector_multiplication",
                                    "this function requires 2 parameters");
    return true;
  }
  initid->maybe_null = true;
  return false;
}

static void vector_multiplication_udf_deinit(__attribute__((unused))
                                             UDF_INIT *initid) {
  assert(initid->ptr == udf_init || initid->ptr == my_udf);
}

const char *vector_multiplication_udf(UDF_INIT *, UDF_ARGS *args,
                                      char *vector_sum, unsigned long *length,
                                      char *is_null, char *error) {
  *error = 0;
  *is_null = 0;

  uint32_t dim_vec1 = get_dimensions(args->lengths[0], sizeof(float));
  uint32_t dim_vec2 = get_dimensions(args->lengths[1], sizeof(float));
  if (dim_vec1 != dim_vec2 || dim_vec1 == UINT32_MAX ||
      dim_vec2 == UINT32_MAX) {
    error_msg_size();
    *error = 1;
    *is_null = 1;
    return 0;
  }

  float *vec1 = ((float *)args->args[0]);
  float *vec2 = ((float *)args->args[1]);

  unsigned long vector_length = 0;

  vector_sum = vector_multiplication(dim_vec1, vec1, vec2, &vector_length);
  *length = vector_length;

  return const_cast<char *>(vector_sum);
}

// UDF to implement a vector division function of two vectors

static bool vector_division_udf_init(UDF_INIT *initid, UDF_ARGS *args, char *) {
  if (args->arg_count < 2) {
    mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                    ER_UDF_ERROR, 0, "vector_division",
                                    "this function requires 2 parameters");
    return true;
  }
  initid->maybe_null = true;
  return false;
}

static void vector_division_udf_deinit(__attribute__((unused))
                                       UDF_INIT *initid) {
  assert(initid->ptr == udf_init || initid->ptr == my_udf);
}

const char *vector_division_udf(UDF_INIT *, UDF_ARGS *args, char *vector_sum,
                                unsigned long *length, char *is_null,
                                char *error) {
  *error = 0;
  *is_null = 0;

  uint32_t dim_vec1 = get_dimensions(args->lengths[0], sizeof(float));
  uint32_t dim_vec2 = get_dimensions(args->lengths[1], sizeof(float));
  if (dim_vec1 != dim_vec2 || dim_vec1 == UINT32_MAX ||
      dim_vec2 == UINT32_MAX) {
    error_msg_size();
    *error = 1;
    *is_null = 1;
    return 0;
  }

  float *vec1 = ((float *)args->args[0]);
  float *vec2 = ((float *)args->args[1]);

  unsigned long vector_length = 0;

  vector_sum = vector_division(dim_vec1, vec1, vec2, &vector_length);
  *length = vector_length;

  return const_cast<char *>(vector_sum);
}

} /* namespace udf_impl */

static mysql_service_status_t vector_operations_service_init() {
  mysql_service_status_t result = 0;

  log_bi = mysql_service_log_builtins;
  log_bs = mysql_service_log_builtins_string;

  LogComponentErr(INFORMATION_LEVEL, ER_LOG_PRINTF_MSG, "initializing…");

  list = new udf_list();

  if (list->add_scalar("VECTOR_ADDITION", Item_result::STRING_RESULT,
                       (Udf_func_any)udf_impl::vector_addition_udf,
                       udf_impl::vector_addition_udf_init,
                       udf_impl::vector_addition_udf_deinit)) {
    delete list;
    return 1; /* failure: one of the UDF registrations failed */
  }

  if (list->add_scalar("VECTOR_SUBTRACTION", Item_result::STRING_RESULT,
                       (Udf_func_any)udf_impl::vector_subtraction_udf,
                       udf_impl::vector_subtraction_udf_init,
                       udf_impl::vector_subtraction_udf_deinit)) {
    delete list;
    return 1; /* failure: one of the UDF registrations failed */
  }

  if (list->add_scalar("VECTOR_MULTIPLICATION", Item_result::STRING_RESULT,
                       (Udf_func_any)udf_impl::vector_multiplication_udf,
                       udf_impl::vector_multiplication_udf_init,
                       udf_impl::vector_multiplication_udf_deinit)) {
    delete list;
    return 1; /* failure: one of the UDF registrations failed */
  }

  if (list->add_scalar("VECTOR_DIVISION", Item_result::STRING_RESULT,
                       (Udf_func_any)udf_impl::vector_division_udf,
                       udf_impl::vector_division_udf_init,
                       udf_impl::vector_division_udf_deinit)) {
    delete list;
    return 1; /* failure: one of the UDF registrations failed */
  }

  return result;
}

static mysql_service_status_t vector_operations_service_deinit() {
  mysql_service_status_t result = 0;

  if (list->unregister()) return 1; /* failure: some UDFs still in use */

  delete list;

  LogComponentErr(INFORMATION_LEVEL, ER_LOG_PRINTF_MSG, "uninstalled.");

  return result;
}

BEGIN_COMPONENT_PROVIDES(vector_operations_service)
END_COMPONENT_PROVIDES();

BEGIN_COMPONENT_REQUIRES(vector_operations_service)
REQUIRES_SERVICE(log_builtins), REQUIRES_SERVICE(log_builtins_string),
    REQUIRES_SERVICE(mysql_udf_metadata), REQUIRES_SERVICE(udf_registration),
    REQUIRES_SERVICE(mysql_runtime_error), END_COMPONENT_REQUIRES();

/* A list of metadata to describe the Component. */
BEGIN_COMPONENT_METADATA(vector_operations_service)
METADATA("mysql.author", "Oracle Corporation / lefred"),
    METADATA("mysql.license", "GPL"), METADATA("mysql.dev", "lefred"),
    END_COMPONENT_METADATA();

/* Declaration of the Component. */
DECLARE_COMPONENT(vector_operations_service, "mysql:vector_operations_service")
vector_operations_service_init,
    vector_operations_service_deinit END_DECLARE_COMPONENT();

/* Defines list of Components contained in this library. Note that for now
  we assume that library will have exactly one Component. */
DECLARE_LIBRARY_COMPONENTS &COMPONENT_REF(vector_operations_service)
    END_DECLARE_LIBRARY_COMPONENTS
/* Copyright (c) 2017, 2024, Oracle and/or its affiliates. All rights reserved.

  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License, version 2.0,
  as published by the Free Software Foundation.

  This program is also distributed with certain software (including
  but not limited to OpenSSL) that is licensed under separate terms,
  as designated in a particular file or component or in included license
  documentation.  The authors of MySQL hereby grant you an additional
  permission to link the program and your derivative works with the
  separately licensed software that they have included with MySQL.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License, version 2.0, for more details.

  You should have received a copy of the GNU General Public License
  along with this program; if not, write to the Free Software
  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */

#define LOG_COMPONENT_TAG "community_vector"

#include <mysql/components/component_implementation.h>
#include <mysql/components/services/log_builtins.h> /* LogComponentErr */
#include <mysql/components/services/mysql_runtime_error_service.h>
#include <mysql/components/services/udf_metadata.h>
#include <mysql/components/services/udf_registration.h>
#include <mysqld_error.h> /* Errors */

#include <list>
#include <sstream>
#include <string>
#include <vector>

#include "mysql/strings/m_ctype.h"
#include "sql/field.h"
#include "sql/sql_udf.h"
#include "sql/vector_conversion.h"

void populate_vector(uint32_t vec_dim, float *vec1,
                     std::vector<float> &vector1) {
  vector1.clear();
  vector1.reserve(vec_dim);

  for (uint32_t i = 0; i < vec_dim; i++) {
    float value1;
    memcpy(&value1, vec1 + i, sizeof(float));
    vector1.push_back(value1);
  }
}

std::string std_vector_to_string(const std::vector<float> &vec) {
  std::ostringstream oss;
  oss << "[";

  for (size_t i = 0; i < vec.size(); ++i) {
    // Set precision and scientific notation
    oss << std::scientific << vec[i];

    // Add a comma if it's not the last element
    if (i != vec.size() - 1) {
      oss << ",";
    }
  }

  oss << "]";
  return oss.str();
}

static char *vector_addition(uint32_t vec_dim, float *vec1, float *vec2,
                             unsigned long *length) {
  std::vector<float> vector1;
  std::vector<float> vector2;

  populate_vector(vec_dim, vec1, vector1);
  populate_vector(vec_dim, vec2, vector2);

  std::vector<float> result(vector1.size());
  for (size_t i = 0; i < vector1.size(); ++i) {
    result[i] = vector1[i] + vector2[i];
  }

  std::string result_str = std_vector_to_string(result);
  char *result_cstr = result_str.data();
  String vector_string;

  uint32 output_dims = Field_vector::max_dimensions;
  auto dimension_bytes = Field_vector::dimension_bytes(output_dims);
  if (vector_string.mem_realloc(dimension_bytes)) return 0;

  bool err = from_string_to_vector(result_cstr, strlen(result_cstr),
                                   vector_string.ptr(), &output_dims);

  if (err) {
    if (output_dims == Field_vector::max_dimensions) {
      vector_string.replace(32, 5, "... \0", 5);
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_addition",
                                      "Data out of range");
    } else {
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_addition",
                                      "Invalid vector conversion");
    }
    return 0;
  }

  size_t output_vector_length = Field_vector::dimension_bytes(output_dims);
  vector_string.length(output_vector_length);
  *length = output_vector_length;

  return vector_string.c_ptr_safe();
}

static char *vector_subtraction(uint32_t vec_dim, float *vec1, float *vec2,
                                unsigned long *length) {
  std::vector<float> vector1;
  std::vector<float> vector2;

  populate_vector(vec_dim, vec1, vector1);
  populate_vector(vec_dim, vec2, vector2);

  std::vector<float> result(vector1.size());
  for (size_t i = 0; i < vector1.size(); ++i) {
    result[i] = vector1[i] - vector2[i];
  }

  std::string result_str = std_vector_to_string(result);
  char *result_cstr = result_str.data();
  String vector_string;

  uint32 output_dims = Field_vector::max_dimensions;
  auto dimension_bytes = Field_vector::dimension_bytes(output_dims);
  if (vector_string.mem_realloc(dimension_bytes)) return 0;

  bool err = from_string_to_vector(result_cstr, strlen(result_cstr),
                                   vector_string.ptr(), &output_dims);

  if (err) {
    if (output_dims == Field_vector::max_dimensions) {
      vector_string.replace(32, 5, "... \0", 5);
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_subtraction",
                                      "Data out of range");
    } else {
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_subtraction",
                                      "Invalid vector conversion");
    }
    return 0;
  }

  size_t output_vector_length = Field_vector::dimension_bytes(output_dims);
  vector_string.length(output_vector_length);
  *length = output_vector_length;

  return vector_string.c_ptr_safe();
}

static char *vector_multiplication(uint32_t vec_dim, float *vec1, float *vec2,
                                   unsigned long *length) {
  std::vector<float> vector1;
  std::vector<float> vector2;

  populate_vector(vec_dim, vec1, vector1);
  populate_vector(vec_dim, vec2, vector2);

  std::vector<float> result(vector1.size());
  for (size_t i = 0; i < vector1.size(); ++i) {
    result[i] = vector1[i] * vector2[i];
  }

  std::string result_str = std_vector_to_string(result);
  char *result_cstr = result_str.data();
  String vector_string;

  uint32 output_dims = Field_vector::max_dimensions;
  auto dimension_bytes = Field_vector::dimension_bytes(output_dims);
  if (vector_string.mem_realloc(dimension_bytes)) return 0;

  bool err = from_string_to_vector(result_cstr, strlen(result_cstr),
                                   vector_string.ptr(), &output_dims);

  if (err) {
    if (output_dims == Field_vector::max_dimensions) {
      vector_string.replace(32, 5, "... \0", 5);
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_multiplication",
                                      "Data out of range");
    } else {
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_multiplication",
                                      "Invalid vector conversion");
    }
    return 0;
  }

  size_t output_vector_length = Field_vector::dimension_bytes(output_dims);
  vector_string.length(output_vector_length);
  *length = output_vector_length;

  return vector_string.c_ptr_safe();
}

static char *vector_division(uint32_t vec_dim, float *vec1, float *vec2,
                             unsigned long *length) {
  std::vector<float> vector1;
  std::vector<float> vector2;

  populate_vector(vec_dim, vec1, vector1);
  populate_vector(vec_dim, vec2, vector2);

  std::vector<float> result(vector1.size());
  for (size_t i = 0; i < vector1.size(); ++i) {
    if (vector2[i] == 0) {
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_division",
                                      "Division by zero is undefined");
      return 0;
    }
    result[i] = vector1[i] / vector2[i];
  }

  std::string result_str = std_vector_to_string(result);
  char *result_cstr = result_str.data();
  String vector_string;

  uint32 output_dims = Field_vector::max_dimensions;
  auto dimension_bytes = Field_vector::dimension_bytes(output_dims);
  if (vector_string.mem_realloc(dimension_bytes)) return 0;

  bool err = from_string_to_vector(result_cstr, strlen(result_cstr),
                                   vector_string.ptr(), &output_dims);

  if (err) {
    if (output_dims == Field_vector::max_dimensions) {
      vector_string.replace(32, 5, "... \0", 5);
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_division",
                                      "Data out of range");
    } else {
      mysql_error_service_emit_printf(mysql_service_mysql_runtime_error,
                                      ER_UDF_ERROR, 0, "vector_division",
                                      "Invalid vector conversion");
    }
    return 0;
  }

  size_t output_vector_length = Field_vector::dimension_bytes(output_dims);
  vector_string.length(output_vector_length);
  *length = output_vector_length;

  return vector_string.c_ptr_safe();
}

extern REQUIRES_SERVICE_PLACEHOLDER(log_builtins);
extern REQUIRES_SERVICE_PLACEHOLDER(log_builtins_string);
extern REQUIRES_SERVICE_PLACEHOLDER(udf_registration);
extern REQUIRES_SERVICE_PLACEHOLDER(mysql_udf_metadata);
extern REQUIRES_SERVICE_PLACEHOLDER(mysql_runtime_error);

extern SERVICE_TYPE(log_builtins) * log_bi;
extern SERVICE_TYPE(log_builtins_string) * log_bs;

The code is also available on GitHub.

Once compiled, you get the component file to load:

Testing

We can load the component and test it:

In the image above, we can see how we loaded the component and then we can see that the UDFs for all four operations are available.

SQL> select vector_to_string(
       vector_addition(
          string_to_vector('[1,2,3]'), 
          string_to_vector('[4,5,6]')
       )
     ) sum;
+---------------------------------------+
| sum                                   |
+---------------------------------------+
| [5.00000e+00,7.00000e+00,9.00000e+00] |
+---------------------------------------+
1 row in set (0.0002 sec)

In comparison to the function developed in JavaScript in the previous article, this time we utilize the VECTOR datatype as both the input and output, we don’t use the string representation.

We can test with the same table as in the previous post:

SQL> select id, vector_addition(vec1, vec2) vec_sum from t1;
+----+----------------------------+
| id | vec_sum                    |
+----+----------------------------+
|  1 | 0x0000C0400000404100000040 |
|  2 | 0x0000004066662A4200001C43 |
+----+----------------------------+
2 rows in set (0.0008 sec)

SQL> select id, vector_to_string(vector_addition(vec1, vec2)) vec_sum 
     from t1;
+----+---------------------------------------+
| id | vec_sum                               |
+----+---------------------------------------+
|  1 | [6.00000e+00,1.20000e+01,2.00000e+00] |
|  2 | [2.00000e+00,4.26000e+01,1.56000e+02] |
+----+---------------------------------------+
2 rows in set (0.0003 sec)

Conclusion

The VECTOR datatype is a valuable feature of MySQL 9.0 and can be extended with ease, much like other datatypes in MySQL, using your UDFs.

Extending MySQL using the Component Architecture is relatively straightforward but more complex to build and deploy in the cloud, where creating JavaScript functions is easier with MySQL HeatWave.

Enjoy MySQL, Vectors, and coding components in C++!

July 25, 2024

MySQL 9.0.0 has brought the VECTOR datatype to your favorite Open Source Database.

There are already some functions available to deal with those vectors:

This post will show how to deal with vectors and create our own functions to create operations between vectors.

We will use the MLE Component capability to create JavaScript functions. JS stored procedures are available in MySQL HeatWave and MySQL Enterprise Edition. You can use MySQL EE for free while learning, developing, and prototyping. See here.

For more information regarding Javascript functions in MySQL, please check the following posts:

For the Community Users, we will see in part 2 how to implement the same functions as UDF creating a component.

Operations

We will create a function for each of the 4 mathematical basic operators between two vectors of the same size: addition (+), subtraction (-), multiplication (x), and division (:).

All the functions will take two vectors represented by a string as arguments and return a string.

vector_addition_js

drop function if exists vector_addition_js;
create function vector_addition_js(a varchar(15000), b varchar(15000))
    returns varchar(15000) language javascript as $$
    const vec1 = JSON.parse(a)
    const vec2 = JSON.parse(b)
    if (vec1.length !== vec2.length) {
        throw new Error("Vectors must have the same dimension")
    }
    const result = []
    let i = 0
    while (i < vec1.length) {
        result.push(vec1[i] + vec2[i])
        i++
    }
    const resultStr = JSON.stringify(result)
    try {
        const parsedResult = JSON.parse(resultStr)
        return resultStr
    } catch (error) {
        throw new Error("Invalid vector conversion")
    }

$$;

Let’s test it:

SQL> select vector_addition_js('[1,2,3]','[4,5,6]');
+-----------------------------------------+
| vector_addition_js('[1,2,3]','[4,5,6]') |
+-----------------------------------------+
| [5,7,9]                                 |
+-----------------------------------------+
1 row in set (0.0053 sec)

Let’s create a table with the VECTOR datatype to test with our function:

SQL> create table t1 (
        id int unsigned auto_increment primary key, 
        vec1 vector, 
        vec2 vector
     );
SQL> insert into t1 values
     (0,string_to_vector('[4,9,1]'),string_to_vector('[2,3,1]'));
SQL> insert into t1 values
     (0,string_to_vector('[0,36.6,144]'),string_to_vector('[2,6,12]'));
SQL> select * from t1;
+----+----------------------------+----------------------------+
| id | vec1                       | vec2                       |
+----+----------------------------+----------------------------+
|  1 | 0x00008040000010410000803F | 0x00000040000040400000803F |
|  2 | 0x000000006666124200001043 | 0x000000400000C04000004041 |
+----+----------------------------+----------------------------+
2 rows in set (0.0027 sec)

Let’s try our function to add vec1 to vec2 for each row:

SQL> select id, vector_addition_js(
           vector_to_string(vec1), vector_to_string(vec2)
     ) vec3 from t1;
+----+--------------+
| id | vec3         |
+----+--------------+
|  1 | [6,12,2]     |
|  2 | [2,42.6,156] |
+----+--------------+
2 rows in set (0.0015 sec)

vector_subtraction_js

The function is almost the same, we replace the sign + with :

drop function if exists vector_subtraction_js;
create function vector_subtraction_js(a varchar(15000), b varchar(15000))
    returns varchar(15000) language javascript as $$
    const vec1 = JSON.parse(a)
    const vec2 = JSON.parse(b)
    if (vec1.length !== vec2.length) {
        throw new Error("Vectors must have the same dimension")
    }
    const result = []
    let i = 0
    while (i < vec1.length) {
        result.push(vec1[i] - vec2[i])
        i++
    }
    const resultStr = JSON.stringify(result)
    try {
        const parsedResult = JSON.parse(resultStr)
        return resultStr
    } catch (error) {
        throw new Error("Invalid vector conversion")
    }

$$;

Let’s try it:

SQL> select id, 
     vector_subtraction_js(
       vector_to_string(vec1), vector_to_string(vec2)
     ) vec3 from t1;
+----+---------------+
| id | vec3          |
+----+---------------+
|  1 | [2,6,0]       |
|  2 | [-2,30.6,132] |
+----+---------------+
2 rows in set (0.0068 sec)

vector_multiplication_js

You have the principle now and should create the function by yourself:

SQL> select id, vector_multiplication_js(
         vector_to_string(vec1), vector_to_string(vec2)
     ) vec3 from t1;
+----+-----------------------------+
| id | vec3                        |
+----+-----------------------------+
|  1 | [8,27,1]                    |
|  2 | [0,219.60000000000002,1728] |
+----+-----------------------------+
2 rows in set (0.0029 sec)

SQL> select vector_multiplication_js('[1,2,3]','[0,0,0]');
+-----------------------------------------------+
| vector_multiplication_js('[1,2,3]','[0,0,0]') |
+-----------------------------------------------+
| [0,0,0]                                       |
+-----------------------------------------------+
1 row in set (0.0011 sec)

vector_division_js

This time we add a check to not accept division by zero in the while loop:

while (i < vec1.length) {
        if (vec2[i] == 0) {
           throw new Error("Division by zero is undefined")
        }
        result.push(vec1[i] / vec2[i])
        i++
    }

We can test the function using again the same records in our table:

SQL> select id, vector_division_js(
        vector_to_string(vec1), vector_to_string(vec2)
     ) vec3 from t1;
+----+---------------------------+
| id | vec3                      |
+----+---------------------------+
|  1 | [2,3,1]                   |
|  2 | [0,6.1000000000000005,12] |
+----+---------------------------+
2 rows in set (0.0028 sec)

SQL> select id, vector_division_js(
         vector_to_string(vec2), vector_to_string(vec1)
     ) vec3 from t1;
ERROR: 6113: JavaScript> Error: Division by zero is undefined

Conclusion

The support of the VECTOR datatype is a nice step further in MySQL and as you can see, it’s quite straightforward to use them and create any function you need around vectors.

This is a wonderful addition to MySQL and certainly to MySQL EE and HeatWave, where it is very easy to extend MySQL with any type of function coded in JavaScript.

Enjoy MySQL & the VECTOR datatype and happy coding your JS functions!

July 23, 2024

Keeping up appearances in tech

Cover Image
The word "rant" is used far too often, and in various ways.
It's meant to imply aimless, angry venting.

But often it means:

Naming problems without proposing solutions,
this makes me feel confused.

Naming problems and assigning blame,
this makes me feel bad.

I saw a remarkable pair of tweets the other day.

In the wake of the outage, the CEO of CrowdStrike sent out a public announcement. It's purely factual. The scope of the problem is identified, the known facts are stated, and the logistics of disaster relief are set in motion.


  CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts. Mac and Linux hosts are not impacted.

  This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed.

  We refer customers to the support portal for the latest updates and will continue to provide complete and continuous updates on our website. We further recommend organizations ensure they’re communicating with CrowdStrike representatives through official channels.

  Our team is fully mobilized to ensure the security and stability of CrowdStrike customers.

Millions of computers were affected. This is the equivalent of a frazzled official giving a brief statement in the aftermath of an earthquake, directing people to the Red Cross.

Everything is basically on fire for everyone involved. Systems are failing everywhere, some critical, and quite likely people are panicking. The important thing is to give the technicians the information and tools to fix it, and for everyone else to do what they can, and stay out of the way.

In response, a communication professional posted an 'improved' version:


  I’m the CEO of CrowdStrike. I’m devastated to see the scale of today’s outage and will be personally working on it together with our team until it’s fully fixed for every single user.

  But I wanted to take a moment to come here and tell you that I am sorry. People around the world rely on us, and incidents like this can’t happen. This came from an error that ultimately is my responsibility. 

  Here’s what we know: [brief synopsis of what went wrong and how it wasn’t a cyberattack etc.]

  Our entire team will be working all day, all night, all weekend, and however long it takes to resolve this and make sure it doesn’t happen again.

  We’ll be sharing updates as often as possible, which you can find here [link]. If you need to contact us, the quickest way is to go here [link].

  We’re responding as quickly as possible. Thank you to everyone who has alerted us to the outage, and again, please accept my deepest apologies. More to come soon.

Credit where credit is due, she nailed the style. 10/10. It seems unobjectionable, at first. Let's go through, shall we?

Hyacinth fixing her husband's tie

Opposite Day

First is that the CEO is "devastated." A feeling. And they are personally going to ensure it's fixed for every single user.

This focuses on the individual who is inconvenienced. Not the disaster. They take a moment out of their time to say they are so, so sorry a mistake was made. They have let you and everyone else down, and that shouldn't happen. That's their responsibility.

By this point, the original statement had already told everyone the relevant facts. Here the technical details are left to the imagination. The writer's self-assigned job is to wrap the message in a more palatable envelope.

Everyone will be working "all day, all night, all weekends," indeed, "however long it takes," to avoid it happening again.

I imagine this is meant to be inspiring and reassuring. But if I was a CrowdStrike technician or engineer, I would find it demoralizing: the boss, who will actually be personally fixing diddly-squat, is saying that the long hours of others are a sacrifice they're willing to make.

Plus, CrowdStrike's customers are in the same boat: their technicians get volunteered too. They can't magically unbrick PCs from a distance, so "until it's fully fixed for every single user" would be a promise outsiders will have to keep. Lovely.

There's even a punch line: an invitation to go contact them, the quickest way linked directly. It thanks people for reaching out.

If everything is on fire, that includes the phone lines, the inboxes, and so on. The most stupid thing you could do in such a situation is to tell more people to contact you, right away. Don't encourage it! That's why the original statement refers to pre-existing lines of communication, internal representatives, and so on. The Support department would hate the CEO too.

Hyacinth and Richard peering over a fence

Root Cause

If you're wondering about the pictures, it's Hyacinth Bucket, from 90s UK sitcom Keeping Up Appearances, who would always insist "it's pronounced Bouquet."

Hyacinth's ambitions always landed her out of her depth, surrounded by upper-class people she's trying to impress, in the midst of an embarrassing disaster. Her increasingly desperate attempts to save face, which invariably made things worse, are the main source of comedy.

Try reading that second statement in her voice.

I’m devastated to see the scale of today’s outage and will be personally working on it together with our team until it’s fully fixed for every single user.

But I wanted to take a moment to come here and tell you that I am sorry. People around the world rely on us, and incidents like this can’t happen. This came from an error that ultimately is my responsibility.

I can hear it perfectly, telegraphing Britishness to restore dignity for all. If she were in tech she would give that statement.

It's about reputation management first, projecting the image of competence and accountability. But she's giving the speech in front of a burning building, not realizing the entire exercise is futile. Worse, she thinks she's nailing it.

If CrowdStrike had sent this out, some would've applauded and called it an admirable example of wise and empathetic communication. Real leadership qualities.

But it's the exact opposite. It focuses on the wrong things, it alienates the staff, and it definitely amplifies the chaos. It's Monty Python-esque.

Apologizing is pointless here, the damage is already done. What matters is how severe it is and whether it could've been avoided. This requires a detailed root-cause analysis and remedy. Otherwise you only have their word. Why would that re-assure you?

The original restated the company's mission: security and stability. Those are the stakes to regain a modicum of confidence.

You may think that I'm reading too much into this. But I know the exact vibe on an engineering floor when the shit hits the fan. I also know how executives and staff without that experience end up missing the point entirely. I once worked for a Hyacinth Bucket. It's not an anecdote, it's allegory.

They simply don't get the engineering mindset, and confuse authority with ownership. They step on everyone's toes without realizing, because they're constantly wearing clown shoes. Nobody tells them.

Hyacinth is not happy

Softness as a Service

The change in style between #1 and #2 is really a microcosm of the conflict that has been broiling in tech for ~15 years now. I don't mean the politics, but the shifting of norms, of language and behavior.

It's framed as a matter of interpersonal style, which needs to be welcoming and inclusive. In practice this means they assert or demand that style #2 be the norm, even when #1 is advisable or required.

Factuality is seen as deficient, improper and primitive. It's a form of doublethink: everyone's preference is equally valid, except yours, specifically.

But the difference is not a preference. It's about what actually works and what doesn't. Style #1 is aimed at the people who have to fix it. Style #2 is aimed at the people who can't do anything until it's fixed. Who should they be reaching out to?

In #2, communication becomes an end in itself, not a means of conveying information. It's about being seen saying the words, not living them. Poking at the statement makes it fall apart.

When this becomes the norm in a technical field, it has deep consequences:

  • Critique must be gift-wrapped in flattery, and is not allowed to actually land.
  • Mistakes are not corrected, and sentiment takes precedence over effectiveness.
  • Leaders speak lofty words far from the trenches to save face.
  • The people they thank the loudest are the ones they pay the least.

Inevitably, quiet competence is replaced with gaudy chaos. Everyone says they're sorry and responsible, but nobody actually is. Nobody wants to resign either. Sound familiar?

Onslow

Cope and Soothe

The elephant in the room is that #1 is very masculine, while #2 is more feminine. When you hear "women are more empathetic communicators", this is what it means. They tend to focus on the individual and their relation to them, not the team as a whole and its mission.

Complaints that tech is too "male dominated" and "notoriously hostile to women" are often just this. Tech was always full of types who won't preface their proposals and criticisms with fluff, and instead lean into autism. When you're used to being pandered to, neutrality feels like vulgarity.

The notable exceptions are rare and usually have an exasperating lead up. Tech is actually one of the most accepting and egalitarian fields around. The maintainers do a mostly thankless job.

"Oh so you're saying there's no misogyny in tech?" No I'm just saying misogyny doesn't mean "something 1 woman hates".

The tone is really a distraction. If someone drops an analysis, saying shit or get off the pot, even very kindly and patiently, some will still run away screaming. Like an octopus spraying ink, they'll deploy a nasty form of #2 as a distraction. That's the real issue.

Many techies, in their naiveté, believed the cultural reformers when they showed up to gentrify them. They obediently branded heretics like James Damore, and burned witches like Richard Stallman. Thanks to racism, words like 'master' and 'slave' are now off-limits as technical terms. Ironic, because millions of computers just crashed because they worked exactly like that.

Django commit replacing master/slave
Guys, I'm stuck in the we work lift.

The cope is to pretend that nothing has truly changed yet, and more reform is needed. In fact, everything has already changed. Tech forums used to be crucibles for distilling insight, but now they are guarded jealously by people more likely to flag and ban than strongly disagree.

I once got flagged on HN because I pointed out Twitter's mass lay-offs were a response to overhiring, and that people were rooting for the site to fail after Musk bought it. It suggested what we all know now: that the company would not implode after trimming the dead weight, and that they'd never forgive him for it.

Diversity is now associated with incompetence, because incompetent people have spent over a decade reaching for it as an excuse. In their attempts to fight stereotypes, they ensured the stereotypes came true.

Hyacinth is not happy

Bait and Snitch

The outcry tends to be: "We do all the same things you do, but still we get treated differently!" But they start from the conclusion and work their way backwards. This is what the rewritten statement does: it tries to fix the relationship before fixing the problem.

The average woman and man actually do things very differently in the first place. Individual men and women choose. And others respond accordingly. The people who build and maintain the world's infrastructure prefer the masculine style for a reason: it keeps civilization running, and helps restore it when it breaks. A disaster announcement does not need to be relatable, it needs to be effective.

Furthermore, if the job of shoveling shit falls on you, no amount of flattery or oversight will make that more pleasant. It really won't. Such commentary is purely for the benefit of the ones watching and trying to look busy. It makes it worse, stop pretending otherwise.

There's little loyalty in tech companies nowadays, and it's no surprise. Project and product managers are acting more like demanding clients to their own team, than leaders. "As a user, I want..." Yes, but what are you going to do about it? Do you even know where to start?

What's perceived as a lack of sensitivity is actually the presence of sensibility. It's what connects the words to the reality on the ground. It does not need to be improved or corrected, it just needs to be respected. And yes it's a matter of gender, because bashing men and masculine norms has become a jolly recreational sport in the overculture. Mature women know it.

It seems impossible to admit. The entire edifice of gender equality depends on there not being a single thing men are actually better at, even just on average. Where men and women's instincts differ, women must be right.

It's childish, and not harmless either. It dares you to call it out, so they can then play the wounded victim, and paint you as the unreasonable asshole who is mean. This is supposed to invalidate the argument.

* * *

This post is of course a giant cannon pointing in the opposite direction, sitting on top of a wall. Its message will likely fly over the reformers' heads.

If they read it at all, they'll selectively quote or paraphrase, call me a tech-bro, and spool off some sentences they overheard, like an LLM. It's why they adore AI, and want it to be exactly as sycophantic as them. They don't care that it makes stuff up wholesale, because it makes them look and feel competent. It will never tell them to just fuck off already.

Think less about what is said, more about what is being done. Otherwise the next CrowdStrike will probably be worse.

July 21, 2024

Pour une mémoire commune numérique

L’amnésie d’Internet

Internet est un lieu d’échange, mais sa mémoire est volatile. Nous avons tous connu des sites qui disparaissent, des ressources qui deviennent inaccessibles. Personnellement, j’ai perdu tous mes sites web avant ce blog. J’ai perdu tout ce que j’ai posté sur les réseaux sociaux, particulièrement Google+ où j’étais très actif.

Cela m’a servi de leçon. La simplification à outrance de ce blog est une manière pour moi de le pérenniser.

Tout le monde ne pense pas comme moi ou ne réalise pas l’importance de l’archivage.

Toutes les archives du site de MTV, la chaine télé musicale qui a marqué ma génération, viennent d’être irrémédiablement effacées.

Cela va continuer. Avec le Web en 91 puis les smartphones en 2007, Internet s’est complètement insinué dans la société et la vie de tous les humains. Mais c’est très récent. Tellement récent que nous n’avons aucun recul sur ce que cela signifie sur le long terme.

Rappelons que les protocoles à la base d’Internet ont été conçus en imaginant que le stockage serait cher, mais que la bande passante serait bon marché. Internet est donc un réseau de routage de paquets sans aucune capacité de mémoire. Mémoriser ne fait pas partie de sa structure intrinsèque. Le réseau est conçu pour échanger et oublier.

Ironiquement, l’inverse s’est passé : le stockage est devenu meilleur marché que la bande passante. Mais la technologie ne s’est pas adaptée. Les humains non plus. Nous échangeons à toute vitesse des informations que nous oublions tout aussi vite.

Il n’y a, à ce jour, qu’une seule technologie qui a fait ses preuves pour nous souvenir et améliorer notre mémoire collective dans la durée : l’écriture. Les livres. Les livres qui, à l’exception d’incendies dramatiques, résistent aux millénaires, aux multiples lectures, aux annotations, à l’écorchage de leurs pages.

Internet et les livres resteront toujours complémentaires. Le premier pour nous mettre en contact et nous permettre d’échanger en temps réel. Les seconds pour nous permettre de nous souvenir et d’échanger à travers les années et les siècles.

La mémoire de la ligne de commande

Sur Internet, on passe son temps à réinventer la roue alors que la solution existe probablement. D’où cette question provocante, mais incroyablement pertinente : pourquoi ne pas utiliser SSH pour tout ?

Exemple avec un chat qui enfonce les Slack et autres Teams tout en ne nécessitant rien d’autre qu’un client ssh.

SSH all the things!

On me rétorquera que c’est plus compliqué. Je réponds que pas du tout. Il faut juste apprendre tout comme nous avons appris à cliquer sur des liens. Il faut arrêter de prendre les utilisateurs pour des nuls et expliquer des choses qui pourront leur servir toute leur vie.

Une fois le principe de base d’une ligne de commande compris, un ordinateur devient une machine incroyable qui obéit aux ordres qu’on lui donne et non plus une boîte noire magique dont il faut deviner ce qu’elle veut qu’on fasse. La ligne de commande contrôle l’ordinateur là où le GUI contrôle l’utilisateur. 

Retenir une ligne de commande est très simple : il suffit de l’écrire quelque part. Par exemple dans un livre. Retenir une action à accomplir dans une interface graphique est très difficile. De toute façon, elle change tout le temps.

Autre point sous-estimé : une ligne de commande se partage très simplement. La ligne de commande est, par essence, sociale. Le GUI est solitaire. Comme cette fois où j’ai tenté, par téléphone, d’aider ma mère à installer un logiciel sous Ubuntu. Comme je n’arrivais pas à lui dire où cliquer, j’ai fini par lui faire lancer un terminal et lui dicter la commande. Ça a fonctionné du premier coup (j’ai juste dû lui prévenir que c’était normal que son mot de passe ne s’affiche pas, qu’il fallait le taper à l’aveugle).

Software Supply Chain contre Communs numériques

Dans un billet précédent, j’explique que contribuer à l’open source doit être vu comme une contribution aux communs et, de ce fait, il faut favoriser les licences Copyleft qui imposent de reverser aux communs toute modification/contribution ultérieure.

Ces 20 dernières années sont une leçon que si l’Open Source a gagné, c’est essentiellement parce que les grosses entreprises utilisent l’Open Source comme un gigantesque réservoir de main-d’œuvre gratuit. Elles vont même jusqu’à se plaindre quand un logiciel casse leur produit alors même qu’elles n’ont jamais acheté ce logiciel ni contribué.

Thomas Depierre pointe l’erreur de logique: les entreprises parlent de « Supply Chain » (chaine des fournisseurs), mais lorsqu’on écrit du code Open Source, on n’est pas un fournisseur. Et on n’a pas spécialement envie de le devenir. Un truc que les entreprises ont du mal à comprendre « Si on a ce code dans notre produit, il provient soit de chez nous, soit d’un fournisseur, non ? ». Et bien non.

La meilleure solution est pour moi d’utiliser des licences copyleft et, plus spécifiquement la licence AGPL qui impose le copyleft y compris pour les utilisateurs réseau. En clair, s’il est prouvé que Google utilise du code copyleft dans son moteur de recherche, n’importe quel utilisateur a le droit dé réclamer le code de tout le moteur Google.

Inutile de dire que les entreprises sont terrifiées à cette idée et qu’elles fuient le code GPL/AGPL comme la peste. C’est la raison pour laquelle le noyau Linux, sous GPL, est complètement isolé du reste du système dans Android. À part Linux, Apple, Google et Microsoft considèrent le GPL comme un véritable cancer. Les serveurs Netflix ou la PlayStation Sony sont sous FreeBSD par crainte de la GPL. macOS se base également sur BSD en évitant le monde Linux par peur d’être contaminé à la GPL.

Ils ont tellement peur qu’ils accusent les licences copyleft d’être « restrictives ». Non. La seule restriction que vous impose le copyleft, c’est de vous interdire d’ajouter des restrictions à vos utilisateurs. Il est interdit d’interdire.

Alors, oui, au plus il y aura du code sous licence AGPL, au moins les entreprises considéreront les développeurs open source comme des fournisseurs. Plus il y aura de code sous licence copyleft, plus large sera notre mémoire commune numérique.

TL;DR: release your code under the AGPL

Bon, après, je dis AGPL, mais je suis seulement en train de découvrir qu’en droit européen, il est bien possible que l’EUPL soit préférable. Mais, dans l’esprit, les deux sont équivalentes.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

July 16, 2024

A pilot races a spacecraft along a curved track in space.

The Drupal Starshot initiative has been making significant progress behind the scenes, and I'm excited to share some updates with the community.

Leadership team formation and product definition

Over the past few months, we've been working diligently on Drupal Starshot. One of our first steps was to appoint a leadership team to guide the project. With the leadership team in place as well as the new Starshot Advisory Council, we shifted our focus to defining the product. We've made substantial progress on this front and will be sharing more details about the product strategy in the coming weeks.

Introducing Drupal Starshot tracks

We already started to break down the initiative into manageable components, and are introducing the concept of "tracks". Tracks are smaller, focused parts of the Drupal Starshot project that allow for targeted development and contributions. We've already published the first set of tracks on the Drupal Starshot issue queue on Drupal.org.

Example tracks include:

  1. Creating Drupal Recipes for features like contact forms, advanced search, events, SEO and more.
  2. Enhancing the Drupal installer to enable Recipes during installation.
  3. Updating Drupal.org for Starshot, including product marketing and a trial experience.

While many tracks are technical and need help from developers, most of the tracks need contribution from designers, UX experts, marketers, testers and site builders.

Recruiting more track leads

Several tracks already have track leads and have made significant progress:

However, we need many additional track leads to drive our remaining tracks to completion.

We're now accepting applications for track lead positions. Interested individuals and organizations can apply by completing our application form. The application window closes on July 31st, two weeks from today.

Key responsibilities of a track lead

Track leads can be individuals, teams, or organizations, including Drupal Certified Partners. While technical expertise is beneficial, the role primarily focuses on strategic coordination and project management. Key responsibilities include:

  • Defining and validating requirements to ensure the track meets the expectations of our target audience.
  • Developing and maintaining a prioritized task list, including creating milestones and timelines.
  • Overseeing and driving the track's implementation.
  • Collaborating with key stakeholders, including the Drupal Starshot leadership team, module maintainers, the marketing team, etc.
  • Communicating progress to the community (e.g. blogging).

Track lead selection and announcement

After the application deadline, the Drupal Starshot Leadership Team will review the applications and appoint track leads. We expect to announce the selected track leads in the first week of August.

While the application period is open, we will be available to answer any questions you may have. Feel free to reach out to us through the Drupal.org issue queue, or join us in an upcoming zoom meeting (details to be announced / figured out).

Looking ahead to DrupalCon Barcelona

Our goal is to make significant progress on these tracks by DrupalCon Barcelona, where we plan to showcase the advancements we've made. We're excited about the momentum building around Drupal Starshot and can't wait to see the contributions from the community.

If you're passionate about Drupal and want to play a key role in shaping its future, consider applying for a track lead position.

Stay tuned for more updates on Drupal Starshot, and thank you for your continued support of the Drupal community.

July 11, 2024

A group of people in a futuristic setting having a meeting in an open forum with a space background.

I'm excited to announce the formation of the Drupal Starshot Advisory Council. When I announced Starshot's Leadership Team, I explained that we are innovating on the leadership model by adding a team of advisors. This council will provide strategic input and feedback to help ensure Drupal Starshot meets the needs of key stakeholders and end-users.

The Drupal Starshot initiative represents an ambitious effort to expand Drupal's reach and impact. To guide this effort, we've established a diverse Advisory Council that includes members of the Drupal Starshot project team, Drupal Association staff and Board of Directors, representatives from Drupal Certified Partners, Drupal Core Committers, and last but not least, individuals representing the target end-users for Drupal Starshot. This ensures a wide range of perspectives and expertise to inform the project's direction and decision-making.

The initial members include:

The council has been meeting monthly to receive updates from myself and the Drupal Starshot Leadership Team. Members will provide feedback on project initiatives, offer recommendations, and share insights based on their diverse experiences and areas of expertise.

In addition to guiding the strategic direction of Drupal Starshot, the Advisory Council will play a vital role in communication and alignment between the Drupal Starshot team, the Drupal Association, Drupal Core, and the broader Drupal community.

I'm excited to be working with this accomplished group to make the Drupal Starshot vision a reality. Together we can expand the reach and impact of Drupal, and continue advancing our mission to make the web a better place.

July 10, 2024

MySQL HeatWave 9.0 was released under the banner of artificial intelligence. It includes a VECTOR datatype and can easily process and analyze vast amounts of proprietary unstructured documents in object storage, using HeatWave GenAI and Lakehouse.

Oracle Cloud Infrastructure also provides a wonderful GenAI Service, and in this post, we will see how to use OCI GenAI with MySQL HeatWave to build a custom RAG solution using the content of a WordPress site.

This article was written with the help of my colleague Ivan Ma and based on the original blog of Ashul Dupare.

The Plan

The plan is to generate embeddings from the content stored in WordPress. For this example, I will use all the content of my blog: https://lefred.be.

The content of the articles is stored in the table wp_posts and in the field post_content (a LONGTEXT).

We will use Python, and we will retrieve the content of each article, and clean them up, meaning removing HTML tags, and special characters, … Then generate embedding using OCI GenAI’s service and store the vectors in MySQL HeatWave DB System.

This process will take some time… then we will be able to generate embedding for any question and compare it with the content we have in our database and generate an answer, this will be cool!

Prerequisites

Of course, we need to have a WordPress database, an OCI account to access the GenAI service, and a MySQL HeatWave DB System. Currently, Chicago is the region providing most of the GenAI services. It’s not needed to have your DB System in the same region.

I use a compute instance, but this is not mandatory for your Python code can reach our MySQL HeatWave DB System using a bastion host, a VPN, or the load balancer service for example.

Verify that your DB System is using at least MySQL 9.0.0:

Getting Started

We need to create a table in the WordPress schema to store our embeddings:

create table wp_embeddings(
   id bigint unsigned auto_increment,
   content varchar(4000), 
   vec vector(1024), 
   wp_post_id bigint unsigned, 
   primary key(id)
);

The code is available on GitHub: here.

Then, you also need to edit the wp_config.py file with your credentials, the endpoint of the GenAI compartment ID:

COMPARTMENT = "<ENTER HERE THE GENAI COMPARTMENT ID>"
CONFIG_FILE = "~/.oci/config"
CONFIG_PROFILE = "DEFAULT"

# Service endpoint
# ENDPOINT = "https://inference.generativeai.eu-frankfurt-1.oci.oraclecloud.com"
ENDPOINT = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"

DB_USER = "<YOUR DB USER ACCOUNT>"
DB_PASSWORD = "<YOUR DB USER PASSWORD>"
DB_HOST = "<YOUR DB HOST IP>"
DB_PORT = 3306
DB_SCHEMA = "<WORDPRESS SCHEMA>"

# Not mandatory, only if you plan to use reranking
COHERE_API_KEY = "<COHERE_API_KEY>"

To find the COMPARTMENT ID, you need to connect to the Oracle Cloud console, and choose a region where you see the GenAI Service (US-Midwest, Chicago is recommended):

Start a random chat:

And view the code, select Python:

And cut the comparment_id value to use in your wp_config.py.

Embeddings

You are now ready to start the embedding of all the content of your WordPress and store everything in MySQL HeatWave using STRING_TO_VECTOR() function:

This process will populate the wp_embeddings table:

select count(*) from wp_embeddings;
+----------+
| count(*) |
+----------+
|    19726 |
+----------+
1 row in set (0.0119 sec)

Let’s have a look at one record in this table:

select id, content, wp_post_id, VECTOR_TO_STRING(vec) 
from wp_embeddings order by id desc limit 1\G
*************************** 1. row ***************************
                   id: 19726
              content: P ;-) Enjoy coding in PHP and storing your data in MySQL!
           wp_post_id: 7538
VECTOR_TO_STRING(vec):[1.47247e-02,1.53656e-02,-7.51953e-02,-6.26221e-02,...]

RAG

Now that all articles have been parsed and the vector stored in the database, let’s try it!

In the illustration above, we use the vector search but we don’t rerank the results. To be able to rerank to have a better result, it’s then mandatory to get an API Key for Cohere (you can get a free one). I didn’t find that feature in the OCI GenAI capabilities yet.

See below, the same question but this time using “reranking”:

It’s recommended to use rerank to avoid some strange results on more complex queries.

This is another overview of how it works:

Conclusion

Thanks to the OCI GenAI services and MySQL HeatWave’s vector capabilities, it’s now so easy to “ask” anything related to the data stored in your MySQL.

It doesn’t require too much coding skills and the result is stunning!

Welcome to a new world of GenAI, LLMS, RAG for your data, where cutting-edge technologies transform the way we interact with information… and with your own already stored in MySQL information!

Do not hesitate to dive into this revolutionary landscape and discover endless possibilities for innovation and efficiency.

July 08, 2024

In news that is sure to rock the world (or at least mildly interest a few chess enthusiasts), I'm excited to report that I reached an Elo rating of 1600 on chess.com!

For most people, "Elo 1600" doesn't mean much. I understand; it didn't mean much to me a few years ago. According to chess.com, this ranking places me in the top 2.5% of players worldwide. While that sounds impressive, it's important to remember that many users on chess.com don't stay active for long.

In the real world, outside of chess.com, an Elo 1600 rating probably means that I'm a decent intermediate player. I estimate I'm about 100-200 points away from being competitive in beginner-level chess tournaments. However, this is just an educated guess, as I've never played in an official competition.

The path from 1400 to 1500 took 18 months and was filled with ups, downs, and even moments of self-doubt. In contrast, the climb from 1500 to 1600 was much faster, taking me just a month.

I'm still working with my chess coach. I stuck with the opening repertoire we developed and I spent a lot of time on visualization training and solving puzzles. I feel my training acted like a "coiled spring" – it took time to build up my skills, but once I understood enough opening theory and chess principles, I improved quickly.

I'm sharing this milestone for two reasons. First, to continue documenting my chess journey. Second, to highlight this key lesson about learning: progress often comes in spurts. You might be stuck for months, then suddenly leap forward.

Memo to self: expect to face challenges and setbacks in the future, and remember that this is a normal part of the process. Staying positive can be easier said than done, as the emotional rollercoaster of progress and setbacks can be intense. However, this experience has reinforced the importance of consistent effort and persistence. Hard work will often pay off in the end. This has been true in college, business, other parts of life, and now in chess.

Next stop: Elo 1700.

July 04, 2024

As you can read in my previous post related to MySQL 9 and authentication, the old mysql_native_password plugin has been removed.

In that post, I showed an example using PHP 7.2, the default version in OL8.

If you are using PHP and you want to use MySQL 9, you must be using a more recent version that fully supports the default authentication plugin caching_cha2_password.

Here is a summary table illustrating which PHP versions are compatible with MySQL 9:

This is the test using OL8 and PHP from the official repository and from Remi’s repo:

[root@mysql1 ~]# php test.php 
PHP version: 7.2.24
PHP Warning:  mysqli::__construct(): The server requested authentication method
 unknown to the client [caching_sha2_password] in /root/test.php on line 9
PHP Warning:  mysqli::__construct(): (HY000/2054): The server requested
 authentication method unknown to the client in /root/test.php on line 9
Connection failed: The server requested authentication method unknown to the client

[root@mysql1 ~]# php test.php 
PHP version: 7.2.34
PHP Warning:  mysqli::__construct(): The server requested authentication method
 unknown to the client [caching_sha2_password] in /root/test.php on line 9
PHP Warning:  mysqli::__construct(): (HY000/2054): The server requested
 authentication method unknown to the client in /root/test.php on line 9
Connection failed: The server requested authentication method unknown to the client


[root@mysql1 ~]# php test.php 
PHP version: 7.3.33
PHP Warning:  mysqli::__construct(): The server requested authentication method
 unknown to the client [caching_sha2_password] in /root/test.php on line 9
PHP Warning:  mysqli::__construct(): (HY000/2054): The server requested
 authentication method unknown to the client in /root/test.php on line 9
Connection failed: The server requested authentication method unknown to the client


[root@mysql1 ~]# php test.php 
PHP version: 7.4.33
Connected successfully
MySQL version: 9.0.0


[root@mysql1 ~]# php test.php 
PHP version: 8.0.30
Connected successfully
MySQL version: 9.0.0


[root@mysql1 ~]# php test.php 
PHP version: 8.1.29
Connected successfully
MySQL version: 9.0.0

[root@mysql1 ~]# php test.php 
PHP version: 8.2.21
Connected successfully
MySQL version: 9.0.0


[root@mysql1 ~]# php test.php 
PHP version: 8.3.9
Connected successfully
MySQL version: 9.0.0

Be ready to upgrade your legacy code to a newer version of PHP, it will be good for MySQL and PHP 😉

Enjoy coding in PHP and storing your data in MySQL!

With the latest MySQL Innovation Release, we decided that it was time to remove the remaining weak authentication plugin: mysql_native_password.

We previously deprecated it and made it not default loaded in MySQL 8.4 LTS and, now, in 9.0 it’s gone!

Reasons

Oracle places significant attention on the security of all its products, and MySQL is no exception. The removal of the weak authentication plugin has been carefully considered, we had some extra time for the LTS release as it was initially intended for version 8.4, but it is now fully effective.

But why is the mysql_native_password considered as weak compared to more modern authentication methods like the default caching_sha2_password:

  1. Weak Hashing Algorithm: mysql_native_password uses the SHA-1 hashing algorithm to hash passwords. SHA-1 is considered weak and vulnerable to certain types of cryptographic attacks, such as collision attacks, where two different inputs produce the same hash output (see SHAttered).
  2. No Salt: mysql_native_password does not use salting when hashing passwords. Salting adds random data to the password before hashing, which makes it more difficult for attackers to use precomputed hashes to crack passwords. The lack of salting makes this authentication method more vulnerable to such attacks.
  3. No Iterations: more secure hashing methods use multiple iterations of the hash function to slow down the hashing process, making brute-force attacks more time-consuming. mysql_native_password does not use multiple iterations, which makes it faster to compute and therefore easier to brute-force.

Implications

In practice, this means that old connectors that were already struggling with MySQL 8.0, like PHP 7.2 for example, won’t be able to connect to MySQL 9.0.

Let’s have a look at this simple PHP example:

<?php
$servername = "192.168.56.1";
$username = "test_user";
$password = "xxxxxxx";


echo "PHP version: " . phpversion() . "\n";

$conn = new mysqli($servername, $username, $password );

if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error . "\n");
}
echo "Connected successfully\n";

echo "MySQL version: " . $conn->server_info . "\n";

$conn->close();
?>

The script is running on a fresh Oracle Linux 8 with PHP and php-mysqlnd installed:

[root@mysql1 ~]# cat /etc/oracle-release 
Oracle Linux Server release 8.6

[root@mysql1 ~]# rpm -qa | grep php
php-mysqlnd-7.2.24-1.module+el8.2.0+5510+6771133c.x86_64
php-cli-7.2.24-1.module+el8.2.0+5510+6771133c.x86_64
php-fpm-7.2.24-1.module+el8.2.0+5510+6771133c.x86_64
php-pdo-7.2.24-1.module+el8.2.0+5510+6771133c.x86_64
php-common-7.2.24-1.module+el8.2.0+5510+6771133c.x86_64
php-7.2.24-1.module+el8.2.0+5510+6771133c.x86_64

Now let’s run the script that connects to our MySQL 9.0 server where the test_user has been created:

SQL> select version();
+-----------+
| version() |
+-----------+
| 9.0.0     |
+-----------+
1 row in set (0.0005 sec)

SQL> select user, host, plugin from mysql.user where user='test_user';
+-----------+------+-----------------------+
| user      | host | plugin                |
+-----------+------+-----------------------+
| test_user | %    | caching_sha2_password |
+-----------+------+-----------------------+
1 row in set (0.0009 sec)
[root@mysql1 ~]# php test.php 
PHP version: 7.2.24
PHP Warning:  mysqli::__construct(): The server requested authentication method
 unknown to the client [caching_sha2_password] in /root/test.php on line 9
PHP Warning:  mysqli::__construct(): (HY000/2054): The server requested
 authentication method unknown to the client in /root/test.php on line 9
Connection failed: The server requested authentication method unknown to the client

Solutions

The only valid, supported, and recommended solution if you want to use MySQL 9 is to upgrade your connector. For this example, it will be required to upgrade to a more recent PHP.

A man who knows is worth two!

You can also continue to use MySQL 8. MySQL 8.4 is an LTS version that is still supported for many years (extended support will end in April 2032).

If you don’t use any support and if you run MySQL on your own, in the end, MySQL is Open Source and pluggable, you can write your own authentication plugin and why not backport the mysql_native_password one? This is not recommended of course as it won’t make it more secure but it’s feasible. Everything is (almost) feasible and that’s the reason why MySQL is cool.

This is an example:

Enjoy MySQL and enjoy secure connections to MySQL 9.0!

Extra

My friend Marco pointed me that mysql_native_password.so was still part of MySQL 9.0. And indeed it is, but this is the plugin for the client. There is no plugin for the server:

$ rpm -ql mysql-community-client-plugins-9.0.0-10.fc40.x86_64 | grep plugin/
/usr/lib64/mysql/plugin/authentication_kerberos_client.so
/usr/lib64/mysql/plugin/authentication_ldap_sasl_client.so
/usr/lib64/mysql/plugin/authentication_oci_client.so
/usr/lib64/mysql/plugin/authentication_webauthn_client.so
/usr/lib64/mysql/plugin/mysql_native_password.so

The plugin I used in the example, mysql_native_password_legacy.so, is a plugin I wrote to illustrate the example and it’s not available with MySQL.

July 03, 2024

On July 1st, 3 new releases of MySQL came out. Indeed, we released the next 8.0 (8.0.38), the first update of the 8.4 LTS (8.4.1), and the very first 9.0 as Innovation Release. We now support 3 versions of the most popular Open Source database.

With these releases, we also want to thank all the great contributors who send patches to MySQL.

For some time now, all the contributors have also been highlighted in the Release Notes, but we never thank enough, so let’s take back the good habits of listing them in a dedicated blog post.

Some contributions were merged into multiple versions. Let’s have a look at all these contributions:

MySQL 8.0

As I missed some releases, I would like to go back in time, so these contributions cover the releases from 8.0.34 to 8.0.38.

Clients & C API

  • #92755 – Fix mysqldump querying column stats – Meta [8.0.34]
  • #98350 – Initialize error variable – Meta [8.0.34]
  • #98352 – Check net before deference – Meta [8.0.34]
  • #98385 – Mysqladmin related – Meta [8.0.34]
  • #110658 – mysql_binlog_xxx() symbols are not exported in lybmysqlclient.so – Yura Sorokin (Percona) [8.0.34]

MySQL Cluster (NDB)

  • #103814 – ClusterJ partition key scratch buffer size too small – Mikael Ronström [8.0.34]
  • #110807 – Add missing s390x support for NDB – Namrata Bhave [8.0.36]
  • #112775 – Rondb 475 contribute – Mikael Ronström [8.0.36]
  • #114147 – [RONDB-620] Foreign key column ordering mismatch – Axel Svensson [8.0.38]

MySQL Server

  • #105092 – AUTO_INCREMENT can be set to less than MAX + 1 and not forced to MAX + – Dmitry Lenev (Percona) [8.0.34]
  • #109576 – Contribution: Fix scramble algorithm docs – Niklas Keller [8.0.34]
  • #110251 – change information about get_lock error message – Bin Wang [8.0.34]
  • #110494 – Deadlock between FLUSH STATUS, COM_CHANGE_USER and SELECT FROM I_S.PROCESSLIST – Dmitry Lenev (Percona) [8.0.34]
  • #110801 – prepared stmt issue – Hao Lu [8.0.35]

Replication

  • #107996 – wrong default value of geometry column – Yin Peng (Tencent) [8.0.34]
  • #109154 – Fix incorrect suggested commands to start slave in error logs … – Dan McCombs [8.0.34]
  • #109485 – Previous_gtids miss gtid when binlog_order_commits off – Yewei Xu (Tencent) [8.0.34]

InnoDB / Clone

  • #109873 – no need to maintain err in buf_page_init_for_read – Alex Xing [8.0.34]
  • #110569 – Clone plugin sometimes failed, errno 22 – Yin Peng(Tencent) [8.0.35]
  • #110652 – InnoDB:trx hangs due to wrong trx->in_innodb value – Shaohua Wang [8.0.37]
  • #111823 – mysqldb instant column – Richard Dang [8.0.37]
  • #112137 – parallel read is slow when innodb_use_native_aio = off – Ke Yu [8.0.37]
  • #112424 – Crash during background rollback if both prepare and active transaction exist – Genze Wu [8.0.37]
  • #114154 – Parallel read is slow after partition table is read in parallel – Ke Yu [8.0.38]
  • #115041 – Unused heap in function Validate_files::check – Huaxiong Song [8.0.38]

Optimizer

  • #109979 – Query with GROUP_CONCAT and ROLLUP – Dmitry Lenev (Percona) [8.0.34]
  • #111465 – Issue with cached queries – Nicholas Othieno (Amazon) [8.0.35]

Performance_Schema

  • #104121 – Ensure that prepared statements are shown correctly in PERFORMANCE_SCHEMA.THREADS – Dishon Merkhai (Amazon) [8.0.38]

Full Text Search

  • #114156 – fts_sync_commit is not crash safe – Yin Pen (Tencent) [8.0.37]

Compiling

  • #110216 – failed to compile in static mode – Alex Xing [8.0.34]
  • #111190 – Build fails with LANG=ja_JP.UTF-8 – Kento Takeuchi [8.0.34]
  • #111549 – Fix system zlib version detection –Nikolai Kostrigin [8.0.34]
  • #111467 – Set variables to PARENT_SCOPE in CMake functions – Meng-Hsiu Chiang (Amazon) [8.0.35]
  • #110808 – sql/memory: Fix build on musl – Sam James [8.0.36]
  • #112209 – The warning of deprecated function in openssl3 not be supressed – Karry Zhang [8.0.36]
  • #112845 – On s390x, use the stckf instruction to get the cycles timer… – Jonathan Albrecht [8.0.37]
  • #113095 – s390x is a big endian platform so use the big endian icu data – Jonathan Albrecht [8.0.37]
  • #113096 – On s390x, compile the FMA test with -O2 to avoid over-optimiz … – Jonathan Albrecht [8.0.37]
  • #114064 – Unified behaviour when calling \”plugin->deinit\” for all plugins – Martin Alderete [8.0.37]

MySQL 8.4

I’m also including the contributions of 8.3.

Clients & Connectors

  • #107107 – Redundant “Reset stmt” when setting useServerPrepStmts&cachePrepStmts to true – Marcos Albe (Percona) [8.3]
  • #110286 – Only call Messages.getString(…) when it’s needed – Janick Reynders [8.4]
  • #110784 – cdk: Use cmake’s FindOpenSSL, not custom FindSSL module – Sam James [8.4]
  • #111031 – Update SyntaxRegressionTest.java – Abby Palmero [8.4]
  • #113599 – Replace StringBuffer with StringBuilder in ValueEncoders – Henning Pöttker [8.4]
  • #113600 – Contribution: Fix join condition for retrieval of imported primary keys – Henning Pöttker [8.4]
  • #113766 – Fix compile errors on GCC 14 – Christopher Fore [8.4]
  • #114272 – Fix deadlock in mysql poolmanager attempt #2 – Marek Matys [8.4]

InnoDB

  • #108731 – just need to maintain FIL_PAGE_LSN once during recovery – Alex Xing [8.3]
  • #111823 – issue with upgrades and instant columns – Richard Dang [8.3]
  • #112532 – Innodb_row_lock_current_waits values show non-zero when there is no lock – Bin Wang [8.3]
  • #110652 – InnoDB:trx hangs due to wrong trx->in_innodb value – Shaohua Wang [8.4]
  • #112137 – parallel read is slow when innodb_use_native_aio = off – Ke Yu [8.4]
  • #112424 – Crash during background rollback if both prepare and active transaction exist – Genze Wu [8.4]
  • #114154 – Parallel read is slow after partition table is read in parallel – Ke Yu [8.4.1]
  • #115041 – Unused heap in function Validate_files::check – Huaxiong Song [8.4.1]

Compiling

  • #112209 – The warning of deprecated function in openssl3 not be supressed – Karry Zhang [8.3]
  • #112845 – On s390x, use the stckf instruction to get the cycles timer v … – Jonathan Albrech [8.3]
  • #110808 – sql/memory: Fix build on musl – Sam James [8.4]
  • #112845 – On s390x, use the stckf instruction to get the cycles timer… – Jonathan Albrecht [8.4]
  • #113095 – s390x is a big endian platform so use the big endian icu data – Jonathan Albrecht [8.4]
  • #113096 – On s390x, compile the FMA test with -O2 to avoid over-optimiz … – Jonathan Albrecht [8.4]
  • #114064 – Unified behaviour when calling “plugin->deinit” for all plugins – Martin Alderete [8.4.1]

Optimizer

  • #112304 – MEM_ROOT::AllocBlock() does not satisfy minimum_length bigger than wanted_length – Kaiwang Chen [8.3]
  • #112426 – wrong dbug_dump for accesspath in debug mode – Tianfeng Li [8.3]

MySQL Cluster (NDB)

  • #112775 – Rondb 475 contribute – Mikael Ronström [8.3]
  • #114147 – [RONDB-620] Foreign key column ordering mismatch – Axel Svensson [8.4.1]

MySQL Server

  • #112846 – Fix usage of BIO_get_mem_data – Samuel Chiang [8.3]

Documentation

  • #112957 – Doc: Update compression protocol information – Daniël van Eeden [8.3]

MySQL Operator for K8s

  • #113149 – Add Timezone Support To Backup Schedule – Eldin Didic [8.3]

Full Text Search

  • #114156 – fts_sync_commit is not crash safe – Yin Pen (Tencent) [8.4]

Document Store / X Plugin / MySQL Shell

  • #113500 – fix stale link about functions – Minha Jeong [8.4.1]
  • #114127 – Fix handling of table/schema names containing non-ASCII characters in get_schema() – Daniel Lenski (Amazon) [8.4.1]
  • #114707 – collect_diagnostics() fails with AttributeError when handling None diff values – Ioannis Androulidakis [8.4.1]

MySQL 9.0

Clients & Connectors

  • #110512 – Replace synchronized with ReentrantLock – Bart De Neuter
  • #113443 – SQL Keyword “RETURNING” not handled properly when updating data – Wang Shiyao
  • #114985 – update dnspython required versions to allow latest 2.6.1 – Michael Perrone

InnoDB

  • #110652 – InnoDB:trx hangs due to wrong trx->in_innodb value – Shaohua Wang
  • #112424 – Crash during background rollback if both prepare and active transaction exist – Genze Wu
  • #113640 – no need to judge rseg in while loop – Alex Xing
  • #114154 – Parallel read is slow after partition table is read in parallel – Ke Yu
  • #115041 – Unused heap in function Validate_files::check – Huaxiong Song

Document Store / X Plugin / MySQL Shell

  • #113500 – fix stale link about functions – Minha Jeong
  • #114127 – Fix handling of table/schema names containing non-ASCII chara … – Daniel Lenski
  • #114707 – collect_diagnostics() fails with AttributeError when handling None diff values – Ioannis Androulidakis

MySQL Server

  • #113933 – Contribution: Fix multibyte bugs in my_instr_mb – Dirkjan Bussink

Compiling

  • #114064 – Unified behaviour when calling “plugin->deinit” for all plugins – Martin Alderete

MySQL Cluster (NDB)

  • #114147 – [RONDB-620] Foreign key column ordering mismatch – Axel Svensson

Full Text Search

  • #114156 – fts_sync_commit is not crash safe – Yin Pen (Tencent)

Don´t forget that MySQL is Open Source and that we, as Oracle, accept contributions. Please note that the time for processing contributions may differ depending on the category and complexity.

If you have patches and you also want to be part of the MySQL Contributors, it’s easy, you can send Pull Requests from MySQL’s GitHub repositories or send your patches on Bugs MySQL (signing the Oracle Contributor Agreement is required).

Thanks again to all our amazing contributors!

Petit dictionnaire du spammeur

Le but de la publicité est de vous faire consommer plus. Donc de vous faire polluer plus. C’est, écologiquement, une activité criminelle et moralement injustifiable.

Mais il y a pire que la destruction de la planète ! Après tout, détruire la planète, tout le monde le fait. C’est devenu banal.

Ce qui est unique à la publicité, c’est l’envahissement permanent de notre espace mental, de nos inbox, de tous nos canaux de communication. On parle alors de « spam ».

Le spam est à la fois un crime écologique à l’échelle de la planète et une atteinte morale envers l’individu.

Pour s’acheter une façade de respectabilité, les criminels se sont contentés de changer de nom. Voici donc un petit dictionnaire pour vous aider à y voir clair.

Ce n’est pas parce que vous appelez vos spams autrement que vous n’êtes pas un enfoiré de spammeur

Agence marketing : Entreprise spécialisée dans la production de spam

Spécialiste marketing : spammeur

Mailing : spam

Cold-mailing : spam (et assumé comme tel)

Communication : spam

Stratégie de com : série de nombreux spams

Newsletter : spam (sauf lorsque les abonnés ont explicitement et volontairement fait des démarches pour recevoir les emails. Si vous avez obtenu une liste d’adresses email pour votre newsletter d’une source externe, vous êtes un spammeur. Si l’abonnement se fait en oubliant de cocher une case, vous êtes un spammeur. Si vous n’êtes pas certain, vous êtes un spammeur.)

Influenc·eur·euse : diffuseur de spam, mais qui a l’air cool

Partenariat : contrat liant une agence marketing et l’influenc·eur·euse

Ambassad·eur·rice : influenc·eur·euse qui vient de signer un partenariat

Vous êtes probablement le méchant du film

Si vous participez à l’une des activités ci-dessus, c’est que vous pourrissez la vie des autres pour nous tenter de convaincre de pourrir à notre tour la planète. Vous êtes une cause majeure du problème.

Oui, vous, qui envoyez des newsletters à toutes les personnes qui sont passées dans votre magasin, votre hôtel ou un train de votre réseau : vous êtes un·e enfoiré·e de spammeur.

Si vous vous en gargarisez sur LinkedIn avec les termes ci-dessus, avouez qu’on peut difficilement faire autre chose que de vous demander d’aller vous faire foutre.

Et si vous trouvez que ce billet exagère, qu’on ne peut pas mettre tout le monde dans le même bateau, qu’il faut bien que vous fassiez votre boulot : vous êtes un·e enfoiré·e de spammeur, mais vous ne l’assumez pas. Oui, vous êtes le méchant du film et vous êtes trop hypocrite pour oser vous l’avouer.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

July 01, 2024

On Open Source and the Sustainability of the Commons

TL;DR: put your open source code under the AGPL license.

Much have been said about the need to pay Open Source developers for their work and the fact that huge corporations use open source software without contributing back.

Most articles I’ve been reading on the subject completely miss the mark. Plenty of commentators try to reinvent some kind of "free software but with forced contributions" or "free software but non-commercial". Those are naive and wrong. If you impose limitations, it’s, by definition, not free software anymore.

The problem is not about Open Source or Free Software. The problem is everything else.

Open Source utopia, as envisioned until the first decade of this millennium, was to create a huge, powerful stack of open-source software that would serve as the foundations of any human endeavour. Including building businesses or, for some, proprietary products. Free Software would be part of the commons, a huge natural pool of resources. Every business would be small in comparison. Just like we allowed private companies to sell water (a common good) thinking the companies would be small compared to the nearly infinite supply of water.

We were naive.

What we got is more or less the opposite: huge monopolistic corporations and lots of small fragmented free software pieces that connect them. Bottled soda factories pumping so much water that whole populations start to suffer from the lack of fresh water and, as a consequence, being forced to buy bottled water.

Technically, Open Source won. Politically, it lost. The reason is simple: it was easier to build consensus around technical solutions, washing away political implications that were seen as out of scope or too hard to agree.

Every megacorporation is now built on top of free software. But they managed to make it effectively proprietary by hiding their code behind web interfaces. When publicly distributed, the open-source code is hidden behind layers of indirection bypassing any packaging/integration effort, relying instead on virtualisation and downloading dependencies on the fly. Thanks to those strategies, corporations could benefit from open source code without any consequence. The open source code is, anyway, mostly hosted and developed on proprietary platforms.

Even hardcore free software geeks now use some dependencies/plugins mechanisms that are hardcoded to only look at Github. Through nested dependencies, millions of people are running code directly downloaded from Github without even realising it.

Due to the original open-source utopia paradigm, every time a developer push free code on Github, she feels like she’s contributing to the commons. But, effectively, she’s pushing code into production in hundreds of exploitive corporate projects. When a problem occurs, per corporate tradition, pressure and blame fall on the maintainer. Even if that maintainer is not on the payroll.

Paying the Maintainer Makes the Problem Worse

That will be an extremely unpopular opinion but I’m convinced that paying the contributor/maintainer a dime is not the solution. It worsen the situation. It acknowledges the responsibility of the aforementioned maintainer and legitimises the exploitation.

We need to remember that most (if not all) free software is provided, "without liability". That rule should be enforced. We should not care about corporations. If there was no support contract prior hand, let them burn. Trying to force corporations to pay the maintainer is like trying to force landlords to pay firefighters only if their house is burning. Or agreeing that a factory should give a small tip to volunteers cleaning the river it is polluting.

Paid and unpaid open source developers are pressed into providing a support they never promised in the first time. So they ask companies for mandatory contributions, something they explicitly refused when they licensed their code.

So, what can we do?

In the short term, it’s very simple. If you care about the commons, you should put your work under a strong copyleft license like the AGPL. That way, we will get back to building that commons we lost because of web services. If someone ever complains that a web service broke because of your AGPL code, reply that the whole web service should be under the AGPL too.

We were tricked into thinking that BSD or MIT licences were "freer" like we were tricked into believing that building a polluting factory next to our local river would be "good for the economy". It is a scam. A lot of unpaid or badly paid developers would probably benefit from switching to a copyleft license but they use BSD/MIT because they see themselves are "temporary embarrassed software millionaires".

We should also actively fight against automatic installation of recursive dependencies. No, it is not normal and no sanely engineered system should do this. We should not trust the Microsoft-owned Github to distribute software. A git repository is a development tool, not a distribution mechanism for end users. Something the Great Ancients understood fully when they started projects like BSD, Debian or Red Hat which are called… "distributions". Yes, the "D" in BSD stand for "distribution". It is not by accident that those distributions care a lot about the license of the software they distribute.

Get Rid of Monopolies

In the long term, the root causes of most of our problems are the monopolistic corporations. Without them, we would not have this discussion. There’s a generational divide here. Brilliant coders now on the market or in the free software space have never known a world without Google, Facebook and Github. Their definition of software is "something running in the browser". Even email is, for them, a synonym for the proprietary messaging system called "Gmail" or "Outlook". They contribute to FLOSS on Github while chatting on Slack or Discord, sharing specifications on Google Drive and advertising their project on Twitter/X. They also often have an iPhone and a Mac because "shiny". They cannot imagine an alternative world where monopolies would not be everywhere. They feel that having nice Github and Linkedin profiles where they work for free is the only hope they have to escape unemployment. Who can blame them? They cannot imagine a world without monopolies. They don’t search, they Google, they don’t shop online, they go on Amazon, they don’t read a book but a Kindle, they don’t take a coffee but a Starbucks. For them, politics is only a source of conflicts, a naughty word.

As they start to understand that they are exploited by those omnipotent deities, they see only one way to make it acceptable: ask, through one of those deities (Twitter, Facebook, LinkedIn), to be paid. They understand that they are two classes of coders in the world: those who are exploited without being paid and those who are paid to be exploited. A bit or even more in some cases. While a few hands keep all the power.

What elderly, like myself, should teach them, is that there are many alternatives. We can live without Google, Facebook Microsoft, Apple, Amazon. We can write code which is not on Github, which doesn’t run on an Amazon server and which is not displayed in a Google browser. We should also insist that every piece of technology is, by essence, political. That you cannot understand technology without understanding the people. And you cannot understand people without understanding politics. Every choice you made has an impact on the world.

At the turn of the century, the free software community was focused on fighting Microsoft monopoly. We even joined force with Google and Apple to fight Microsoft. We completely failed. We helped build a world where mostly everything is "Microsofted", "Googled" and "Iphonized". All of this made possible thanks to open source and millions of hours worked for free by people who contributed to what we thought was "the commons".

The lesson we learned is harsh: we can never trust corporations with anything. They destroyed our oceans, our atmosphere and our politics. There’s no reason to trust them with our software, our privacy and our daily lives.

In the long term, our only hope is to build stronger commons. Every day, we must fight to protect and improve the commons while letting corporations have as little power as we can over it and over our lives.

If you are a creator or a coder, you can do it today by adopting copyleft licenses and enforcing them as much as you can.

Put your open source code under the AGPL license!

As a writer and an engineer, I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

If you read French, you can support me by buying/sharing/reading my books and subscribing to my newsletter in French or RSS. I also develop Free Software.

June 28, 2024

Two and a half years ago, I celebrated reaching 500 loans on Kiva, a platform dedicated to helping those without access to traditional banking.

Today, I reached a 1,000 loans on Kiva. It took ten years to reach 500 loans, but only 2.5 years to double that, marking a 300% increase in lending speed.

To clarify, I do not offer these loans in full, nor do I profit from them. I contribute $20-$100 per loan, and they are interest-free. While I take on the risk, I don't make any money from these loans. Not all loans get paid back, but when they don't, I am content knowing I helped other entrepreneurs.

Consider Nasiba, a 22-year-old woman from Tajikistan. She requested a ~$3,000 loan, and I am one of the lenders. She is using the loan to buy inventory for her small grocery stall to increase her family's income.

I also lent money to Jesús, a 69-year-old cattle farmer from Ecuador, who requested a ~$1,000 loan to help buy a mechanical milking machine to improve the stability of his family's income.

Having helped 1,000 entrepreneurs in this way feels special. Each one has a story of hope, resilience, and bettering their family's life.

Inspired by this milestone and the many stories, I want to do more. I plan to double my contributions to Kiva once again and aim to fund about 500 loans each year, the same amount I did in the first 10 years.

If you have the means, want to help entrepreneurs, and build towards a financially inclusive world, consider joining Kiva.

June 26, 2024

We're big coffee enthusiasts, and I proudly hold the title of "chief barista" in our family. Every morning, I grind beans, try to perfect espresso shots, steam milk, and craft lattes.

We also love camping, but obviously, lugging our espresso machine into the great outdoors isn't an option. Over the years, we've experimented with various brewing methods in an attempt to make our camping coffee as good as what we brew at home.

Ultimately, we've come to really like the moka pot, also known as a percolator, for camping. It delivers a strong, rich coffee that is somewhat close to an espresso due to its pressure-based brewing method.

A moka pot brewing coffee on a portable camping stove in a forest setting. A moka pot (or percolator) brewing coffee during a camping trip in Maine, USA.

Using the moka pot can be a bit tricky. Making good coffee requires getting the details right, which we tend to forget between trips. On our first morning camping, the coffee is often okay but not great. Each time, it takes a few mornings of trial and error to turn okay coffee into great coffee.

So, I've finally decided to document our steps, hoping it will help us brew the perfect cup from the first morning. I wrote down these steps for our own reference, but I'll keep tweaking them as we refine our technique.

What you need

  • 12-cup Bialetti moka pot
  • 750 ml of water (filtered preferred for best taste)
  • 45 grams of medium-fine ground coffee
  • Gas stove
  • AeroPress filter (optional, for a smoother brew)

Even though we settled on a 12-cup moka pot, it actually makes four standard cups, as each 'cup' is more like an espresso size.

Klaas making breakfast with a moka pot and frying pan in a Volkswagen California camper. Brewing coffee and cooking breakfast in a Volkswagen California camper van during a trip to Wales, UK.

Brewing steps

  • Prepare the water: Fill the bottom chamber of your moka pot with 750 ml of water. Keep it below the safety valve.
  • Set up the coffee basket: Add 45 grams of medium-fine ground coffee (6 grams per 100 milliliters). Gently tap the basket to settle the grounds evenly; don't tamp it as you might with an espresso machine. If you're using an AeroPress filter, place it on top of the grounds. The filter is optional but can improve the coffee; it helps to evenly distribute the steam and ensure more uniform extraction. This results in a consistent color, smoother taste, and clearer brew.
  • Start brewing: Assemble the pot and place it on a small burner with a medium flame. The coffee should flow gently into the upper chamber; if it sputters, the heat is too high. We keep the lid open to observe and adjust the flame as needed. Once the coffee starts percolating, watch it closely. Remove the pot from the heat when it is done, usually indicated by a hissing sound. Removing the pot from the heat helps prevent the coffee from becoming bitter.
Cooking eggs and brewing coffee in the van’s kitchen, with apples from a nearby tree visible outside the window. Preparing eggs and coffee in our van at a campsite in Heidenburg, Germany.

Brewing tips

  • Bialetti designs their moka pots to scale; the water chamber and coffee basket are proportioned. To brew without a scale, fill the water to the bottom of the pressure valve, fill the coffee basket and level it off without tamping. Filling them both should result in a decent brew.
  • If the coffee extracts too fast, it will be under-extracted, resulting in sour or acidic coffee, often lacking richness and balance. To fix this, grind the coffee finer or use a lower heat.
  • If the extraction is too slow, it will be over-extracted, it will taste bitter and harsh due to too many flavors being extracted. To fix this, grind the coffee coarser or use a higher heat.

That is all there is to it! Happy camping and happy brewing!

June 25, 2024

De l’utilité profonde de l’informatique

Plutôt que de contourner les problèmes du quotidien, il est temps pour l’informatique moderne de se poser la question de la cause profonde de ces problèmes.

Internet au pôle Sud

La connexion Internet en Antarctique, c’est un véritable challenge, comme l’explique très bien ce post d’une personne ayant travaillé 14 mois à gérer l’IT d’une base.

Sauf que, si on lit correctement, ce n’est pas vraiment la connexion Internet le problème. C’est tout le web moderne ! Le post ne fait que se plaindre des difficultés à télécharger 20Mo de JavaScript pour pouvoir… chatter ! Fondamentalement, la connexion en Antartique est bien meilleure que mes premières connexions sur le réseau téléphonique.

En fait, je me rends compte qu’Offpunk avec un cache partagé par tous les membres de la base serait un outil idéal pour ce genre d’endroit.

Ce que je trouve effrayant, c’est que le post ne semble à aucun moment remettre en question le fait qu’il faille télécharger 20Mo de JavaScript pour chatter. Faut dire que, dans le même post, il trouve également normal de mettre à jour son macOS depuis l’Antarctique. Alors que, personnellement, je mettrais tout le monde en Debian stable avec un cache local (approx ou apt-cacher sont très simples à mettre en place).

Mais le bon sens et l’indépendance d’esprit ont déserté l’immense majorité des départements IT. Jusque dans l’antarctique. Il est devenu "normal" d’utiliser Slack et Office 365 sur un Mac ou un Windows vérolé, même dans des environnements extrêmes où cela n’a absolument aucun sens.

En fait, les informaticiens ont complètement perdu le sens de ce qu’est un ordinateur.

Bullshitter les bullshit jobs

Un ami me disait récemment que son travail consistait à coller dans une feuille Excel des factures en PDF. Factures qui sont générées depuis une base de données interne à la boîte. Les feuilles Excel que mon ami produit sont ensuite transférées à une autre personne qui les encode dans une autre base de données. Mais comme le dit mon ami, relier les deux bases de données directement est impensable. Tout le monde a trop peur de perdre son boulot. Donc on fait les choses de la manière la plus inefficace, la plus débile possible.

Les personnes utilisant une connexion limitée en Antarctique sur une connexion satellite limitée ne voient pas d’autres solutions que d’améliorer le débit des satellites. Ce qui arrange bien les fabricants de satellites.

Du coup, on lance plus de satellites…

De manière à pouvoir télécharger d’énorme quantité de logiciels qui nous permettent, à grand-peine, de faire des trucs que n’importe quel informaticien d’il y a 30 ans fait en une ligne de bash.

L’efficacité énergétique de la programmation

De la même manière, Prma critique un papier qui analyse l’efficacité énergétique de différents langages de programmation. Prma explique que le papier ne prend en compte que le temps de traitement du problème et pas tout ce qu’il y a autour (le développement du logiciel).

Mais je vais beaucoup plus loin : On s’en fout ! On s’en branle que Python soit 78 fois moins performant que le C. Parce que l’important n’est pas là.

La question est « À quoi sert le logiciel ? ». Que demande-t-on à l’ordinateur ?

Le simple fait que cette étude existe démontre à quel point les informaticiens ont perdu toute distance, toute intelligence par rapport à l’outil qu’ils ont créé.

Si on lui demande de nous afficher de la publicité et de nous espionner pour nous faire consommer plus, alors, même si le code est parfaitement optimisé et tourne dans des data centers alimentés à l’énergie solaire, ce code est morbide, dangereux, mortel.

Si votre code python sous-optimisé consomme un peu plus de batterie que prévu, mais c’est pour vous empêcher de voir les pubs, alors ce code est, par essence, écologique.

L’industrie publicitaire, l’industrie pétrolière et toute industrie qui vit en vous faisant consommer plus de ressources que ce que vous avez vraiment besoin sont des industries dangereuses. Elles peuvent planter des arbres et faire des campagnes de pub pour nous convaincre du contraire, ce seront toujours des mensonges. Les centres commerciaux sont des odes à la destruction de la planète. Les pompes à essence sont les chapelles de notre religion mortifère qui consiste à empoisonner le plus vite possible notre air, notre eau, notre nourriture et nos enfants.

Payer pour écrire

Remettre un effort permet de remettre du sens. C’est pourquoi je suis fan des machines à écrire (mon prochain roman a été entièrement rédigé sur une Hermès Baby mécanique). Je vous recommande d’ailleurs l’incroyable livre « THe Typewriter Revolution » de Richard Polt pour comprendre pourquoi taper à la machine est un acte rebelle.

Mais j’ignorais totalement l’existence de machine à écrire payante ! 10 cents pour écrire pendant trente minutes. Et, cerise sur le gâteau, c’est sur une des machines de ce type que Ray Bradbury a écrit le premier jet de Farenheit 451, ce qui lui aura donc couté 9,80$. 49h de travail !

Archéologie logicielle

Afin de remettre du sens derrière l’informatique et les ordinateurs, je me penche de plus en plus dans l’histoire de la discipline.

L’informatique commence en effet à devenir assez ancienne pour qu’apparaisse une nouvelle activité : l’archéologie logicielle. Et c’est passionnant. Comme cet exemple absolument dingue : une des disquettes du jeu Space Quest II, de 1987, avait été mal formatée avant d’être envoyée en production et contenait, caché dans les zones vides du système de fichiers, 70% du code source du jeu.

Il faudra attendre 37 ans pour que quelqu’un remarque cette erreur et redécouvre ce code ancien.

Ça me fait un peu le même effet que de lire la découverte d’une plaquette d’argile couverte d’écriture cunéiforme en provenance d’un empire effondré dans un cataclysme. Sauf que, dans ce cas, je comprends le cunéiforme.

Surtout quand on a joué à des jeux Sierra-on-line et qu’on connait l’histoire de la chute de cette entreprise qui aurait pu devenir un colosse, mais s’est fait racheter par un arnaqueur.

Retracer une histoire logicielle personnelle

En beaucoup plus récent et moins archéologique, Flozz, sur sa capsule Gemini, nous raconte les différentes manières qu’il a utilisées pour écouter de la musique sur son ordinateur. Ça rappellera des souvenirs aux utilisateurs de XMMS. Il a fini par échouer, un jour de flemme, sur Spotify.

Mais, avant même toute considération éthique, Spotify a un problème qui devrait faire bondir tout audiophile : une chanson que vous écoutez peut disparaitre du jour au lendemain. Ou, comme ça lui est arrivé, être remplacé par une autre version de la chanson.

Je n’ai jamais utilisé Spotify, mais rien qu’à l’idée, j’en frémis.

Du coup, Flozz prend le taureau par les cornes pour migrer vers Nextcloud-music et nous le raconte.

Discuter les problèmes plutôt que les résoudre à tout prix

Plutôt que de tenter d’offrir des solutions à tout prix, ce qui fait des blog posts populaires sur Hacker News, il est nécessaire de prendre de la distance, de faire comme Flozz et de tracer l’historique du problème, l’évolution du besoin.

Cela me rappelle la blague de la NASA qui développe, pour des millions de dollars, un stylo-bille capable de fonctionner en apesanteur. Le bic fonctionne néanmoins très mal et, à la fin de la guerre froide, les ingénieurs américains demandent à leurs confrères russes comment les astronautes prennent des notes. Les Soviétiques de répondre : « Ils utilisent un crayon, pourquoi ? »

L’ordinateur est un marteau universel qui fait que le monde ressemble à un champ de clou. Mais peut-être que les meilleurs informaticiens aujourd’hui sont ceux qui vous expliquent comment ne pas utiliser un ordinateur.

— Pourquoi tu ne l’écris pas tout simplement sur un papier que tu déposes sur son bureau ?
– Je… Je n’y avais pas pensé.

Breizhcamp à Rennes, ce 27 juin

Au fait, je serai toute la journée du jeudi 27 juin au Breizhcamp à Rennes. Je dédicacerai avec plaisir et j’aurai quelques bouquins avec moi pour ceux qui veulent en acheter, mais pas beaucoup donc si vous l’avez acheté à l’avance dans une librairie, c’est plus sûr.

Et ne ratez pas la keynote du jeudi matin !

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

June 18, 2024

Cette acceptation progressive de l’inacceptable

Et du devoir moral d’être anormal

Le monde, comme les moulins, c’était mieux à vent.

C’est du moins ce que prétend chaque génération depuis que l’humanité existe. Je crois me souvenir d’un texte latin (peut-être de Pline l’Ancien ?) se plaignant de la musique de jeunes de son époque.

Force est de constater que, dans certains cas, c’est vrai. Ce n’est pas tant que les choses se sont empirées, c’est surtout que le cadre de référence de ce qui est « normal » a été complètement bouleversé.

Changement du cadre de référence en matière de vie privée

Seven, de David Fincher, est un film iconique de ma jeunesse. Sorti en 1995 (je ne le verrai que bien plus tard, mais je lirai très vite la novélisation), il raconte l’histoire de deux flics, joués par Brad Pitt et Morgan Freeman, à la poursuite d’un tueur en série obsédé par les sept péchés capitaux.

L’un des éléments clés de l’enquête est l’accès au fichier des personnes empruntant les livres liés aux sept péchés capitaux à la bibliothèque municipale. Problème : pour des raisons qui semblent évidentes au public des années 90, il est interdit de ficher les citoyens selon leurs lectures. Les scénaristes contournent cet écueil en prétendant que la CIA maintient une liste parfaitement illégale des individus empruntant certains livres et en faisant en sorte que Morgan Freeman soit justement pote avec un mec travaillant à la CIA.

Ce qui est particulièrement saisissant, 30 ans plus tard, c’est de constater que la ville fictive où se déroule l’intrigue est décrite comme sale, violente, mais que les citoyens semblent disposer de plus de droits que nous n’en avons aujourd’hui. Le fait de ne pas voir divulguer la liste de ses emprunts à la bibliothèque est considéré comme allant de soi.

Aujourd’hui, les titres que vous empruntez électroniquement sont, d’une manière ou d’une autre, transmis aux annonceurs afin de faire de la publicité ciblée.

Il faut absolument revoir « All the President’s Men » pour se rendre compte le scandale qu’était le fait d’installer des micros-espions en 1972. Scandale qui aboutira à la démission de Nixon alors que, depuis Edward Snowden, nous trouvons normal de nous balader tous en permanence avec des micros et caméras-espionnes partout.

Edward Snowden pensait choquer le monde avec ses révélations. Nous nous sommes contentés de hausser les épaules en trouvant cela « normal ».

Google, l’espion qui ne se cache même plus

Si vous lisez ceci, vous pensez peut-être que Chrome est un navigateur web qui vous espionne ?

Vous êtes encore loin du compte.

De nouvelles fuites démontrent que Chrome est un espion logiciel déguisé en navigateur web depuis le début. Chrome a été créé dans l’optique d’obtenir le plus d’informations possible sur vous et sur votre usage du web. Même vos mouvements de souris, vos clics avortés sont enregistrés et envoyés à Google pour être analysés. Chrome fait ce que Google a toujours juré qu’il ne faisait pas. Tout système où Chrome est installé doit être considéré comme un ordinateur-espion.

Et si cela ne suffisait pas, Chrome va désormais limiter très fortement les bloqueurs de publicité.

La solution existe : utiliser Firefox, tant sur votre ordinateur que sur votre mobile. Firefox pour Android permet désormais d’utiliser les extensions. Et si vous ne savez pas laquelle choisir, il n’y en a qu’une de vraiment indispensable : ublock origin.

Microsoft, l’espion qui apprend vite

Tant qu’on y est, si vous êtes encore sous Windows, il devient urgent d’envisager une migration. Windows va, dans un futur proche, prendre des screenshots réguliers de votre écran, les analyser et les sauvegarder. Le but ? Vous permettre de vous rappeler ce que vous faisiez dans le passé.

Le plus hallucinant, c’est que les gens qui ont pondu cette idée chez Microsoft sont tombés de leur chaise quand ils ont découvert que ce genre de fonctionnalités est quand même très effrayant pour beaucoup d’utilisateurs. Ah ben, pour le coup, ils n’avaient même pas pensé à sécuriser le bazar. Pourtant, c’est cool, non, d’être espionné en permanence par son propre ordinateur ?

Il faut avouer que, chez Microsoft, la sécurité n’est pas la priorité. Andrew Harris était un hacker réputé, employé chez Microsoft pour garantir la sécurité. Il a très vite découvert que le système cloud de Microsoft permettait à n’importe qui de se logguer avec des permissions arbitraires. Une faille énorme s’il en est ! Malgré les nombreuses alertes d’Harris en interne, Microsoft n’a jamais résolu ce bug. Andrew Harris a fini par démissionner, dégouté, en 2020. La faille, elle, existe probablement toujours et est à l’origine de certaines des plus grosses attaques informatiques, y compris l’exfiltration de documents ultra-confidentiels de différentes administrations américaines.

Au fait, installer Windows 11 sans le fameux compte cloud de chez Microsoft va devenir de plus en plus difficile.

Et tous les autres aspirants espions

Ce ne sont pas seulement les gros éditeurs qui tentent de vous emmerdifier. Nous assistons à une vague de rachats. Les petits logiciels propriétaires, gratuits ou payants, sont rachetés en douce par des groupes nébuleux qui n’ont qu’un seul objectif : exploiter la base utilisateur pour en extraire des données ou les faire payer. Ça a été le cas par exemple pour le navigateur web Opera, racheté pour 600 millions de dollars par un obscur consortium chinois. C’est régulièrement le cas pour toutes les petites applications Android/Mac/Windows. Vous mettez à jour innocemment et, sans que vous soyez prévenu, vous êtes subitement espionné ou forcé de payer pour accéder à vos propres données. Les histoires sont trop nombreuses pour que je les cite toutes.

Notez que ça fait 40 ans que Richard Stallman nous prévient qu’utiliser des logiciels propriétaires doit forcément finir comme cela.

La pérennité de vos outils passe obligatoirement par leur liberté.

Alors, oui, il y a des logiciels qui n’ont pas encore d’équivalents libres. Mais si votre usage est essentiellement dans un navigateur ou pour coder, ça vaut la peine d’investiguer de passer à Linux, au moins partiellement (le double-boot est toujours possible).

L’anti-simplification d’Apple

Si vous doutiez encore de la merdification et de la complexité dans l’informatique moderne, voici un exemple qui devrait vous éclairer.

Les iPhones ne disposent pas du port standard « jack 3,5mm » pour brancher des écouteurs. Ni de port USB-C standard. Il faut donc utiliser le port propriétaire Apple « Lightning ». Ce qui, selon les fans d’Apple, est génial.

Sauf que pour fabriquer un truc aussi simple que des écouteurs qui se branchent là-dessus, il faut payer une redevance à Apple et comprendre la complexité derrière Lightning. Un truc que les producteurs bon marché ne veulent pas faire. Mais ils ont trouvé une parade : faire des écouteurs Bluetooth… avec fil !

Quand on connait la complexité de Bluetooth, le fait que faire des écouteurs Bluetooth soit plus simple que des filaires en dit long sur ce que nous considérons désormais comme « normal ».

La merdification générale de la culture

Je me concentre sur l’informatique, car je connais bien le domaine. Mais c’est pareil pour la culture. J’ai déjà évoqué l’ensevelissement de l’art par l’amusement et celui de l’amusement par la distraction.

Récemment, j’ai fait la réflexion à mon épouse que toutes les bandes-annonces projetées sur l’écran à l’extérieur du cinéma de ma ville étaient, sans exception, des suites ou des reboots de films vieux de minimum vingt ans. Il n’y avait strictement rien de nouveau mis en avant.

Un simple sentiment ? Adam Mastroianni a quantifié le problème, tant au niveau des films que de la musique, de la télévision et des livres. Les résultats sont effrayants. Nous sommes littéralement en train d’étouffer toute tentative d’idées originales en termes de culture.

Et la presse, alouette, alouette…

Dans la salle d’attente d’un centre dentaire, je suis tombé sur un numéro du Figaro Magazine que j’ai feuilleté. Pendant un moment, j’ai cru à une parodie. Une couverture mettant à l’honneur des « influenceurs patriotes » (comprenez « réacs ») tout droit sorti d’un sketch des Inconnus des années 90. De superbes photos d’Éric Zemmour annonçant avec le plus grand sérieux que Mélenchon prépare une insurrection musulmane antirépublicaine.

Quand mon dentiste est entré, il m’a surpris avec ce torchon entre les mains. Je me suis immédiatement justifié, arguant d’un intérêt sociologique. Il a rigolé.

Il y a quelques années à peine, ce genre de choses n’aurait même pas pu être pris au sérieux. La raison de ce changement de cadre ? Bolloré. Un milliardaire d’extrême droite qui rachète les médias pour imposer sa ligne éditoriale. Mais, plus subtil encore, il achète également le monopole des points de vente presse et livre (les fameux Relay dans toutes les gares). De cette manière, cette presse est mise en avant même si elle ne se vend pas ! Les Relay sont donc littéralement des panneaux publicitaires permanents au service de l’idéologie réac de Bolloré ! Même si vous n’achetez pas, vous êtes forcément confronté aux affiches et aux unes.

Les Relay, ce sont aussi parmi les plus grands vendeurs de livres. Alors que les éditions PVH, où je suis édité, doivent se battre bec et ongles pour avoir quelques exemplaires dans les libraires, tous les Relay ont des piles de dizaines de bouquins absurdes, infâmes ou les deux qui finissent, à l’usure, par se vendre. Avec Gee, on a notamment déliré sur un livre disponible partout en énorme quantité et prétendant démontrer l’existence de Dieu. Livre (très mal) écrit… par le frère de Vincent Bolloré.

Déplacer le cadre de référence de la normalité est plus facile lorsqu’on est milliardaire.

Soyez anormaux, entrez en résistance !

Je le dis, je le répète, mais je ne le dirai jamais assez : résistez !

Prenez garde aux logiciels qui tournent dans votre ordinateur : protégez-vous, installer des bloqueurs, utilisez des logiciels libres.

Prenez garde aux logiciels qui tournent dans votre cerveau : éteignez définitivement votre télé, piratez de vieux films, fréquentez les libraires indépendants et soutenez les auteurs, les artistes et les créateurs qui s’expriment pour faire bouger le cadre de référence, qui refusent de se laisser sponsoriser, acheter. Soutenez les anomalies que le système essaie de gommer !

Alors oui, je sais, c’est plus facile de lancer Netflix en rentrant du boulot, de mater la série dont tous les collègues parlent. Oui, c’est plus facile d’utiliser Chrome et un iPhone. Comme c’est plus facile de s’allumer une clope. Résister n’a jamais été facile. C’est pourtant nécessaire…

Lorsqu’une morbide normalité nous est imposée à coup de milliards, il est un devoir moral d’être anormal.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

June 17, 2024

Les mécanismes de compensation carbone expliqués à mon hamster

Un homme en costume abaisse la vitre de sa grosse voiture rutilante. Il s’adresse à une femme en tenue cycliste qui s’apprête à monter sur son vélo.

— Salut Josiane. Dis-moi, tu vas toujours au travail en vélo ?
— Bien sûr René.
— Tu fais combien de kilomètres par semaine ?
— Entre 80 et 100km, je pense.
— Tu connais d’autres vélotaffeurs ?
— Oui, plein.
— Tu pourrais me réunir pour un total de 500km par semaine ?
— Sans problème René. Mais pourquoi ?
— Je t’explique : je vous paie aux alentours de 10 centimes le kilomètre et, en échange, vous me donnez chacun un certificat prouvant que vous n’allez pas utiliser la voiture pour aller au travail.
— Mais on ne comptait de toute façon pas utiliser la voiture. Je n’en ai même pas.
— Ça n’a pas d’importance Josiane.
— Ça te sert à quoi ?
— Comme je fais 500km par semaine dans mon SUV (t’as vu ? Il est tout nouveau ! Full options ), ces certificats compensent mes trajets. En vous payant les kilomètres à vélo, c’est comme si je ne polluais pas. Tu comprends ?
— Pas bien, non…
— C’est pas grave Josiane. Toi tu es payée et moi je peux mettre un autocollant comme quoi mon véhicule est neutre en émission carbone. En plus, j’ai des abattements d’impôts qui compensent largement ce que je vous paie.

Voix off: Toi aussi, comme Josiane et René, participe à protéger la planète en investissant dans les certificats verts et les mécanismes de compensation carbone !

— Mais dis-moi René, mes potes qui font du télétravail, ils font comment ?
— Très bonne question Josiane ! Je peux leur fournir un vélo d’appartement (je suis justement actionnaire dans une entreprise qui en fabrique). Tes amis s’engagent à pédaler dessus tous les jours un nombre fixé de kilomètres, par exemple pendant leur pause de midi. Ils peuvent rembourser le vélo d’appartement avec les certificats qu’ils génèrent.
— Tu ne risques pas d’en avoir trop des certificats ?
— Mais je les revends, Josiane. Comment crois-tu que j’ai payé ce nouveau jouet ?

René tapote avec fierté le volant de sa voiture rutilante. Josiane semble hésiter.

— Mais du coup, il faut être sûr de faire les kilomètres.
— Bien sûr. Bien sûr. Mais, entre nous, qui ira vérifier ? Tiens, voilà déjà ta première semaine. Tu m’enverras le certificat !

Il tend un billet, éclate de rire et démarre en trombe en adressant un signe d’au revoir à Josiane. Le rugissement du moteur résonne longuement. La cycliste le regarde partir puis contemple le billet de 10€ qu’elle tient dans la main. Elle enfourche son vélo et se met à pédaler. Tout en souriant, elle murmure :

— Ça fait du bien de savoir qu’on participe au sauvetage de la planète ! J’utiliserai le même mécanisme pour compenser mes prochaines vacances.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

June 15, 2024

 

book picture
A book written by a doctor that has ADD himself, and it shows.

I was annoyed in the first half by his incessant use of anecdotes to prove that ADD is not genetic. It felt like he had to convince himself, and it read as an excuse for his actions as a father.

He uses clear and obvious examples of how not to raise a child (often with himself as the child or the father) to play on the readers emotion. Most of these examples are not even related to ADD.

But in the end, definitely the second half, it is a good book. Most people will recognize several situations and often it does make one think about life choices and interpretation of actions and emotions.

So for those getting past the disorganization (yes there are parts and chapters in this book, but most of it feels randomly disorganized), the second half of the book is a worthy thought provoking read.


June 11, 2024

When I mindlessly unplugged a USB audio card from my laptop while viewing its mixer settings in AlsaMixer, I was surprised by the following poetic error message: [1]

In the midst of the word he was trying to say, In the midst of his laughter and glee, He had softly and suddenly vanished away— For the Snark was a Boojum, you see.

These lines are actually the final verse of Lewis Carroll's whimsical poem The Hunting of the Snark:

In the midst of the word he was trying to say,
  In the midst of his laughter and glee,
He had softly and suddenly vanished away—
  For the Snark was a Boojum, you see.

-- Lewis Carroll, "The Hunting of the Snark"

Initially, I didn't even notice the bright red error message The sound device was unplugged. Press F6 to select another sound card. The poetic lines were enough to convey what had happened and they put a smile on my face.

Please, developers, keep putting a smile on my face when something unexpected happens.

[1]

Fun fact: the copyright header of AlsaMixer's source code includes: Copyright (c) 1874 Lewis Carroll.

Un nouvel âge d’or de l’imaginaire ?

Lors d’une conversation avec Lionel, mon éditeur, j’ai un jour dit que j’aurais aimé vivre lors de « l’âge d’or de la SF », cette fameuse période des années 50 durant laquelle n’importe quel aspirant auteur pouvait envoyer ses nouvelles aux fameux magazines pulps et se retrouver publié entre Poul Anderson et Frederik Pohl tandis qu’Isaac Asimov passait ses soirées à rigoler avec Lyon Sprague de Camp et Lester del Rey.

La réponse de Lionel fut immédiate :

— Tu es peut-être en train de vivre ce qui, pour les prochaines générations, sera perçu comme un âge d’or.

De retour des caves de Pesmes (voir photo) et de son festival « Si l’imaginaire m’était Comté » où nous avons fêté les 10 ans de la maison d’édition PVH, je repense à cette réponse. Simple. Évidente. Digne de l’historien de formation qu’est Lionel.

Et je ne veux qu’agréer à la chance que j’ai de connaître une forme d’âge d’or. Ou, du moins, de la tentative que nous faisons pour le créer.

Après 10 ans d’efforts, les livres de PVH sont désormais disponibles dans toutes les librairies de Suisse, de France et de Belgique. La qualité et l’attention portée au détail, notamment dans la collection Ludomire, sont particulièrement appréciées des critiques et des lecteurs.

Dans cette collection, « L’Héritage des Sombres » de Pascal Lovis est déjà devenu un best-seller et un classique de la fantasy suisse. La saga One Minute, de Thierry Crouzet, est un ovni littéraire qui bouleverse les codes de la SF. « Sortilèges et Syndicats », de Gee, est pressenti pour participer au prix littéraire Grolandais. Mon propre recueil de nouvelles « Stagiaire au spatioport Omega 3000 » est, d’après Wikipédia, le premier livre disposant d’une piste cachée (un truc que les moins de vingt ans ne peuvent pas comprendre).

Mais ce n’est pas tout !

La collection Ludomire est particulière, car elle est, à ma connaissance, la première collection de littérature de l’imaginaire entièrement sous licence libre (CC By-SA). Les auteurs qui rejoignent la collection sont obligés, par contrat, de libérer leur œuvre. À travers le programme print@home, les lecteurs ont été encouragés à télécharger et imprimer chez eux des exemplaires de mon roman Printeurs puis de tous les autres de la collection.

La licence libre ouvre également des portes vers des pays qui sont peu ou pas approvisionnés par les circuits de distribution classiques. PVH travaille notamment avec le Sénégal et la république démocratique du Congo. Et si, plutôt que de transporter des tonnes de papiers sur des milliers de kilomètres, on permettait à chaque pays de faire sa propre impression des œuvres ? La licence libre le permet !

GNU/PVH, Ploum et Lionel dans une librairie, avec t-shirt GNU jaune et un polo mauve PVH GNU/PVH, Ploum et Lionel dans une librairie, avec t-shirt GNU jaune et un polo mauve PVH

La liberté facilite la diffusion, l’adaptation et notamment les traductions. « À l’orée de la ville », le (trop) court roman cyberpunk d’Allius sera bientôt diffusé en italien !

Mais ce n’est pas tout !

Libérer les livres, c’est très bien. Les vendre à travers une plateforme libre, c’est encore mieux. C’est pourquoi PVH a développé sa propre plateforme de vente sous licence libre : be-BOP.

be-BOP a pour objectif de libérer les commerçants, les créateurs, les associations des plateformes centralisées comme Patreon, Ulule voire VISA et Mastercard grâce à la possibilité de payer en espèce, virements ou bitcoins.

Grâce à l’incroyable énergie de Christophe Gérard, auteur de « La légende du bretteur qui se battait pour un petit pois », le tout premier livre publié par PVH, be-BOP est devenu une plateforme multifonction. À la fois CMS pour réaliser un site de commerce en ligne, mais également plateforme autohébergée permettant le crowdfunding, les abonnements, l’achat en direct comme logiciel de caisse, la comptabilité, la facturation. Be-BOP supporte même la gestion des tables et des tickets d’un restaurant !

Que vous achetiez les livres sur le site PVH ou dans un salon, vous passez désormais par be-BOP et sa licence AGPLv3.

Tout comme la libération de la collection Ludomire, l’objectif de be-BOP est double : libérer les outils de l’économie de la création et rendre celle-ci accessible au plus grand nombre. À travers des expériences au Salvador et au Sénégal, PVH a été confronté au fait que disposer d’un compte en banque et d’une carte de crédit est loin d’être la norme et ne devrait en aucun cas être une condition d’accès à la culture. Dans certaines régions, s’affranchir du système bancaire fait toute la différence !

Mais ce n’est pas tout !

Tenter de faire revivre les mondes de l’imaginaire, créer une culture libre, créer des outils économiques libres. Je ne sais pas si l’histoire jugera qu’il s’agit d’un nouvel âge d’or au cours duquel Ploum emmenait Pascal Lovis et Sara Schneider se baigner sous la pluie dans l’Onion (je vous passe les innombrables jeux de mots de Pascal), où Ploum et Bruno Leyval rivalisaient à qui ronflerait le plus fort. En fait, je ne sais pas si l’histoire retiendra quoi que ce soit.

Nos tentatives paraissent parfois dérisoires face aux monstres commerciaux qui imposent aux libraires la présence de 50 exemplaires de chaque nouveauté en tête de gondole.

Mais, au moins, nous essayons de proposer une alternative : la liberté et l’originalité. Et nous continuerons d’essayer, de toutes nos forces.

Mon prochain roman sortira en septembre/octobre aux éditions PVH (mais pas dans la collection Ludomire, je vous laisse la surprise). C’est un roman univers et les discussions vont déjà bon train pour l’adapter en jeu de rôle. Les premiers lecteurs rêvent également de le voir adapté en bande dessinée (si un dessinateur accroche à l’intrigue).

Mais ce n’est pas tout !

Car tout ce que nous créons avec PVH n’aurait aucun sens s’il n’y avait pas vous, les lecteurs. Des lecteurs qui peuvent s’approprier la culture libre, l’adapter, la partager autour d’eux ou sur des blogs. Des lecteurs qui viennent discuter lors des salons et des séances de dédicaces. Comme ce couple dont la fille a été baptisée en l’honneur d’une héroïne de Sara Schneider. Où ces lecteurs de mon blog, qui m’avouent préférer les versions epub pirates, ce que j’encourage fortement ! C’est d’ailleurs le défaut des licences libres : les versions pirates n’en sont plus vraiment, ce qui enlève un peu le sel du téléchargement sur libgen.rs.

Et puis il y a les libraires, qui partagent notre passion, qui mettent en valeur un livre de la collection où nous accueillent pour une dédicace, qui font vivre la littérature de l’imaginaire malgré son caractère désormais démodé et peu rentable. Si vous ne piratez pas vos livres, s’il vous plait, tentez de vous les procurer chez un libraire indépendant ! Leur apport à la culture est essentiel.

Et puis c’est mieux que tout…

Je suis baigné dans un maelstrom d’auteurs, d’artistes de tout poil, d’éditeur, d’intellectuels, de blogueurs que je croise dans les trains, de programmeurs, de rôlistes, de geeks, de lecteurs de ce blog avec qui j’échange régulièrement. Je rencontre des personnes de tous les horizons qui partagent mes passions pour l’imaginaire et la liberté. Qui partagent souvent mes colères. Qui, parfois, deviennent des amis.

Peut-être que ce n’est pas un nouvel âge d’or. Ce n’est certainement pas un nouvel âge d’or.

Mais, pour moi, grâce à vous tous, ça y ressemble vachement.

Et c’est peut-être tout ce qui compte…

Alors merci ! Merci d’en faire partie et, chacun·e à votre manière, de le faire exister.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

June 10, 2024

When a new major version of Drupal is released, custom code often requires updates to align with API changes, including the removal of deprecated APIs.

Because I keep forgetting certain aspects of this workflow, I decided to document it for future reference.

Tools overview

Tool Interface Functionality Target Audience
Upgrade Status module UI in Drupal Identifies deprecated code, hosting environment compatibility, and more Site administrators and developers
Drupal Check Command-line Identifies deprecated code Developers, especially during coding and continuous integration (CI)

Upgrade Status module

The Upgrade Status module assesses a Drupal site's readiness for major version upgrades by checking for deprecated code and other compatibility issues.

Screenshot of Drupal 11 upgrade status report in the administration interface. Screenshot of a Drupal upgrade status report showing hosting environment compatibility checks.
  1. Install the Upgrade Status module like you would install any other Drupal module:

    $ ddev composer require –dev drupal/upgrade_status

    Here, ddev is the tool I prefer for managing my local development environment. composer is a dependency manager for PHP, commonly used to install Drupal modules. The –dev option specifies that the module should be installed as a development requirement, meaning it is necessary for development environments but not installed on production environments.

  2. Enable the Upgrade Status module:

    $ ddev drush pm-enable upgrade_status

    drush stands for "Drupal shell" and is a command-line utility for managing Drupal sites. The command pm:enable (where pm stands for "package manager") is used to enable a module in Drupal.

  3. After enabling the module, you can access its features by navigating to the Admin > Reports > Upgrade status page at /admin/reports/upgrade-status.

Upgrading PHP and MySQL using DDEV

The Upgrade Status module might recommend updating PHP and MySQL, per Drupal's system requirements.

To update the PHP version of DDEV, use the following command:

$ ddev config –-php-version 8.3

To upgrade the MySQL version of DDEV and migrate your database content, use the following command:

$ ddev debug migrate-database mariadb:10.11

After updating these settings, I restart DDEV and run my PHPUnit tests. Although these tests are integrated into my CI/CD workflow, I also run them locally on my development machine using DDEV for immediate feedback.

Drupal Check

Drupal Check is a command-line tool that scans Drupal projects for deprecated code and compatibility issues.

I always run drupal-check before updating my Drupal site's code and third-party dependencies. This helps ensure there are no compatibility issues with the current codebase before upgrading. I also run drupal-check after the update to identify any new issues introduced by the updated code.

Terminal output showing successful run of Drupal Check with no errors. Output of Drupal Check command indicating no deprecated code was found.
  1. Installation:

    $ ddev composer require –dev mglaman/drupal-check
  2. Run Drupal Check from the root of your Drupal installation:

    $ ./vendor/bin/drupal-check –-memory-limit 500M docroot/modules/custom

    I usually have to increase the memory limit, hence the --memory-limit 500M.

Using PHPStan directly

In the future, I'd like to evaluate whether using PHPStan directly is simpler. This is a TODO for myself. Drupal Check is essentially a wrapper around PHPStan, offering default configuration such as automatically running at level 2. To achieve the same result with PHPStan, I should be able to simply run:

$ php vendor/bin/phpstan analyze -l 2 docroot/modules/custom

June 07, 2024

Setting up a new computer or device for secure, password-free logins not only streamlines automation processes but also improves security. Password-free authentication methods, such as public key authentication, reduce the risk of brute force attacks and phishing, as they eliminate the need to enter a password that could potentially be intercepted or guessed.

Because I keep forgetting how to set this up, I decided to document the steps for future reference.

Step 1: Generate an SSH key on your control machine

If you don't already have an SSH key, create one with this command:

$ ssh-keygen -t rsa -b 2048

This will generate a 2048-bit RSA key. Follow the prompts to specify a file location and password.

Step 2: Copy your public key to the target machine

Transfer your public key to the target machine, which is the computer you want to log into:

$ ssh-copy-id user@machine.local

This command appends your public key to the ~/.ssh/authorized_keys file on the target machine.

Replace user with your username and machine.local with the target machine's hostname or IP address.

Step 3: Configure sshd for passwordless logins

Disable password authentication and enable public key authentication for the ssh deamon on the target machine. First, log into the target machine, then edit the SSH configuration file, typically located at /etc/ssh/sshd_config. Ensure the following lines are set correctly:

PubkeyAuthentication yes
PasswordAuthentication no

After updating the configuration, restart the SSH daemon:

sudo systemctl restart sshd

Note: Ensure your public key has been correctly set up on the target machine before making these changes. Failure to do so may result in being unable to log in.

Step 4: Load the key into your SSH agent

To automate authentication and avoid having to enter your password, we need to load the SSH key into the SSH agent. The SSH agent securely stores your keys and handles the authentication process, streamlining your workflow:

$ ssh-add ~/.ssh/id_rsa

That is it!

June 05, 2024

A group of people in a futuristic setting having a meeting at a round table with a space background.

Although my blog has been quiet, a lot has happened with the Drupal Starshot project since its announcement a month ago. We provided an update in the first Drupal Starshot virtual meeting, which is available as a recording.

Today, I am excited to introduce the newly formed Drupal Starshot leadership team.

Meet the leadership team

Product Lead: Dries Buytaert

I will continue to lead the Drupal Starshot project, focusing on defining the product vision and strategy and building the leadership team. In the past few weeks, I have cleared other responsibilities to dedicate a significant amount of time to Drupal Starshot and Drupal Core.

Technical Lead: Tim Plunkett (Acquia)

Tim will oversee technical decisions and facilitate contributions from the community. His role includes building a team of Drupal Starshot Committers, coordinating with Drupal Core Committers, and ensuring that Drupal Starshot remains stable, secure, and easy to upgrade. With 7 years of engineering leadership experience, Tim will help drive technical excellence. Acquia is providing Tim the opportunity to work full-time on the Drupal Starshot project.

User Experience Lead: Cristina Chumillas (Lullabot)

Cristina will define the design and user experience vision for Drupal Starshot. She will engage with contributors to initiate research activities and share the latest UI/UX best practices, ensuring a user-centric approach. She has been leading UX-related Drupal Core initiatives for over 7 years. Lullabot, Cristina's employer, has generously offered her the opportunity to work on Drupal Starshot full-time.

Product Owner: Pamela Barone (Technocrat)

Pam will help ensure alignment and progress among contributors, including defining and prioritizing work. She brings strong communication and organizational skills, having led Drupal projects for more than 12 years.

Contribution Coordinator: Gábor Hojtsy (Acquia)

Gábor will focus on making it easier for individuals and organizations to contribute to Drupal Starshot. With extensive experience in Open Source contribution and community engagement, Gábor will help communicate progress, collaborate with the Drupal Association, and much more. Acquia will provide Gábor with the opportunity to work full-time on the Drupal Starshot project.

Starshot Council (Advisory Board)

To support the leadership team, we are establishing the Starshot Council, an advisory board that will include:

  1. Three end-users (site builders)
  2. Three Certified Drupal Partners
  3. Two Drupal Core Committers (one framework manager and one release manager)
  4. Three Drupal Association board members, one from each of the following Board Working Groups: Innovation, Marketing, and Fundraising
  5. Two staff members from the Drupal Association

The council will meet monthly to ensure the leadership team remains aligned with the broader community and strategic goals. The Drupal Association is leading the effort to gather candidates, and the members of the Starshot Council will be announced in the coming weeks.

More opportunities to get involved

There are many opportunities for others to get involved as committers, designers, developers, content creators, and more.

We have specific tasks that need to be completed, such as finishing Project Browser, Recipes and Automatic Updates. To help people get involved with this work, we have set up several interactive Zoom calls. We'll update you on our progress and give you practical advice on where and how you can contribute.

Beyond the tasks we know need to be completed, there are still many details to define. Our next step is to identify these. My first priority was to establish the leadership team. With that in place, we can focus on product definition and clarifying the unknowns. We'll brief you on our initial ideas and next steps in our next Starshot session this Friday.

Conclusion

The Drupal Starshot project is off to an exciting start with this exceptional leadership team. I am grateful to these talented individuals for stepping up to drive this important project. Their combined expertise and dedication will drive excitement and improvements for the Drupal platform, ultimately benefiting our entire community. Stay tuned for updates as we continue to make strides in this ambitious initiative.

La complexité de la simplicité

Nous avons oublié comment faire simple

Le projet Standard Ebooks a pour vocation de prendre les œuvres du domaine public (genre projet Gutenberg) et d’en faire de beaux, jolis epubs qui sont un plaisir à lire sur une liseuse. Le projet a un certain succès : en 2021, 1,4 million de livres ont été téléchargés pour un site qui accumule 1,2 million de vues par mois (sans compter les appels à l’API et au flux RSS).

Le tout tourne sans souci sur un petit serveur monocœur doté de 2GB de RAM.

Le secret ?

Attention, vous n’allez pas en croire vos oreilles…

Roulements de tambour !

Pas d’utilisation de frameworks ni de JavaScript. Voilà. C’est tout. C’est un site web qui utilise du PHP fait à la main et stocke les fichiers… sur un disque dur.

Nous avons oublié à quel point nos ordinateurs sont des monstres de puissance. Mais nous les mettons à genoux avec des couches de complexité inutiles. Vous avez besoin d’une base de donnée sur un container clustérisé ? Non, il y a 99% de chances que des fichiers sur un disque dur soit la réponse à votre problème. Vous avez besoin d’un framework javascript ? Non, vous avez besoin de réfléchir au contenu de votre page web qui, dans 99% des cas, peut être statique avec une jolie CSS si vous voulez vous démarquer.

La complexité est une arnaque. La complexité vous coûte une fortune. Votre site corporate doit à présent migrer sur un docker dans Amazon S3 à cause de tous les frameworks qu’il nécessite alors qu’il s’agit globalement de texte, d’images et de quelques formulaires pour les cas particuliers.

Vous croyez simplifier et diminuer les coûts en confiant, par exemple, la gestion de votre trafic web à Cloudfare ? En réalité, vous avez lancé le compte à rebours d’une bombe qui explosera un jour ou l’autre. Vous allez devoir payer, passez des nuits blanches.

La technologie n’est jamais une solution à l’incompétence. Au contraire, elle l’exacerbe.

Pour trouver la solution ultime à tous les problèmes, il existe un algorithme, celui de Feynman :

1. Écrire le problème
2. Réfléchir
3. Écrire la solution

Cela ressemble à une blague, mais c’est littéralement comme ça que toutes les bonnes idées sont apparues. Et c’est exactement ça que personne ne fait et ne veut faire. Parce que les étapes 1. et 2. sont vues comme inefficaces et qu’il faudrait passer directement à la troisième.

L’élégance du texte brut

En termes de simplicité, rien ne vaut le texte brut. Le seul format qui est et restera lisible sur absolument tout ordinateur qui existe. Pour la pérennité, rien ne vaut le texte brut.

Mais, outre la pérennité, le texte brut offre bien d’autres qualités : c’est gratuit, indépendant de tout logiciel particulier. Vous pouvez changer d’éditeur à votre convenance. Et, surtout, c’est incroyablement puissant.

Vous voulez organiser vos notes et vos données ? Utilisez les répertoires. Vous voulez chercher ? Pour un informaticien un minimum compétent, il suffit de faire un petit coup de find ou de grep. Des opérations complexes ? Sed, uniq, sort voire awk.

Le texte brut et les outils Unix de base, c’est comme la dactylo : ce sont quelques heures (ou plus) d’apprentissage inconfortable pour une vie entière à avoir le sentiment que l’ordinateur fait exactement ce qu’on attend de lui. Mais quelques heures d’apprentissage réel et concentré, c’est difficile. C’est un choix. Dans le monde de l’industrie, l’immense majorité des informaticiens préfèrent une vie entière d’asservissement à l’ordinateur et aux fournisseurs de logiciels propriétaires.

Un artisan doit chérir ses outils

Il n’y a pas que les informaticiens. Pour un écrivain, le traitement de texte est un instrument. Thierry Crouzet revient sur son parcours d’écrivain dans ce poignant billet.

utiliser [MS Word] c’est ne se poser aucune question politique ou philosophique quant à l’époque, c’est faire comme si rien n’avait changé, c’est laisser sa conscience en berne, c’est croire que l’outil n’a aucune importance, c’est être d’une crasse ignorance des dessous du monde, et se laisser abandonner à de belles histoires qui n’ont aucune profondeur contemporaine.

Quand j’y repense, mon parcours est un peu similaire. J’ai commencé à écrire avec Microsoft Works sur Windows 3.1, je suis passé à Wordpad sur Windows 98 parce qu’il se lançait plus vite que Word et qu’il était plus simple. Une fois sous Linux, alors que j’utilisais StarOffice pour les documents "officiels", Gedit pour le python et Vim pour les fichiers de configuration/les scripts bash, je préférais écrire des nouvelles sous Abiword. Mais, cherchant encore la simplification, je suis passé à Pyroom, un éditeur de texte tout simple et plein écran. Mes quelques années sous Mac m’ont fait tester Ulysses, sur recommandation de Thierry. De retour sous Linux, j’ai utilisé Zettlr et, très brièvement, Obsidian.

À ce moment-là, je me suis rendu compte que j’utilisais Vim depuis 20 ans sans l’avoir jamais vraiment appris. J’ai pris quelques semaines pour faire ces fameuses « quelques heures d’apprentissage inconfortable ». Le « Vim pour les humains » de Vincent Jousse sous le coude, je me suis forcé à mémoriser une commande à la fois. Depuis, je n’utilise que (Neo)Vim, sans plug-in, sans configuration autre qu’une adaptation à mon clavier bépo. Tout me semble tellement naturel que je me déplace, je modifie, je relis avec les commandes Vim sans même y penser.

Mais, en parallèle de ce retour à Vim, je me suis également mis à la machine à écrire. Les nouvelles les plus récentes de mon recueil « Stagiaire au spatioport Omega 3000 » ont été tapées à la machine. Sauriez-vous les détecter ? Car, oui, je pense que l’outil influence énormément l’écriture, comme l’a analysé Thierry dans son excellent essai « La mécanique du texte ».

La machine à écrire force à aller à l’essentiel. Sa résistance mécanique et l’impossibilité d’effacer sont des outils incroyables pour penser là où les traitements de texte permettent de dégueuler du texte au kilomètre sans réfléchir puis de tout réarranger à coup de copier-coller pour que ça se vende. À ce titre, les assistants « intelligents » ne sont qu’une étape de plus pour produire du texte de merde au kilomètre.

Et comme le dit très bien Thierry, il faut peut-être être geek pour faire tous ces apprentissages, mais comment ne pas être geek lorsqu’on travaille toute la journée sur un ordinateur ? Comment respecter un menuisier qui n’aurait aucun intérêt pour ses outils et utiliserait un outil multifonction soldé chez Aldi pour tout faire parce que « l’outil ne l’intéresse pas » ?

Mon prochain roman, qui sortira fin septembre ou début octobre, a été entièrement écrit sur une Hermes Baby, une machine à écrire ultraportable des années 1960. Et vous verrez que ce mélange de mécanique et de voyage se prête particulièrement bien au thème. Mais je n’ai pas choisi la machine à cause du texte. C’est, au contraire, la machine qui a choisi le texte.

Écrire dans Vim ou à la machine me parait tellement simple, tellement évident.

Fuir la complexité ?

Faut-il fuir la complexité à tout prix ? Ce serait simplifier mon discours (tentative d’humour volontaire). L’humanité est une interconnexion incroyablement complexe de très nombreux systèmes relativement simples. Un écosystème sain est composé de composants simples, nombreux, différents, mais avec des interconnexions complexes.

C’est pour ça que j’aime tant la philosophie Unix de l’informatique : une myriade d’outils simples, voire simplissimes, mais dont l’interconnexion est incroyablement complexe. Ces outils nous forcent à repousser la limite de notre imagination, de notre créativité.

Nous faisons exactement le contraire : nous avons des interactions simples, voire simplissimes avec très peu de composants dont la complexité nous est, à dessein, cachée : les monopoles.

Nous tuons notre imaginaire pour cliquer plus facilement.

Nous réduisons la communication à un message Facebook ou Twitter, l’achat en ligne à un clic sur Amazon, la recherche à Google, l’email à Gmail/Outlook, le téléphone à iPhone/Android et le café à Starbucks.

La sémantique est importante. Lorsqu’une marque devient un verbe ou un nom commun, elle a gagné. Lorsque l’on ne cherche plus, mais on « google », lorsqu’on ne va plus manger un hamburger, mais un « macdo », nous perdons jusqu’à notre vocabulaire, notre capacité de percevoir des différences, des subtilités, des évolutions.

Pensez-y la prochaine fois que vous utiliserez une marque plutôt que le concept sous-jacent !

Nous nous faisons non seulement avoir en tant qu’individu, mais nous détruisons également tous l’écosystème humain autour de nous. Accepter d’utiliser les systèmes centralisés n’est pas seulement une capitulation personnelle, c’est un acte destructeur envers la diversité culturelle, technique, politique et humaine.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

June 02, 2024

Some years ago, I wrote an article on connecting to MySQL 8.0 using the default authentication plugin (caching_sha2_password) with Perl. I also provided Perl-DBD-MySQL packages for EL7.

Somebody recently left a comment as he was looking for a perl-DBD-MySQL driver compatible with the caching_sha2_password for Rocky Linux 8 (EL8 or OL8 compatible).

Therefore, I build two new packages supporting the latest perl-DBD-MySQL driver, version 5.005.

The difference is related to the version of libmysql they are linked to.

libmysql.so.21 for MySQL 8.0 and libmysql.so.24 for MySQL 8.4 LTS:

MySQL 8.0

$ rpm -qf /usr/lib64/mysql/libmysqlclient.so.21
mysql-community-libs-8.0.37-1.el8.x86_64

$ ldd /usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql.so | grep libmysql
	libmysqlclient.so.21 => /usr/lib64/mysql/libmysqlclient.so.21

MySQL 8.4

$ rpm -qf /usr/lib64/mysql/libmysqlclient.so.24
mysql-community-libs-8.4.0-1.el8.x86_64

$ ldd /usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql.so | grep libmysql
	libmysqlclient.so.24 => /usr/lib64/mysql/libmysqlclient.so.24

Here are the two rpm packages:

Enjoy connecting to MySQL using Perl and thank you to Daniël van Eeden for maintaining the driver.

May 30, 2024

www.wagemakers.be vierkant

https://www.wagemakers.be

I’ve finally found the time to give my homepage a complete makeover. Yes, HTTPS is enabled now ;-)

The content has been migrated from WebGUI to Hugo.

It still contains the same old content, but I’ll update it in the coming weeks or when some of the projects are updated.




May 28, 2024

In MySQL HeatWave, the Database as a Service (DBaaS) offered by Oracle, Machine Learning is incorporated to provide database administrators (DBAs) and developers with an optimal experience.

Those tools are the Auto ML advisors like the autopilot indexing for example.

MySQL HeatWave is also an invaluable asset for Data Analysts, offering straightforward tools that streamline the process of creating predictive models through data training and scoring.

The Data

The data we are using is the Bank Marketing data related to marketing campaigns over phone calls of a Portuguese banking institution. The classification goal is to predict if the client will subscribe to a term deposit. This is the variable y in the dataset.

The data is available here. We will use bank-full.csv to load into our DB System to train and create our model.

Then, we will use bank.csv to test it.

Importing the data

It’s not mandatory to load the data into HeatWave’s secondary engine to create the model, it can stay in InnoDB.

We need first to create a schema and a new table to match our data:

MySQL> CREATE DATABASE marketing;
MySQL> CREATE TABLE bank_marketing(
        age int,
        job varchar(255),
        marital varchar(255),
        education varchar(255),
        default1 varchar(255),
        balance float,
        housing varchar(255),
        loan varchar(255),
        contact varchar(255),
        day int,
        month varchar(255),
        duration float,
        campaign int,
        pdays float,
        previous float,
        poutcome varchar(255),
        y varchar(255)
    );

After that, we can load the data using MySQL Shell’s util.importTable() method:

JS> util.importTable('bank-full.csv', 
             {table: 'bank_marketing', skipRows: 1,
             fieldsTerminatedBy: ';', dialect:  'csv-unix'});
Importing from file '/home/opc/bank-full.csv' to table 
                                              `marketing`.`bank_marketing` 
in MySQL Server at 10.0.1.35:3306 using 1 thread
[Worker000] bank-full.csv: Records: 45211  
                                      Deleted: 0  Skipped: 0  Warnings: 0
99% (4.61 MB / 4.61 MB), 0.00 B/s
File '/home/opc/bank-full.csv' (4.61 MB) 
                                  was imported in 0.4034 sec at 4.61 MB/s
Total rows affected in marketing.bank_marketing: Records: 45211  
                                     Deleted: 0  Skipped: 0  Warnings: 0

We can already create an identical table called bank_marketing_test this time.

And then load the test data into it:

JS> util.importTable('bank.csv', 
             {table: 'bank_marketing_test', skipRows: 1,
              fieldsTerminatedBy: ';', dialect:  'csv-unix'});
Importing from file '/home/opc/bank.csv' to table 
                                         `marketing`.`bank_marketing_test`
in MySQL Server at 10.0.1.35:3306 using 1 thread
[Worker000] bank.csv: Records: 4521  
                                 Deleted: 0  Skipped: 0  Warnings: 0
99% (461.32 KB / 461.47 KB), 0.00 B/s
File '/home/opc/bank.csv' (461.32 KB) 
                            was imported in 0.1058 sec at 461.32 KB/s
Total rows affected in marketing.bank_marketing_test: Records: 4521  
                                 Deleted: 0  Skipped: 0  Warnings: 0

Training & Inference

Now we need to create our model by training the data. If we plan to use a dedicated user for the Grafana dashboard later, it’s also mandatory to provide some privileges to that specific user.

If the user is called grafana, the following grants are mandatory:

MySQL> GRANT ALL PRIVILEGES ON `ML_SCHEMA_grafana`.* TO grafana;
MySQL> GRANT SELECT, EXECUTE ON `sys`.* TO grafana;
MySQL> GRANT SELECT ON `marketing`.* TO grafana;
MySQL> GRANT ALTER ON marketing.* to grafana;

It is also very important to create the schema where the model will be stored as the dedicated user won’t be able to create it:

MySQL> CREATE DATABASE ML_SCHEMA_grafana;
MySQL> \c grafana@10.0.1.35
Creating a session to 'grafana@10.0.1.35'
Please provide the password for 'grafana@10.0.1.35': ************
Save password for 'grafana@10.0.1.35'? [Y]es/[N]o/Ne[v]er (default No): yes

Let’s train the data now:

MySQL> CALL sys.ML_TRAIN('marketing.bank_marketing',
               'y', JSON_OBJECT('task', 'classification'), @grafana_model);
Query OK, 0 rows affected (10 min 6.6280 sec)

We can check the value of @grafana_model:

MySQL> select @grafana_model;
+------------------------------------------------+
| @grafana_model                                 |
+------------------------------------------------+
| marketing.bank_marketing_grafana_1716846458328 |
+------------------------------------------------+
1 row in set (0.0014 sec)

To validate the model, we need to score it. Therefore, we will load the model and test it against the testing data:

MySQL> CALL sys.ML_MODEL_LOAD(@grafana_model, NULL) ;
Query OK, 0 rows affected (0.1200 sec)

MySQL> CALL sys.ML_SCORE('marketing.bank_marketing_test', 'y', 
                     @grafana_model,  'accuracy', @score, null);
Query OK, 0 rows affected (0.9615 sec)

MySQL> SELECT @score;
+--------------------+
| @score             |
+--------------------+
| 0.9303251504898071 |
+--------------------+
1 row in set (0.0007 sec)

This is a very good score that validates our model!

Prediction

We can also try to predict with our model of a special set of attributes will be willing to subscribe or not to our term deposit:

MySQL> SELECT JSON_PRETTY(sys.ML_PREDICT_ROW(
        JSON_OBJECT("age", "30", 
                    "job", "services",
                    "marital", "married", 
                    "education", "secondary",
                    "default1", "no",     
                    "balance", "7032",     
                    "housing", "no",      
                    "loan", "no",
                    "contact", "cellular",     
                    "day", "17",      
                    "month", "jul",      
                    "duration", "402",
                    "campaign", "1",      
                    "pdays", "-1",     
                    "previous", "0",
                    "poutcome", "unknown", 
                    "id", "0"), @grafana_model, NULL)) prediction\G
*************************** 1. row ***************************
prediction: {
  "id": "0",
  "age": 30.0,
  "day": 17.0,
  "job": "services",
  "loan": "no",
  "month": "jul",
  "pdays": -1.0,
  "balance": 7032.0,
  "contact": "cellular",
  "housing": "no",
  "marital": "married",
  "campaign": 1.0,
  "default1": "no",
  "duration": 402.0,
  "poutcome": "unknown",
  "previous": 0.0,
  "education": "secondary",
  "Prediction": "no",
  "ml_results": {
    "predictions": {
      "y": "no"
    },
    "probabilities": {
      "no": 0.9,
      "yes": 0.1
    }
  }
}
1 row in set (0.2049 sec)

We can see that with those characteristics, the customer won’t subscribe to our term deposit.

Grafana

Grafana is an open source dashboard to visualize and query your data.

I’ve installed a Grafana server on a compute instance (public subnet) in the same VCN as my DB System.

When connected to Grafana, we need to create a new data source to connect to MySQL:

And we use the credentials of the dedicated user we created (grafana) and pointing the schema where the data is stored (marketing):

We can create a dashboard to visualize the current data in the bank_marketing table:

Prediction with Grafana

The funniest part is to visualize the prediction when changing some of the values:

Here are the two dashboards in case you want to import them:

The query used to build the Probability pie chart is the following:

SELECT
  prediction, prob_y as "yes", prob_n as "no"
FROM
  (
    SELECT
      sys.ML_PREDICT_ROW(
        json_object(
          "age", "${age}",
          "job", "${job}",
          "marital", "${marital}",
          "education", "${education}",
          "default1",  "no",
          "balance",    "${balance}",
          "housing",    "${housing}",
          "loan",       "${loan}",
          "contact",    "${contact}",
          "day",        "10",
          "month",      "${month}",
          "duration",   "600",
          "campaign",   "1",
          "pdays",      "-1",
          "previous",   "0",
          "poutcome",   "unknown",
          "id",         "0"
        ),
        (
          SELECT
            model_handle
          FROM
            ML_SCHEMA_grafana.MODEL_CATALOG
          ORDER BY
            model_id DESC
          LIMIT
            1
        ), NULL
      ) as a
    
  ) as t
  JOIN JSON_TABLE(
    t.a,
    '$' COLUMNS(
      prediction varchar(3) PATH "$.ml_results.predictions.y",
      prob_y decimal(4, 2) PATH "$.ml_results.probabilities.yes",
      prob_n decimal(4, 2) PATH "$.ml_results.probabilities.no"
    )
  ) as jt;

As you can see above, the tricky part is the model catalog to use. Here we read the first line in the schema ML_SCHEMA_grafana where grafana is the user connecting to the DB System.

Conclusion

MySQL HeatWave’s Machine Learning capabilities simplify tasks for data analysts by using simple SQL calls instead of extensive Python code for example. It also easily integrates with third-party tools like Grafana through a simple MySQL connection.

Enjoy MySQL and enjoy discovering the MySQL HeatWave Machine Learning capabilities, you are now ready to make predictions!

May 22, 2024

MySQL HeatWave is the MySQL DBaaS provided by Oracle in OCI and some other clouds. Compared to the vanilla MySQL, one of the key features of the service is that it allows you to run analytics queries (aka OLAP) very quickly using the HeatWave cluster. You can also run such queries on files using LakeHouse.

When we talk about analytics, we also think about data visualization solutions. In OCI, you can use Oracle Analytics Cloud.

In this article, we explore the procedure to establish a connection between Oracle Analytics Cloud (OAC) and a MySQL HeatWave DB instance within Oracle Cloud Infrastructure (OCI).

Deploying OAC

We consider that you already have a DB System up and running in OCI. The first step is then to deploy an Analytics Cloud instance.

The Analytics Cloud service is located in Analytics & AI section:

We need to create a new instance:

We provide the usual details, after filling in a name and a description, I use the defaults:

The instance is then created:

Prepare the DB Instance

Of course, you need, if you want, but still recommended, to create a dedicated user for OAC to connect to your MySQL HeatWave DB instance but there is something else very important to perform before being able to connect OAC to MySQL HeatWave: prepare a DNS name.

If like me, you don’t have any FQDN for your DB instance, you will need to follow the steps in this section,

As you can see the FQDN’s value is .

Therefore, we need to create a new private DNS Zone we will use to connect OAC to MySQL HeatWave:

I called my private zone lefred.hw and it’s important to use the same VCN as the DB Instance.

When the zone is created, we need to create a new record for the MySQL HeatWave DB instance resolving to its private IP (10.0.1.35 on my system):

We can then publish the changes:

And we are done with this part.

OAC Private Access Channel

By now, our OAC instance should be active. We still need to create a Private Access Channel to access our MySQL HeatWave DB System:

It’s important to use again the same VCN as the DB System and the private subnet (where the DB System is). It’s mandatory to use the VNC’s domain name as DNS zone and then add the new DNS zone we created in the previous section:

As soon as the OAC is active again, we can click on the URL to access the Oracle Analytics Cloud interface:

Connection & Dataset

Once we reach OAC’s home page, we can create a new connection:

We browse the connection types and we choose HeatWave:

Then we enter the credentials but pay attention that you cannot use the DB System’s IP:

But if the login and the password are correct, with the DNS name we created, the connection will be successful:

And we can now access the data stored in our MySQL HeatWave instance.

We can finally create a dataset and a workbook:

Conclusion

In the context of Oracle Cloud Infrastructure, employing a robust tool such as Oracle Analytics Cloud alongside MySQL HeatWave necessitates the precise configuration of a private access channel. It is essential to utilize a DNS-resolvable name for your database instance, as the IP address will not work.

Enjoy Analytics and data visualization of all your data stored in MySQL HeatWave!

May 20, 2024

Seventeen months after reaching an Elo rating of 1400, I finally achieved a stable rating of 1500 – a milestone that was far from easy for me. I never expected it would this long.

To be fair, I briefly reached 1500 in August 2023 but quickly fell back to 1300. In chess, progress can be slow and setbacks can be swift.

Because of my swift drop to 1300, I decided not to write about it at the time. However, I'm happy to update you on my progress today as I'm back over 1500, and it feels much more stable.

Chess as a personal mirror

Perhaps most importantly, over the past year, I've realized that chess is much more than a game – it's a reflection of my personal strengths and weaknesses. I've learned a great deal about my motivations, patience, discipline, decision-making, pressure management, strategic thinking, and learning ability.

For example, at one point, frustrated by my setbacks, I hired a chess coach. The first couple of months were fun and rapidly improved my understanding of chess. However, after three months, I was ready to quit chess. Despite hours of practice and study, my Elo rating wasn't improving. I felt stuck, began to doubt if I could ever achieve a 1500 rating, and started to think that maybe chess wasn't for me. I even questioned my own intelligence.

Feeling frustrated and disheartened, I confided in my chess coach that I was considering quitting chess. Instead of letting me give up, we developed a new training program tailored to my needs. Within three months, I achieved my highest rating ever!

Reflecting on it now, even though it was just a few months ago, it was actually relatively small adjustments that helped me progress – and all changes were unrelated to my intelligence. This experience showed that it can be important to keep going. Sometimes, a new strategy, some support, and a little faith are all you need to reach new heights. It also reminded me that, like in other parts of my life, I'm driven by seeing progress and achieving results.

Why did reaching Elo 1500 take so long, and what actions did I take?

One of the main reasons why it took so long is because I tried out different openings before finding ones that worked for me. To make matters worse, I didn't study or practice these openings enough before playing games.

My coach helped me build and study an opening repertoire, a set of standard openings and responses.

→ Lesson 1: It's important to select an opening repertoire that aligns with your style and level, and to practice them before competitive games.

Opening repertoire

For most of 2023, I was playing the Pirc Defense and King's Indian with Black, both of which are somewhat advanced openings. More recently, I switched to the Caro-Kann, which is better aligned with my level.

As White, I moved from the Four Knights Scotch to experimenting with the Stonewall and Queen's Gambit, before finally settling on the London System in early 2024. Adopting the London System improved my results.

Another reason for falling back from 1500 to 1300 is that I switched to shorter 10-minute games, which set me back quite a bit. I made a lot more mistakes under time pressure. Returning to 30-minute games helped me win more often.

→ Lesson 2: Longer time controls improve your decision-making.

Lastly, I mostly played games and almost never did chess puzzles. This meant my pattern recognition and tactical skills were weak. My coach quickly identified these as areas needing improvement.

Once I started focusing on puzzles, my pattern recognition improved. It became easier to spot tactical patterns such as forks, pins, and skewers. Puzzles also improved my visualization skills, i.e. the ability to calculate moves ahead.

→ Lesson 3: Rapid improvement in chess doesn't come from playing games 90% of the time. A balanced mix of playing, studying and solving puzzles is essential. For me, puzzles made a big difference.

Conclusion

My path to Elo 1500 was marked by many ups and downs. I had to develop better habits, dedicate more time to training, and learn new openings.

Although it took seventeen months to improve just 100 Elo points, my chess foundation feels a lot stronger now. I'm hopeful that the foundational work I've done over the past year will accelerate my progress toward a 1600 rating.

Mask27.dev I started my company last year and recently had the time to create my company’s website.

https://mask27.dev

Feel free to check it out.




May 14, 2024

Contre nos comportements stupides et criminels, l’apprentissage

Nos amies les baleines

Le chant des baleines est-il un langage avec une véritable grammaire et une signification précise ? Ou bien est-ce un chant « musical », émotionnel ? Nous n’en savons encore rien. Et c’est extrêmement difficile d’apprendre leur langage, car nous n’avons pas de conversation. Nous ne faisons que les écouter, nous n’avons aucun point de référence.

Rappelons que les hiéroglyphes égyptiens étaient indéchiffrables alors qu’il s’agissait d’une écriture humaine utilisée moins de 2000 ans auparavant. Nous ne comprenions pas nos propres ancêtres, avec le même cerveau que nous, le même mode de fonctionnement. Il a fallu, pour les déchiffrer, trouver un texte pour lequel nous connaissions la signification (la pierre de Rosette).

Les baleines sont une espèce totalement différente de nous, qui évolue dans un milieu totalement différent du nôtre et les seules interactions que nous avons avec elles, c’est pour les massacrer. Nous accomplissons un génocide, nous exterminons une espèce consciente, intelligente et communicante sans sourciller. Et sans réelle nécessité autre que notre petit confort absurde.

Nous sommes littéralement les mauvais extra-terrestres d’Independance Day ou de Mars Attack. En plus stupide. Car c’est notre propre planète que nous détruisons.

Parenthèse SF

Je profite de la référence SF pour présenter un tout nouveau blog de critiques de Science-Fiction qui me fait l’honneur de parler de mon recueil « Stagiaire au spatioport Omega 3000 »

Green washing

Nous sommes tellement stupides que, lorsque nous nous rendons vaguement compte du mal que nous faisons, nous tentons de nous acheter une conscience en, surtout, ne changeant rien. J’ai reçu un email qui me demandait, littéralement, si j’avais des solutions pour rendre le monde plus écologique, mais sans rien changer.

À ce sujet, Louis Derrac dézingue Écosia, qui le mérite amplement. Le principe d’Écosia c’est de vous montrer des publicités pour vous faire consommer, mais de vous dire que c’est vert parce qu’ils plantent des arbres.

Bref, c’est du bullshit à l’état brut. C’est une arnaque jusqu’au plus profond. Car, comment croyez-vous que l’on plante des arbres de manière industrielle ? C’est très simple : on rase une forêt, on fait une plantation d’arbres que l’on coupe régulièrement pour faire du papier et puis on replante. On génère de cette manière des certificats verts qu’on peut revendre. C’est une des raisons qui fait que les « certificats verts » sont une catastrophe écologique.

Au départ, je croyais qu’Écosia était simplement un truc un peu benêt, fait par des gens pas très malins. Je commence à penser qu’ils sont, en fait, vraiment malhonnêtes et cyniques. Mais également stupide.

Car, comme disait le père Brown, le personnage de G.K. Chesterton :

Être assez intelligent pour gagner autant d’argent implique d’être assez stupide pour le vouloir
Un homme s’écrie « Toutes ces entreprises immorales gagnent des milliards en exploitant, polluant, espionnant… » et deux riches patrons lui répondent : « Vous pouvez nous réexpliquer comment elles font ? Ça nous intéresse ! » Un homme s’écrie « Toutes ces entreprises immorales gagnent des milliards en exploitant, polluant, espionnant… » et deux riches patrons lui répondent : « Vous pouvez nous réexpliquer comment elles font ? Ça nous intéresse ! »

D’une manière générale : la publicité est une arnaque. C’est le principe. Il y a vingt ans, la publicité prétendait blanchir votre linge sale. Aujourd’hui c’est votre conscience. Mais c’est toujours un mensonge.

Ce qui me fend le cœur, c’est de voir des personnes hautement impliquées dans l’écologie et les luttes sociales trouver normal de promouvoir à tout prix leur compte Facebook où Instagram. Avec un iPhone parce que ça prend des plus belles photos pour Instagram. Voire dire que « l’informatique ne les intéresse pas » (alors que c’est la base de notre société et de ses rapports de force).

C’est pourquoi je suis content de voir que des initiatives existent désormais pour aider les assocs qui se veulent éthiques à être éthiques également dans le choix de leurs outils.

Addiction aux statistiques

Nous sommes tellement stupides que nous tentons d’optimiser des métriques absurdes en payant ceux qui fournissent ces métriques. Votre manager ne veut pas entendre parler de Mastodon, mais veut que vous intégriez une IA et vous mettiez en avant la page Facebook ?

Parce, comme tout le reste du monde, votre manager est, n’ayons pas peur du mot, stupide, et accroc aux statistiques absurdes. Car la page Facebook permet de voir "le nombre de clics journaliers" et le fameux "reach" de chaque post avec des jolis graphiques et des boutons à cliquer pour augmenter ces statistiques pour seulement quelques euros. Mastodon pas. Peertube, non. Alors, difficile de justifier le boulot sur ces plateformes.

Ce comportement, crétin et stupide, est tellement normalisé que c’est une des premières critiques que me disent les particuliers qui n’aiment pas Mastodon et/ou Gemini. Ils n’ont aucun intérêt commercial et pourtant ils se plaignent « de ne pas avoir de statistiques ».

Vous allez probablement me demander en quoi augmenter des statistiques en payant le fournisseur de statistique a une quelconque utilité et peut être considéré comme un succès.

Je suis content que vous posiez la question parce que, justement, les plus grandes entreprises de la planète basent aujourd’hui tout leur business model sur le fait que personne ne la posera jamais.

Encore et toujours l’IA

Le crétinisme ambiant devient particulièrement visible lors des bulles technologiques. Oui, les algorithmes LLMs sont intéressants. Mais non, vous n’en avez pas besoin. Même les utilisateurs les plus enthousiastes de Github Copilot commencent à réaliser qu’ils passent plus de temps à tenter de comprendre ce que le générateur pond plutôt que de l’écrire eux-mêmes. Et, avec candeur, de penser qu’il « suffirait que ça s’améliore juste un peu ».

Viznut, le mec qui a inventé le terme « permacomputing », réfléchit avec finesse sur ce que nous appelons l’intelligence artificielle.

Il en sort des conclusions que je partage totalement : oui, la technologie est fascinante et intéressante. Oui, c’est une pure bulle qui va exploser. Mais il en rajoute une couche : le concept même de prompt est absurde et jette à la poubelle 50 ans de recherche et d’expérience sur la notion d’interface utilisateur. Les ordinateurs qui répondent à un prompt, ça fonctionne dans Startrek (et Galaxy Quest quand c’est Sigourney Weaver qui demande) parce qu’il y a un scénario prédéfini.

Dans la réalité, un humain a besoin de comprendre comment son outil fonctionne pour affiner les résultats, pour en faire une utilisation créative. À moins qu’on ne veuille surtout pas d’utilisateurs créatifs et intelligents comme le suggère la dernière pub d’Apple pour son iPad ? J’ai, en toute honnêteté, cru qu’il s’agissait d’un spot anti-monopole et anti-Apple lorsque je l’ai vu pour la première fois (cela en dit long sur la prétention des mecs qui, à toutes les échelles de ce projet, n’ont pas réalisé que détruire des instruments de musique pouvait être mal perçu)

La médiocrité de la moyenne statistique

Dans les IA, le principe statistique impose, et c’est déjà le cas pour tous les algorithmes de promotions de « contenu », une médiocrité moyenne obligatoire. Il suffit de surfer sur Facebook ou Twitter pour comprendre ce que cette médiocrité « moyenne » représente.

Alors, peut-être que beaucoup d’humains n’ont pas envie d’apprendre à utiliser un outil et qu’ils sont contents avec un « prompt » qui leur propose des réponses définitives. Peut-être que beaucoup d’humains ne veulent pas être confrontés à des articles, des histoires ou des œuvres d’art qui sont particulièrement étranges et dérangeantes. En fait, c’est une certitude, la majorité des humains veut faire comme la majorité. Sans oser se l’avouer, bien sûr, mais quand même.

Moi pas. Et je ne pense pas être le seul.

Alors, par principe, je m’oppose à toute « moyenne statistique de la création humaine ». Je veux des œuvres qui dérangent, des questions qui fâchent, lire des textes qui sont détestés, honnis.

Je ne veux pas produire ou consommer du contenu. Je veux créer et découvrir.

Et si la bulle n’avait pas explosé ?

Charlie Stross rappelle que l’IA est objectivement une bulle qui ne sert que les intérêts de Nvidia et qui ne rapporte même pas de quoi payer l’électricité consommée. Cela lui rappelle la bulle dotcom de 2000. Et il se prend un délire génial : et si la bulle de 2000 n’avait pas explosé ? Et si toutes les promesses bullshit s’étaient réalisées ?

En remettre une couche ou simplifier ?

Au final, rajouter une couche de complexité sur un outil est un manque d’intelligence.

Plus on a de compétences techniques, plus on peut utiliser des technologies « basses », et plus on est incompétent, plus on utilise une technologie haut niveau qui décide à notre place. Donc il faut essayer d’enseigner la technologie la plus « basse » possible pour permettre aux enfants d’être en mesure de comprendre ce qu’ils font.

Marcello Vitali-Rosati, auteur de « Éloge du bug »

Je le vois autour de moi : toute personne qui creuse un peu, qui réfléchit tente de simplifier ses outils, de revenir à l’essentiel.

Thierry Crouzet a suivi un chemin très similaire au mien pour quitter Wordpress et revenir au texte brut. Si je ne vous ai pas encore convaincu, voici une autre version, une autre vision de la même solution au même problème. Avec un titre parfaitement adapté à la situation.

J’avoue de mon côté une fascination pour les réseaux dits « off-grid », qui ne reposent sur aucune infrastructure publique majeure (comme Internet repose sur le réseau téléphonique/câbles sous-marins/fibre optique).

Lark est justement en train de tester Meshtastic.

Productivité

Cela fait 35 ans que j’écris sur un ordinateur. Avec le recul, je n’ai aucun doute sur ce qui a eu le plus d’impact sur ma productivité : apprendre la dactylographie.

Si je ne dois conseiller qu’un seul apprentissage à ceux qui veulent augmenter à la fois le plaisir et leur productivité sur un ordinateur, c’est bien la dactylographie. Et, tant qu’à faire, autant l’apprendre sur une disposition optimisée comme le bépo.

Mon expérience date un peu, mais Mart-e a fait un long compte rendu de son passage, d’abord au Bépo puis à l’Ergo-L (le passage à l’Ergo-L me tente bien).

Si j’écris avec autant de facilité, c’est parce que je maitrise mon clavier Bépo et mon éditeur, Vim. Je n’utilise rien d’autre, pas même un plugin. J’écris de petits scripts bash pour accomplir les opérations répétitives. Et force est de constater que je suis productif pour écrire du texte et du code.

Aucune innovation ne peut passer outre l’apprentissage. Aucun ordinateur ne pourra jamais accomplir ce que vous voulez si vous n’avez pas d’abord appris à savoir ce que vous voulez.

Si vous pensez qu’une intelligence artificielle va augmenter votre productivité, mais que vous n’êtes pas prêt à investir quelques heures pour apprendre la dactylographie, vous êtes, excusez-moi de vous le dire, un crétin. Vous espérez faire plus en pensant moins. Or le monde a exactement besoin du contraire : des gens qui pensent plus pour faire moins.

Notre stupidité est en train de détruire notre monde, de détruire la vie. Les nouveaux outils ne feront qu’envenimer la situation. Notre seule arme, l’unique, c’est l’apprentissage et l’éducation.

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

May 09, 2024

Beste andere Belgen

Venera en ik hebben een bébé gemaakt. Hij is Troi en naar het schijnt lijkt hij erg op mij.

Hier is een fotooke van hem en zijn mama. Troi is uiteraard heel erg schattig:

Last week my new book has been published, Building Wireless Sensor Networks with OpenThread: Developing CoAP Applications for Thread Networks with Zephyr.

Thread is a protocol for building efficient, secure and scalable wireless mesh networks for the Internet of Things (IoT), based on IPv6. OpenThread is an open-source implementation of the Thread protocol, originally developed by Google. It offers a comprehensive Application Programming Interface (API) that is both operating system and platform agnostic. OpenThread is the industry’s reference implementation and the go-to platform for professional Thread application developers.

All code examples from this book are published on GitHub. The repository also lists errors that may have been found in this book since its publication, as well as information about changes that impact the examples.

This book uses OpenThread in conjunction with Zephyr, an open-source real-time operating system designed for use with resource-constrained devices. This allows you to develop Thread applications that work on various hardware platforms, without the need to delve into low-level details or to learn another API when switching hardware platforms.

With its practical approach, this book not only explains theoretical concepts, but also demonstrates Thread’s network features with the use of practical examples. It explains the code used for a variety of basic Thread applications based on Constrained Application Protocol (CoAP). This includes advanced topics such as service discovery and security. As you work through this book, you’ll build on both your knowledge and skills. By the time you finish the final chapter, you’ll have the confidence to effectively implement Thread-based wireless networks for your own IoT projects.

What is Thread?

Thread is a low-power, wireless, and IPv6-based networking protocol designed specifically for IoT devices. Development of the protocol is managed by the Thread Group, an alliance founded in 2015. This is a consortium of major industry players, including Google, Apple, Amazon, Qualcomm, Silicon Labs, and Nordic Semiconductor.

You can download the Thread specification for free, although registration is required. The version history is as follows:

Versions of the Thread specification

Version

Release year

Remarks

Thread 1.0

2015

Never implemented in commercial products

Thread 1.1

2017

  • The ability to automatically move to a clear channel on detecting interference (channel agility)

  • The ability to reset a master key and drive new rotating keys in the network

Thread 1.2

2019

  • Enhancements to scalability and energy efficiency

  • Support for large-scale networking applications, including the ability to integrate multiple Thread networks into one singular Thread domain

Thread 1.3

2023

  • Enhancements to scalability, reliability, and robustness of Thread networks

  • Standardization of Thread Border Routers, Matter support, and firmware upgrades for Thread devices

All versions of the Thread specification maintain backward compatibility.

Thread’s architecture

Thread employs a mesh networking architecture that allows devices to communicate directly with each other, eliminating the need for a central hub or gateway. This architecture offers inherent redundancy, as devices can relay data using multiple paths, ensuring increased reliability in case of any single node failure.

The Thread protocol stack is built upon the widely used IPv6 protocol, which simplifies the integration of Thread networks into existing IP infrastructures. Thread is designed to be lightweight, streamlined and efficient, making it an excellent protocol for IoT devices running in resource-constrained environments.

Before discussing Thread, it’s essential to understand its place in the network protocol environment. Most modern networks are based on the Internet Protocol suite, which uses a four-layer architecture. The layers of the Internet Protocol suite are, from bottom to top:

Link layer

Defines the device’s connection with a local network. Protocols like Ethernet and Wi-Fi operate within this layer. In the more complex Open Systems Interconnection (OSI) model, this layer includes the physical and data link layers.

Network layer

Enables communication across network boundaries, known as routing. The Internet Protocol (IP) operates within this layer and defines IP addresses.

Transport layer

Enables communication between two devices, either on the same network or on different networks with routers in between. The transport layer also defines the concept of a port. User Datagram Protocol (UDP) and Transmission Control Protocol (TCP) operate within this layer.

Application layer

Facilitates communication between applications on the same device or different devices. Protocols like HyperText Transfer Protocol (HTTP), Domain Name System (DNS), Simple Mail Transfer Protocol (SMTP), and Message Queuing Telemetry Transport (MQTT) operate within this layer.

When data is transmitted across the network, the data of each layer is embedded in the layer below it, as shown in this diagram:

/images/internet-protocol-encapsulation.png

Network data is encapsulated in the four layers of the Internet Protocol suite (based on: Colin Burnett, CC BY-SA 3.0)

Thread operates in the network and transport layers. However, it’s important to know what’s going on in all layers.

Network layer: 6LoWPAN and IPv6

Thread’s network layer consists of IPv6 over Low-Power Wireless Access Networks (6LoWPAN). This is an IETF specification described in RFC 4944, "Transmission of IPv6 Packets over IEEE 802.15.4 Networks", with an update for the compression mechanism in RFC 6282, "Compression Format for IPv6 Datagrams over IEEE 802.15.4-Based Networks". 6LoWPAN’s purpose is to allow even the smallest devices with limited processing power and low energy requirements to be part of the Internet of Things.

6LoWPAN essentially enables you to run an IPv6 network over the IEEE 802.15.4 link layer. It acts as an ‘adaptation layer’ between IPv6 and IEEE 802.15.4 and is therefore sometimes considered part of the link layer. This adaptation poses some challenges. For example, IPv6 mandates that all links can handle datagram sizes of at least 1280 bytes, but IEEE 802.15.4, as explained in the previous subsection, limits a frame to 127 bytes. 6LoWPAN solves this by excluding information in the 6LoWPAN header that can already be derived from IEEE 802.15.4 frames and by using local, shortened addresses. If the payload is still too large, it’s divided into fragments.

The link-local IPv6 addresses of 6LoWPAN devices are derived from the IEEE 802.15.4 EUI-64 addresses and their shortened 16-bit addresses. The devices in a 6LoWPAN network can communicate directly with ‘normal’ IPv6 devices in a network if connected via an edge router. This means there is no need for translation via a gateway, in contrast to non-IP networks such as Zigbee or Z-Wave. Communication with non-6LoWPAN networks purely involves forwarding data packets at the network layer.

In this layer, Thread builds upon IEEE 802.15.4 to create an IPv6-based mesh network. In IEEE 802.15.4 only communication between devices that are in immediate radio range is possible, whereas routing allows devices that aren’t in immediate range to communicate.

If you want to delve into more details of Thread’s network layer, consult the Thread Group’s white paper Thread Usage of 6LoWPAN.

Transport layer: UDP

On top of 6LoWPAN and IPv6, Thread’s transport layer employs the User Datagram Protocol (UDP), which is the lesser known alternative to Transmission Control Protocol (TCP).

UDP has the following properties:

Connectionless

Unlike TCP, UDP doesn’t require establishing a connection before transferring data. It sends datagrams (packets) directly without any prior setup.

No error checking and recovery

UDP doesn’t provide built-in error checking, nor does it ensure the delivery of packets. If data is lost or corrupted during transmission, UDP doesn’t attempt to recover or resend it.

Speed

UDP is faster than TCP, as it doesn’t involve the overhead of establishing and maintaining a connection, error checking, or guaranteeing packet delivery.

UDP is ideal for use by Thread for several reasons:

Resource constraints

Thread devices often have limited processing power, memory, and energy. The simplicity and low overhead of UDP make it a better fitting choice for such resource-constrained devices compared to TCP.

Lower latency

UDP’s connectionless and lightweight nature guarantees data transmission with low latency, making it appropriate for wireless sensors.

Thread lacks an application layer

Unlike home automation protocols such as Zigbee, Z-Wave, or Matter, the Thread standard doesn’t define an application layer. A Thread network merely offers network infrastructure that applications can use to communicate. Just as your web browser and email client use TCP/IP over Ethernet or Wi-Fi, home automation devices can use Thread over IEEE 802.15.4 as their network infrastructure.

Several application layers exist that can make use of Thread:

Constrained Application Protocol (CoAP)

A simpler version of HTTP, designed for resource-constrained devices and using UDP instead of TCP.

MQTT for Sensor Networks (MQTT-SN)

A variant of MQTT designed to be used over UDP.

Apple HomeKit

Apple’s application protocol for home automation devices.

Matter

A new home automation standard developed by the Connectivity Standards Alliance (CSA).

Note

Matter was developed by the Zigbee Alliance, the original developers of the Zigbee standard for home automation. Following the introduction of Matter, the Zigbee Alliance rebranded to the Connectivity Standards Alliance. Matter’s application layer is heavily influenced by Zigbee. It not only runs on top of Thread, but can also be used over Wi-Fi and Ethernet.

Like your home network which simultaneously hosts a lot of application protocols including HTTP, DNS, and SMTP, a Thread network can also run all these application protocols concurrently.

/images/thread-stack.png

The Thread network stack supports multiple application protocols simultaneously.

Throughout this book, I will be using CoAP as an application protocol in a Thread network. Once you have a Thread network set up, you can also use it for Apple HomeKit and Matter devices.

Advantages of Thread

Some of Thread’s key benefits include:

Scalability

Thread’s mesh network architecture allows for large-scale deployment of IoT devices, supporting hundreds of devices within a single network. Thread accommodates up to 32 Routers per network and up to 511 End Devices per Router. In addition, from Thread 1.2, multiple Thread networks can be integrated into one single Thread domain, allowing thousands of devices within a mesh network.

Security

Devices can’t access a Thread network without authorization. Moreover, all communication on a Thread network is encrypted using IEEE 802.15.4 security mechanisms. As a result, an outsider without access to the network credentials can’t read network traffic.

Reliability

Thread networks are robust and support self-healing and self-organizing network properties at various levels, ensuring the network’s resilience against failures. This all happens transparently to the user; messages are automatically routed around any bad node via alternative paths.

For example, an End Device requires a parent Router to communicate with the rest of the network. If communication with this parent Router fails for any reason and it becomes unavailable the End Device will choose another parent Router in its neighborhood, after which communication resumes.

Thread Routers also relay packets from their End Devices to other Routers using the most efficient route they can find. In case of connection problems, the Router will immediately seek an alternate route. Router-Eligible End Devices can also temporarily upgrade their status to Routers if necessary. Each Thread network has a Leader who supervises the Routers and whose role is dynamically selected by the Routers. If the Leader fails, another Router automatically takes over as Leader.

Border Routers have the same built-in resilience. In a Thread network with multiple Border Routers, communication between Thread devices and devices on another IP network (such as a home network) occurs along multiple routes. If one Border Router loses connectivity, communications will be rerouted via the other Border Routers. If your Thread network only has a single Border Router, this will become a single point of failure for communication with the outside network.

Low power consumption

Thread is based on the power-efficient IEEE 802.15.4 link layer. This enables devices to operate on batteries for prolonged periods. Sleepy End Devices will also switch off their radio during idle periods and only wake up periodically to communicate with their parent Router, thereby giving even longer battery life.

Interoperability

As the protocol is built upon IPv6, Thread devices are straightforward to incorporate into existing networks, both in industrial and home infrastructure, and for both local networks and cloud connections. You don’t need a proprietary gateway; every IP-based device is able to communicate with Thread devices, as long as there’s a route between both devices facilitated by a Border Router.

Disadvantages of Thread

In life, they say there’s no such thing as a free lunch and Thread is no exception to this rule. Some of its limitations are:

Limited range

Thread’s emphasis on low power consumption can lead to a restricted wireless range compared with other technologies such as Wi-Fi. Although this can be offset by its mesh architecture, where Routers relay messages, it still means that devices in general can’t be placed too far apart.

Low data rates

Thread supports lower data rates than some rival IoT technologies. This makes Thread unsuitable for applications requiring high throughput.

Complexity

The Thread protocol stack may be more challenging to implement compared to other IoT networking solutions, potentially ramping up development costs and time.

This is an abridged version of the introduction of my book Building Wireless Sensor Networks with OpenThread: Developing CoAP Applications for Thread Networks with Zephyr. You can read the full introduction for free in the Elektor Store, where you can also find the table of contents.

Platforms used in this book

In this book I focus on building wireless sensor networks using the Thread protocol together with the following hardware and software environments:

Nordic Semiconductor’s nRF52840 SoC

A powerful yet energy-efficient and low cost hardware solution for Thread-based IoT devices

OpenThread

An open-source implementation of the Thread protocol, originally developed by Google, making it easy to incorporate Thread into your IoT projects

Zephyr

An open-source real-time operating system designed for resource-constrained devices

Armed with these tools and by using practical examples I will go on to guide you step-by-step through the details of the hardware and software required to build Thread networks and OpenThread-based applications.

Nordic Semiconductor’s nRF52840 SoC

Nordic Semiconductor’s nRF52840 is a SoC built around the 32-bit ARM Cortex-M4 CPU running at 64 MHz. This chip supports Bluetooth Low Energy (BLE), Bluetooth Mesh, Thread, Zigbee, IEEE 802.15.4, ANT and 2.4 GHz proprietary stacks. With its 1 MB flash storage and 256 KB RAM, it offers ample resources for advanced Thread applications.

This SoC is available for developers in the form of Nordic Semiconductor’s user-friendly nRF52840 Dongle. You can power and program this small, low-cost USB dongle via a computer USB port. The documentation lists comprehensive information about the dongle.

/images/nrf52840-dongle.png

Nordic Semiconductor’s nRF52840 Dongle is a low-cost USB dongle ideal for experimenting with Thread. (image source: Nordic Semiconductor)

OpenThread

Thread is basically a networking protocol; in order to work with it you need an implementation of the Thread networking protocol. The industry’s reference implementation, used even by professional Thread device developers, is OpenThread. It’s a BSD-licensed implementation, with development ongoing via its GitHub repository. The license for this software allows for its use in both open-source and proprietary applications.

OpenThread provides an Application Programming Interface (API) that’s operating system and platform agnostic. It has a narrow platform abstraction layer to achieve this, and has a small memory footprint, making it highly portable. In this book, I will be using OpenThread in conjunction with Zephyr, but the same API can be used in other combinations, such as ESP-IDF for Espressif’s Thread SoCs.

/images/openthread-website.png

The BSD-licensed OpenThread project is the industry’s standard implementation of the Thread protocol.

Zephyr

Zephyr is an open-source real-time operating system (RTOS) designed for resource-constrained devices. It incorporates OpenThread as a module, simplifying the process of creating Thread applications based on Zephyr and OpenThread. Because of Zephyr’s hardware abstraction layer, these applications will run on all SoCs supported by Zephyr that have an IEEE 802.15.4 radio facility.

/images/zephyr-website.png

Zephyr is an open-source real-time operating system (RTOS) with excellent support for OpenThread.

May 08, 2024

This week, approximately 1,400 Drupal enthusiasts came together for DrupalCon North America in Portland, Oregon. As a matter of tradition, I delivered my State of Drupal keynote, often referred to as "DriesNote". In case you missed it, you can watch the video or download my slides (385 MB).

This year's keynote was inspired by President John F. Kennedy's famous "Moonshot" speech. After being global leaders, the U.S. had fallen behind in the Space Race. Challenged by the Soviet cosmonaut program, President Kennedy rallied Americans around the ambitious goal of landing on the moon before the decade was out.

Drupal Starshot, a new version of Drupal

Drupal has always been known for its low-code capabilities. However, many competitors now offer similar features, and in some areas, they even surpass what Drupal provides. While Drupal is celebrated for its robustness, it can be challenging for newcomers, especially those with limited technical expertise. So in my keynote, I was excited to introduce Drupal Starshot, our "Moonshot" to make Drupal more accessible and easier to use.

Twenty-three years after Drupal's inception, we are preparing to launch a second official version of Drupal. For the time being, we're calling this second version "Drupal CMS". It will be built on top of Drupal Core and common contributed modules, and available as a separate download alongside Drupal Core.

Wireframe for Drupal download page displaying two options: "Drupal CMS" for general use and "Drupal Core" for custom CMS development. Wireframe of the Drupal.org download page featuring two options: Drupal CMS (Drupal Starshot) and Drupal Core, with Drupal CMS being promoted as the preferred starting point for most.

Drupal Starshot will be designed to have a great out-of-the-box experience. It will enable Ambitious Site Builders without Drupal experience to easily create a new Drupal site and extend it with pre-packaged recipes, all using their browser.

The vision for Drupal Starshot is the outcome of highly productive brainstorming sessions with Drupal Core Committers, the Drupal Association, colleagues at Acquia, various Drupal agencies, and others.

From an implementation standpoint, it will primarily rely on the Project Browser and Recipes initiatives, while also incorporating the Experience Builder initiative. We actually started prototyping Drupal Starshot a few weeks ago and showcased our progress at DrupalCon. Our goal is to launch a first version of Drupal Starshot within 8 months.

At DrupalCon, hundreds of people pledged to get involved, and we had two "super BoFs" with over 50 people each. If you're interested in getting involved with Drupal Starshot, you can register your interest at https://drupal.org/starshot. Additionally, join the conversation in the #Starshot channel on Drupal Slack.

Drupal Starshot pledge stickers with example pledges like "Tester", "Marketer", and "Documentation".

Drupal's brand refresh and marketing strategy

Alongside our technical efforts, we've launched a bold marketing strategy. At DrupalCon Lille, I discussed the need for a fresh marketing approach. Since then, we've made tremendous progress.

I introduced a Drupal brand refresh, which includes updated brand guidelines to ensure a consistent and modern visual identity. This refresh aims to invigorate our brand – making it more vibrant and appealing to newcomers, while still honoring our history and representing our diverse, global community.

I couldn't cover all the details around the brand refresh in my keynote, so I'm expanding on them in this blog post. For a detailed explanation of Drupal's refreshed brand, check out this video by Shawn Perritt: Drupal 2024 brand refresh explained.

Our efforts went beyond just a brand refresh. The Marketing Committee has also guided the community in developing a comprehensive marketing toolkit, which includes messaging guides, pitch decks, and more. All these resources will be available at https://www.drupal.org/marketing.

Let's reach for the stars

A Drupal astronaut on the moon with a laptop and a Drupal flag.

As we advance with Drupal Starshot, I recall President Kennedy's famous words:

We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win.

Embracing the Drupal Starshot initiative reflects a similar ethos; we're not choosing the easy path, but the one that tests our strength, creativity, and ability to do hard things. The success of Drupal Starshot will be a testament to the incredible collective power of the Drupal community. It's a challenge we are also unwilling to postpone, and intend to win.

Drupal Starshot is more than a technological leap; it represents a shift in how we think, innovate, and collaborate. It's about reaching for the stars and making the Open Web accessible to everyone.

Thank you for being part of this journey. I can't wait to see where it takes us together. The feedback from the last two days at DrupalCon has been overwhelmingly positive. I am more committed than ever and excited to pledge a significant amount of my time to this mission, and to the future of Drupal.

May 01, 2024

Yesterday, MySQL 8.4, the very first LTS version of MySQL was released.

A lot of deprecations have finally been removed, and several InnoDB variable default values have been modified to match current workloads and hardware specifications.

The default value of 20 InnoDB variables has been modified!

Let’s have a look at those variables and explain the reason for such modification:

innodb_buffer_pool_in_core_file

Previous Value:ON
New Value (8.4 LTS):OFF if MADV_DONTDUMP is supported
else ON

MADV_DONTDUMP is a macro supported in Linux 3.4 and later, (“sys/mman.h” header file is present and contains the symbol MADV_DONTDUMP, a non-POSIX extension to madvise()), this is not supported on Windows machines or most MacOS systems.

In summary, this means that by default on Linux systems, the content of the Buffer Pool is not dumped in core file.

innodb_buffer_pool_instances

Previous Value:8 (or 1 if BP < 1 GB)
New Value (8.4 LTS):If BP <= 1 GB: 1
If BP > 1 GB: then the minimum value in the range of 1-64 between:
a. (innodb_buffer_pool_size / innodb_buffer_pool_chunk_size) / 2
b. 1/4 of available logical processors

The old value of 8 could have been too large on some systems. The manual contains nice examples of the BP size calculation, see the Configuring InnoDB Buffer Pool Size.

innodb_change_buffering

Previous Value:all
New Value (8.4 LTS):none

Change buffering is a technique that was beneficial for favoring sequential I/O by delaying write operations to secondary indexes. On most recent hardware, random I/O is not a problem anymore.

innodb_dedicated_server

Previous Value:OFF
New Value (8.4 LTS):OFF

Since MySQL 8.0 we recommend enabling this variable and not modifying manually the InnoDB settings taken in charge by this variable when MySQL is running on a dedicated server where all resources are available for the database.

The default value is the same for this variable, but the variables controlled by enabling innodb_dedicated_server are different.

Since MySQL 8.4, innodb_dedicated_server configures the following variables:

  • innodb_buffer_pool_size
    • 128MB is the server has less than 1 GB memory.
    • detected server memory * 0.5 if the server has between 1GB and 4GB memory.
    • detected server memory * 0.75 if the server had more than 4GB memory.
  • innodb_redo_log_capacity: (number of available logical processors/2) GB, with a maximum of 16GB.

innodb_flush_method is not automatically configured when innodb_dedicated_server is enabled.

innodb_adaptive_hash_index

Previous Value:ON
New Value (8.4 LTS):OFF

AHI (InnoDB Adaptive Hash Index) has long been the cause of some performance issues. Every experienced DBA always advises just disabling it, almost like the old Query Cache. I’m surprised that there wasn’t an AHI Tuner like the Query Cache Tuner from Domas Mituzas 😉

AHI may provide some benefit on read queries (SELECT) when none of the data is changed and is fully cached in the Buffer Pool. As soon as there are write operations, or a higher load on the system, or if all the data required for the read cannot be cached, the Adaptive Hash Index becomes a massive bottleneck.

To have a more predictable response time, it’s recommended to disable it.

innodb_doublewrite_files

Previous Value:innodb_buffer_pool_instances * 2
New Value (8.4 LTS):2

Previously the default value was calculated according to the number of buffer pools, to simplify, the default is now 2.

The documentation states that this value defines the number of double write files for each buffer pool. But I’ve the impression that his it global independently of the amount of buffer pool instances.

From the MySQL error log:

2024-05-01T05:43:03.226604Z 1 [Note] [MY-012955] [InnoDB] Initializing buffer pool, total size = 2.000000G, instances = 2, chunk size =128.000000M 
[...]
2024-05-01T05:43:03.288068Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_0.dblwr' for doublewrite
2024-05-01T05:43:03.295917Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_1.dblwr' for doublewrite
2024-05-01T05:43:03.317319Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_0.bdblwr' for doublewrite
2024-05-01T05:43:03.317398Z 1 [Note] [MY-013566] [InnoDB] Double write buffer files: 2
2024-05-01T05:43:03.317410Z 1 [Note] [MY-013565] [InnoDB] Double write buffer pages per instance: 128
2024-05-01T05:43:03.317423Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_0.dblwr' for doublewrite
2024-05-01T05:43:03.317436Z 1 [Note] [MY-013532] [InnoDB] Using './#ib_16384_1.dblwr' for doublewrite

We see that we have 2 Buffer Pool instances, but still only 2 double write buffer files. I would expect 4 according to the documentation. The third file, #ib_16384_0.bdblwr, is created to be used when innodb_doublewrite is set to “DETECT_ONLY“.

With DETECT_ONLY, only metadata is written to the doublewrite buffer. Database page content is not written to the doublewrite buffer, and recovery does not use the doublewrite buffer to fix incomplete page writes. This lightweight setting is intended for detecting incomplete page writes only.

innodb_doublewrite_pages

Previous Value:innodb_write_io_threads (4 by default)
New Value (8.4 LTS):128

From our testing and for performance reasons, we realized that having a larger value as default was better as we often recommended to increase it.

innodb_flush_method

Previous Value:fsync
New Value (8.4 LTS):O_DIRECT (or fsync)

When supported, O_DIRECT has always been the preferred value and we recommended using it to bypass the filesystem cache to flush InnoDB changes to disk (for data files and log files).

If O_DIRECT is not supported, we use the old fsync method. This is for Unix, on Windows, the default value is unbuffered.

innodb_io_capacity

Previous Value:200
New Value (8.4 LTS):10000

For recent systems (RAIDs, SSDs, … ), the default I/O capacity was too low. As the variable defines the number of IOPS available to InnoDB background operations, having a too-low value was limiting the performance.

innodb_io_capacity_max

Previous Value:2 * innodb_io_capacity (min 2000)
New Value (8.4 LTS):2 * innodb_io_capacity

If InnoDB needs to flush more aggressively, this variable defines the maximum number of IOPS InnoDB can use to perform the background operations. The new default is simpler as it’s just double the innodb_io_capacity.

innodb_log_buffer_size

Previous Value:16 MB
New Value (8.4 LTS):64 MB

We increased the default because a large log buffer enables large transactions to run without requiring the log to be written to disk before the transactions commit.

innodb_numa_interleave

Previous Value:OFF
New Value (8.4 LTS):ON

When the system supports NUMA, the new default sets the NUMA memory policy to MPOL_INTERLEAVE for mysqld during the allocation of the InnoDB Buffer Pool. This operation balances memory allocation randomly to all numa nodes, causing better spread between those nodes.

Of course, you benefit from this only if your system has multiple NUMA nodes.

This is how to verify the number of nodes:

$ numactl --hardware
available: 2 nodes (0-1)
node 0 size: 16160 MB
node 0 free: 103 MB
node 1 size: 16130 MB
node 1 free: 83 MB
node distances:
node 0 1
0: 10 20
1: 20 10

In the example above, we can see that the CPU has two nodes.

You can also use lstopo to display the architecture and display the NUMA cores. This is another example:

innodb_page_cleaners

Previous Value:4
New Value (8.4 LTS):innodb_buffer_pool_instances

The new default is to use as many threads to flush dirty pages from buffer pool instances as there are buffer pool instances.

innodb_parallel_read_threads

Previous Value:4
New Value (8.4 LTS):logical processors / 8 (min 4)

For performance reason, on systems with a large amount of logical CPUs, the number of threads used for parallel clustered index reads is automatically increased.

innodb_purge_threads

Previous Value:4
New Value (8.4 LTS):1 if logical processors <= 16
else 4

This variable is somehow also auto configured for systems with a large amount (>=16) of vCPUs. But we also realised that having 4 purge threads can be problematic on some smaller systems. For such system, we reduced the default value to 1.

innodb_read_io_threads

Previous Value:4
New Value (8.4 LTS):logical processors / 2 (min 4)

This variable also increase automatically in case of the system has more than 8 vCPUs.

innodb_use_fdatasync

Previous Value:OFF
New Value (8.4 LTS):ON

On systems supporting it, a fdatasync() call does not flush changes to file metadata unless required. This provides a performance benefit.

temptable_max_ram

Previous Value:1 GB
New Value (8.4 LTS):3% of total memory (within a range of 1-4 GB)

The default now auto-increases if the system benefits from a large amount of memory. But the default cap to 4GB. So for systems having more than 132GB of memory, by default the value of temptable_max_ram will be set to 4GB.

temptable_max_mmap

Previous Value:1 GB
New Value (8.4 LTS):0 (disabled)

The new default disables the allocation of memory from memory-mapped temporary files (no creation of files in tmpdir).

temptable_use_mmap

Previous Value:ON
New Value (8.4 LTS):OFF

When temptable_use_mmap is disabled (new default), the TempTable storage engine uses InnoDB on-disk internal temporary tables instead of allocating space for internal in-memory temporary tables as memory-mapped temporary files in the tmpdir when the amount of the TempTable storage engine exceeds the limit defined by the temptable_max_ram variable.

Conclusion

With this brand new version of MySQL, the very first LTS, we’ve had the chance to change the default values of certain InnoDB variables to bring them more into line with the reality of production servers.

Some are now auto-tuned to match better the system on which MySQL is running.

Enjoy MySQL and enjoy the new defaults!

April 26, 2024

Lectures : Soyez un héros, passez en manuel !

L’abrutissement de l’automatisation

Une tiktokeuse lance la mise à jour de sa Tesla et se retrouve coincée dedans pendant 40 minutes.

Mais, comme le souligne brillamment Olivier Ertzscheid, la subtilité est qu’elle aurait pu sortir… en ouvrant la porte « manuellement », ce qu’elle s’est refusée à faire.

D’un autre côté, le simple fait qu’on puisse ouvrir une porte non manuellement est inquiétant. Parce qu’on perd l’habitude de ce geste simple est qu’en cas d’urgence, on pourrait tout simplement ne pas y penser. Comme nous avons perdu la capacité d’imaginer réparer un objet qui tombe en panne.

Nous sommes devenus cette masse abrutie qui regarde béatement le héros lorsque, devant la catastrophe imminente, celui-ci annonce, sûr de lui : « je passe en manuel ! ».

Mort aux notifications

Un autre truc pour lequel on devrait passer en manuel : les notifications.

Je le dis et le répète : les notifications sont une abomination. Elles représentent littéralement quelqu’un qui vient vous taper sur l’épaule pour vous interrompre sans ménagement dans ce que vous faites, quelle que soit la situation.

La seule personne que j’autorise à faire cela, c’est moi-même. Les seules notifications que j’accepte sont donc celles de mon agenda. Mon bureau Regolith n’a même pas de notifications tout court et mon téléphone est soit en silencieux, soit en mode avion. Seule exception : lorsque j’attends un coup de fil ou que je veux être joignable pour mon épouse. Et je peste de ne pas pouvoir activer les notifications uniquement pour une personne particulière. Pourquoi pensez-vous que cette fonctionnalité évidente soit si rare, voire inexistante ?

Le simple fait que, par défaut, tous nos appareils soient conçus pour nous interrompre est la preuve que nous sommes du bétail exploité par les fabricants. Sur macOS, il est impossible de désactiver de manière permanente les notifications. J’ai même reçu le témoignage de personnes qui ne savaient pas qu’il était possible de désactiver les « bip bip » incessants de leur téléphone (mise à jour d’app, notifications aléatoires, etc.) Ça les ennuyait, mais elles vivaient avec, croyant que c’était le prix à payer.

Chaque notification est une source de stress, même si elle ne vient pas de votre téléphone, même si vous n’êtes pas concerné. Le son des notifications est pensé et conçu pour mettre votre cerveau en alerte, pour générer une petite décharge d’adrénaline. Les notifications nous épuisent littéralement.

Je suis devenu un psychopathe antinotifications. Pour tous les numéros inconnus qui m’ont appelé ou qui m’ont envoyé un SMS publicitaire, je dépose d’ailleurs une plainte sur le point de contact belge. Je viens de le faire pour un opticien de ma ville qui m’a annoncé par SMS que les appareils auditifs étaient en promotion. J’ai donné mon numéro, il y a plusieurs années, pour qu’ils puissent me prévenir quand mes lunettes étaient réparées.

Je ne suis pas sûr de l’utilité, mais je me dis que, à la longue, si on est plusieurs dizaines à faire pareil, ça va commencer à ne plus pouvoir être ignoré.

Et je me désabonne d’absolument tout email qui arrive dans mon inbox depuis une liste. Si je suis réabonné, je désactive l’alias dans SimpleLogin.

L’omniprésence du tracking

Une bonne manière de visualiser que toute votre activité en ligne est surveillée et de faire en sorte que votre ordinateur émette une… notification sous forme de bip sonore à chaque fois qu’il envoie des données vers un espion identifié. Oui, je me contredis. Mais ici, c’est juste pour l’expérience !

C’est ce qu’a fait Bert Hubert. Ceci ne prend évidemment pas en compte les espions non identifiés. Mais c’est déjà impressionnant en soi. Allez voir la seconde vidéo pour vous donner une idée.

Abonnez-vous à ma chaîne !

Tant qu’on y est, pour être notifié chaque fois que je poste un billet, n’oubliez pas de vous abonner à ma chaîne ! Enfin, ma mailing-liste ;-)

Un prisonnier enchaîné dit à un autre « Abonnez-vous à ma chaîne ! » Un prisonnier enchaîné dit à un autre « Abonnez-vous à ma chaîne ! »

Israël et Whatsapp

Loin de passer en manuel, Israël utiliserait "l’intelligence artificielle" pour déterminer les cibles de ses frappes. Et de forts soupçons existent sur le fait que cette IA soit alimentée par des données Whatsapp. C’est entièrement plausible, car, contrairement à Signal, Whatsapp recueille des myriades d’informations sur vous. Je l’ai vécu il y a quelques années lorsque j’avais encore des comptes Facebook et Whatsapp. Un de mes anciens propriétaires, avec qui je n’étais plus en contact depuis des années, m’a envoyé par erreur un message Whatsapp. Dans les minutes qui ont suivi ma réponse pour lui signifier son erreur, Facebook m’a proposé d’être ami avec cette personne (nous n’avions ni connaissance ni page suivie en commun).

Le simple fait d’être dans un groupe de discussion avec un terroriste, d’être en contact avec lui voire d’être au même endroit à un moment donné suffirait à faire de vous une victime d’une frappe. Et tant pis si le terroriste était dans un appartement deux étages au-dessous du vôtre : vous partagiez les mêmes coordonnées géographiques !

Notez que ce n’est pas neuf. Plusieurs exemples existent en Afghanistan de victimes ayant été ciblées, car ayant, selon des algorithmes « intelligents », un « comportement suspect ». Je me souviens d’un médecin pakistanais tué par un drone US. Il était considéré comme suspect, car passant fréquemment la frontière (il travaillait en Afghanistan et vivait au Pakistan) et changeant régulièrement de carte sim (oui, il en avait une pour chaque pays pour des raisons évidentes d’abonnement).

Ce qui est fascinant c’est à quel point ces outils permettent de diluer la responsabilité. Les militaires obéissent aux ordres, les commanditaires disent se fier aux algorithmes, les programmeurs que la responsabilité incombe aux collecteurs de données et ces derniers prétendent ne faire que collecter des données, ne rien en inférer ni décider.

C’est d’ailleurs le véritable marché de l’intelligence artificielle : diluer la responsabilité. Une voiture tue un piéton ? C’est « l’algorithme ». Une entreprise balance des tonnes de produits néfastes dans une rivière ? C’est « une erreur de programmation ».

Être sur Whatsapp est potentiellement mortel. Vos données seront, un jour, utilisées contre vous. Et certainement par erreur. La question n’est pas de savoir si vous avez quelque chose à vous reprocher, mais si un algorithme ne pourrait pas un jour penser que vous êtes potentiellement dangereux. Et ce, pour chaque algorithme qui existe et pour chaque jeu de données.

En attendant, utilisez Signal. Signal n’est pas parfait, mais n’a, pour seules données à votre sujet, que votre numéro de téléphone, la date de création de votre compte et la date de votre dernière connexion. Contrairement à Whatsapp, Signal n’a accès ni à vos contacts, ni à votre localisation, ni à vos groupes, ni à vos habitudes. Signal n’est pas non plus une entreprise commerciale, mais une fondation (dont une grande partie du financement vient d’un don… du créateur de Whatsapp, qui a renié sa création après avoir vu ce que Facebook en faisait).

Pour ceux qui le peuvent, supprimez Whatsapp. Oui, vous allez rater des choses. Mais vous allez aussi faire évoluer la masse critique vers Signal. L’effort en vaut la peine.

Signal se merdifie ? Non, c’est Apple !

Lors d’un appel vidéo récent sur Signal, j’ai eu la surprise de voir s’afficher des émojis en forme de pouces levés chaque fois que mon correspondant agitait les mains. Je me suis immédiatement demandé comment Signal pouvait ajouter cette fonctionnalité sans intercepter la vidéo. Après recherche, il s’est avéré que ce n’est pas Signal, mais bien l’iPhone qui fait ça automatiquement.

Résultat : ce truc, dont je n’ose imaginer le coût de développement et l’impact sur la batterie, apparaissait tout le temps et nous a fait perdre 15 minutes de discussion avant de commencer à nous énerver sérieusement. Ce n’est même plus possible de discuter sérieusement sans être distrait. C’est un peu comme les notifications : ces produits ne sont pas pensés pour nous permettre d’être productif ou épanoui, mais, par défaut, pour nous distraire. Dans tous les sens du terme. Pour nous empêcher de penser.

N’attendez pas, agissez !

Distraction, espionnage permanent, ciblage publicitaire ou par drone tueur : cette merdification n’est pas seulement prévisible, elle est inévitable. Oui, votre activité sur Whatsapp et partout ailleurs va un jour se retourner contre vous. Aux États-Unis, les données de votre voiture servent déjà à augmenter le tarif de votre assurance.

Vous pouvez supprimer sans crainte vos comptes Reddit, Twitter, LinkedIn et bien d’autres. En les supprimant définitivement, par opposition à une simple « désactivation », vous forcez l’effacement d’une partie de vos données. Cette suppression peut avoir des effets bénéfiques insoupçonnés sur votre santé mentale.

De toute façon, de plus en plus de comptes sur ces plateformes sont des bots qui servent à faire de la pub. Si un compte interagit avec vous en vous parlant d’un service en ligne, c’est probablement un bot. Pardon, une « intelligence artificielle ».

Passez en manuel : supprimer vos réseaux sociaux propriétaires et venez tester Mastodon !

Supprimer un compte fait peur. On va perdre plein de données. Mais, en fait, vous les perdrez un jour ou l’autre, quand vous vous y attendrez le moins. Pour une raison débile. Parce qu’un algorithme aura déterminé que vous avez « enfreint les conditions d’utilisation ».

Et oui, vous allez un jour perdre votre accès à toutes les vidéos que vous avez mises sur YouTube.

YouTube qui est tellement incontournable qu’il ne peut que se merdifier. Et il le fait. YouTube suivrait-il la voie de Twitter ? Oui, même si c’est moins spectaculaire, moins Elonesque.

Si vous avez des vidéos, postez-les sur un compte Peertube. Si vous avez une chaîne, faites-en une copie sur Peertube et parlez-en. Normalisez le fait d’être un vidéaste sur Peertube.

Si vous avez des données que vous souhaitez conserver sur Twitter, Facebook, Instagram, LinkedIn et YouTube, faites-en une copie, car vous allez les perdre comme j’ai perdu tout ce que j’avais posté sur Google+. Ce n’est pas une prédiction, mais une certitude. Ce n’est qu’une question de temps.

Alors, soyez le héros, passez en manuel : sauvegardez vos données et supprimez vos comptes propriétaires !

Ingénieur et écrivain, j’explore l’impact des technologies sur l’humain, tant par écrit que dans mes conférences.

Recevez directement par mail mes écrits en français et en anglais. Votre adresse ne sera jamais partagée. Vous pouvez également utiliser mon flux RSS francophone ou le flux RSS complet.

Pour me soutenir, achetez mes livres (si possible chez votre libraire) ! Je viens justement de publier un recueil de nouvelles qui devrait vous faire rire et réfléchir. Je fais également partie du coffret libre et éthique « SF en VF ».

April 21, 2024

I use a Free Software Foundation Europe fellowship GPG smartcard for my email encryption and package signing. While FSFE doesn’t provide the smartcard anymore it’s still available at www.floss-shop.de.

gpg smartcard readers

I moved to a Thinkpad w541 with coreboot running Debian GNU/Linux and FreeBSD so I needed to set up my email encryption on Thunderbird again.

It took me more time to reconfigure it again - as usual - so I decided to take notes this time and create a blog post about it. As this might be useful for somebody else … or me in the future :-)

The setup is executed on Debian GNU/Linux 12 (bookworm) with the FSFE fellowship GPG smartcard, but the setup for other Linux distributes, FreeBSD or other smartcards is very similar.

umask

When you’re working on privacy-related tasks - like email encryption - it’s recommended to set the umask to a more private setting to only allow the user to read the generated files.

This is also recommended in most security benchmarks for “interactive users”.

umask 27 is a good default as it allows only users that are in the same group to read the generated files. Only the user that generated the file can modify it. This should be the default IMHO, but for historical reasons, umask 022 is the default on most Un!x or alike systems.

On most modern Un!x systems the default group for the user is the same as the user name. If you really want to be sure that only the user can read the generated files you can also use umask 077.

Set the umask to something more secure as the default. You probably want to configure it in your ~/.profile or ~/.bashrc ( if you’re using bash) or configure this on the system level for the UID range that is assigned to “interactive users”.

$ umask 27

Verify.

$ umask
0027
$

Install GnuPG

Smartcards are supported by GnuPG by scdaemon, it should be possible to use the native GnuPG CCID(Chip Card Interface) driver with more recent versions of GnuPG or the OpenSC PC/SC interface.

gpg logo

For the native GnuPG CCID interface you’ll need a supported smartcard reader. I continue to use the OpenSC PC/SC method, which implements the PKCS11 interface - the defacto standard interface for smartcards or HSMs - Hardware Security modules - and supports more smartcard readers.

The PKCS11 interface also allows you to reuse your PGP keypair for other things like SSH authentication through the PKCS11 interface ( SSH support is also possible with the gpg-agent ) or other applications like web browsers.

On Debian GNU/Linux, scdaemon is a separate package.

Make sure that the gpg and scdaemon are installed.

$ sudo apt install gpg scdaemon

GPG components

GnuPG has a few components, to list the components you can execute the gpgconf --list-components command.

$ gpgconf --list-component
gpg:OpenPGP:/usr/bin/gpg
gpgsm:S/MIME:/usr/bin/gpgsm
gpg-agent:Private Keys:/usr/bin/gpg-agent
scdaemon:Smartcards:/usr/lib/gnupg/scdaemon
dirmngr:Network:/usr/bin/dirmngr
pinentry:Passphrase Entry:/usr/bin/pinentry

If you want to know in more detail what the function is for a certain component you can check the manpages for each component.

We use in the:

  • scdaemon: Smartcard daemon for the GnuPG system
  • gpg-agent: Secret key management for GnuPG

components in this blog post as they are involved in OpenPGP encryption.

To reload the configuration of all components you execute the gpgconf --reload command. To reload you can execute gpgconf --reload <component>.

e.g.

$ gpgconf --reload scdaemon

Will reload the configuration of the scdaemon.

When you want or need to stop a gpg process you can use the gpgconf --kill <component> command.

e.g.

$ gpgconf --kill scdaemon

Will kill the scdaemon process.

(Try to) Find your smartcard device

Make sure that the reader is connected to your system.

$ lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 08e6:3438 Gemalto (was Gemplus) GemPC Key SmartCard Reader
Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd QEMU Tablet
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
$ 

Without config

Let’s see if gpg can detect the smartcard.

Execute gpg --card-status.

$ gpg --card-status
gpg: selecting card failed: No such device
gpg: OpenPGP card not available: No such device
$ 

gpg doesn’t detect the smartcard…

When you execute the gpg command the gpg-agent is started.

$ ps aux | grep -i gpg
avahi        540  0.0  0.0   8288  3976 ?        Ss   Apr05   0:00 avahi-daemon: running [debian-gpg.local]
staf         869  0.0  0.0  81256  3648 ?        SLs  Apr05   0:00 /usr/bin/gpg-agent --supervised
staf        5143  0.0  0.0   6332  2076 pts/1    S+   12:48   0:00 grep -i gpg
$ 

The gpg-agent also starts the scdaemon process.

$ ps aux | grep -i scdaemon
staf        5150  0.0  0.1  90116  5932 ?        SLl  12:49   0:00 scdaemon --multi-server
staf        5158  0.0  0.0   6332  2056 pts/1    S+   12:49   0:00 grep -i scdaemon
$ 

When it’s the first time that you started gpg it’ll create a ~/.gnupg directory.

staf@debian-gpg:~/.gnupg$ ls
private-keys-v1.d  pubring.kbx
staf@debian-gpg:~/.gnupg$ 

Add debug info

scdaemon

To debug why GPG doesn’t detect the smartcard, we’ll enable debug logging in the scdeamon.

Create a directory for the logging.

$ mkdir ~/logs
$ chmod 700 ~/logs

Edit ~/gnugpg/scdaemon.conf to reconfigure the scdaemon to create logging.

staf@debian-gpg:~/.gnupg$ cat scdaemon.conf
verbose
debug-level expert
debug-all
log-file    /home/staf/logs/scdaemon.log
staf@debian-gpg:~/.gnupg$ 

Stop the sdaemon process.

staf@debian-gpg:~/logs$ gpgconf --reload scdaemon
staf@debian-gpg:~/logs$ 

Verify.

$ ps aux | grep -i scdaemon
staf        5169  0.0  0.0   6332  2224 pts/1    S+   12:51   0:00 grep -i scdaemon
$ 

The next time you execute a gpg command, the scdaemon is restarted.

$ gpg --card-status
gpg: selecting card failed: No such device
gpg: OpenPGP card not available: No such device
$ 

With the new config, the scdaemon creates a log file with debug info.

]$ ls -l ~/logs
total 8
-rw-r--r-- 1 staf staf 1425 Apr 15 19:45 scdaemon.log
$ 

gpg-agent

We’ll configure the gpg-agent to generate a debug log as this might help to debug if something goes wrong.

Open the gpg-agent.conf configuration file with your favourite editor.

staf@bunny:~/.gnupg$ vi gpg-agent.conf

And set the debug-level and the log output file.

debug-level expert
verbose
verbose 
log-file /home/staf/logs/gpg-agent.log

Reload the gpg-agent configuration.

staf@bunny:~/.gnupg$ gpgconf --reload gpg-agent
staf@bunny:~/.gnupg$ 

Verify that a log-file is created.

staf@bunny:~/.gnupg$ ls ~/logs/
gpg-agent.log  scdaemon.log
staf@bunny:~/.gnupg$ 

The main components are configured to generate debug info now. We’ll continue with the gpg configuration.

Setup opensc

We will use opensc to configure the smartcard interface.

opensc

Install apt-file

To find the required files/libraries, it can be handy to have apt-file on our system.

Install apt-file.


$ sudo apt install apt-file

Update the apt-file database.

$ sudo apt-file update

Search for the tools that will be required.

$ apt-file search pcsc_scan 
pcsc-tools: /usr/bin/pcsc_scan            
pcsc-tools: /usr/share/man/man1/pcsc_scan.1.gz
$ 
$ apt-file search pkcs15-tool
opensc: /usr/bin/pkcs15-tool              
opensc: /usr/share/bash-completion/completions/pkcs15-tool
opensc: /usr/share/man/man1/pkcs15-tool.1.gz
$ 

Install smartcard packages

Install the required packages.

$ sudo apt install opensc pcsc-tools

Start the pcscd.service

On Debian GNU/Linux the services are normally automatically enabled/started when a package is installed.

But the pcscd service wasn’t enabled on my system.

Let’s enable it.

$ sudo systemctl enable pcscd.service 
Synchronizing state of pcscd.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable pcscd
$ 

Start it.

$ sudo systemctl start pcscd.service 

And verify that it’s running.

$ sudo systemctl status pcscd.service 
● pcscd.service - PC/SC Smart Card Daemon
     Loaded: loaded (/lib/systemd/system/pcscd.service; indirect; preset: enabled)
     Active: active (running) since Tue 2024-04-16 07:47:50 CEST; 16min ago
TriggeredBy: ● pcscd.socket
       Docs: man:pcscd(8)
   Main PID: 8321 (pcscd)
      Tasks: 7 (limit: 19026)
     Memory: 1.5M
        CPU: 125ms
     CGroup: /system.slice/pcscd.service
             └─8321 /usr/sbin/pcscd --foreground --auto-exit

Apr 16 07:47:50 bunny systemd[1]: Started pcscd.service - PC/SC Smart Card Daemon.

Verify connection

Execute the pcsc_scan command and insert your smartcard.

$ pcsc_scan
PC/SC device scanner
V 1.6.2 (c) 2001-2022, Ludovic Rousseau <ludovic.rousseau@free.fr>
Using reader plug'n play mechanism
Scanning present readers...
0: Alcor Micro AU9540 00 00
 
Tue Apr 16 08:05:26 2024
 Reader 0: Alcor Micro AU9540 00 00
  Event number: 0
  Card state: Card removed, 
 Scanning present readers...
0: Alcor Micro AU9540 00 00
1: Gemalto USB Shell Token V2 (284C3E93) 01 00
 
Tue Apr 16 08:05:50 2024
 Reader 0: Alcor Micro AU9540 00 00
  Event number: 0
  Card state: Card removed, 
 Reader 1: Gemalto USB Shell Token V2 (284C3E93) 01 00
  Event number: 0
  Card state: Card inserted, 
  ATR: <snip>

ATR: <snip>
+ TS = 3B --> Direct Convention
+ T0 = DA, Y(1): 1101, K: 10 (historical bytes)
<snip>
+ TCK = 0C (correct checksum)
<snip>

Possibly identified card (using /usr/share/pcsc/smartcard_list.txt):
<snip>
	OpenPGP Card V2
$ 

Execute pkcs15-tool -D to ensure that the pkcs11/pkcs15 interface is working.

$ pkcs15-tool -D
Using reader with a card: Gemalto USB Shell Token V2 (284C3E93) 00 00
PKCS#15 Card [OpenPGP card]:
	Version        : 0
	Serial number  : <snip>
	Manufacturer ID: ZeitControl
	Language       : nl
	Flags          : PRN generation, EID compliant


PIN [User PIN]
	Object Flags   : [0x03], private, modifiable
	Auth ID        : 03
	ID             : 02
	Flags          : [0x13], case-sensitive, local, initialized
	Length         : min_len:6, max_len:32, stored_len:32
	Pad char       : 0x00
	Reference      : 2 (0x02)
	Type           : UTF-8
	Path           : 3f00
	Tries left     : 3

PIN [User PIN (sig)]
	Object Flags   : [0x03], private, modifiable
	Auth ID        : 03
	ID             : 01
	Flags          : [0x13], case-sensitive, local, initialized
	Length         : min_len:6, max_len:32, stored_len:32
	Pad char       : 0x00
	Reference      : 1 (0x01)
	Type           : UTF-8
	Path           : 3f00
	Tries left     : 3

PIN [Admin PIN]
	Object Flags   : [0x03], private, modifiable
	ID             : 03
	Flags          : [0x9B], case-sensitive, local, unblock-disabled, initialized, soPin
	Length         : min_len:8, max_len:32, stored_len:32
	Pad char       : 0x00
	Reference      : 3 (0x03)
	Type           : UTF-8
	Path           : 3f00
	Tries left     : 3

Private RSA Key [Signature key]
	Object Flags   : [0x03], private, modifiable
	Usage          : [0x20C], sign, signRecover, nonRepudiation
	Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
	Algo_refs      : 0
	ModLength      : 3072
	Key ref        : 0 (0x00)
	Native         : yes
	Auth ID        : 01
	ID             : 01
	MD:guid        : <snip>

Private RSA Key [Encryption key]
	Object Flags   : [0x03], private, modifiable
	Usage          : [0x22], decrypt, unwrap
	Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
	Algo_refs      : 0
	ModLength      : 3072
	Key ref        : 1 (0x01)
	Native         : yes
	Auth ID        : 02
	ID             : 02
	MD:guid        : <snip>

Private RSA Key [Authentication key]
	Object Flags   : [0x03], private, modifiable
	Usage          : [0x222], decrypt, unwrap, nonRepudiation
	Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
	Algo_refs      : 0
	ModLength      : 3072
	Key ref        : 2 (0x02)
	Native         : yes
	Auth ID        : 02
	ID             : 03
	MD:guid        : <snip>

Public RSA Key [Signature key]
	Object Flags   : [0x02], modifiable
	Usage          : [0xC0], verify, verifyRecover
	Access Flags   : [0x02], extract
	ModLength      : <snip>
	Key ref        : 0 (0x00)
	Native         : no
	Path           : b601
	ID             : 01

Public RSA Key [Encryption key]
	Object Flags   : [0x02], modifiable
	Usage          : [0x11], encrypt, wrap
	Access Flags   : [0x02], extract
	ModLength      : <snip>
	Key ref        : 0 (0x00)
	Native         : no
	Path           : b801
	ID             : 02

Public RSA Key [Authentication key]
	Object Flags   : [0x02], modifiable
	Usage          : [0x51], encrypt, wrap, verify
	Access Flags   : [0x02], extract
	ModLength      : <snip>
	Key ref        : 0 (0x00)
	Native         : no
	Path           : a401
	ID             : 03

$ 

You might want to reinsert your smartcard after you execute pkcs15-tool or pkcs11-tool commands, as still this might lock the smartcard that generates a sharing violation (0x8010000b) error (see below).

GnuPG configuration

scdaemon

The first step is to get the smartcard detected by gpg (scdaemon).

Test

Execute gpg --card-status to verify that your smartcard is detected by gpg. This might work as gpg will try to use the internal CCID if this fails it will try to use the opensc interface.

$ gpg --card-status
gpg: selecting card failed: No such device
gpg: OpenPGP card not available: No such device
$ 

When the smartcard doesn’t get detected, have a look at the ~/logs/scdaemon.log log file.

Disable the internal ccid

When you use a smartcard reader that isn’t supported by GnuPG or you want to force GnuPG to use the opensc interface it’s a good idea to disable the internal ccid.

Edit scdaemon.conf.

staf@bunny:~/.gnupg$ vi scdaemon.conf
staf@bunny:~/.gnupg$

and add disable-ccid.

card-timeout 5
disable-ccid

verbose
debug-level expert
debug-all
log-file    /home/staf/logs/scdaemon.log

Reload the scdaemon.

staf@bunny:~/.gnupg$ gpgconf --reload scdaemon
staf@bunny:~/.gnupg$ 

gpg will try to use the opensc as a failback by default, so disabling the internal CCI interface will probably not fix the issue when your smartcard isn’t detected.

multiple smartcard readers

When you have multiple smartcard readers on your system. e.g. an internal and other one connected over USB. gpg might only use the first smartcard reader.

Check the scdaemon.log to get the reader port for your smartcard reader.

staf@bunny:~/logs$ tail -f scdaemon.log 
2024-04-14 11:39:35 scdaemon[471388] DBG: chan_7 -> D 2.2.40
2024-04-14 11:39:35 scdaemon[471388] DBG: chan_7 -> OK
2024-04-14 11:39:35 scdaemon[471388] DBG: chan_7 <- SERIALNO
2024-04-14 11:39:35 scdaemon[471388] detected reader 'Alcor Micro AU9540 00 00'
2024-04-14 11:39:35 scdaemon[471388] detected reader 'Gemalto USB Shell Token V2 (284C3E93) 01 00'
2024-04-14 11:39:35 scdaemon[471388] reader slot 0: not connected
2024-04-14 11:39:35 scdaemon[471388] reader slot 0: not connected
2024-04-14 11:39:35 scdaemon[471388] DBG: chan_7 -> ERR 100696144 No such device <SCD>
2024-04-14 11:39:35 scdaemon[471388] DBG: chan_7 <- RESTART
2024-04-14 11:39:35 scdaemon[471388] DBG: chan_7 -> OK

To force gpg to use a smartcard reader you can set the reader-port directive in the scdaemon.conf configuration file.

staf@bunny:~/.gnupg$ vi scdaemon.conf
card-timeout 5
disable-ccid

reader-port 'Gemalto USB Shell Token V2 (284C3E93) 01 00'

verbose
debug-level expert
debug-all
log-file    /home/staf/logs/scdaemon.log
staf@bunny:~$ gpgconf --reload scdaemon
staf@bunny:~$

sharing violation (0x8010000b)

Applications will try to lock your opensc smartcard, if your smartcard is already in use by another application you might get a sharing violation (0x8010000b).

This is fixed by reinserting your smartcard. If this doesn’t help it might be a good idea to restart the pcscd service as this disconnects all the applications that might lock the opensc interface.

$ sudo systemctl restart pcscd

When everything goes well your smartcard should be detected now. If not, take a look at the scdaemon.log .

staf@bunny:~$ gpg --card-status
Reader ...........: Gemalto USB Shell Token V2 (284C3E93) 00 00
Application ID ...: <snip>
Application type .: OpenPGP
Version ..........: 2.1
Manufacturer .....: ZeitControl
Serial number ....: 000046F1
Name of cardholder: <snip>
Language prefs ...: <snip>
Salutation .......: <snip>
URL of public key : <snip>
Login data .......: [not set]
Signature PIN ....: <snip>
Key attributes ...: <snip>
Max. PIN lengths .: <snip>
PIN retry counter : <snip>
Signature counter : <snip>
Signature key ....: <snip>
      created ....: <snip>
Encryption key....: <snip>
      created ....: <snip>
Authentication key: <snip>
      created ....: <snip>
General key info..: [none]
staf@bunny:~$ 

Note that gpg will lock the smartcard, if you try to use your smartcard for another application you’ll get a “Reader in use by another application” error.

staf@bunny:~$ pkcs15-tool -D
Using reader with a card: Gemalto USB Shell Token V2 (284C3E93) 00 00
Failed to connect to card: Reader in use by another application
staf@bunny:~$ 

Private key references

When you execute gpg --card-status successfully. gpg creates references (shadowed-private-keys) to the private key(s) on your smartcard. These references are text files created in ~/.gnupg/private-keys-v1.d.

staf@bunny:~/.gnupg/private-keys-v1.d$ ls
<snip>.key
<snip>.key
<snip>.key
staf@bunny:~/.gnupg/private-keys-v1.d$ 

These references also include the smartcard serial number, this way gpg knows on which hardware token the private key is located and will ask you to insert the correct hardware token (smartcard).

Pinentry

pinentry allows you to type in a passphrase (pin code) to unlock the encryption if you don’t have a smartcard reader with a PIN-pad.

Lets’s look for the packages that provides a pinentry binary.

$ sudo apt-file search "bin/pinentry"
kwalletcli: /usr/bin/pinentry-kwallet     
pinentry-curses: /usr/bin/pinentry-curses
pinentry-fltk: /usr/bin/pinentry-fltk
pinentry-gnome3: /usr/bin/pinentry-gnome3
pinentry-gtk2: /usr/bin/pinentry-gtk-2
pinentry-qt: /usr/bin/pinentry-qt
pinentry-tty: /usr/bin/pinentry-tty
pinentry-x2go: /usr/bin/pinentry-x2go
$ 

Install the packages.

$ sudo apt install pinentry-curses pinentry-gtk2 pinentry-gnome3 pinentry-qt

/bin/pinetry is a soft link to the pinentry binary.

$ which pinentry
/usr/bin/pinentry
staf@debian-gpg:/tmp$ ls -l /usr/bin/pinentry
lrwxrwxrwx 1 root root 26 Oct 18  2022 /usr/bin/pinentry -> /etc/alternatives/pinentry
staf@debian-gpg:/tmp$ ls -l /etc/alternatives/pinentry
lrwxrwxrwx 1 root root 24 Mar 23 11:29 /etc/alternatives/pinentry -> /usr/bin/pinentry-gnome3
$ 

If you want to use another pinentry provider you can update it with update-alternatives --config pinentry.

sudo update-alternatives --config pinentry

Import public key

The last step is to import our public key.

$ gpg --import <snip>.asc
gpg: key <snip>: "Fname Sname <email>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1
$

Test

Check if our public and private keys are known by GnuPG.

List public keys.

$ gpg --list-keys
/home/staf/.gnupg/pubring.kbx
-----------------------------
pub   rsa<snip> <snip> [SC]
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
uid           [ unknown] fname sname <email>
sub   rsa<snip> <snip> [A]
sub   rsa<snip> <snip> [E]

$ 

List our private keys.

$ gpg --list-secret-keys 
/home/staf/.gnupg/pubring.kbx
-----------------------------
sec>  rsa<size> <date> [SC]
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
      Card serial no. = XXXX YYYYYYYY
uid           [ unknown] Fname Sname <email>
ssb>  rsa<size> <date> [A]
ssb>  rsa<size> <date> [E]

$ 

Test if we sign something with our GnuPG smartcard.

Create a test file.

$ echo "I'm boe." > /tmp/boe
$ 

Sign it.

$ gpg --sign /tmp/boe
$ 

Type your PIN code.

                ┌──────────────────────────────────────────────┐
                │ Please unlock the card                       │
                │                                              │
                │ Number: XXXX YYYYYYYY                        │
                │ Holder: first list                           │
                │ Counter: <number>                            │
                │                                              │
                │ PIN ________________________________________ │
                │                                              │
                │      <OK>                        <Cancel>    │
                └──────────────────────────────────────────────┘

In a next blog post will set up Thunderbird to use the smartcard for OpenPGP email encryption.

Have fun!

Links

April 11, 2024

Getting the Belgian eID to work on Linux systems should be fairly easy, although some people do struggle with it.

For that reason, there is a lot of third-party documentation out there in the form of blog posts, wiki pages, and other kinds of things. Unfortunately, some of this documentation is simply wrong. Written by people who played around with things until it kind of worked, sometimes you get a situation where something that used to work in the past (but wasn't really necessary) now stopped working, but it's still added to a number of locations as though it were the gospel.

And then people follow these instructions and now things don't work anymore.

One of these revolves around OpenSC.

OpenSC is an open source smartcard library that has support for a pretty large number of smartcards, amongst which the Belgian eID. It provides a PKCS#11 module as well as a number of supporting tools.

For those not in the know, PKCS#11 is a standardized C API for offloading cryptographic operations. It is an API that can be used when talking to a hardware cryptographic module, in order to make that module perform some actions, and it is especially popular in the open source world, with support in NSS, amongst others. This library is written and maintained by mozilla, and is a low-level cryptographic library that is used by Firefox (on all platforms it supports) as well as by Google Chrome and other browsers based on that (but only on Linux, and as I understand it, only for linking with smartcards; their BoringSSL library is used for other things).

The official eID software that we ship through eid.belgium.be, also known as "BeID", provides a PKCS#11 module for the Belgian eID, as well as a number of support tools to make interacting with the card easier, such as the "eID viewer", which provides the ability to read data from the card, and validate their signatures. While the very first public version of this eID PKCS#11 module was originally based on OpenSC, it has since been reimplemented as a PKCS#11 module in its own right, with no lineage to OpenSC whatsoever anymore.

About five years ago, the Belgian eID card was renewed. At the time, a new physical appearance was the most obvious difference with the old card, but there were also some technical, on-chip, differences that are not so apparent. The most important one here, although it is not the only one, is the fact that newer eID cards now use a NIST P-384 elliptic curve-based private keys, rather than the RSA-based ones that were used in the past. This change required some changes to any PKCS#11 module that supports the eID; both the BeID one, as well as the OpenSC card-belpic driver that is written in support of the Belgian eID.

Obviously, the required changes were implemented for the BeID module; however, the OpenSC card-belpic driver was not updated. While I did do some preliminary work on the required changes, I was unable to get it to work, and eventually other things took up my time so I never finished the implementation. If someone would like to finish the work that I started, the preliminal patch that I wrote could be a good start -- but like I said, it doesn't yet work. Also, you'll probably be interested in the official documentation of the eID card.

Unfortunately, in the mean time someone added the Applet 1.8 ATR to the card-belpic.c file, without also implementing the required changes to the driver so that the PKCS#11 driver actually supports the eID card. The result of this is that if you have OpenSC installed in NSS for either Firefox or any Chromium-based browser, and it gets picked up before the BeID PKCS#11 module, then NSS will stop looking and pass all crypto operations to the OpenSC PKCS#11 module rather than to the official eID PKCS#11 module, and things will not work at all, causing a lot of confusion.

I have therefore taken the following two steps:

  1. The official eID packages now conflict with the OpenSC PKCS#11 module. Specifically only the PKCS#11 module, not the rest of OpenSC, so you can theoretically still use its tools. This means that once we release this new version of the eID software, when you do an upgrade and you have OpenSC installed, it will remove the PKCS#11 module and anything that depends on it. This is normal and expected.
  2. I have filed a pull request against OpenSC that removes the Applet 1.8 ATR from the driver, so that OpenSC will stop claiming that it supports the 1.8 applet.

When the pull request is accepted, we will update the official eID software to make the conflict versioned, so that as soon as it works again you will again be able to install the OpenSC and BeID packages at the same time.

In the mean time, if you have the OpenSC PKCS#11 module installed on your system, and your eID authentication does not work, try removing it.

March 21, 2024

Mario & Luigi: Superstar Saga is a surprisingly fun game. At its core it’s a role-playing game. But you wouldn’t tell because of the amount of platforming, the quantity and quality of small puzzles and -most of all- the funny self-deprecating dialogue. As the player you control Mario and Luigi simultaneously and that takes some time to master. As you learn techniques along the way, each character get special abilities allowing you to solve new puzzles and advance on your journey.

March 15, 2024

Dit kan helemaal niet in België want het vereist een serieus niet populistisch debat tussen alle politieke partijen (waarbij ze hun rekenmachine en logisch redeneren meenemen én er op toezien dat het budget écht in balans blijft). Dat bestaat al decennia niet meer op het Federaal niveau. Dus ja.

Maar een manier om het budgetair probleem van het land op te lossen zou kunnen door meer te belasten op consumptie door het BTW tarief van 21% naar bv. 25% of 30% te verhogen, (veel) minder te belasten op inkomen en tot slot meer producten en diensten op 6% en 12% te zetten.

M.a.w. alle noodzakelijke uitgaven zoals electriciteit, Internet, (gas) verwarming, water, brood, en zo verder op 6%. Een (veel) grotere mand van (bv. gezonde) voedingsmiddelen, woning (en kosten zoals bv. verbouwingen) op 12%.

Maar voor alle luxeconsumptie 25% of 30% of eventueel zelfs 40%.

Daaraan gekoppeld een stevige verlaging van de personenbelasting.

Gevolg meer BTW-inkomsten uit consumptie. Betere concurrentiepositie t.o.v. onze buurlanden doordat de salarissen niet hoeven te stijgen (door de verlaging van de personenbelasting). Koopkracht wordt daardoor versterkt en die versterking wordt aangevuld doordat er een grotere mand voor 6% en 12% BTW tarieven is.

M.a.w. enkel wordt luxeconsumptie (veel) meer belast. Noodzakelijke consumptie gaat naar 6% of 12% (of men behoudt hiervoor de 21%).

Grootste nadeel is dat onze boekhouders meer werk hebben. Maar ik denk wel dat ze dat zo erg niet zullen vinden (ze factureren hun extra uren trouwens aan het luxeconsumptie BTW tarief).

Voorbeeld voordeel is dat je consumptie beter kan sturen. Wil je dat wij Belgen meer electrische wagens kopen? Maak zo’n wagen 6% BTW en een wagen met een dieselverbrandingsmotor 40% BTW. Wil je dat jongeren gezondere voeding eten? Coca Cola 40% BTW en fruitsap 6% BTW.

Uiteraard moet je ook besparen en een serieuze efficiëntieoefening doen.

March 12, 2024

Nintendo killed the switch emulator Yuzu. Sadly, the official Citra distribution disappeared a well because they shared some developers. The good news is that Citra being open source, is still available as archive builds and new forks. EmuDeck on the Steam Deck adapted to this new reality and adopted a bring-your-own Yuzu and Citra model. The changes to this emulators came at the same time of a big EmuDeck release, complication things a little.

February 25, 2024

Wij houden met Euroclear zo’n 260 miljard euro aan Russische geblokkeerde tegoeden vast.

Moesten wij die gebruiken dan houdt dat in dat ons land de komende honderd en meer jaar internationaal gezien zal worden als een dief.

Tegelijkertijd spenderen wij slechts zo’n procent van onze GDP aan defensie. Dat is niet goed. Maar ja. Misschien indien de VS zo erg graag wil dat we die Russische tegoeden voor hun politieke pleziertjes gebruiken, dat ze deze ‘gunsten’ als Belgische defensie-uitgaven kunnen zien?

Bovendien horen er een aantal garanties te zijn t.o.v. de represailles die Rusland zal uitwerken tegen België wanneer Euroclear de tegoeden herinvesteert (dit wil voor Rusland zeggen: steelt).

Waar zijn die garanties? Welke zijn ze? Over hoeveel geld gaat het? Is het extreem veel? Want dat moet. Is het walgelijk extreem veel? Want dat moet.

Die garanties, gaan die over meer dan honderd jaar? Want dat moet.

Het is allemaal wel gemakkelijk en zo om een en ander te willen. Maar wat staat daar tegenover?

I started to migrate all the services that I use on my internal network to my Raspberry Pi 4 cluster. I migrated my FreeBSD jails to BastileBSD on a virtual machine running on a Raspberry Pi. See my blog post on how to migrate from ezjail to BastilleBSD. https://stafwag.github.io/blog/blog/2023/09/10/migrate-from-ezjail-to-bastille-part1-introduction-to-bastillebsd/

tianocore

Running FreeBSD as a virtual machine with UEFI on ARM64 came to the point that it just works. I have to use QEMU with u-boot to get FreeBSD up and running on the Raspberry Pi as a virtual machine with older FreeBSD versions: https://stafwag.github.io/blog/blog/2021/03/14/howto_run_freebsd_as_vm_on_pi/.

But with the latest versions of FreeBSD ( not sure when it started to work, but it works on FreeBSD 14) you can run FreeBSD as a virtual machine on ARM64 with UEFI just like on x86 on GNU/Linux with KVM.

UEFI on KVM is in general provided by the open-source tianocore project.

I didn’t find much information on how to run OpenBSD with UEFI on x86 or ARM64.

OpenBSD 7.4

So I decided to write a blog post about it, in the hope that this information might be useful to somebody else. First I tried to download the OpenBSD 7.4 ISO image and boot it as a virtual machine on KVM (x86). But the iso image failed to boot on a virtual with UEFI enabled. It looks like the ISO image only supports a legacy BIOS.

ARM64 doesn’t support a “legacy BIOS”. The ARM64 download page for OpenBSD 7.4 doesn’t even have an ISO image, but there is an install-<version>.img image available. So I tried to boot this image on one of my Raspberry Pi systems and this worked. I had more trouble getting NetBSD working as a virtual machine on the Raspberry Pi but this might be a topic for another blog post :-)

You’ll find my journey with my installation instructions below.

Download

Download the installation image

Download the latest OpenBSD installation ARM64 image from: https://www.openbsd.org/faq/faq4.html#Download

The complete list of the mirrors is available at https://www.openbsd.org/ftp.html

Download the image.

[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/install74.img
--2024-02-13 19:04:52--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/install74.img
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 528482304 (504M) [application/octet-stream]
Saving to: 'install74.img'

install74.img       100%[===================>] 504.00M  3.70MB/s    in 79s     

2024-02-13 19:06:12 (6.34 MB/s) - 'install74.img' saved [528482304/528482304]

[staf@staf-pi002 openbsd]$ 

Download the checksum and the signed checksum.

2024-02-13 19:06:12 (6.34 MB/s) - 'install74.img' saved [528482304/528482304]

[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256
--2024-02-13 19:07:00--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 1392 (1.4K) [text/plain]
Saving to: 'SHA256'

SHA256                  100%[=============================>]   1.36K  --.-KB/s    in 0s      

2024-02-13 19:07:01 (8.09 MB/s) - 'SHA256' saved [1392/1392]

[staf@staf-pi002 openbsd]$ 
[staf@staf-pi002 openbsd]$ wget https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256.sig
--2024-02-13 19:08:01--  https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/SHA256.sig
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 1544 (1.5K) [text/plain]
Saving to: 'SHA256.sig'

SHA256.sig              100%[=============================>]   1.51K  --.-KB/s    in 0s      

2024-02-13 19:08:02 (3.91 MB/s) - 'SHA256.sig' saved [1544/1544]

[staf@staf-pi002 openbsd]$ 

Verify

OpenBSD uses signify to validate the cryptographic signatures. signify is also available for GNU/Linux (at least on Debian GNU/Linux and Arch Linux).

More details on how to verify the signature with signify is available at: https://www.openbsd.org/74.html

This blog post was also useful: https://www.msiism.org/blog/2019/10/20/authentic_pufferfish_for_penguins.html

Install OpenBSD signify

Download the signify public key from: https://www.openbsd.org/74.html

[staf@staf-pi002 openbsd]$ wget https://ftp.openbsd.org/pub/OpenBSD/7.4/openbsd-74-base.pub
--2024-02-13 19:14:25--  https://ftp.openbsd.org/pub/OpenBSD/7.4/openbsd-74-base.pub
Connecting to xxx.xxx.xxx.xxx:3128... connected.
Proxy request sent, awaiting response... 200 OK
Length: 99 [text/plain]
Saving to: 'openbsd-74-base.pub'

openbsd-74-base.pub     100%[=============================>]      99   397 B/s    in 0.2s    

2024-02-13 19:14:26 (397 B/s) - 'openbsd-74-base.pub' saved [99/99]

[staf@staf-pi002 openbsd]$

I run Debian GNU/Linux on my Raspberry Pi’s, let see which signify packages are available.

[staf@staf-pi002 openbsd]$ sudo apt search signify
sudo: unable to resolve host staf-pi002: Name or service not known
[sudo] password for staf: 
Sorting... Done
Full Text Search... Done
chkrootkit/stable 0.57-2+b1 arm64
  rootkit detector

elpa-diminish/stable 0.45-4 all
  hiding or abbreviation of the mode line displays of minor-modes

fcitx-sayura/stable 0.1.2-2 arm64
  Fcitx wrapper for Sayura IM engine

fcitx5-sayura/stable 5.0.8-1 arm64
  Fcitx5 wrapper for Sayura IM engine

signify/stable 1.14-7 all
  Automatic, semi-random ".signature" rotator/generator

signify-openbsd/stable 31-3 arm64
  Lightweight cryptographic signing and verifying tool

signify-openbsd-keys/stable 2022.2 all
  Public keys for use with signify-openbsd

[staf@staf-pi002 openbsd]$

There’re two OpenBSD signify packages available on Debian 12 (bookworm);

  • signify-openbsd/: The OpenBSD signify tool.
  • signify-openbsd-keys: This package contains the OpenBSD release public keys, installed in /usr/share/signify-openbsd-keys/. Unfortunately, the OpenBSD 7.4 release isn’t (yet) included in Debian 12 (bookworm).
[staf@staf-pi002 openbsd]$ sudo apt install signify-openbsd signify-openbsd-keys
sudo: unable to resolve host staf-pi002: Name or service not known
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  signify-openbsd signify-openbsd-keys
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 70.4 kB of archives.
After this operation, 307 kB of additional disk space will be used.
Get:1 http://deb.debian.org/debian bookworm/main arm64 signify-openbsd arm64 31-3 [62.3 kB]
Get:2 http://deb.debian.org/debian bookworm/main arm64 signify-openbsd-keys all 2022.2 [8020 B]
Fetched 70.4 kB in 0s (404 kB/s)          
Selecting previously unselected package signify-openbsd.
(Reading database ... 94575 files and directories currently installed.)
Preparing to unpack .../signify-openbsd_31-3_arm64.deb ...
Unpacking signify-openbsd (31-3) ...
Selecting previously unselected package signify-openbsd-keys.
Preparing to unpack .../signify-openbsd-keys_2022.2_all.deb ...
Unpacking signify-openbsd-keys (2022.2) ...
Setting up signify-openbsd-keys (2022.2) ...
Setting up signify-openbsd (31-3) ...
[staf@staf-pi002 openbsd]$ 

Verify the checksum

Verify the checksum.

[staf@staf-pi002 openbsd]$ sha256sum install74.img 
09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8  install74.img
[staf@staf-pi002 openbsd]$ 
[staf@staf-pi002 openbsd]$ grep 09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8 SHA256
SHA256 (install74.img) = 09e4d0fe6d3f49f2c4c99b6493142bb808253fa8a8615ae1ca8e5f0759cfebd8
[staf@staf-pi002 openbsd]$ 

Verify with signify

Execute the signify command to verify the checksum. See the OpenBSD signify manpage for more information.

You’ll find a brief list of the arguments that are used to verify the authenticity of the image.

  • -C: Will verify the Checksum.
  • -p <path>: The path to the Public key.
  • -x <path>: The path to the signature file.

Verify the image with signify.

[staf@staf-pi002 openbsd]$ signify-openbsd -C -p openbsd-74-base.pub -x SHA256.sig install74.img
Signature Verified
install74.img: OK
[staf@staf-pi002 openbsd]$

Secure boot

The Debian UEFI package for libvirt ovmf is based on https://github.com/tianocore/tianocore.github.io/wiki/OVMF.

Debian Bookworm comes with the following UEFI BIOS settings:

  • /usr/share/AAVMF/AAVMF_CODE.ms.fd This is with secure boot enabled.
  • /usr/share/AAVMF/AAVMF_CODE.fd This is without secure boot enabled.

The full description is available at /usr/share/doc/ovmf/README.Debian on a Debian system when the ovmf package is installed.

To install OpenBSD we need to disable secure boot.

Test boot

I first started a test boot.

Logon to the Raspberry Pi.

[staf@vicky ~]$ ssh -X -CCC staf-pi002 
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Linux staf-pi002 6.1.0-17-arm64 #1 SMP Debian 6.1.69-1 (2023-12-30) aarch64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Feb 14 06:08:45 2024 from xxx.xxx.xxx.xxx
[staf@staf-pi002 ~]$ 


virt-manager



Start virt-manager and click on the [ Create on new VM ] icon.



new vm




This will bring up the new vm window. Select ( ) Import existing disk image, you review the architecture option by selecting the \/ Architecture options. The defaults are fine. Click on [ Forward ].






import vm



This will open the “import vm” window. Click on [ Browse ] to select the OpenBSD installation image or just copy/paste the path.

At the bottom of the screen, you’ll see Choose the operating system you are installing. Starting type openbsd and select [ X ] include end-of-life operating systems Debian 12 (bookworm) doesn’t include support for OpenBSD 7.4 (yet) so we need to set it to “OpenBSD 7.0”. Click on [ Forward ].





select custom


In the next windows keep the default Memory and CPU settings as we’re just verifying that we can boot from the installation image.

Debian uses “secure boot” by default. We need to disable secure boot. Select [ X ] Customize configuration before install, this allows us to set the UEFI boot image.






begin install


Set the Firmware to: /usr/share/AAVMF/AAVMF_CODE.fd to disable secure boot and click on [ Begin Installation ].








begin install




Let’s check if OpenBSD can boot. Great, it works!




Installation with virt-install

I prefer to use the command line to install as this allows me to make the installation reproducible and automated.

Create a ZFS dataset

I used ZFS on my Raspberry Pi’s, this makes it easier to create snapshots etc when you’re testing software etc.

root@staf-pi002:/var/lib/libvirt/images# zfs create staf-pi002_pool/root/var/lib/libvirt/images/openbsd-gitlabrunner001
root@staf-pi002:/var/lib/libvirt/images# 
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# pwd
/var/lib/libvirt/images/openbsd-gitlabrunner001
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

Get the correct os-variant

To get the operating system settings you can execute the command virt-install --osinfo list

root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# virt-install --osinfo list | grep -i openbsd7
openbsd7.0
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

We’ll use openbsd7.0 as the operating system variant.

Create QEMU image

Create a destination disk image.

root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# qemu-img create -f qcow2 openbsd-gitlabrunner001.qcow2 50G
Formatting 'openbsd-gitlabrunner001.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=53687091200 lazy_refcounts=off refcount_bits=16
root@staf-pi002:/var/lib/libvirt/images/openbsd-gitlabrunner001# 

Run virt-install

Run virt-install to import the virtual machine.

#!/bin/bash

virt-install --name openbsd-gitlabrunner001  \
 --noacpi \
 --boot loader=/usr/share/AAVMF/AAVMF_CODE.fd \
 --os-variant openbsd7.0 \
 --ram 2048 \
 --import \
 --disk /home/staf/Downloads/isos/openbsd/install74.img  \
 --disk /var/lib/libvirt/images/openbsd-gitlabrunner001/openbsd-gitlabrunner001.qcow2

If everything goes well the virtual machine gets booted.

BdsDxe: loading Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x1,0x3)/Pci(0x0,0x0)
BdsDxe: starting Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x1,0x3)/Pci(0x0,0x0)
disks: sd0*
>> OpenBSD/arm64 BOOTAA64 1.18
boot> 
cannot open sd0a:/etc/random.seed: No such file or directory
booting sd0a:/bsd: 2861736+1091248+12711584+634544 [233295+91+666048+260913]=0x13d5cf8
Copyright (c) 1982, 1986, 1989, 1991, 1993
	The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2023 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 7.4 (RAMDISK) #2131: Sun Oct  8 13:35:40 MDT 2023
    deraadt@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/RAMDISK
real mem  = 2138013696 (2038MB)
avail mem = 2034593792 (1940MB)
random: good seed from bootblocks
mainbus0 at root: linux,dummy-virt
psci0 at mainbus0: PSCI 1.1, SMCCC 1.1
efi0 at mainbus0: UEFI 2.7
efi0: EDK II rev 0x10000
smbios0 at efi0: SMBIOS 3.0.0
smbios0:
sd1 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, >
sd1: 51200MB, 512 bytes/sector, 104857600 sectors
virtio35: msix per-VQ
ppb5 at pci0 dev 1 function 5 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci6 at ppb5 bus 6
ppb6 at pci0 dev 1 function 6 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci7 at ppb6 bus 7
ppb7 at pci0 dev 1 function 7 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci8 at ppb7 bus 8
ppb8 at pci0 dev 2 function 0 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci9 at ppb8 bus 9
ppb9 at pci0 dev 2 function 1 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci10 at ppb9 bus 10
ppb10 at pci0 dev 2 function 2 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci11 at ppb10 bus 11
ppb11 at pci0 dev 2 function 3 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci12 at ppb11 bus 12
ppb12 at pci0 dev 2 function 4 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci13 at ppb12 bus 13
ppb13 at pci0 dev 2 function 5 vendor "Red Hat", unknown product 0x000c rev 0x00: irq
pci14 at ppb13 bus 14
pluart0 at mainbus0: rev 1, 16 byte fifo
pluart0: console
"pmu" at mainbus0 not configured
agtimer0 at mainbus0: 54000 kHz
"apb-pclk" at mainbus0 not configured
softraid0 at root
scsibus2 at softraid0: 256 targets
root on rd0a swap on rd0b dump on rd0b
WARNING: CHECK AND RESET THE DATE!
erase ^?, werase ^W, kill ^U, intr ^C, status ^T

Welcome to the OpenBSD/arm64 7.4 installation program.
(I)nstall, (U)pgrade, (A)utoinstall or (S)hell? 

Continue with the OpenBSD installation as usual. Make sure that you select the second disk during the installation process.

To fully automate the installation we need a system that executes the post-configuration at the first boot. On GNU/Linux is normally done by cloud-init while there are solutions to get cloud-init working on the BSDs. I didn’t look into this (yet).

Have fun!

Links

February 23, 2024

Sorry, maar een bedrijf schandelijk de dieperik injagen is onwettelijk. Punt. Daar mag, nee moet, een gepaste straf tegenover staan.

Kurt Meers – opinie stuk

Ik vraag me ook af waarom journalisten die Vansteenbrugge uitnodigen niet ingaan op zijn totaal foute definities:

Laster is iemand kwaadwillig iets verwijten zodat dit deze persoon blootstelt aan publieke verachting, zonder dat men erin slaagt het wettelijke bewijs ervan te leveren, terwijl de wet het leveren van het bewijs veronderstelt. Dit kan bijvoorbeeld wanneer je in het openbaar beschuldigd wordt van pedofilie, terwijl hier absoluut geen bewijs van voorhanden is.

Eerroof is iemand kwaadwillig iets verwijten zodat de eer van die persoon wordt aangetast of hij/zij wordt blootgesteld aan publieke verachting, zonder dat het bewijs hiervan wettelijk gezien geleverd kan of mag worden. Iemand wordt bijvoorbeeld beschuldigd van diefstal, maar het wettelijke bewijs daarvan is niet meer te leveren omdat de zaak verjaard is.

Advo-Recht

M.a.w. Laster en eerroof heeft totaal niets te maken met wat Vansteenbrugge daarover beweert: iets over dat je mening de democratie niet in gevaar mag brengen, wat er totaal niets mee te maken heeft – en wat trouwens wel toegelaten is als meningsuiting. Bijvoorbeeld Sharia4Belgium mocht haar mening uiten. Daar werden ze niet voor bestraft. Wel voor het rondselen van terroristen.

Ik heb zelden zo’n slechte advocaat gezien. Het is een populist die slecht theater speelt en ook de meest dwaze strategie adviseert voor zijn cliënt (waardoor die nu een gevangenisstraf i.p.v. een werkstraf oploopt). Maarja. Dat was duidelijk onderdeel van het theater om zo de publieke opinie te kunnen bespelen met: het is zo’n zware straf! Eigen keuze. Eigen schuld.

Laten we de definitie van Laster even vergelijken met wat er hier gebeurd is: het aan de schandpaal nagelen van de eigenaars van een restaurant die totaal niets met de zaak te maken hebben. Dat is letterlijk en exact hetzelfde als het blootstellen aan publieke verachting zonder dat men erin slaagt het wettelijke bewijs ervan te leveren.

Dat valt niet onder de noemer vrijheid van meningsuiting. Wel onder de noemer misdaad.

Er wordt ook schadevergoeding met straf vergeleken. Wat totaal onjuist is. Een schadevergoeding is geen straf. Het is de vergoeding voor schade die jij anderen hebt aangedaan. Dat is geen straf. Die vergoeding kan trouwens nog oplopen (dus meer dan de 20000 Euro worden) aangezien er een expert is aangesteld die de werkelijke schade nu zal begroten.

Ik denk eerlijk gezegd dat dat restaurant in Antwerpen veel meer dan 20000 Euro schade heeft opgelopen. Dat moet allemaal en volledig vergoed worden. Hoe hoog dat bedrag ook is.

Die schadevergoeding vergelijken met de schadevergoeding die betaald moest worden aan de ouders van een student die gestorven is tijdens een studentendoop: a) die zaken hebben totaal niets met elkaar te maken en het is b) lijkenpikkenrij. Er is maar weinig dat ik meer verachtelijk vind. Behalve dan politieke lijkenpikkerij. Dat is nog erger en bovendien een aanfluiting van de scheiding der machten.

Waar je voor zou kunnen pleiten in het Federaal Parlement is om de schadevergoeding voor onopzettelijke doding met schuld te verhogen. Maar daarover hoort dan eerst een behoorlijk gevoerd democratisch debat plaatst te vinden. Dat is dus niet de populistische onzin die we gewoon zijn van Sammy Mahdi.

De wet veronderstelt het leveren van het bewijs hiervoor: hij heeft de afgelopen jaren als politicus geen enkele aanzet om dat democratisch debat te voeren ondernomen.

Hij is dus een populistische lijkenpikker.

L’amour est comme l’oiseau de Twitter
On est bleu de lui, seulement pour 48 heures

StromaeCarmen

Q.E.D.

ps. Lees ook zeker de open brief van de moeder van Sanda Dia. Zij is de recuperatie van de dood van haar zoon ook volledig beu.

February 22, 2024

For close to a decade, I dedicated myself to building out Grafana Labs and subsequently took some personal time to recharge. I came out of these experiences as a different person, and am ready to share a few things I learned.

First step is bringing my abandoned blog back to life:

  • Rewrote the about/bio and projects pages
  • Switched from lighttpd to Caddy (automagic https! wow!)
  • Enabled Google Cloud CDN
  • Switched from Google Analytics to the open source PostHog
  • Tried CloudFlare Turnstile for comment spam prevention, but it’s quite complicated and has un-googleable error codes, so i switched to something more simplistic
  • Upgraded hugo from 0.15 to 0.122 with very few breaking changes (!) and found a nice new minimalist theme
  • I took more care than probably necessary to avoid accidental republishing of articles in RSS feeds, since that’s such an easy mistake to happen with rss GUID’s.

I don’t know who still reads this blog, but if you do, stay tuned for more!

February 20, 2024

Ik merkte op het einde van een Terzake dat onze teerbeminde ADIV opzoek is naar mensen.

Ik mag hopen dat ze planet.grep lezen want ja, een deel van het soort volk dat de ADIV nodig heeft leest onze blogs. Gelukkig maar.

Nu, ik heb in 2014 al eens omschreven wat toen Miguel De Bruycker nodig had voor zijn Centrum voor Cybersecurity.

Meneer Van Strythem Michel; jij hebt precies hetzelfde nodig als Miguel toen. Toch? Je gaat die mensen dan ook op precies dezelfde manier vinden als hoe Miguel ze vond: Bescherm ons tegen afluisteren, luister zelf enkel binnen een wettelijk kader af.

Ik schreef in 2014:

Zolang de overheid haar eigen moreel compas volgt, zijn deze mensen bereid hun kunnen voor de overheid in te zetten.

Maar de wet staat boven de militair. Ze moet gevolgd worden. Ook door de inlichtingendiensten. Het is onze enige garantie op een vrije samenleving: ik wil niet werken of geholpen hebben aan een wereld waarin de burger door technologie vrijheden zoals privacy verliest.

Daartussen vond je in mijn schrijfsel een opsommig van een paar opdrachtjes die een klein beetje moeilijk zijn.

Zoals het verbergen van een Linux kernel module, wat dus een standaard Linux virus zou doen. Het exfiltreren van data over een netwerk device zonder dat het al te veel opvalt. Dat is iets wat de ADIV zal willen doen. Het ontvangen van commando’s. Dat is eigenlijk wat Back orifice ook al deed.

Ik omschreef ook dat ze moeten weten hoe een bufferoverflow fout werkt. Omdat dit een veel voorkomende programmeursfout is die tot beveiligingsproblemen leidt. M.a.w. ze moeten toch op zijn minst weten hoe ze zelf zo’n beveiligingsprobleem zouden kunnen maken. Al was het maar om hun aanvalstechnieken te kunnen voorbereiden.

Ik verwachtte ook dat ze een standaard socket-servertje kunnen opzetten. Natuurlijk. Dat is minimaal.

De gemiddelde programmeur zal dat vast allemaal niet kunnen. Maar we hebben de gemiddelde programmeur dan ook niet nodig.

Als de ADIV zich houdt aan de wet en steeds binnen de context van de wet werkt, dan zal de ADIV de besten van ons land vinden. Net zoals hoe de brandweer de beste mensen van het land weet te activeren, zal de ADIV de beste programmeurs kunnen activeren.

Maar dat is omdat het zich aan haar eigen morele compas houdt.

Doet ze dat niet, dan niet.

Met vriendelijke groeten,

Philip. Programmeur.

February 14, 2024

So I like and use Spectra (aka “ultimate addons for gutenberg”) on some of my WordPress sites. If you use Spectra as well, you might want to be know that Spectra 2.12 has a significant performance regression, sucking in almost all of . Here’s hoping this was not intentional and that it’ll get fixed soon More info in https://wordpress.org/support/topic/2-12-significant-performance-regression/

Source

February 06, 2024

FOSDEM is over, but our video infrastructure is still working hard! This year, there were 855 events in rooms where the FOSDEM video team had equipment to stream, record, and publish everything, for a total of 365 scheduled hours of content. Postprocessing that much video is a massive task that takes quite a bit of effort; for this reason, since the 2017 edition, we have crowdsourced most of the manual work that needs to happen for this postprocessing to the devroom organizers and the speakers, as explained in this talk at FOSDEM 2019. As of this writing, out of those舰

February 04, 2024

Why your app turned into spaghetti

Cover Image

"I do not like your software sir,
your architecture's poor.

Your users can't do anything,
unless you code some more.

This isn't how it used to be,
we had this figured out.

But then you had to mess it up
by moving into clouds."

There's a certain kind of programmer. Let's call him Stanley.

Stanley has been around for a while, and has his fair share of war stories. The common thread is that poorly conceived and specced solutions lead to disaster, or at least, ongoing misery. As a result, he has adopted a firm belief: it should be impossible for his program to reach an invalid state.

Stanley loves strong and static typing. He's a big fan of pattern matching, and enums, and discriminated unions, which allow correctness to be verified at compile time. He also has strong opinions on errors, which must be caught, logged and prevented. He uses only ACID-compliant databases, wants foreign keys and triggers to be enforced, and wraps everything in atomic transactions.

He hates any source of uncertainty or ambiguity, like untyped JSON or plain-text markup. His APIs will accept data only in normalized and validated form. When you use a Stanley lib, and it doesn't work, the answer will probably be: "you're using it wrong."

Stanley is most likely a back-end or systems-level developer. Because nirvana in front-end development is reached when you understand that this view of software is not just wrong, but fundamentally incompatible with the real world.

I will prove it.

Alice in wonderland

State Your Intent

Take a text editor. What happens if you press the up and down arrows?

The keyboard cursor (aka caret) moves up and down. Duh. Except it also moves left and right.

The editor state at the start has the caret on line 1 column 6. Pressing down will move it to line 2 column 6. But line 2 is too short, so the caret is forcibly moved left to column 1. Then, pressing down again will move it back to column 6.

It should be obvious that any editor that didn't remember which column you were actually on would be a nightmare to use. You know it in your bones. Yet this only works because the editor allows the caret to be placed on a position that "does not exist." What is the caret state in the middle? It is both column 1 and column 6.

Intent - State - View

To accommodate this, you need more than just a View that is a pure function of a State, as is now commonly taught. Rather, you need an Intent, which is the source of truth that you mutate... and which is then parsed and validated into a State. Only then can it be used by the View to render the caret in the right place.

To edit the intent, aka what a classic Controller does, is a bit tricky. When you press left/right, it should determine the new Intent.column based on the validated State.column +/- 1. But when you press up/down, it should keep the Intent.column you had before and instead change only Intent.line. New intent is a mixed function of both previous intent and previous state.

The general pattern is that you reuse Intent if it doesn't change, but that new computed Intent should be derived from State. Note that you should still enforce normal validation of Intent.column when editing too: you don't allow a user to go past the end of a line. Any new intent should be as valid as possible, but old intent should be preserved as is, even if non-sensical or inapplicable.

Validate vs Mutate

Functionally, for most of the code, it really does look and feel as if the state is just State, which is valid. It's just that when you make 1 state change, the app may decide to jump into a different State than one would think. When this happens, it means some old intent first became invalid, but then became valid again due to a subsequent intent/state change.

This is how applications actually work IRL. FYI.

Dining Etiquette

Knives and Forks

I chose a text editor as an example because Stanley can't dismiss this as just frivolous UI polish for limp wristed homosexuals. It's essential that editors work like this.

The pattern is far more common than most devs realize:

  • A tree view remembers the expand/collapse state for rows that are hidden.
  • Inspector tabs remember the tab you were on, even if currently disabled or inapplicable.
  • Toggling a widget between type A/B/C should remember all the A, B and C options, even if mutually exclusive.

All of these involve storing and preserving something unknown, invalid or unused, and bringing it back into play later.

More so, if software matches your expected intent, it's a complete non-event. What looks like a "surprise hidden state transition" to a programmer is actually the exact opposite. It would be an unpleasant surprise if that extra state transition didn't occur. It would only annoy users: they already told the software what they wanted, but it keeps forgetting.

The ur-example is how nested popup menus should work: good implementations track the motion of the cursor so you can move diagonally from parent to child, without falsely losing focus:

This is an instance of the goalkeeper's curse: people rarely compliment or notice the goalkeeper if they do their job, only if they mess up. Successful applications of this principle are doomed to remain unnoticed and unstudied.

Validation is not something you do once, discarding the messy input and only preserving the normalized output. It's something you do continuously and non-destructively, preserving the mess as much as possible. It's UI etiquette: the unspoken rules that everyone expects but which are mostly undocumented folklore.

This poses a problem for most SaaS in the wild, both architectural and existential. Most APIs will only accept mutations that are valid. The goal is for the database to be a sequence of fully valid states:

Mutate

The smallest possible operation in the system is a fully consistent transaction. This flattens any prior intent.

In practice, many software deviates from this ad-hoc. For example, spreadsheets let you create cyclic references, which is by definition invalid. The reason it must let you do this is because fixing one side of a cyclic reference also fixes the other side. A user wants and needs to do these operations in any order. So you must allow a state transition through an invalid state:

Google sheets circular reference
Mutate through invalid state

This requires an effective Intent/State split, whether formal or informal.

Edsger Dijkstra

Because cyclic references can go several levels deep, identifying one cyclic reference may require you to spider out the entire dependency graph. This is functionally equivalent to identifying all cyclic references—dixit Dijkstra. Plus, you need to produce sensible, specific error messages. Many "clever" algorithmic tricks fail this test.

Now imagine a spreadsheet API that doesn't allow for any cyclic references ever. This still requires you to validate the entire resulting model, just to determine if 1 change is allowed. It still requires a general validate(Intent). In short, it means your POST and PUT request handlers need to potentially call all your business logic.

That seems overkill, so the usual solution is bespoke validators for every single op. If the business logic changes, there is a risk your API will now accept invalid intent. And the app was not built for that.

If you flip it around and assume intent will go out-of-bounds as a normal matter, then you never have this risk. You can write the validation in one place, and you reuse it for every change as a normal matter of data flow.

Note that this is not cowboy coding. Records and state should not get irreversibly corrupted, because you only ever use valid inputs in computations. If the system is multiplayer, distributed changes should still be well-ordered and/or convergent. But the data structures you're encoding should be, essentially, entirely liberal to your user's needs.

Git diff

Consider git. Here, a "unit of intent" is just a diff applied to a known revision ID. When something's wrong with a merge, it doesn't crash, or panic, or refuse to work. It just enters a conflict state. This state is computed by merging two incompatible intents.

It's a dirty state that can't be turned into a clean commit without human intervention. This means git must continue to work, because you need to use git to clean it up. So git is fully aware when a conflict is being resolved.

As a general rule, the cases where you actually need to forbid a mutation which satisfies all the type and access constraints are small. A good example is trying to move a folder inside itself: the file system has to remain a sensibly connected tree. Enforcing the uniqueness of names is similar, but also comes with a caution: falsehoods programmers believe about names. Adding (Copy) to a duplicate name is usually better than refusing to accept it, and most names in real life aren't unique at all. Having user-facing names actually requires creating tools and affordances for search, renaming references, resolving duplicates, and so on.

Even among front-end developers, few people actually grok this mental model of a user. It's why most React(-like) apps in the wild are spaghetti, and why most blog posts about React gripes continue to miss the bigger picture. Doing React (and UI) well requires you to unlearn old habits and actually design your types and data flow so it uses potentially invalid input as its single source of truth. That way, a one-way data flow can enforce the necessary constraints on the fly.

The way Stanley likes to encode and mutate his data is how programmers think about their own program: it should be bug-free and not crash. The mistake is to think that this should also apply to any sort of creative process that program is meant to enable. It would be like making an IDE that only allows you to save a file if the code compiles and passes all the tests.

surprised, mind blown, cursing, thinking, light bulb

Trigger vs Memo

Coding around intent is a very hard thing to teach, because it can seem overwhelming. But what's overwhelming is not doing this. It leads to codebases where every new feature makes ongoing development harder, because no part of the codebase is ever finished. You will sprinkle copies of your business logic all over the place, in the form of request validation, optimistic local updaters, and guess-based cache invalidation.

If this is your baseline experience, your estimate of what is needed to pull this off will simply be wrong.

In the traditional MVC model, intent is only handled at the individual input widget or form level. e.g. While typing a number, the intermediate representation is a string. This may be empty, incomplete or not a number, but you temporarily allow that.

I've never seen people formally separate Intent from State in an entire front-end. Often their state is just an adhoc mix of both, where validation constraints are relaxed in the places where it was most needed. They might just duplicate certain fields to keep a validated and unvalidated variant side by side.

There is one common exception. In a React-like, when you do a useMemo with a derived computation of some state, this is actually a perfect fit. The eponymous useState actually describes Intent, not State, because the derived state is ephemeral. This is why so many devs get lost here.

const state = useMemo(
  () => validate(intent),
  [intent]
);

Their usual instinct is that every action that has knock-on effects should be immediately and fully realized, as part of one transaction. Only, they discover some of those knock-on effects need to be re-evaluated if certain circumstances change. Often to do so, they need to undo and remember what it was before. This is then triggered anew via a bespoke effect, which requires a custom trigger and mutation. If they'd instead deferred the computation, it could have auto-updated itself, and they would've still had the original data to work with.

e.g. In a WYSIWYG scenario, you often want to preview an operation as part of mouse hovering or dragging. It should look like the final result. You don't need to implement custom previewing and rewinding code for this. You just need the ability to layer on some additional ephemeral intent on top of the intent that is currently committed. Rewinding just means resetting that extra intent back to empty.

You can make this easy to use by treating previews as a special kind of transaction: now you can make preview states with the same code you use to apply the final change. You can also auto-tag the created objects as being preview-only, which is very useful. That is: you can auto-translate editing intent into preview intent, by messing with the contents of a transaction. Sounds bad, is actually good.

The same applies to any other temporary state, for example, highlighting of elements. Instead of manually changing colors, and creating/deleting labels to pull this off, derive the resolved style just-in-time. This is vastly simpler than doing it all on 1 classic retained model. There, you run the risk of highlights incorrectly becoming sticky, or shapes reverting to the wrong style when un-highlighted. You can architect it so this is simply impossible.

The trigger vs memo problem also happens on the back-end, when you have derived collections. Each object of type A must have an associated type B, created on-demand for each A. What happens if you delete an A? Do you delete the B? Do you turn the B into a tombstone? What if the relationship is 1-to-N, do you need to garbage collect?

If you create invisible objects behind the scenes as a user edits, and you never tell them, expect to see a giant mess as a result. It's crazy how often I've heard engineers suggest a user should only be allowed to create something, but then never delete it, as a "solution" to this problem. Everyday undo/redo precludes it. Don't be ridiculous.

The problem is having an additional layer of bookkeeping you didn't need. The source of truth was collection A, but you created a permanent derived collection B. If you instead make B ephemeral, derived via a stateless computation, then the problem goes away. You can still associate data with B records, but you don't treat B as the authoritative source for itself. This is basically what a WeakMap is.

Event Sourcing

In database land this can be realized with a materialized view, which can be incremental and subscribed to. Taken to its extreme, this turns into event-based sourcing, which might seem like a panacea for this mindset. But in most cases, the latter is still a system by and for Stanley. The event-based nature of those systems exists to support housekeeping tasks like migration, backup and recovery. Users are not supposed to be aware that this is happening. They do not have any view into the event log, and cannot branch and merge it. The exceptions are extremely rare.

It's not a system for working with user intent, only for flattening it, because it's append-only. It has a lot of the necessary basic building blocks, but substitutes programmer intent for user intent.

What's most nefarious is that the resulting tech stacks are often quite big and intricate, involving job queues, multi-layered caches, distribution networks, and more. It's a bunch of stuff that Stanley can take joy and pride in, far away from users, with "hard" engineering challenges. Unlike all this *ugh* JavaScript, which is always broken and unreliable and uninteresting.

Except it's only needed because Stanley only solved half the problem, badly.

Patch or Bust

When factored in from the start, it's actually quite practical to split Intent from State, and it has lots of benefits. Especially if State is just a more constrained version of the same data structures as Intent. This doesn't need to be fully global either, but it needs to encompass a meaningful document or workspace to be useful.

It does create an additional problem: you now have two kinds of data in circulation. If reading or writing requires you to be aware of both Intent and State, you've made your code more complicated and harder to reason about.

Validate vs Mutate

More so, making a new Intent requires a copy of the old Intent, which you mutate or clone. But you want to avoid passing Intent around in general, because it's fishy data. It may have the right types, but the constraints and referential integrity aren't guaranteed. It's a magnet for the kind of bugs a type-checker won't catch.

I've published my common solution before: turn changes into first-class values, and make a generic update of type Update<T> be the basic unit of change. As a first approximation, consider a shallow merge {...value, ...update}. This allows you to make an updateIntent(update) function where update only specifies the fields that are changing.

In other words, Update<Intent> looks just like Update<State> and can be derived 100% from State, without Intent. Only one place needs to have access to the old Intent, all other code can just call that. You can make an app intent-aware without complicating all the code.

Validate vs Mutate 2

If your state is cleaved along orthogonal lines, then this is all you need. i.e. If column and line are two separate fields, then you can selectively change only one of them. If they are stored as an XY tuple or vector, now you need to be able to describe a change that only affects either the X or Y component.

const value = {
  hello: 'text',
  foo: { bar: 2, baz: 4 },
};

const update = {
  hello: 'world',
  foo: { baz: 50 },
};

expect(
  patch(value, update)
).toEqual({
  hello: 'world',
  foo: { bar: 2, baz: 50 },
});

So in practice I have a function patch(value, update) which implements a comprehensive superset of a deep recursive merge, with full immutability. It doesn't try to do anything fancy with arrays or strings, they're just treated as atomic values. But it allows for precise overriding of merging behavior at every level, as well as custom lambda-based updates. You can patch tuples by index, but this is risky for dynamic lists. So instead you can express e.g. "append item to list" without the entire list, as a lambda.

I've been using patch for years now, and the uses are myriad. To overlay a set of overrides onto a base template, patch(base, overrides) is all you need. It's the most effective way I know to erase a metric ton of {...splats} and ?? defaultValues and != null from entire swathes of code. This is a real problem.

You could also view this as a "poor man's OT", with the main distinction being that a patch update only describes the new state, not the old state. Such updates are not reversible on their own. But they are far simpler to make and apply.

It can still power a global undo/redo system, in combination with its complement diff(A, B): you can reverse an update by diffing in the opposite direction. This is an operation which is formalized and streamlined into revise(…), so that it retains the exact shape of the original update, and doesn't require B at all. The structure of the update is sufficient information: it too encodes some intent behind the change.

With patch you also have a natural way to work with changes and conflicts as values. The earlier WYSIWIG scenario is just patch(commited, ephemeral) with bells on.

The net result is that mutating my intent or state is as easy as doing a {...value, ...update} splat, but I'm not falsely incentivized to flatten my data structures.

Instead it frees you up to think about what the most practical schema actually is from the data's point of view. This is driven by how the user wishes to edit it, because that's what you will connect it to. It makes you think about what a user's workspace actually is, and lets you align boundaries in UX and process with boundaries in data structure.

Array vs Linked List

Remember: most classic "data structures" are not about the structure of data at all. They serve as acceleration tools to speed up specific operations you need on that data. Having the reads and writes drive the data design was always part of the job. What's weird is that people don't apply that idea end-to-end, from database to UI and back.

SQL tables are shaped the way they are because it enables complex filters and joins. However, I find this pretty counterproductive: it produces derived query results that are difficult to keep up to date on a client. They also don't look like any of the data structures I actually want to use in my code.

A Bike Shed of Schemas

This points to a very under-appreciated problem: it is completely pointless to argue about schemas and data types without citing specific domain logic and code that will be used to produce, edit and consume it. Because that code determines which structures you are incentivized to use, and which structures will require bespoke extra work.

From afar, column and line are just XY coordinates. Just use a 2-vector. But once you factor in the domain logic and etiquette, you realize that the horizontal and vertical directions have vastly different rules applied to them, and splitting might be better. Which one do you pick?

This applies to all data. Whether you should put items in a List<T> or a Map<K, V> largely depends on whether the consuming code will loop over it, or need random access. If an API only provides one, consumers will just build the missing Map or List as a first step. This is O(n log n) either way, because of sorting.

The method you use to read or write your data shouldn't limit use of everyday structure. Not unless you have a very good reason. But this is exactly what happens.

A lot of bad choices in data design come down to picking the "wrong" data type simply because the most appropriate one is inconvenient in some cases. This then leads to Conway's law, where one team picks the types that are most convenient only for them. The other teams are stuck with it, and end up writing bidirectional conversion code around their part, which will never be removed. The software will now always have this shape, reflecting which concerns were considered essential. What color are your types?

{
  order: [4, 11, 9, 5, 15, 43],
  values: {
    4: {...},
    5: {...},
    9: {...},
    11: {...},
    15: {...},
    43: {...},
  },
);

For List vs Map, you can have this particular cake and eat it too. Just provide a List<Id> for the order and a Map<Id, T> for the values. If you structure a list or tree this way, then you can do both iteration and ID-based traversal in the most natural and efficient way. Don't underestimate how convenient this can be.

This also has the benefit that "re-ordering items" and "editing items" are fully orthogonal operations. It decomposes the problem of "patching a list of objects" into "patching a list of IDs" and "patching N separate objects". It makes code for manipulating lists and trees universal. It lets you to decide on a case by case basis whether you need to garbage collect the map, or whether preserving unused records is actually desirable.

Limiting it to ordinary JSON or JS types, rather than going full-blown OT or CRDT, is a useful baseline. With sensible schema design, at ordinary editing rates, CRDTs are overkill compared to the ability to just replay edits, or notify conflicts. This only requires version numbers and retries.

Users need those things anyway: just because a CRDT converges when two people edit, doesn't mean the result is what either person wants. The only case where OTs/CRDTs are absolutely necessary is rich-text editing, and you need bespoke UI solutions for that anyway. For simple text fields, last-write-wins is perfectly fine, and also far superior to what 99% of RESTy APIs do.

CRDT

A CRDT is just a mechanism that translates partially ordered intents into a single state. Like, it's cool that you can make CRDT counters and CRDT lists and whatnot... but each CRDT implements only one particular resolution strategy. If it doesn't produce the desired result, you've created invalid intent no user expected. With last-write-wins, you at least have something 1 user did intend. Whether this is actually destructive or corrective is mostly a matter of schema design and minimal surface area, not math.

The main thing that OTs and CRDTs do well is resolve edits on ordered sequences, like strings. If two users are typing text in the same doc, edits higher-up will shift edits down below, which means the indices change when rebased. But if you are editing structured data, you can avoid referring to indices entirely, and just use IDs instead. This sidesteps the issue, like splitting order from values.

For the order, there is a simple solution: a map with a fractional index, effectively a dead-simple list CRDT. It just comes with some overhead.

Google docs comment

Using a CRDT for string editing might not even be enough. Consider Google Docs-style comments anchored to that text: their indices also need to shift on every edit. Now you need a bespoke domain-aware CRDT. Or you work around it by injecting magic markers into the text. Either way, it seems non-trivial to decouple a CRDT from the specific target domain of the data inside. The constraints get mixed in.

If you ask me, this is why the field of real-time web apps is still in somewhat of a rut. It's mainly viewed as a high-end technical problem: how do we synchronize data structures over a P2P network without any data conflicts? What they should be asking is: what is the minimal amount of structure we need to reliably synchronize, so that users can have a shared workspace where intent is preserved, and conflicts are clearly signposted. And how should we design our schemas, so that our code can manipulate the data in a straightforward and reliable way? Fixing non-trivial user conflicts is simply not your job.

Most SaaS out there doesn't need any of this technical complexity. Consider that a good multiplayer app requires user presence and broadcast anyway. The simplest solution is just a persistent process on a single server coordinating this, one per live workspace. It's what most MMOs do. In fast-paced video games, this even involves lag compensation. Reliable ordering is not the big problem.

The situations where this doesn't scale, or where you absolutely must be P2P, are a minority. If you run into them, you must be doing very well. The solution that I've sketched out here is explicitly designed so it can comfortably be done by small teams, or even just 1 person.

Private CAD app

The (private) CAD app I showed glimpses of above is entirely built this way. It's patch all the way down and it's had undo/redo from day 1. It also has a developer mode where you can just edit the user-space part of the data model, and save/load it.

When the in-house designers come to me with new UX requests, they often ask: "Is it possible to do ____?" The answer is never a laborious sigh from a front-end dev with too much on their plate. It's "sure, and we can do more."

If you're not actively aware the design of schemas and code is tightly coupled, your codebase will explode, and the bulk of it will be glue. Much of it just serves to translate generalized intent into concrete state or commands. Arguments about schemas are usually just hidden debates about whose job it is to translate, split or join something. This isn't just an irrelevant matter of "wire formats" because changing the structure and format of data also changes how you address specific parts of it.

In an interactive UI, you also need a reverse path, to apply edits. What I hope you are starting to realize is that this is really just the forward path in reverse, on so many levels. The result of a basic query is just the ordered IDs of the records that it matched. A join returns a tuple of record IDs per row. If you pre-assemble the associated record data for me, you actually make my job as a front-end dev harder, because there are multiple forward paths for the exact same data, in subtly different forms. What I want is to query and mutate the same damn store you do, and be told when what changes. It's table-stakes now.

With well-architected data, this can be wired up mostly automatically, without any scaffolding. The implementations you encounter in the wild just obfuscate this, because they don't distinguish between the data store and the model it holds. The fact that the data store should not be corruptible, and should enforce permissions and quotas, is incorrectly extended to the entire model stored inside. But that model doesn't belong to Stanley, it belongs to the user. This is why desktop applications didn't have a "Data Export". It was just called Load and Save, and what you saved was the intent, in a file.

Windows 95 save dialog

Having a universal query or update mechanism doesn't absolve you from thinking about this either, which is why I think the patch approach is so rare: it looks like cowboy coding if you don't have the right boundaries in place. Patch is mainly for user-space mutations, not kernel-space, a concept that applies to more than just OS kernels. User-space must be very forgiving.

If you avoid it, you end up with something like GraphQL, a good example of solving only half the problem badly. Its getter assembles data for consumption by laboriously repeating it in dozens of partial variations. And it turns the setter part into an unsavory mix of lasagna and spaghetti. No wonder, it was designed for a platform that owns and hoards all your data.

* * *

Viewed narrowly, Intent is just a useful concept to rethink how you enforce validation and constraints in a front-end app. Viewed broadly, it completely changes how you build back-ends and data flows to support that. It will also teach you how adding new aspects to your software can reduce complexity, not increase it, if done right.

A good metric is to judge implementation choices by how many other places of the code need to care about them. If a proposed change requires adjustments literally everywhere else, it's probably a bad idea, unless the net effect is to remove code rather than add.

Live Canvas

I believe reconcilers like React or tree-sitter are a major guide stone here. What they do is apply structure-preserving transforms on data structures, and incrementally. They actually do the annoying part for you. I based Use.GPU on the same principles, and use it to drive CPU canvases too. The tree-based structure reflects that one function's state just might be the next function's intent, all the way down. This is a compelling argument that the data and the code should have roughly the same shape.

Back-end vs Front-end split

You will also conclude there is nothing more nefarious than a hard split between back-end and front-end. You know, coded by different people, where each side is only half-aware of the other's needs, but one sits squarely in front of the other. Well-intentioned guesses about what the other end needs will often be wrong. You will end up with data types and query models that cannot answer questions concisely and efficiently, and which must be babysat to not go stale.

In the last 20 years, little has changed here in the wild. On the back-end, it still looks mostly the same. Even when modern storage solutions are deployed, people end up putting SQL- and ORM-like layers on top, because that's what's familiar. The split between back-end and database has the exact same malaise.

None of this work actually helps make the app more reliable, it's the opposite: every new feature makes on-going development harder. Many "solutions" in this space are not solutions, they are copes. Maybe we're overdue for a NoSQL-revival, this time with a focus on practical schema design and mutation? SQL was designed to model administrative business processes, not live interaction. I happen to believe a front-end should sit next to the back-end, not in front of it, with only a thin proxy as a broker.

What I can tell you for sure is: it's so much better when intent is a first-class concept. You don't need nor want to treat user data as something to pussy-foot around, or handle like it's radioactive. You can manipulate and transport it without a care. You can build rich, comfy functionality on top. Once implemented, you may find yourself not touching your network code for a very long time. It's the opposite of overwhelming, it's lovely. You can focus on building the tools your users need.

This can pave the way for more advanced concepts like OT and CRDT, but will show you that neither of them is a substitute for getting your application fundamentals right.

In doing so, you reach a synthesis of Dijkstra and anti-Dijkstra: your program should be provably correct in its data flow, which means it can safely break in completely arbitrary ways.

Because the I in UI meant "intent" all along.

More:

February 03, 2024

A while ago I wrote a article comparing Waterloo with Bakhmut. That was in August 2023. We are in February 2024 and I predict that soon I will have to say the same about Avdiivka.

The city is about to be encircled the coming month and is in fact already ~ technically encircled (all of its supply roads are within fire range of Russians). I think that Avdiivka has already surpassed many times the death toll of Bakhmut.

The amounts of deaths are now also becoming comparable to the WWII Operation Bagration which had about 400000 German and 180000 Red Army kills.

The Soviet operation was named after the Georgian prince Pyotr Bagration (1765–1812), a general of the Imperial Russian Army during the Napoleonic Wars.

Wikipedia

January 27, 2024

With FOSDEM just a few days away, it is time for us to enlist your help. Every year, an enthusiastic band of volunteers make FOSDEM happen and make it a fun and safe place for all our attendees. We could not do this without you. This year we again need as many hands as possible, especially for heralding during the conference, during the buildup (starting Friday at noon) and teardown (Sunday evening). No need to worry about missing lunch at the weekend, food will be provided. Would you like to be part of the team that makes FOSDEM tick?舰

January 21, 2024

As of Autoptimize Pro 2.3 there is an option to delay all JavaScript. Delaying can have a bigger positive performance impact then asyncing or deferring because AOPro will only load the delayed JS after a to be defined delay or (better even) only at the moment user interaction (such as swiping/ scrolling or mouse movement) is noticed. Obviously when delaying all JS you might want to be able to…

Source

January 18, 2024

2024 will be the first year there will be a FOSDEM junior track, organizing workshops for children 7-17. Developers from the FOSS Educational Programming Languages devroom will be organizing workshops for children from 7 - 17 with the help of CoderDojo. At FOSDEM, volunteers from CoderDojo International, CoderDojo Netherlands, and CoderDojo Belgium will be helping during the workshops. There will be workshops from MicroBlocks, Snap!, Zim, MIT App Inventor, Raspberry Pi, Hedy, and CoderDojo. For these workshops, you will need to get a free ticket. Visit https://fosdem.org/2024/schedule/track/junior and click on the workshop of your choice. On the workshop page, you舰

January 02, 2024

December 27, 2023

Ik schrijf nu zo’n zeven jaar software voor high-end vijf-as CNC machines bij Heidenhain.

Wij zijn bezig aan de TNC 7 software. Dit is de opvolger van de TNC 640 software. Ik ben samen met een team ontwikkelaars die in Traunreut zitten de ontwikkelaar van voornamelijk de NC editor en alles wat daarbij hoort (alle integraties met de omliggende onderdelen van al wat er ook bij komt kijken – en dat is veel).

Als ik naar de TNC 640 kijk is dit software die veertig jaar zal meegaan.

Als ik naar onze eigen Belgische ruimtevaartindustrie maar ook bv. wapenindustrie kijk, maar ook echt om het even wat, is dat software die ongeveer overal gebruikt wordt. Voor ongeveer alles. De Jobs-pagina van FN Herstal bijvoorbeeld toont al een paar jaar mensen die effectief een CNC machine met een Heidenhain TNC 640 bedienen. Binnenkort zal daar dus onze nieuwe TNC 7 gebruikt worden! (om dan weet ik veel, bijvoorbeeld onze soldaten hun machinegeweren mee te maken).

Enfin. Genoeg gestoef daarover!

Ik ben (dus) de laatste tijd geïnteresseerd geraakt in het wereldje van metaalbewerking. Toen ik een jonge gast van zo’n 15 – 16 was, was mijn grootvader een draaibank metaalbewerker (bij AGFA-gevaert). Die man was altijd trots om mij voortdurend uitleg te geven daarover.

Ik heb ook op school tijdens mijn jaren electromechanica een paar toetsen mogen of moeten doen met een eenvoudige manuele draaibank.

Dat was eigenlijk wel interessant. Maar computers, die waren toen ook heel erg interessant!

Vandaag komt het voor mij samen.

Maar ik moet zeggen. Nee echt. De echte ‘hackers‘ (het aanpassen van een systeem opdat het meer doet dan dat waar het voor ontworpen is) zitten nog veel meer in de machining wereld dan in de of onze software wereld. Hoewel wij er ook wel een stel hebben rondlopen.

Jullie (of wij) andere sofware ontwikkelaars hebben er vaak geen idee van hoe enorm uitgebreid die andere wereld is. Zowel in hout als in metaal. Het loopt daar de spuigaten uit van de hackers.

Ik wil maar zeggen, vooral aan de jonge kuikens: leg uw interesses breed. Ga heel erg ver. Het is allemaal zo ontzettend interessant.

En vooral: het zit in de combinatie van software én iets anders. Dat kan security zijn. Maar dus ook machining. Of medisch (imaging, enzo)? Een combinatie van alles. Ik sprak met mensen die tools maken voor CNC machines die zaken voor chirurgen produceren.

Enkel Python code kloppen is niet de essentie van onze passie. Je schrijft die Python code opdat het samenwerkt met of voor iets anders. Dat iets anders moet je dus ook kennen of willen kennen.

December 23, 2023

Het ontbreekt ons in het Westen aan analysten die kunnen bewegen in hun mening.

Een groot probleem is dat om het even wie die iets zegt, onmiddellijk vastegepint is.

Zo’n persoon kan nooit nog een andere richting uitgaan.

We zijn nu aangekomen in een militair conflict met Rusland. Onze analysten moeten terug kunnen bewegen zoals de wind waait. Want die waait heel de tijd.

Nieuwe inzichten zijn eigenlijk belangrijker dan al onze oude onzin.

We moeten niet zomaar mensen cancelen. We moeten terug toelaten dat analysten fouten maakten, fouten maken en zich dus (daarom) willen aanpassen.

Ik vraag ook ondubbelzinnig dat Duitsland opstaat, en haar leger volledig brengt waar het hoort: de grootste macht van Europa. De Wereld Oorlogen zijn voorbij en wat gebeurd is is gebeurd. We hebben Duitsland nodig om ons te beschermen. We moeten daar aan bijdragen.

Zeit bleibt nicht stehen.

December 22, 2023

De huidige tijd wordt denk ik het best omschreven door Rammstein’s Zeit.

De oorlog tussen Rusland en Oekraïne is voor het Westen eigenlijk volledig verloren. Rusland zal bijna zeker een groot deel van Oekraïne innemen.

Massa’s mannen in Oekraïne sterven momenteel. Ongeveer zo’n 20 000 mensen per maand. Dat waren er zo’n 650 vandaag.

Dit komt overeen met wat DPA, Weeb Union, History Legends en Military Summary zeggen.

Ik ben er al langer uit dat de Westerse propaganda (ja sorry jongens, het is nu echt wel propaganda hoor) kei hard liegt. En dat de bovenstaanden een veel beter en klaar beeld geven van de werkelijkheid.

Die werkelijkheid is bitterhard. Namelijk dat Rusland dit aan het winnen is. Geen beetje. Maar ernstig en echt.

Extreem veel mensen uit Oekraïne zijn aan het sterven. Iedere dag.

Er zijn ook geen gewonnen gebieden. Men kan ze nauwelijks vinden op een map als ze er zijn. En als ze al niet terug ingenomen zijn door Rusland.

Zeit, bitte bleib stehen, bleib stehen. Zeit, das soll immer so weitergehen.

Rammstein, song Zeit

De werkelijkheid is dat Rusland vanaf nu gebieden zal gaan innemen. De tijd zal dus niet blijven staan. Laat staan stilstaan.

Dat is hard en dit zijn harde woorden. Het zijn harde tijden.

Het Westen zal dit verliezen.

Wat U daar ook van vindt.

December 17, 2023

Like many games of that time, Snow Bros. is a simple game that can be mastered with practice (or rather, coins). It’s fun all the way, specially when playing with 2 players (in co-op). The gameplay is somewhat reminiscent of Bubble Bobble, but instead of bubbles you shoot snow to bury enemies in snowballs. Push the snowball to get rid of all the enemies on its path! The arcade game was ported to a lot of game consoles and gaming computers.

December 16, 2023

Iedere keer Jonathan Holslag weer eens heeft mogen factureren aan de VRT omdat ze hem opdraven bij De Afspraak of Terzake vraag ik me af: met welke kennis van zaken spreekt deze man eigenlijk?

Bijna iedere uitspraak van hem is een ideologisch pro NAVO, pro VS, anti China en (uiteraard) anti Rusland. De volstrekte Atlanticist.

Ik kan helemaal volgen wanneer we Frank Creyelman veroordelen voor het hand- en spandiensten te leveren voor Chinese inlichtingendiensten.

Maar waar het ineens vandaan komt dat we dat feit ook moeten koppelen aan wat Yves Leterme doet, ontgaat mij volledig.

Als we Yves van iets moeten beschuldigen, is het dat hij de scheiding der machten niet respecteerde door de Fortis-rechter te beïnvloeden. Dat heeft dan ook gepaste gevolgen gehad.

Maar spionage voor China?

Ik vind het eigenlijk de titel professor bij de VUB onwaardig om zulke valse beschuldigingen te uiten, gewoon om wat centen te kunnen factureren bij de VRT.

Beste heer Holslag. Heb jij eigenlijk iets te zeggen wat wel op feiten is gebaseerd? Heb jij bewijzen?

Ik verwacht dat van iemand die zich professor mag noemen. En ik verwacht ook het ontslag wanneer zo iemand met zo’n titel niet voldoet aan die verwachting.

M.a.w. beste professor Holslag: bewijs dat Leterme een Chinese spion is. Of neem ontslag. Valse beschuldigingen zijn trouwens strafbaar onder de noemer Laster en Eerroof.

Willen we wel professors die zich schuldig maken aan strafbare feiten?