Planet Grep

Planet'ing Belgian FLOSS people

Planet Grep is maintained by Wouter Verhelst. All times are in UTC.

September 26, 2022

Let’s continue the discovery of the MySQL Database Resource when deploying on Oracle Cloud Infrastructure using Terraform.

Last week, we saw how to create custom configurations and define user variables. Today we will see how we can define a backup policy and a maintenance window.

Backup Policy

In the oci_mysql_mysql_db_system resource, we will add a new section called backup_policy like this:

backup_policy {
       is_enabled        = "true"
       retention_in_days = "3"
       window_start_time = "01:00-00:00"
       freeform_tags = {"backup_defined_by"="Terraform"}
       pitr_policy {
            is_enabled = "true"
       }
}

This part of code (you can see in a working Terraform architecture sample), enables backup, sets the retention days to 3. It also defines the starting time and set a free form tag.

In this section, we can include another section about pitr_policy to enable or disable point-in-time recovery capabilities for our MySQL instance.

All the settings are documented in Terraform’s registry: oci_mysql_mysql_backup.

This is what we can see in Oracle Cloud’s console

Maintenance Window

In the oci_mysql_mysql_db_system resource, we can also add a section to specify the maintenance window of the instance.

During maintenance, Oracle patches the underlying operating system, update versions (-u*) of the MySQL Server, and any underlying hardware. For more details check the manual.

This is how we define the maintenance window in our Terraform code:

 maintenance {
      window_start_time = "sun 01:00"
 }

In Oracle Cloud’s console, we can easily see it:

Conclusion

Once again, you can see that it’s very easy to deploy instances of MySQL DBaaS in OCI using Terraform.

We started with the deployment of an instance in my first articles on the MySQL and Terraform database service, now we’ll go into a little more detail and we can see that there are multiple aspects of the service that we can easily define using the OCI API.

You can find the full code example of the MySQL Database Service resource here.

An alphabetical list of software that I really like.

Apache: Not as popular as it used to be, but I know it and so still use it

apt/aptitude/apt-get: The origin of all 'apps' and the most user friendly and secure way to manage software

Audacity: Or is that too controversial now with their telemetry?

bash: The best interface to my computers

convert: From Imagemagick, for batch converting images 

Civilization IV: Yes 4, the only software that is not (yet?) open source

cron: It wakes me up every day (using mpv and bash)

Debian: I run it on almost everything

ffmpeg: to convert video formats

find: because I tend to forget where I put stuff :)

Firefox: because an alternative browser is crucial

FreeCAD: Amazing CAD software, and I like math

gcc: Because coding in C is fun!

gimp: Easy image manipulation

git: How did I ever live without it?

gramps: To keep track of my ancestors

grep: Often very useful.

Homebrew: To make a MacBook usable

Inkscape: Because vector graphics are essential

Linux: in background of course, like so many other tools that I am rarely aware of

make: For software that is not part of Debian

mariadb(mysql): So easy to build a database

mpv: For audio and video (and for their API

Openshot: To create Lego NXT or Prusa video's

PrusaSlicer: A 3D printer opens a whole new world of possibilities 

Python: I hate this language, but it is used so much and coding is still fun

return Youtube dislike: How else is Youtube usable?!

rsync: For backups

Singlefile: because web pages tend to disappear forever 

ssh: Multitool for anything remote

Thunderbird: Reading my mail since 1996 (as Netscape Mail)

tmux: used daily, configured to a quiet and relaxing layout

Video downloadhelper: because video's tend to disappear forever

vim: By far the best editor for text, code, config files and more.

wine: To play that one game on Debian (though I wish its 32-bit had as much memory as 32-bit Windows XP)

Wireshark: The best tool to learn about networking

XFCE: My GUI since almost 20 years

xfce-screenshooter: Because screenshots are useful

 

Inspiration HN.

September 25, 2022

For one of the projects I'm currently involved in, we want to have a better view on sustainability within IT and see what we (IT) can contribute in light of the sustainability strategy of the company. For IT infrastructure, one would think that selecting more power-efficient infrastructure is the way to go, as well as selecting products whose manufacturing process takes special attention to sustainability.

There are other areas to consider as well, though. Reusability of IT infrastructure and optimal resource consumption are at least two other attention points that deserve plenty of attention. But let's start at the manufacturing process...

Certifications for products and companies

Eco certifications are a good start in the selection process. By selecting products with the right certification, companies can initiate their sustainable IT strategy with a good start. Such certifications look at the product and manufacturing, and see if they use proper materials, create products that can have extended lifetimes in the circular (reuse) economy, ensure the manufacturing processes use renewable energy and do not have harmful emissions, safeguard clean water, etc.

In the preliminary phase I am right now, I do not know yet which certifications make most sense to pursue and request. Sustainability is becoming big business, so plenty of certifications exist as well. From a cursory search, I'd reckon that the following certifications are worth more time:

  • EcoVadis provides business sustainability ratings that not only cover the ecological aspect, but also social and ethical performance.
  • ISO 14001 covers environmental management, looking at organizations' processes and systematic improvements contributing to sustainability.
  • Carbon Neutral focus on transparency in measurements and disclosure of emissions, and how the company is progressing in their strategy to reduce the impact on the environment.
  • TCO Certified attempts to address all stages of a manufacturing process, from material selection over social responsibility and hazardous substances up to electronic waste and circular economy.
  • Energy Star focuses on energy efficiency, and tries to use standardized methods for scoring appliances (including computers and servers).

Power efficiency

A second obvious part is on power efficiency. Especially in data center environments, which is the area that I'm interested in, power efficiency also influences the data center's capability of providing sufficient cooling to the servers and appliances. Roughly speaking, a 500 Watt server generates twice as much heat as a 250 Watt server. Now, that's oversimplifying, but for calculating heat dissipation in a data center, the maximum power of infrastructure is generally used for the calculations.

Now, we could start looking for servers with lower power consumption. But a 250 Watt server is most likely going to be less powerful (computing-wise) than a 500 Watt server. Hence, power efficiency should be considered in line with the purpose of the server, and thus also the workloads that it would have to process.

We can use benchmarks, like SPEC's CPU 2017 or SPEC's Cloud IaaS 2018 benchmarks, to compare the performance of systems. Knowing the server's performance for given workloads and the power consumption, allows architects to optimize the infrastructure.

Heat management (and more) in the data center

A large consumer of power in a data center environment are the environmental controls, with the cooling systems taking a big chunk out of the total power consumption. Optimizing the heat management in the data center has a significant impact on the power consumption. Such optimizations are not solely about reducing the electricity bill, but also about reusing the latent heat for other purposes. For instance, data center heat can be used to heat up nearby buildings.

A working group of the European Commission, the European Energy Efficiency Platform (E3P), publishes an annual set of best practices in the EU Code of Conduct on Data Center Energy Efficiency which covers areas such as airflow design patterns, operating temperature and humidity ranges, power management features in servers and appliances, infrastructure design aspects (like virtualization and appropriate, but no over-engineered redundancy), etc.

This practice goes much beyond the heat management alone (and is worth a complete read), covering the complete data center offering. Combining these practices with other areas of data center design (such as redundancy levels, covered by data center tiering) allows for companies that are looking at new data centers to overhaul their infrastructure and be much better prepared for sustainable IT.

Circular ecosystem

Another part that often comes up in sustainability measures is how reusable the infrastructure components are after their "first life". Infrastructure systems, which frequently renew after 4 to 5 years of activity, can be resold rather than destroyed. The same can be said for individual components.

Companies that deal with sensitive data regularly employ "Do Not Return" clauses in the purchases of storage devices. Disks are not returned if they are faulty, or just swapped for higher density disks. Instead, they are routinely destroyed to make sure no data leakage occurs.

Instead of destroying otherwise perfect disks (or disks that still have reusable components) companies could either opt for degaussing (which still renders the disk unusable, but has better recyclability than destroyed disks) or data wiping (generally through certified methods that guarantee the data cannot be retrieved).

Extended lifecycle

Systems are often working perfectly beyond their 4 to 5 year lifespans. Still, these systems are process-wise automatically renewed to get more efficient and powerful systems in place. But that might not always be necessary - beyond even the circular ecosystem remarks above (where such systems could be resold), these systems can even get extended lifecycle within the company.

If there is no need for a more powerful system, and the efficiency of the system is still high (or the efficiency can be improved through minor updates), companies can seek out ways to prolong the use of the systems. In previous projects, I advised that big data nodes can perfectly remain inside the cluster after their regular lifetime, as the platform software (Hadoop) can easily cope with failures if those would occur.

Systems can also be used to host non-production environments or support lab environments. Or they can be refurbished to ensure maximal efficiency while still being used in production. Microsoft for instance has a program called Microsoft Circular Centers which aims at a zero-waste sustainability within the data center, through reuse, repurpose and recycling.

Right-sizing the infrastructure

Right-sizing is to select and design infrastructure to deal with the workload, but not more. Having a set of systems at full capacity is better than having twice as many systems at half capacity, as this leads to power inefficiencies.

To accomplish right-sizing isn't as easy as selecting the right server for a particular workload. Workload is distributed, and systems are virtualized. Virtualization allows for much better right-sizing as you can distribute workload more optimally.

Companies with large amounts of systems can more efficiently distribute workload across their systems, making it easier to have a good consumption pattern. Smaller companies will notice that they need to design for the burst and maximum usage, whereas the average usage is far, far below that threshold.

Using cloud resources can help to deal with bursts and higher demand, while still having resources on-premise to deal with the regular workload. Such hybrid designs, however, can be complex, so make sure to address this with the right profiles (yes, I'm making a stand for architects here ;-)

Standardizing your infrastructure also makes this easier to accomplish. If the vast majority of servers are of the same architecture, and you standardize on as few operating systems, programming languages and what not, you can more easily distribute workload than when these systems have different architectures and purposes.

Automated workload and power management

Large environments will regularly have servers and infrastructure that is not continuously used at near full capacity. Workloads are frequently following a certain curve, such as higher demand during the day and lower at night. Larger platforms use this curve to schedule appropriate workload (like running heavy batch workload at night while keeping the systems available for operational workload during the day) so that the resources are more optimally used.

By addressing workload management and aligning power management, companies can improve their power usage by reducing active systems when there are less resource needs. This can be done gradually, such as putting CPUs in lower power modes (CPU power takes roughly 30% of a system's total power usage), but can expand to complete hosts being put in idle state.

We can even make designs where servers are shut down when unused. While this is frequently frowned upon, citing possible impact on hardware failures as well as reduced reactivity to sudden workload demand, proper shutdown techniques do offer significant power savings (as per a research article titled Quantifying the Impact of Shutdown Techniques for Energy-Efficient Data Centers).

Conclusion

Sustainability within IT focuses on several improvements and requirements. Certification helps in finding and addressing these, but this is not critical in any company's strategy. Companies can address sustainability easily without certification, but with proper attention and design.

Feedback? Comments? Don't hesitate to drop me an email, or join the discussion on Twitter.

September 22, 2022

With Terraform OCI provider 4.90.0 (released August 24th, 2022), we have added the possibility to specify the value of editable configuration variables directly from Terraform.

These variables must be defined in a resource of type oci_mysql_mysql_configuration like this:

resource "oci_mysql_mysql_configuration" "mds_mysql_configuration" {
	#Required
	compartment_id = var.compartment_ocid
    shape_name   = var.mysql_shape

	#Optional
	description = "MDS configuration created by terraform"
	display_name = "MDS terraform configuration"
	parent_configuration_id = data.oci_mysql_mysql_configurations.mds_mysql_configurations.configurations[0].id
	variables {
		#Optional
		max_connections = "501"
        binlog_expire_logs_seconds = "7200"
	}
}

We can see from the code above that we have changed the value of two variables: max_connections and binlog_expire_logs_seconds.

We can also notice that a parent configuration id must be provided. The parent configuration id can be selected from the existing configurations (default or custom). This is how you can define the parent configuration:

data "oci_mysql_mysql_configurations" "mds_mysql_configurations" {
  compartment_id = var.compartment_ocid

  #Optional
  state        = "ACTIVE"
  shape_name   = var.mysql_shape
}

And finally the MySQL configuration resource created must be applied to the MySQL instance resource we want to deploy (line 6 on the code below):

resource "oci_mysql_mysql_db_system" "MDSinstance" {
    admin_password = var.admin_password
    admin_username = var.admin_username
    availability_domain = var.availability_domain
    compartment_id = var.compartment_ocid
    configuration_id = oci_mysql_mysql_configuration.mds_mysql_configuration.id
    shape_name = var.mysql_shape
    subnet_id = var.subnet_id
    data_storage_size_in_gb = var.mysql_data_storage_in_gb
    display_name = var.display_name

    count = var.existing_mds_instance_id == "" ? 1 : 0

    is_highly_available = var.deploy_ha
}

When the Terraform environment is deployed, in OCI’s console we can see that a the MySQL DB system is using the new configuration we created:

If we verify the value of the variables in this configuration we can see that it contains the default values for all the variables but the two we defined in our Terraform file:

Many people were requesting this feature that is now available !

Don’t forget that not all variables can be modified in MySQL HeatWave Database Service. Only those called User Variables can be modified. You can find the list of those User Variables on this page.

A full example to deploy a MySQL Database Service on OCI is available on this GitHub repo.

Don’t forget that if you want to connect to your MySQL DB instance as a developer, the easiest way is to use MySQL Shell for Visual Studio Code:

As usual, enjoy MySQL and happy deployment on OCI with Terraform !

September 21, 2022

Over the 15+ years of my development career, I have learned several things that significantly increase my effectiveness. In this post, I share those learnings with you.

The blog post mentions and links to many valuable concepts that you can explore further as you see fit. It has three sections:

  • Generic Advice — Important context and motivation for the technical advice
  • Technical Advice — The main course
  • Recommended Reading — Links to high-quality books and blogs that are great for getting started

Generic Advice for Juniors

1. Code is not the Point

As developers, we like writing code. Most of us want to be given a nice unambiguous task. A fun technical puzzle to solve without paying attention to the rest of the world.

Put reasonable effort into making sure that you are solving the right problem. To quote Peter Drucker: There is nothing so useless as doing efficiently that which should not be done at all. Gather feedback early and often, typically by continuous delivery to real users. Be Agile.

Software development is expensive, with the vast majority of the effort of real-world projects typically going into maintenance. Combine this with the goal being user/business outcomes, the best code is often no code. To quote Bill Gates: “Measuring programming progress by lines of code is like measuring aircraft building progress by weight.”

See also: YAGNI, KISS, The Last Responsible Moment.

2. Software Design Matters

During the first 5 years of my development career, I thought that software design is for software architects or other people with special roles. I was focused on “getting things done”, and saw software design and practices such as writing tests, as a distraction at best. My code worked, and I was getting a lot of things done. Or so I thought.

Then I read Clean Code, by Robert C. Martin. This book motivates caring about software design and contains examples and many technical heuristics. The most conceptual takeaway is the saying that “the only way to go fast is to go well“. In other words, if you make a mess, it will slow you down. See also: TradableQualityHypothesis, DesignStaminaHypothesis

Learning how to write well-designed clean code of course takes time and effort. And when you start, you will be slower and make mistakes. Simple is not Easy.

3. Use the BEST Practices

Writing tests tends to be beneficial. There are exceptions, but most of the time, it makes a lot of sense to write automated tests. Writing tests is an example of a best practice.

If you are new to writing tests, just follow the best practice and write tests for everything. When starting, blindly following the best practice will be better than following your own underdeveloped judgment. Over time you will learn how to write tests effectively, and be able to tell the difference between you have messed up, and situations where writing a test is not worth it. You will also start to understand the value tests bring on a more visceral level, by having experienced the decrease in debugging sessions and the worry-free refactoring enabled by your tests. After developing your judgment, you will be able to transcend the best practice.

This advice applies to best practices in any area that you are a junior in. Automated tests are just an example.

One big gotcha is that it is not easy to tell the difference between a sensible best practice and something nonsensical or even counterproductive. This is made more complicated by most existing code being a mess, and by most developers, including “experienced” and “senior” ones, not knowing software design basics. This makes having a good mentor extremely valuable. Barring that, one piece of advice based on my own experiences is to be wary of best practices specific to the community of your language or framework. Look for evergreen advice that has been around for decades.

Technical Advice for Juniors

Our focus will be on technical topics. Many other areas are important, such as health, happiness, career, and soft skills. Knowing how to avoid a technical pitfall won’t help you if you are sleep deprived and working on the wrong problem for a toxic boss that underpays you.

4. Write Tests

Write automated tests. Perhaps write tests before the code, such as via Test Driven Development (TDD). This makes it easy to verify your code is correct in a repeatable manner, thus saving you from much manual retresting and from debugging sessions.

You think test-first is difficult? Try debug-after.

Perhaps even more importantly, tests give you the safety net to refactor your code. And continuous refactoring is needed to keep your code clean. Without a reliable test suite, it is all the more likely that your code will rot.

Writing tests is difficult if the design of your code is poor, such as when using inheritance for code reuse, or when using static functions. If on the other hand, you have SOLID classes, with no global dependencies, then writing nice tests is not so difficult.

Test design matters because poorly written tests will slow you down. Avoid binding your tests to implementation details of the code under test or to the structure of the system. Avoid overusing Mocks and write better Test Doubles.

5. Do Not Use Inheritance For Code Reuse

This is one of those best practices that bring to mind the “Use the Best Practices” section. My advice: do not use inheritance for code reuse at all when you are starting out. It is rarely the right call and can do a lot of harm. Favor composition over inheritance.

6. Write Object-Oriented code

Write SOLID code that is not STUPID. There is so much value in understanding these principles and anti-patterns.

Actually create objects. Classes with only static methods are not OO. Try to avoid using static code altogether.

See also: my defense of SOLID.

7. Write Functional Code

(Functional programming is not to be confused with imperative structural programming.)

This point is not about fully switching to a functional language. You can benefit from using a functional style in your OO language. Minimize state, especially mutable state, and do one thing in your functions. See also: functional core, imperative shell.

8. Use Informed Duplication

Copy-pasting big chunks of code to multiple places is almost always unwise. Any self-respecting developer soon learns this and starts to follow some form of Don’t Repeat Yourself (DRY). Unfortunately, the well-intended pursuit of DRY often leads to overengineering and accidental complexity. This is where the counterpart of DRY comes in: Write Everything Twice (WET). The idea behind WET is to only deduplicate on the third occurrence of duplication.

For a more in-depth look at the costs and benefits of deduplication, see The Fallacy of DRY.

9. Types, Names and Comments

Try to write self-documenting code and avoid comments.

Every time you write a comment, you should grimace and feel the failure of your ability of expression. — Robert C. Martin

Comments are dangerous because they can lie. The code can change without the comment being updated. New code can be added right under the comment. The comment might have been wrong or inaccurate in the first place. When this happens, the comment not only becomes useless, it becomes misleading.

To write self-documenting code:

  • Do one thing in your functions
    • By doing a single thing in a function, you can give it a clear name
    • Feel the need to explain what different sections of your function do by adding comments? Instead, extract each section into its own well-named function.
    • Extract till you drop“: if you can meaningfully extract a function, then you probably should. Don’t be afraid of small functions.
    • Command Query Separation
    • Similar to the Single Responsibility Principle for classes (The S in SOLID)
  • Minimize state
  • Use types. Combined with a test suite that executes the code, you can rely on the types telling the truth.
  • Avoid mixed types. Avoid parameters or return values that can be an integer, a boolean, or a string. This naturally happens if you write focused functions that only do one thing.
  • Write tests. A well-written and comprehensive test suite shows you how the production code can be used, and how it behaves.

Clean Code by Robert C. Martin has some good rules of thumb about naming and comments.

Recommended Reading for Juniors

Books

Blogs

See also: Recommended Reading for Developers by Jeff Atwood

Bonus links

  • Tell Don’t Ask — Encapsulation best practice
  • Law of Demeter — Coupling best practice
  • Domain Driven Design — A sizeable toolbox. More advanced, good to first learn the basics.
  • Object Calisthenics — Rules that restrict what you can do in programming. Nice for learning how to do things differently
  • Pair Programming — A great way to learn from each other
  • Code katas — Simple programming, great for practicing a specific technique or skill such as Test Driven Development

The post Advice for junior developers appeared first on Entropy Wins.

September 15, 2022

This is a warning: only do this at home, instead of don’t do this at home. On your local home network that is behind a firewall and/or gateway.

Edit: Philip Paeps sent me a few suggestions to restrict things even more. I have adapted this blog item to mention all of them.

Unfortunately there are companies that make and made printers and then don’t provide firmware upgrades for it.

You end up with a scanner/printer that works perfectly fine. Except you can’t find any SMTP servers without TLS anymore. Because rightfully so has more or less every E-mail provider turned off plain text SMTP.

Now your printer cannot send the scanned documents over E-mail anymore. Firmware upgrade of your scanner/printer? What if there is none?

What to do? Run a unencrypted postfix smtpd on some scrap computer on your local network, that relays your mails with its smtp client to a TLS enabled SMTP server.

apt-get install postfix # and select Internet site

I didn’t want fake security by having authentication on the smtpd side, as there will be no encryption between printer and our local postfix. When somebody listens on your local network they would not only have the PDFs that you scanned with your scanner/printer, they will also have those authentication tokens.

I suppose you can add authentication, but then at least don’t be silly and use usernames and passwords that you use somewhere else. Note that some really old scanners/printers also can’t do SMTP with authentication.

I used these relay_restrictions for smtpd. We will place the printer’s IP address in/within mynetworks, so we will relay for it through permit_mynetworks.

smtpd_relay_restrictions = permit_mynetworks reject_unauth_destination

The mydestination should be as restricted as possible so we’ll just empty it (no local delivery at all):

mydestination =
local_recipient_maps =
local_transport = error:local delivery is disabled

You can now also remove the local delivery agent from master.cf if you want. We restrict the mynetworks of course too. The scanner/printer is with DHCP on 192.168.0.0/24, so add that:

mydestination =
mynetworks = 127.0.0.0/8 192.168.0.0/24

Even better is to have your scanner/printer on a fixed IP address and then use that one IP address:

mynetworks = 127.0.0.0/8 192.168.0.11

In fact, when the scrap computer has a fixed IP address then you can further restrict things by using inet_interfaces of course:

inet_interfaces = 192.168.0.14, 127.0.0.1

And now we configure the relayhost, we will relay to our TLS enabled SMTP(s) server:

relayhost = [smtps.server.org]:587
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_sasl_security_options = noanonymous
smtp_tls_security_level = encrypt
header_size_limit = 4096000
message_size_limit = 524288000

Now make a file /etc/postfix/sasl_passwd to add the username / password. After that of course run postmap /etc/postfix/sasl_passwd

[smtps.server.org]:587 username:password

We now on the printer configure the scrap computer’s local network IP address or DHCP hostname as SMTP server.

Other security considerations that are possible is to place printer and scrap computer on a VLAN or let there be a crossed UTP cable between them. But if you are going to do those things then you also know how to do it yourself. Such things do make sense: your easily hackable LED lamps and even more easily hackable Smart TV don’t need to talk to your printer nor this scrap computer. VLAN everything!

September 14, 2022

It has been long overdue, but I’m finally almost ready to release Autoptimize Pro. The first version of AO Pro (an add-on to be installed on top off Autoptimize) will have Image optimization, CDN, fully automated critical CSS and a number of “boosters” to improve performance even more, with more features are in the backlog. If you’re interested in beta-testing fill out the form below and based on...

Source

September 13, 2022

FOSDEM 2023 will take place on Saturday 4th and Sunday 5th of February 2023. We are aiming for a physical conference. Update: We initially said "2022" which was a typo, of course. Sorry about that. While there are still a lot of open questions, we are aiming for a physical FOSDEM 2023. It's both exciting and scary. We will open Call for Devrooms etc soon. One note: With the online format, we were able to host more devrooms than ever. In the physical world, we will be constrained by the amount and size of the rooms ULB is able舰

September 09, 2022

More and more people are using UUID’s to identify records in their database.

As you already know, for MySQL’s storage engine (InnoDB) the primary key is very important ! (for performance, memory and disk space).

See the following links:

Problems

There are 2 major problems having a UUID as Primary Key in InnoDB:

  1. generally they are random and cause clustered index to be rebalanced
  2. they are included in each secondary indexes (consuming disk and memory)

Let’s have a look at this example:

MySQL > CREATE TABLE my_table ( 
       uuid VARCHAR(36) DEFAULT (UUID()) PRIMARY KEY, 
       name VARCHAR(20), beers int unsigned);
...

MySQL > SELECT * FROM my_table;
+--------------------------------------+---------+-------+
| uuid                                 | name    | beers |
+--------------------------------------+---------+-------+
| 17cd1188-1fa0-11ed-ba36-c8cb9e32df8e | Kenny   |     0 |
| 17cd12e2-1fa0-11ed-ba36-c8cb9e32df8e | lefred  |     1 |
| 478368a0-1fa0-11ed-ba36-c8cb9e32df8e | Scott   |     1 |
| 47836a77-1fa0-11ed-ba36-c8cb9e32df8e | Lenka   |     0 |
+--------------------------------------+---------+-------+

Now, let’s insert 2 new records:

MySQL > INSERT INTO my_table (name, beers) VALUES ("Luis",1), ("Miguel",5);

We can check the content of the table:

MySQL > SELECT * FROM my_table;
+--------------------------------------+---------+-------+
| uuid                                 | name    | beers |
+--------------------------------------+---------+-------+
| 17cd1188-1fa0-11ed-ba36-c8cb9e32df8e | Kenny   |     0 |
| 17cd12e2-1fa0-11ed-ba36-c8cb9e32df8e | lefred  |     1 |
| 36f1ce9a-1fa1-11ed-ba36-c8cb9e32df8e | Luis    |     1 |  <--
| 36f1d158-1fa1-11ed-ba36-c8cb9e32df8e | Miguel  |     5 |  <--
| 478368a0-1fa0-11ed-ba36-c8cb9e32df8e | Scott   |     1 |
| 47836a77-1fa0-11ed-ba36-c8cb9e32df8e | Lenka   |     0 |
+--------------------------------------+---------+-------+

We can see that the two new records were not inserted at the end of the table but in the middle. InnoDB had to move two old records to be able to insert the two new before them. On such small table (all records are on the same page) that doesn’t cause any problem, but imagine this table is 1TB large !

Additionally, if we keep the VARCHCAR datatype for our uuid, the primary key could take 146 bytes per row (some utf8 characters can take up to 4 bytes + the 2 bytes marking the end of the VARCHAR):

MySQL > EXPLAIN SELECT * FROM my_table WHERE 
        uuid='36f1d158-1fa1-11ed-ba36-c8cb9e32df8e'\G
 *************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: my_table
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 146        <--
          ref: const
         rows: 1
     filtered: 100
        Extra: NULL

Solutions

Of course there are some best practices that MySQL users can follow to avoid those problems:

  1. use a smaller datatype to store their UUIDs: BINARY(16)
  2. store the UUIDs sequentially: use UUID_TO_BIN(..., swap_flag)
    • The time-low and time-high parts (the first and third groups of hexadecimal digits, respectively) are swapped.

Let’s see this in action with the following example:

MySQL > CREATE TABLE my_table2 ( 
           uuid BINARY(16) DEFAULT (UUID_TO_BIN(UUID(), 1)) PRIMARY KEY, 
           name VARCHAR(20), beers int unsigned);

MySQL > SELECT * FROM my_table2;
+------------------------------------+--------+-------+
| uuid                               | name   | beers |
+------------------------------------+--------+-------+
| 0x11ED1F9F633ECB6CBA36C8CB9E32DF8E | Kenny  |     0 |
| 0x11ED1F9F633ECD6FBA36C8CB9E32DF8E | lefred |     1 |
+------------------------------------+--------+-------+

As the UUID is now binary, we need to decode it using the function BIN_TO_UUID() and not forget the swap flag:

MySQL > SELECT BIN_TO_UUID(uuid,1), name, beers FROM my_table2;
+--------------------------------------+--------+-------+
| BIN_TO_UUID(uuid,1)                  | name   | beers |
+--------------------------------------+--------+-------+
| 633ecb6c-1f9f-11ed-ba36-c8cb9e32df8e | Kenny  |     0 |
| 633ecd6f-1f9f-11ed-ba36-c8cb9e32df8e | lefred |     1 |
+--------------------------------------+--------+-------+

And now we can verify that when we add new entries they are added to the end of the table:

MySQL > INSERT INTO my_table2 (name, beers) VALUES ("Scott",1), ("Lenka",5); 

MySQL > SELECT * FROM my_table2;
+------------------------------------+---------+-------+
| uuid                               | name    | beers |
+------------------------------------+---------+-------+
| 0x11ED1F9F633ECB6CBA36C8CB9E32DF8E | Kenny   |     0 |
| 0x11ED1F9F633ECD6FBA36C8CB9E32DF8E | lefred  |     1 |
| 0x11ED1FA537C57361BA36C8CB9E32DF8E | Scott   |     1 |  <--
| 0x11ED1FA537C5752DBA36C8CB9E32DF8E | Lenka   |     5 |  <--
+------------------------------------+---------+-------+

and we can of course decode the UUID and see that without the swap flag, InnoDB would have to rebalance the clustered index:

MySQL > SELECT BIN_TO_UUID(uuid,1), name, beers FROM my_table2;
+--------------------------------------+---------+-------+
| BIN_TO_UUID(uuid,1)                  | name    | beers |
+--------------------------------------+---------+-------+
| 633ecb6c-1f9f-11ed-ba36-c8cb9e32df8e | Kenny   |     0 |
| 633ecd6f-1f9f-11ed-ba36-c8cb9e32df8e | lefred  |     1 |
| 37c57361-1fa5-11ed-ba36-c8cb9e32df8e | Scott   |     1 |  <--
| 37c5752d-1fa5-11ed-ba36-c8cb9e32df8e | Lenka   |     5 |  <--
+--------------------------------------+---------+-------+

And of course, now the size of the primary key is smaller and fixed to 16 bytes. Only those 16 bytes are added to all secondary indexes:

MySQL > EXPLAIN SELECT * FROM my_table2 
        WHERE uuid=UUID_TO_BIN("37c5752d-1fa5-11ed-ba36-c8cb9e32df8e",1)\G
 *************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: my_table2
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 16        <---
          ref: const
         rows: 1
     filtered: 100
        Extra: NULL

UUID v1

MySQL generates UUID v1 as described in RFC4122.

  • UUID v1 : is a universally unique identifier that is generated using a timestamp and the MAC address of the computer on which it was generated.
  • UUID v4 : is a universally unique identifier that is generated using random numbers.

With UUID v4m it’s not possible to generate any sequential ouput and this is why those random UUID should never be used as Primary Key with InnoDB.

UUID v4

Some developers keep asking about UUIDv4 and how to generate them for MySQL. Browsing the Internet, you can find several store procedures trying to achieve this.

This one, found on StackOverflow, is maybe my favorite:

CREATE FUNCTION uuid_v4s()
    RETURNS CHAR(36)
BEGIN
    -- 1th and 2nd block are made of 6 random bytes
    SET @h1 = HEX(RANDOM_BYTES(4));
    SET @h2 = HEX(RANDOM_BYTES(2));

    -- 3th block will start with a 4 indicating the version, remaining is random
    SET @h3 = SUBSTR(HEX(RANDOM_BYTES(2)), 2, 3);

    -- 4th block first nibble can only be 8, 9 A or B, remaining is random
    SET @h4 = CONCAT(HEX(FLOOR(ASCII(RANDOM_BYTES(1)) / 64)+8),
                SUBSTR(HEX(RANDOM_BYTES(2)), 2, 3));

    -- 5th block is made of 6 random bytes
    SET @h5 = HEX(RANDOM_BYTES(6));

    -- Build the complete UUID
    RETURN LOWER(CONCAT(
        @h1, '-', @h2, '-4', @h3, '-', @h4, '-', @h5
    ));
END

Unfortunately this function cannot be used as default expression for a column.

I also wrote a component using boost’s uuid library: https://github.com/lefred/mysql-component-uuid_v4

But this new function is also not usable as default value expression.

MySQL error code MY-003770 (ER_DEFAULT_VAL_GENERATED_NAMED_FUNCTION_IS_NOT_ALLOWED): Default value expression of column '%s' contains a disallowed function: %s.

This means that every new record needs to provide the uuid column… this is not too complicated anyway.

Let’s see an example:

MySQL > install component "file://component_uuid_v4";

MySQL > select uuid_v4() ;
+--------------------------------------+
| uuid_v4()                            |
+--------------------------------------+
| 9944272b-e3f9-4778-9c54-818f0baa87da |
+--------------------------------------+
1 row in set (0.0002 sec)

Now we will create a new table, but as recommended, we won’t use the uuid as Primary Key ! We will use a new feature of MySQL 8.0.30: GIPK Mode !

GIPK stands for Generated Invisible Primary Key, check the manual for more info.

MySQL > SET sql_generate_invisible_primary_key=1;
 
MySQL > CREATE TABLE my_table3 (   
            uuid BINARY(16) NOT NULL UNIQUE,
            name VARCHAR(20), beers INT UNSIGNED);

MySQL > SHOW CREATE TABLE my_table3\G
*************************** 1. row ***************************
       Table: my_table3
Create Table: CREATE TABLE `my_table3` (
  `my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
  `uuid` binary(16) NOT NULL,
  `name` varchar(20) DEFAULT NULL,
  `beers` int unsigned DEFAULT NULL,
  PRIMARY KEY (`my_row_id`),
  UNIQUE KEY `uuid` (`uuid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

Now let’s insert some records and see if they are inserted sequentially and if the UUID’s value is completely random:

MySQL > INSERT INTO my_table3 (uuid, name, beers) 
        VALUES (UUID_TO_BIN(uuid_v4()),'Kenny', 3),
               (UUID_TO_BIN(uuid_v4()), 'lefred', 1);

MySQL > SELECT * FROM my_table3;
+------------------------------------+--------+-------+
| uuid                               | name   | beers |
+------------------------------------+--------+-------+
| 0x5A28E5482CDF4B3D89A298ECA3F3703B | Kenny  |     3 |
| 0x94662BF4DC2F469489D868820B7B31E5 | lefred |     1 |
+------------------------------------+--------+-------+

MySQL > SELECT BIN_TO_UUID(uuid), name, beers FROM my_table3;
+--------------------------------------+--------+-------+
| bin_to_uuid(uuid)                    | name   | beers |
+--------------------------------------+--------+-------+
| 5a28e548-2cdf-4b3d-89a2-98eca3f3703b | Kenny  |     3 |
| 94662bf4-dc2f-4694-89d8-68820b7b31e5 | lefred |     1 |
+--------------------------------------+--------+-------+

So far, so good.. let’s add some more records:

MySQL > INSERT INTO my_table3 (uuid, name, beers) 
        VALUES (UUID_TO_BIN(uuid_v4()),'Scott', 10), 
               (UUID_TO_BIN(uuid_v4()), 'Lenka', 0);

MySQL > SELECT BIN_TO_UUID(uuid), name, beers FROM my_table3;
+--------------------------------------+--------+-------+
| bin_to_uuid(uuid)                    | name   | beers |
+--------------------------------------+--------+-------+
| 5a28e548-2cdf-4b3d-89a2-98eca3f3703b | Kenny  |     3 |
| 94662bf4-dc2f-4694-89d8-68820b7b31e5 | lefred |     1 |
| 615fae32-d6c8-439c-9520-5d3c8bfa934b | Scott  |    10 |
| 80a01a29-489b-419d-bca1-05a756ad9d9d | Lenka  |     0 |
+--------------------------------------+--------+-------+

We can see that indeed, the UUIDs are completely random and sequentially added to the table. The reason of that optimal sequential insertion is that the invisible Primary Key is an auto_increment.

It’s possible to also display it on demand:

MySQL > SELECT my_row_id, BIN_TO_UUID(uuid), name, beers FROM my_table3;
+-----------+--------------------------------------+--------+-------+
| my_row_id | bin_to_uuid(uuid)                    | name   | beers |
+-----------+--------------------------------------+--------+-------+
|         1 | 5a28e548-2cdf-4b3d-89a2-98eca3f3703b | Kenny  |     3 |
|         2 | 94662bf4-dc2f-4694-89d8-68820b7b31e5 | lefred |     1 |
|         3 | 615fae32-d6c8-439c-9520-5d3c8bfa934b | Scott  |    10 |
|         4 | 80a01a29-489b-419d-bca1-05a756ad9d9d | Lenka  |     0 |
+-----------+--------------------------------------+--------+-------+

Conclusion

In summary, if you want to use UUID’s in MySQL, it’s recommended to use UUID v1, those generated by MySQL, and store them as binary using the swap flag.

If for some reason you need UUID v4, it is recommended to let MySQL and InnoDB handle the primary key by enabling GIPK mode.

Enjoy MySQL !

September 08, 2022

You know how much I praise and like MySQL Shell but if like me, for you too, old habits die hard, I advise you to create these different aliases in your ~/.bashrc (or ~/.bash_aliases) file to force yourself to use MySQL Shell even for a small statement check:

alias mysql="mysqlsh --sql mysql://localhost"
alias mysqlx="mysqlsh --js mysqlx://localhost"

Of course you can specify the user you want, by default it uses the same user as the system one.

For example, if this is your test machine and you want to always use the root account (even if not recommended), you can specify it like this by modifying the URI:

alias mysql="mysqlsh --sql mysql://root@localhost"

Example:

So now when using mysql MySQL Shell is launched and it connects directly to localhost in SQL mode using the classic protocol.

With mysqlx command, This time MySQL Shell is started and connects to localhost using the X Protocol and my session is now in JavaScript mode:

Two easy aliases to get rid of using the old mysql client in my terminal.

September 06, 2022

Recently, somebody asked me how he can find the long running transactions in MySQL.

I already have one MySQL Shell plugin that allows you to find the current transactions sorted by time. The plugin allows you to also get the details about the desired transaction. See check.getRunningStatements().

Let’s see how we can easily find those long transaction that can be a nightmare for the DBAs (see MySQL History List Length post).

SELECT thr.processlist_id AS mysql_thread_id,
       concat(PROCESSLIST_USER,'@',PROCESSLIST_HOST) User,
       Command,
       FORMAT_PICO_TIME(trx.timer_wait) AS trx_duration,
       current_statement as `latest_statement`
  FROM performance_schema.events_transactions_current trx
  INNER JOIN performance_schema.threads thr USING (thread_id)
  LEFT JOIN sys.processlist p ON p.thd_id=thread_id
 WHERE thr.processlist_id IS NOT NULL 
   AND PROCESSLIST_USER IS NOT NULL 
   AND trx.state = 'ACTIVE'
 GROUP BY thread_id, timer_wait 
 ORDER BY TIMER_WAIT DESC LIMIT 10;

We can see that currently we have an active transaction open for more than 43 minutes and doing… nothing it seems.

The sleeping ones are those that are practically causing the most issues as they might be interactive sessions that have been forgotten and will stay alive for a long, long time by default (8 hours, interactive_timeout).

It’s also possible to list the statements that were performed in this transaction (limited to 10 by default, performance_schema_events_statements_history_size) if the instrumentation is enabled:

UPDATE performance_schema.setup_consumers 
       SET enabled = 'yes' 
  WHERE name LIKE 'events_statements_history_long' 
     OR name LIKE 'events_transactions_history_long';

Now that is enabled, we can see the history for all new transactions using the following statement:

SELECT DATE_SUB(now(), INTERVAL (
         SELECT variable_value 
           FROM performance_schema.global_status 
           WHERE variable_name='UPTIME')-TIMER_START*10e-13 second) `start_time`,
       SQL_TEXT 
  FROM performance_schema.events_statements_history  
 WHERE nesting_event_id=(
               SELECT EVENT_ID 
                 FROM performance_schema.events_transactions_current t   
                 LEFT JOIN sys.processlist p ON p.thd_id=t.thread_id  
                 WHERE conn_id=<VALUE OF mysql_thread_id COLUMN>) 
 ORDER BY event_id;

Let’s try it:

As you can see, we can have a list of the previous statements that were executed in this long transaction.

Once again, Performance_Schema contains all what we need.

Enjoy MySQL and avoid long transactions !

Na meer dan een half jaar geshockeerd toe te kijken hoe onze Europese leiders de éne na de andere aankondigingspolitiek deden zonder wat dan ook substantiëels te doen, lijkt de politieke klasse in de Europese wijk in Brussel eindelijk te zijn wakker geschoten.

Dat Rusland tegensancties zou doen nadat wij letterlijk hun nationale reserves afnamen, was verdomd voorspelbaar en kei hard al vanaf half februari een gegeven dat zeker was.

Dat het Russisch gas uiteindelijk zou afgesloten worden was klaar en duidelijk. Hoeveel gas het was was klaar en duidelijk. Dat de godganse Europese economie dan op haar gat zou gaan, was klaar en duidelijk.

We vinden het allemaal heel erg prettig om nu een beetje met Duitsland te staan lachen. Maar heel erg binnenkort zal jouw baan, die afhankelijk is van de Duitse industrie (denk er maar eens twee keer over na), verdwijnen. Het zal gedaan zijn met lachen terwijl je gigantisch hoge energiefacturen mag betalen.

Allemaal klaar en duidelijk. Crystal helder.

Maar wat deden de jongens en meisjes in de Europese wijk te Brussel daar aan? Helemaal niets.

Aankondigingspolitiek. Populistisch zijn. Dissidenten verketteren: was je tegen sancties, dan was je voor Putin. Alsof dit enkel Putin is en niet de gehele elite in Het Kremlin. Enkelen werden zelfs afgedreigd. Ja toch wel ja. WIJ zijn aan het afglijden naar fascisme en censuur. Heel erg duidelijk.

Maar als je je economie in oorlog zet tegen Rusland, dan hoor je ook voorbereid te zijn. Dat zijn en dat waren wij niet. Totaal niet.

De populistische aankondigingspolitiek van het afgelopen jaar zal er voor zorgen dat onze Europese economie gewoon de grond ingeboord zal worden.

We hadden intelligente sancties kunnen bedenken. Maar dat was niet straf genoeg. We hadden kunnen gaan onderhandelen met Rusland. Maar dan werd je verketterd.

Of is dat niet waar? Zie dan naar Macron die probeerde en telkens door de roeptoeters verketterd werd.

Maar zie hoe Erdogan het wel doet en daarmee zijn Turkije helemaal op de kaart heeft gezet. Hoe de hele West-Europese wereld in het conflict van OekraĂŻne er letterlijk gewoon niet toe doet. Maar Turkije helemaal wel.

Onze Westerse leiders zijn een bende populistische Twitter-accounts maar dan zonder eigen leger. Ze zijn dus niets. Niets dat er toe doet.

Wij hebben géén strategie. Dat zijn wij helemaal verleerd. We schipperen rond op populisme en het om ter hardst roepen. Ik bedoel, tweeten.

En dat, dat werkt niet in oorlogstijden.

August 30, 2022

A notorious ex-DD decided to post garbage on his site in which he links my name to the suicide of Frans Pop, and mentions that my GPG key is currently disabled in the Debian keyring, along with some manufactured screenshots of the Debian NM site that allegedly show I'm no longer a DD. I'm not going to link to the post -- he deserves to be ridiculed, not given attention.

Just to set the record straight, however:

Frans Pop was my friend. I never treated him with anything but respect. I do not know why he chose to take his own life, but I grieved for him for a long time. It saddens me that Mr. Notorious believes it a good idea to drag Frans' name through the mud like this, but then, one can hardly expect anything else from him by this point.

Although his post is mostly garbage, there is one bit of information that is correct, and that is that my GPG key is currently no longer in the Debian keyring. Nothing sinister is going on here, however; the simple fact of the matter is that I misplaced my OpenPGP key card, which means there is a (very very slight) chance that a malicious actor (like, perhaps, Mr. Notorious) would get access to my GPG key and abuse that to upload packages to Debian. Obviously we can't have that -- certainly not from him -- so for that reason, I asked the Debian keyring maintainers to please disable my key in the Debian keyring.

I've ordered new cards; as soon as they arrive I'll generate a new key and perform the necessary steps to get my new key into the Debian keyring again. Given that shipping key cards to South Africa takes a while, this has taken longer than I would have initially hoped, but I'm hoping at this point that by about halfway September this hurdle will have been taken, meaning, I will be able to exercise my rights as a Debian Developer again.

As for Mr. Notorious, one can only hope he will get the psychiatric help he very obviously needs, sooner rather than later, because right now he appears to be more like a goat yelling in the desert.

Ah well.

August 29, 2022

Since 8.0.30, you have the possibility to modify the InnoDB Redo Log Capacity online. An undersized Redo Log Capacity is problematic and lead to performance issues.

However, it’s not recommended to oversize the Redo Log either. Redo Log files consume disk space and increases the recovery time in case of a restart (innodb_fast_shutdown=1) or a sudden crash. And it also slows down shutdown when innodb_fast_shutdown=0.

This means that now, you don’t need to restart MySQL if you want to increase or decrease the size of the InnoDB Redo Logs files. In fact, we don’t talk anymore about file size but about capacity ! The DBA doesn’t need to specify any file size and/or amount of files for Redo Logs anymore as it was the case previously.

New Redo Log Architecture

The new redo log size is defined by one single variable: innodb_redo_log_capacity (in bytes). The default is 100MB.

For example, you can change the Redo Log capacity to 200MB by using the following SQL statement:

set global innodb_redo_log_capacity=200*1024*1024;

InnoDB creates 32 redo log files in MySQL’s datadir inside a new dedicated folder: #innodb_redo.

Inside that directory, you will be able to find two types of files:

  • #ib_redoXXX (where XXX is the file_id, a sequence number): those are the active redo log files
  • #ib_redoXXX_tmp: those are spare redo log files
example of InnoDB redo log folder’s content

InnoDB tries to maintain approximately 32 files here, so that it doesn’t need to wait long before one of them becomes no longer needed as it would if you had just 2 big files. This way it can reclaim them one by one when you want to resize them.

This is how it could be represented:

  • checkpoint_lsn (Innodb_redo_log_checkpoint_lsn) : an LSN point up to which all changes to the pages are guaranteed to have already been written and fsynced back to tablespace files – basically, the still needed portion of redo log starts here.
  • current_lsn (Innodb_redo_log_current_lsn) : the last written position in the redo log. That write could still be buffered inside MySQL processes buffer.
  • flushed_to_disk_lsn (Innodb_redo_log_flushed_to_disk_lsn) : the last position in the redo log that InnoDB has been flushed to disk.

Now when we reach the end of the 31st file (90%), the log files governor will perform some cleanup and some active files that are not needed anymore will become the new spare ones:

When the background thread is not able to remove a log file from the left to put it to the right, the user transaction will get stuck waiting for REDO buffers to be written to disk. DBAs get warning in the error log notifying them to increase the InnoDB Redo Log Capacity:

 [Warning] [MY-013865] [InnoDB] Redo log writer is waiting for a new redo log file.
 Consider increasing innodb_redo_log_capacity.

Checkpointing

Let’s recap what is the role of the Redo Log. As you may already know, each time data is changed in InnoDB, the page(s) containing the data is modified in memory (in the InnoDB Buffer Pool). The page(s) is (are) noted as dirty. In case of a sudden crash, we cannot loose all those changes… but the data in memory is gone ! This is why diff data of the pages are also written (and by default flushed to disk) on the redo logs. The data in those logs will be read only in case of InnoDB Recovery. During that process the modified pages will be reconstructed with the modified data.

InnoDB flushes those dirty pages from the Buffer Pool (memory) to the table spaces (disk) in small batches, step by step. This operation is called Fuzzy Checkpointing.

Once the pages are written to the data files on disk (InnoDB tablespaces), the corresponding entries in the Redo Log are not required anymore. The position up to which InnoDB has written the data to the disk is the value of Innodb_redo_log_checkpoint_lsn.

InnoDB Checkpointing is adaptive. This means that considering the checkpoint age (log_lsn_checkpoint_age) InnoDB will decide to flush less or more aggressively.

The manual has a section about adaptive flushing.

For info, log_lsn_checkpoint_age and inndob_redo_log_logical_size are almost equivalent:

Back to the lsn checkpoint age and the redo log capacity thresholds:

  • soft limit for logical capacity: to avoid deadlocks InnoDB doesn’t let the user transactions to use up the whole innodb_redo_log_capacity – instead it keeps them below soft logical capacity which is roughly 30/32 of it. When this limitation is exceeded, all user threads are paused and a message is sent to the error_log
  • hard limit for logical capacity: this limitation is never exceeded. If space isn’t reclaimed after 1 second wait when the limit is reached, log are written as much as possible or crash InnoDB !
  • async flush point (log_max_modified_age_async): writes are allowed but page flushing will be gradually increased to reach the next threshold. This will lead to a drop of performance. In the code, async flush point can be called adaptive_flush_min_age. This is 7/8 of the soft logical capacity.
    However, in practice, it seems that the adaptive flushing already starts at innodb_adaptive_flushing_lwm (by default 10% of soft logical capacity), and reaches maximum allowed IO capacity already at 82% of the async flush point.
  • sync flush point (log_max_modified_age_sync): at this point the checkpointer will request page cleaners to flush as much of dirty pages to get the checkpoint age below this threshold and will wait for it synchronously. Terrible performance. This is also called adaptive_flush_max_age. This is 15/16 of the soft logical capacity.
  • aggressive_checkpoint_min_age: this represents 31/32 of soft logical capacity. When this point is reached, InnoDB already asked to InnoDB to flush dirty pages from the Buffer Pool at full speed. The checkpointer will not sleep for 1 second between attempting updating checkpoint lsn – instead it will request a sync checkpoint as often as possible and will also update checkpoint_lsn value to the redo log header as soon as possible afterwards. This is performed to be able to reclaim the space faster. As we are already at the top speed, this doesn’t add any more pressure to the page cleaners.

Instrumentation

The new Redo Log is instrumented in Performance_Schema in the table innodb_redo_log_files:

This means there are 5 active redo log files and 27 (32-5) spare ones (_tmp):

All the files are also instrumented in Performance_Schema‘s file instance tables (file_instances and file_summary_by_instance):

As usual, there are status variables and InnoDB Metrics entries providing information about the “flushpointing” operations:

global status variables

innodb metrics

The Snake Metaphor

The new Redo Log Architecture can be seen as a snake (the useful redo log information that is still needed to insure the Durability of InnoDB) laying across cages (the redo log files). Those cages are consecutively connected with each other so that snake can keep going forward. When the snake is reaching the end of the penultimate cage on the right, InnoDB takes the no longer needed cages from the left to place them on the right. The amount of cages is always 32 (unless very special circumstances).

The size of the snake can grow longer or shrink. When InnoDB flushes dirty pages from the Buffer Pool to the data files, the redo log information is not needed and snake’s tails (checkpoint_lsn) moves also to the right. The remaining cages on the left of the snake’s tail are containing the snake’s old skin.

Some Examples

Let’s have a look at some output and see how this can be illustrated.

Example 1 – no load

In the example above, we can see that the current_lsn, the checkpoint_lsn and the flushed_to_disk_lsn have the same value. They are all in the last active log (id 10844). In fact there are no transaction for the moment.

The snake is at its minimal size (Innodb_redo_log_logical_size) : 512 bytes. The checkpoint age is 0:

Example 2 – load

We see here that the snake is bigger, and still has plenty of room. In case it’s needed (when the snake will reach 90% of the penultimate cage, the cages containing the snake’s old skin will be re-used as new spare cages.

Redo Log Consumers

As I already explained, InnoDB Redo Logs are only read (consumed) during InnoDB recovery process. But this is not the truth. As explained in this blog post, some process can also read the redo log files.

This is the list of potential “consumers”:

  1. InnoDB Recovery.
  2. a thread that used the UDF innodb_redo_log_consumer_register(). This is used by MEB (MySQL Enterprise Backup) for example. As this can cause considerable performance issue when badly used, the BACKUP_ADMIN privilege is required.
  3. Clone’s archiver.

Those process can slow down the re-use of redo log files (conversion to spare redo log files). Meaning that the left-most cage can not be moved to the right because still in use.

These consumers must provide InnoDB with their progress by sharing their current lsn.

In case of InnoDB Recovery (1), we know what is the oldest needed lsn by looking at current checkpoint lsn. That means “the checkpointer thread” is one of consumers, and it is always registered.

For (2), the consumer needs to use the UDF inndob_redo_log_consumer_advance(lsn) to move the cursor forward manually. The session needs to be ended to de-register the consumer.

And finally, when clone (3) is in progress, we register clone’s consumer and clone knows how and then to advance its current lsn (using an API).

Some Useful Queries

These are some interesting queries with their output. Most of them use the InnoDB Metrics table.

To enable all you need to run the following SQL statement:

SET GLOBAL innodb_monitor_enable = all;

You can also specify those you want to enable. You must do it one by one like this:

SET GLOBAL innodb_monitor_enable= 'log_lsn_checkpoint_age';

Overview of the redo log active files

SELECT file_id, start_lsn, end_lsn, 
       if(is_full=1,'100%',
          concat(round((((
               select VARIABLE_VALUE 
                 from performance_schema.global_status 
                where VARIABLE_NAME='Innodb_redo_log_current_lsn'
               )-start_lsn)/(end_lsn-start_lsn)*100),2),'%')) full,
          concat(format_bytes(size_in_bytes)," / " ,
          format_bytes(@@innodb_redo_log_capacity) ) file_size, 
      (select VARIABLE_VALUE from performance_schema.global_status 
        where VARIABLE_NAME='Innodb_redo_log_checkpoint_lsn') checkpoint_lsn,
      (select VARIABLE_VALUE from performance_schema.global_status 
        where VARIABLE_NAME='Innodb_redo_log_current_lsn') current_lsn, 
      (select VARIABLE_VALUE from performance_schema.global_status 
        where VARIABLE_NAME='Innodb_redo_log_flushed_to_disk_lsn') flushed_to_disk_lsn,
      (select count from information_schema.INNODB_METRICS 
        where name like 'log_lsn_checkpoint_age') checkpoint_age 
FROM performance_schema.innodb_redo_log_files;

Getting the Checkpoint Age

The checkpoint age is the difference of the current LSN value and the value of the last checkpoint LSN. As explained before, the checkpoint age should be almost equivalent to the value of the InnoDB Redo Log Logical Size.

This is the query I use to display it (we already saw the query earlier):

select concat(count, " (", format_bytes(count), ")") log_lsn_checkpoint_age,
       concat(variable_value, " (", format_bytes(variable_value),")")
             innodb_redo_log_logical_size 
 from information_schema.innodb_metrics 
 join performance_schema.global_status 
where variable_name like 'innodb_redo_log_logical_size' 
      and name like 'log_lsn_checkpoint_age';

Please pay attention that InnoDB Metrics must be enabled to have a coherent result.

Without having the metrics enabled, you could end up with strange invalid results like this:
+------------------------+------------------------------+
| log_lsn_checkpoint_age | innodb_redo_log_logical_size |
+------------------------+------------------------------+
| 5976089 (5.70 MiB) | 512 ( 512 bytes) |
+------------------------+------------------------------+

This is just incorrect, log_lsn_checkpoint_age should be 0 ! You can verify with the following query that the required metric is enabled:

SELECT status FROM INFORMATION_SCHEMA.INNODB_METRICS
WHERE NAME like 'log_lsn_checkpoint_age';
+----------+
| status |
+----------+
| disabled |
+----------+

Calculating the optimal InnoDB Redo Log Capacity

During peak traffic time, you can get an estimation of the required amount for the Redo Log Capacity Size by running the query below (all in one single line):

select VARIABLE_VALUE from performance_schema.global_status 
 where VARIABLE_NAME='Innodb_redo_log_current_lsn' into @a;select sleep(60) into @garb ;select VARIABLE_VALUE from performance_schema.global_status 
 where VARIABLE_NAME='Innodb_redo_log_current_lsn' into @b;select 
 format_bytes(abs(@a - @b)) per_min, format_bytes(abs(@a - @b)*60) per_hour;

The rule of thumb is to make the Redo Log Capacity large enough to hold at least 1h of logs, so as not to force InnoDB to dump dirty pages too often in case the Redo Log Capacity is too small, and not to waste disk and take too long for recovery if it is too large.

Redo Log usage view

When the appropriate InnoDB Metrics are enabled, it’s also possible to get an overview of the Redo Log’s usage and see where we are in relation to the soft and hard redo log logical capacity:

select concat(variable_value, " (", 
              format_bytes(variable_value),")") innodb_redo_log_logical_size,
       concat(round(count*8/7), " (", 
              format_bytes(round(count*8/7)), ")") soft_logical_capacity,
       concat(round(@@innodb_redo_log_capacity*29.8/32), " (",
              format_bytes(round(@@innodb_redo_log_capacity*29.8/32)) ,")") hard_logical_capacity,
       concat(@@innodb_redo_log_capacity, " (",
              format_bytes(@@innodb_redo_log_capacity) ,")") redo_log_capacity,              
       concat(round(variable_value / (count*8/7)*100,2), "%") logical_used,
       concat(round(variable_value / (@@innodb_redo_log_capacity*29.8/32)*100,2), "%") hard_used  
  from performance_schema.global_status 
  join information_schema.innodb_metrics 
 where variable_name like 'innodb_redo_log_logical_size' 
   and name like 'log_max_modified_age_async';

A graph a day keeps the doctor away !

As I always say, trending your database can help you understand the workload and prevent issues.

I’ve updated my MySQL Shell plugins that collect and generate graphs for MySQL with the new MySQL 8.0.30 redo log. I explained in a previous post how to use it.

Here are some output of these new graphs:

If we check the graph above, we can see the different threshold limits. And we can see that we never had to trigger something special. This graph only is not enough to see if we hit some issue. At contrary, if we had already passed only one horizontal line, this would have been problematic.

This graph shows the difference between the current_lsn and the lsn_flushed_to_disk. If the difference stays always high then it could mean that the sync operation to disk is too slow.

However we can see something strange on the graph (pointed with the red arrow). The difference stayed the same for a long time. This means there was data on the MySQL processes buffers but that could not be sent to the redo log files on disk. The next graph will provide us the reason for this:

We can see that twice we reached 32 active InnoDB Redo Log files. This means no spare at all. This should not be possible… unless we have something else also consuming the Redo Log. In fact in another session, I was registering a consumer and didn’t advance the lsn.

We can see that for the same period, there was no new lsn, nothing could be written.

InnoDB was of course not able to flush any new page as checkpointing operations (ib_pg_flush_co & ib_log_checkpt) are blocked waiting for redo log fsync.

In sysbench’s output, we can see that no inserts are made (I’m using oltp_insert.lua):

qps: 0.00 (r/w/o: 0.00/0.00/0.00)

And in MySQL’s processlist we can see plenty of insert queries in waiting for handler commit state.

Pay attention that if you stay in that state for 600 seconds mysqld will intentionally crash !

Conclusion

This new InnoDB Redo Log architecture seems to scale very well and the possibility to resize it dynamically online is really something that makes the difference !

As usual, enjoy MySQL !

Thank you very much to Kuba Lopuszanski from the InnoDB Team for helping me and reviewing this article.

Here’s a scenario: ten years ago, you renovated your bathroom. As a shower, you chose a walk-in shower with a Belgian blue stone which is porous and is a limestone.

That’s why you have to be careful with products like Antikal. Because they will react with the lime in your limestone. In other words, you have to rinse away the excess Antikal quickly.

But you’re a computer geek and because you have a piece of software that you programmed and are already thinking about, you sometimes forget this.

Thus, a drop of Antikal remains on your blue stone.

A few hours later, that drop became a white stain. Which now really seems to be inside your stone.

Unfortunately, it really is. No matter what you try, it cannot be scrubbed off. Unless you let a professional come and polish a layer off of your stone (which costs about 800 euros, by the way).

Unfortunately, not much later the stains come back. Because the stains are deep in your stone and the reaction ‘continues’. That was 800 euros for a few weeks’ peace of mind.

Now what?

Of course you buy a whole gang of HG products such as HG blauwesteen reiniger, HG grafsteenreiniger and HG natuursteen kleurvlekkenverwijderaar.

You spray it on, let it soak in for half an hour; a little of the stains are now indeed better. But not really.

What actually worked better was the Mr. Proper Magic sponge. That sanded the top layer of that white stain away a bit.

But what worked best after that was a coat of Moeller Stone Care HMK P333 Hardsteenolie.

Your Belgian Blue stone will become a Belgian dark blue stone. But the stains (which I had first rubbed out with the Mr. Proper Magic sponge) are less visible and seem to belong to the stone’s natural life.

Then, after two full days letting the stuff get impregnated, a sticky top layer appeared. I had probably used a little too much of that stuff.

But another heavy treatment with the whole gang of HG products, such as HG blauwesteen reiniger, HG grafsteenreiniger and HG natuursteen kleurvlekkenverwijderaar (all at the same time and soak for half an hour) followed by rubbing hard with an iron sponge, and the sticky top layer is gone.

The stone is actually quite beautiful now. Dark blue.

August 26, 2022

The IT world is littered with frameworks, best practices, reference architectures and more. In an ever-lasting attempt to standardize IT, we often get lost in too many standards or specifications. For consultants, this is a gold-mine, as they jump in to support companies - for a fee, naturally - in adopting one or more of these frameworks or specifications.

While having references and specifications isn't a bad thing, there are always pros and cons.

August 22, 2022

Sometimes, it's useful to get a notification that a command has finished doing something you were waiting for:

make my-large-program && notify-send "compile finished" "success" || notify-send "compile finished" "failure"

This will send a notification message with the title "compile finished", and a body of "success" or "failure" depending on whether the command completed successfully, and allows you to minimize (or otherwise hide) the terminal window while you do something else, which can be a very useful thing to do.

It works great when you're running something on your own machine, but what if you're running it remotely?

There might be something easy to do, but I whipped up a bit of Perl instead:

#!/usr/bin/perl -w

use strict;
use warnings;

use Glib::Object::Introspection;
Glib::Object::Introspection->setup(
    basename => "Notify",
    version => "0.7",
    package => "Gtk3::Notify",
);

use Mojolicious::Lite -signatures;

Gtk3::Notify->init();

get '/notify' => sub ($c) {
    my $msg = $c->param("message");
    if(!defined($msg)) {
        $msg = "message";
    }
    my $title = $c->param("title");
    if(!defined($title)) {
        $title = "title";
    }
    app->log->debug("Sending notification '$msg' with title '$title'");
    my $n = Gtk3::Notify::Notification->new($title, $msg, "");
    $n->show;
    $c->render(text => "OK");
};

app->start;

This requires the packages libglib-object-introspection-perl, gir1.2-notify-0.7, and libmojolicious-perl to be installed, and can then be started like so:

./remote-notify daemon -l http://0.0.0.0:3000/

(assuming you did what I did and saved the above as "remote-notify")

Once you've done that, you can just curl a notification message to yourself:

curl 'http://localhost:3000/notify?title=test&message=test+body'

Doing this via localhost is rather silly (much better to use notify-send for that), but it becomes much more interesting if you're going to run this to your laptop from a remote system.

An obvious TODO would be to add in some form of security, but that's left as an exercise to the reader...

August 21, 2022

Door omstandigheden wat minder kunnen lezen, maar bij deze nog eens drie boeken:



De Hel van Deurne-Noord is een verzamling getuigenissen van voetballers en trainers, en reporters, over hun ervaring op den Bosuil. Het dateert van 2016 ofzo denk ik, dus het is vooral dromen van een groot Royal Antwerp FC. Ik denk niet dat er toen iemand was die Naingolan of Alderweireld als spelers van den Antwerp zag.

Het boek leest zeer vlot, elke getuigenis is maar een blad of twee, soms drie. Ideaal voor bus en treinritten.


Ons Antwerpen heeft me blij verrast! Dit is een van de beste boeken dat ik al gelezen heb. Het dateert van rond 1931, de auteurs zijn dus al lang dood, anders had ik hen zeker bedankt.

In dit boek wandel je rond in het Antwerpen van 90 jaar geleden. Er wordt bijvoorbeeld gezegd dat er plannen zijn om een Rubenshuis te maken in de buurt van de Meir/Wapperstraat. Het plein dat nu 'Wapper' heet, bestond nog niet. Er is ook geen Astridplein (dat heette toen Statieplein), maar ze hadden wel net meer dan tienduizend schepen gehad in Antwerpen op 1 jaar tijd.

Ik ben zinnens eens met dit boek rond te lopen, want echt elke straat waar ze zijn wordt vermeld, tot huisnummers toe. De weg zou dus helemaal moeten te volgen zijn.


Starten met C is mijn introductie tot programmeren in C (ik kende totdantoe enkel Basic, Assembler en PASCAL, ah en COBOL). Dit was ons cursusboekje in Leuven in 1991 denk ik... en maakte mij een grote fan van C.

Ik ben het terug aan het lezen nu en beleef er weer veel plezier aan. Nog steeds fan van C dus, helaas ben ik momenteel niet zo'n goeie programmeur. Nu het bijna uit is, zal ik mijn odio applicatie maar snel aanpassen, voordat iemand deze klucht kan zien :)



August 19, 2022

Yesterday I had the day off, so I decided to finally upgrade my Thinkpad X13 to Ubuntu 22.04. Jammy Jellyfish (as the release is nicknamed) is nice but the new default display server, Wayland, blocks Shutter and other non-native screenshot apps from making screenshots. This interfering with my support workflow as I use Shutter not only to make the screenshot but also to edit and upload it to imgur.

Source

August 16, 2022

During my summer holidays, we released MySQL 8.0.30.

MySQL 8.0.30 was released on July 26th 2022. If you are using 8.0.29, it’s highly recommended to upgrade as there was a bug corrupting InnoDB files in a rare and specific case.

This new release contains several contributions from our awesome Community and on behalf of the entire MySQL Team, as usual, I would like to thanks all our contributors !

This new release contains patches from Facebook/Meta, Marcelo Altmann (Percona), Mengchu Shi, Zhou Xinjing (Tencent), Yuxiang Jiang (Tencent), Namrata Bhave, Hongyuan Li, Alexey Kopytov, Wei Zhao, Piotr Jurkiewicz, Dennis Gao, Zheng Lai, Bin Wang, Hao Wu (Tencent), Weijie Wu and Rahul Malik (Percona).

Once again, thank you all for your great contributions and to the companies you are working for.

Here is the list of the above contributions and related bugs:

InnoDB

  • #103211 – The innodb_doublewrite system variable, which enables or disables the doublewrite buffer, has two new settings, DETECT_ONLY and DETECT_AND_RECOVER. With the DETECT_ONLY setting, database page content is not written to the doublewrite buffer, and recovery does not use the doublewrite buffer to fix incomplete page writes. This lightweight setting is intended for detecting incomplete page writes only. The DETECT_AND_RECOVER setting is equivalent to the existing ON setting – Facebook
  • #107613 – A TRUNCATE TABLE operation failed to remove data dictionary entries for columns that were dropped using ALGORITHM=INSTANT – Marcelo Altmann from Percona
  • #106687 (private) – mysqld may crash due to a debug assertion failure – Menchu Shi
  • #100512 – Enabling the adaptive hash index (AHI) on a high-concurrency instance caused temporary AHI search latch contention while the hash index was being built – Zhou Xinjing from CDB Team at Tencent
  • #81814 – InnoDB adaptive hash index uses a bad partitioning algorithm for the real world – Alexey Kopytov
  • #102733 (private) – Server crash in specific mtr test cases launched in parallel – Zheng Lai (base fixed version: 5.7.38)
  • #106630 – Missing declaration of srv_log_writer_timeout – Hao Wu from Tencent
  • #107207 – Assertion Failure buf0buf.cc:4318:page_size.equals_to(space_page_size) – Rahul Malik from Percona

Clients / Connectors

  • #96369 – A new mysqldump option --mysqld-long-query-time lets you set a custom value of the long_query_time system variable for mysqldump’s session. Use the new option to increase the elapsed time allowed for mysqldump’s queries before they are written to the slow query log file, in order to avoid unnecessary logging – Facebook
  • #92260 – Connector/Python – With caching_sha2_password SSL should not be required on Unix socket – Piotr Jurkiewicz
  • #106779 – Connector/J – Fix typo in PropertyDefinitions – Weijie Wu

Performance Schema

  • #106939 – Under certain circumstances TRUNCATE performance_schema.accounts caused duplicated counts in global_status. This occurred if some hosts were not instrumented. For example, if performance_schema_hosts_size was set to a low value – Yuxiang Jiang from Tencent

Compiling

  • #107081 – On RHEL 7.x, fetching the CPU cache line size returned 0 on s390x RHEL 7.x which caused rpl_commit_order_queue and integrals_lockfree_queue to fail – Namrata Bhave
  • #106824 – Added a cycle timer for the s390x architecture – Namrata Bhave
  • #106555 – Fixed an assert definition in SetOsLimitMaxOpenFiles – Hongyuan Li

XA Transactions

  • #87560 XA PREPARE log order error in replication and binlog recovery – Wei Zhao
  • #98616XA PREPARE/XA COMMIT/XA ROLLBACK lost if mysql crash just after binlog flush – Dennis Gao

Replication

  • #101056Ack_receiver may lost the slave semi-sync ack due to net timeout – Dennis Gao

Logging

  • #106590 – Fix spell of log messages in mysql_safe – Bin Wang (base fixed version: 5.7.38)

As usual, those contributions cover many different sections of MySQL. If you have patches and you also want to be part of the MySQL Contributors, it’s easy, you can send Pull Requests from MySQL’s GitHub repositories or send your patches on Bugs MySQL (signing the Oracle Contributor Agreement is required).

And don’t forget, that if you want to extend MySQL, the best way is to create a component. Now there is a complete series to learn how to start using the MySQL Component Infrastructure (part 1) !

Thank you again to all our contributors !

August 12, 2022

I run Debian on my laptop (obviously); but occasionally, for $DAYJOB, I have some work to do on Windows. In order to do so, I have had a Windows 10 VM in my libvirt configuration that I can use.

A while ago, Microsoft issued Windows 11. I recently found out that all the components for running Windows 11 inside a libvirt VM are available, and so I set out to upgrade my VM from Windows 10 to Windows 11. This wasn't as easy as I thought, so here's a bit of a writeup of all the things I ran against, and how I fixed them.

Windows 11 has a number of hardware requirements that aren't necessary for Windows 10. There are a number of them, but the most important three are:

  • Secure Boot is required (Windows 10 would still boot on a machine without Secure Boot, although buying hardware without at least support for that hasn't been possible for several years now)
  • A v2.0 TPM module (Windows 10 didn't need any TPM)
  • A modern enough processor.

So let's see about all three.

A modern enough processor

If your processor isn't modern enough to run Windows 11, then you can probably forget about it (unless you want to use qemu JIT compilation -- I dunno, probably not going to work, and also not worth it if it were). If it is, all you need is the "host-passthrough" setting in libvirt, which I've been using for a long time now. Since my laptop is less than two months old, that's not a problem for me.

A TPM 2.0 module

My Windows 10 VM did not have a TPM configured, because it wasn't needed. Luckily, a quick web search told me that enabling that is not hard. All you need to do is:

  • Install the swtpm and swtpm-tools packages
  • Adding the TPM module, by adding the following XML snippet to your VM configuration:

    <devices>
      <tpm model='tpm-tis'>
        <backend type='emulator' version='2.0'/>
      </tpm>
    </devices>
    

    Alternatively, if you prefer the graphical interface, click on the "Add hardware" button in the VM properties, choose the TPM, set it to Emulated, model TIS, and set its version to 2.0.

You're done!

Well, with this part, anyway. Read on.

Secure boot

Here is where it gets interesting.

My Windows 10 VM was old enough that it was configured for the older i440fx chipset. This one is limited to PCI and IDE, unlike the more modern q35 chipset (which supports PCIe and SATA, and does not support IDE nor SATA in IDE mode).

There is a UEFI/Secure Boot-capable BIOS for qemu, but it apparently requires the q35 chipset,

Fun fact (which I found out the hard way): Windows stores where its boot partition is somewhere. If you change the hard drive controller from an IDE one to a SATA one, you will get a BSOD at startup. In order to fix that, you need a recovery drive. To create the virtual USB disk, go to the VM properties, click "Add hardware", choose "Storage", choose the USB bus, and then under "Advanced options", select the "Removable" option, so it shows up as a USB stick in the VM. Note: this takes a while to do (took about an hour on my system), and your virtual USB drive needs to be 16G or larger (I used the libvirt default of 20G).

There is no possibility, using the buttons in the virt-manager GUI, to convert the machine from i440fx to q35. However, that doesn't mean it's not possible to do so. I found that the easiest way is to use the direct XML editing capabilities in the virt-manager interface; if you edit the XML in an editor it will produce error messages if something doesn't look right and tell you to go and fix it, whereas the virt-manager GUI will actually fix things itself in some cases (and will produce helpful error messages if not).

What I did was:

  • Take backups of everything. No, really. If you fuck up, you'll have to start from scratch. I'm not responsible if you do.
  • Go to the Edit->Preferences option in the VM manager, then on the "General" tab, choose "Enable XML editing"
  • Open the Windows VM properties, and in the "Overview" section, go to the "XML" tab.
  • Change the value of the machine attribute of the domain.os.type element, so that it says pc-q35-7.0.
  • Search for the domain.devices.controller element that has pci in its type attribute and pci-root in its model one, and set the model attribute to pcie-root instead.
  • Find all domain.devices.disk.target elements, setting their dev=hdX to dev=sdX, and bus="ide" to bus="sata"
  • Find the USB controller (domain.devices.controller with type="usb", and set its model to qemu-xhci. You may also want to add ports="15" if you didn't have that yet.
  • Perhaps also add a few PCIe root ports:

    <controller type="pci" index="1" model="pcie-root-port"/>
    <controller type="pci" index="2" model="pcie-root-port"/>
    <controller type="pci" index="3" model="pcie-root-port"/>
    

I figured out most of this by starting the process for creating a new VM, on the last page of the wizard that pops up selecting the "Modify configuration before installation" option, going to the "XML" tab on the "Overview" section of the new window that shows up, and then comparing that against what my current VM had.

Also, it took me a while to get this right, so I might have forgotten something. If virt-manager gives you an error when you hit the Apply button, compare notes against the VM that you're in the process of creating, and copy/paste things from there to the old VM to make the errors go away. As long as you don't remove configuration that is critical for things to start, this shouldn't break matters permanently (but hey, use your backups if you do break -- you have backups, right?)

OK, cool, so now we have a Windows VM that is... unable to boot. Remember what I said about Windows storing where the controller is? Yeah, there you go. Boot from the virtual USB disk that you created above, and select the "Fix the boot" option in the menu. That will fix it.

Ha ha, only kidding. Of course it doesn't.

I honestly can't tell you everything that I fiddled with, but I think the bit that eventually fixed it was where I chose "safe mode", which caused the system to do a hickup, a regular reboot, and then suddenly everything was working again. Meh.

Don't throw the virtual USB disk away yet, you'll still need it.

Anyway, once you have it booting again, you will now have a machine that theoretically supports Secure Boot, but you're still running off an MBR partition. I found a procedure on how to convert things from MBR to GPT that was written almost 10 years ago, but surprisingly it still works, except for the bit where the procedure suggests you use diskmgmt.msc (for one thing, that was renamed; and for another, it can't touch the partition table of the system disk either).

The last step in that procedure says to restart your computer!, which is fine, except at this point you obviously need to switch over to the TianoCore firmware, otherwise you're trying to read a UEFI boot configuration on a system that only supports MBR booting, which obviously won't work. In order to do that, you need to add a loader element to the domain.os element of your libvirt configuration:

<loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>

When you do this, you'll note that virt-manager automatically adds an nvram element. That's fine, let it.

I figured this out by looking at the documentation for enabling Secure Boot in a VM on the Debian wiki, and using the same trick as for how to switch chipsets that I explained above.

Okay, yay, so now secure boot is enabled, and we can install Windows 11! All good? Well, almost.

I found that once I enabled secure boot, my display reverted to a 1024x768 screen. This turned out to be because I was using older unsigned drivers, and since we're using Secure Boot, that's no longer allowed, which means Windows reverts to the default VGA driver, and that only supports the 1024x768 resolution. Yeah, I know. The solution is to download the virtio-win ISO from one of the links in the virtio-win github project, connecting it to the VM, going to Device manager, selecting the display controller, clicking on the "Update driver" button, telling the system that you have the driver on your computer, browsing to the CD-ROM drive, clicking the "include subdirectories" option, and then tell Windows to do its thing. While there, it might be good to do the same thing for unrecognized devices in the device manager, if any.

So, all I have to do next is to get used to the completely different user interface of Windows 11. Sigh.

Oh, and to rename the "w10" VM to "w11", or some such. Maybe.

August 10, 2022

À voir l’affiche et la bande-annonce, l’annĂ©e du requin s’annonce comme une comĂ©die estivale des plus traditionnelles, sorte de croisement entre « Les gendarmes de Saint-Tropez Ă  la pĂȘche au requin » et « Les bronzĂ©s au camping 3 ».

Heureusement, la lecture des critiques m’avait mis la puce Ă  l’oreille. L’annĂ©e du requin n’est pas une Ă©niĂšme comĂ©die franchouillarde de type sous-splendid, au grand plaisir ou au grand dam des commentateurs. Les gags de la bande-annonce s’enchainent dans les premiĂšres minutes du film. Comme prĂ©vu, le gendarme Maja, Marina FoĂŻs, se prend un seau d’eau et une vanne comique de la part de son collĂšgue Blaise, Jean-Pascal Zadi. Rires bien vite Ă©touffĂ©s par la rĂ©plique tranchante d’une Marina FoĂŻs qui crĂšve l’écran en gendarme fatiguĂ©e par une carriĂšre assez terne dans une ville oĂč la spĂ©cialitĂ© est de poser ses fesses dans le sable et de regarder la mer : « Ce n’est pas gai de se prendre un seau d’eau lorsqu’on est en service. » Sourires gĂȘnĂ©s de ses coĂ©quipiers et du public.

Le ton est donnĂ©. Le prĂ©texte comĂ©die n’était qu’un attrape-nigaud. Si le film regorge de pĂ©pites humoristiques, celles-ci se font discrĂštes, sans insistance (comme le coup de la garde-robe de Maja, entraperçue une seconde en arriĂšre-plan). LĂ  n’est pas le propos.

Le propos ? Il n’est pas non plus dans l’histoire, assez simple pour ne pas dire simplette : un requin hante les cĂŽtes de la station balnĂ©aire de La Pointe et, Ă  la veille de la retraite, la gendarme maritime Maja dĂ©cide d’en faire son affaire.

Pas de comĂ©die dĂ©sopilante ? Pas d’histoire ? Mais quel est l’intĂ©rĂȘt alors ?

Tout simplement dans l’incroyable panoplie d’humains que la camĂ©ra des frĂšres Boukherma va chercher. Chaque personnage est ciselĂ©, la camĂ©ra s’attardant longuement sur les dĂ©fauts physiques, les rides, les visages bouffis, fatiguĂ©s, vieillis, mais Ă©galement souriants et pleins de personnalitĂ©. Au contraire des frĂšres Dardennes, l’image ne cherche pas Ă  servir un ultra-rĂ©alisme social. Il s’agit plutĂŽt de mettre Ă  l’honneur, d’hĂ©roĂŻfier ces humains normaux. En contrepoint Ă  ces anti-superhĂ©ros, le film offre un maire jeune, lisse et sans caractĂšre ni le moindre esprit de dĂ©cision (LoĂŻc Richard). ParachutĂ© depuis Paris, il se rĂ©fugie, symbole de cette lutte des classes omniprĂ©sente, derriĂšre une visiĂšre anti-covid. Des Parisiens qui sont Ă  la fois dĂ©testĂ©s par les locaux, mais nĂ©cessaires, car faisant tourner l’économie.

** Entracte publicitaire **

Acteur bordelais, LoĂŻc Richard est rĂ©putĂ© pour son travail de la voix. J’ai eu l’occasion de collaborer avec lui lorsqu’il a enregistrĂ© la version audiolivre de mon roman Printeurs, disponible sur toutes les plateformes d’audiobook. Autant il joue Ă  merveille le personnage fade et lisse dans le film, autant il peut prendre des intonations sombres et inquiĂ©tantes dans sa lecture de Printeurs. Je ne pouvais quand mĂȘme pas rater de placer cette anecdote 😉

=> https://voolume.fr/catalogue/sf-et-fantasy/printeurs/

** Fin de l’entracte, merci de regagner vos siùges **

Dans la premiĂšre partie du film, Maja part Ă  la chasse aux requins et tout se passe, Ă  la grande surprise du spectateur, un peu trop facilement. La gendarme devient, malgrĂ© elle, une hĂ©roĂŻne des rĂ©seaux sociaux. Mais au plus rapide est la montĂ©e, au plus dure est la chute. Au premier incident, qui n’est clairement pas le fait de Maja, elle devient la bĂȘte noire. HarcelĂ©e, elle en vient Ă  paniquer dans une courte, mais puissante scĂšne de rĂȘve. Le propos est clair : le vĂ©ritable requin est l’humain, alimentĂ© par les rĂ©seaux sociaux et par les mĂ©dias, symbolisĂ© par une omniprĂ©sente radio rĂ©actionnaire qui attise les haines sous un vernis pseudohumoristique. Sous des dehors de petits paradis balnĂ©aires, la haine et la rancƓur sont tenaces. Sous la plage, les pavĂ©s. L’éden est amer.

À partir de la sĂ©quence onirique, le film perd progressivement tout semblant de rĂ©alisme et l’humour se fait de plus en plus rare. Les codes sont inversĂ©s : si l’humour Ă©tait filmĂ© de maniĂšre rĂ©aliste, les images d’action et d’angoisse sont offertes Ă  travers la camĂ©ra d’une comĂ©die absurde, l’apogĂ©e paradoxal Ă©tant atteint avec le rodĂ©o impromptu de Blaise et le rĂ©veil surrĂ©aliste d’une Maja qui s’était pourtant noyĂ©e quelques minutes auparavant. Tout donne l’impression que Maja a continuĂ© son rĂȘve, que la lutte contre le requin se poursuit dans son inconscient.

Étrange et dĂ©stabilisant, le film fonctionne entre autres grĂące Ă  un travail trĂšs particulier du cadre et de la couleur. Chaque plan rĂ©sulte d’une recherche qui porte le propos, l’émotion. Lorsqu’elle est sur son ordinateur, Maja est baignĂ©e d’une lumiĂšre froide alors que son mari, Ă  l’arriĂšre-plan, reprĂ©sente la douceur chaleureuse du foyer. « Tu devrais arrĂȘter Twitter », lance-t-il machinalement en partant dans la nature alors qu’elle reste enfermĂ©e devant son smartphone. Lors des confrontations entre les Ă©poux, la camĂ©ra se dĂ©centre souvent, donnant une perspective, un retrait, mais une intensitĂ© aux Ă©changes.

Le titre lui-mĂȘme porte une critique sociale trĂšs actuelle : « L’annĂ©e passĂ©e c’était le covid, cette annĂ©e le requin. Ce sera quoi l’annĂ©e prochaine ? ». Le requin est le pur produit d’un rĂ©chauffement climatique entrainant des catastrophes face auxquelles tant les politiciens, les Ă©cologistes et les rĂ©actionnaires sont impuissants. Chacun ne cherchant finalement qu’à se dĂ©douaner de toute responsabilitĂ©. Comme le dit le maire : « Ça va encore ĂȘtre la faute de la mairie ! ».

Sans y toucher, le film dĂ©montre le succĂšs et la nĂ©cessitĂ© de dĂ©cennies de lutte fĂ©ministe. Le personnage principal est une femme qui s’est consacrĂ©e Ă  sa carriĂšre avec le soutien d’un mari effacĂ© et trĂšs gentil (Kad Merad, incroyablement humain en mari bedonnant). Son assistante EugĂ©nie est une femme (Christine Gautier). Pourtant, encore une fois, aucune insistance n’est placĂ©e sur le sujet. Le sexe des personnages importe peu, les relations Ă©tant, Ă  tous les niveaux, purement basĂ©es sur leur caractĂšre. Aucune sĂ©duction, aucune histoire d’amour autre qu’un mariage de longue date entre Maja et son mari, aucune miĂšvrerie. Le tout avec des interactions humaines profondĂ©ment rĂ©alistes (dans des situations qui le sont Ă©videmment beaucoup moins).

L’annĂ©e du requin n’est certes pas le film de la dĂ©cennie, la faute probablement Ă  un scĂ©nario un peu simplet, il offre nĂ©anmoins une expĂ©rience cinĂ©matographique originale, nouvelle. Les frĂšres Boukherma nous gratifiant d’un nouveau genre : celui de la parodie sĂ©rieuse qui ne se prend pas la tĂȘte. Fourmillant de trouvailles (la radio, la voix off particuliĂšrement originale), le film mĂȘle plaisir, clins d’Ɠil aux cinĂ©philes, critique sociale et cadre original, le tout servi par des acteurs dont les talents sont particuliĂšrement bien exploitĂ©s.

Que demander de plus ?

Une morale ? Le film se termine justement sur une morale gentille, mais pas trop bateau et parfaitement appropriĂ©e : « Il y a deux types de hĂ©ros. Ceux qui veulent sauver le monde et ceux qui veulent sauver ceux qu’ils aiment ».

Si l’annĂ©e du requin ne sauve pas ni ne rĂ©volutionne le monde, il saura offrir quelques heures de plaisir Ă  ceux qui cherchent des saveurs nouvelles sans se prendre la tĂȘte et qui aiment ce cynisme un peu grinçant qui ne s’inscrit dans aucune case prĂ©cise. Il m’a clairement donnĂ© envie de dĂ©couvrir Teddy, le premier film de ce jeune tandem de rĂ©alisateurs jumeaux. Et si aprĂšs le loup-garou et le requin, ils dĂ©cident de s’attaquer Ă  la science-fiction, je suis volontaire pour leur pondre un scĂ©nario.

Recevez les billets par mail ou par RSS. Max 2 billets par semaine, rien d’autre. Adresse email jamais partagĂ©e et dĂ©finitivement effacĂ©e lors du dĂ©sabonnement. Dernier livre paru : Printeurs, thriller cyberpunk. Pour soutenir l’auteur, lisez, offrez et partagez des livres.

Ce texte est publié sous la licence CC-By BE.

August 07, 2022

Si ma dĂ©connexion totale a Ă©tĂ© un Ă©chec, si j’ai repris des connexions intempestives, mon usage de l’ordinateur a cependant Ă©tĂ© profondĂ©ment modifiĂ©. Il est, par dĂ©faut, non connectĂ©. Je suis conscient de chaque connexion. Et je ne regarde mes emails qu’une fois, parfois deux par jour. Ce dernier changement ayant grandement facilitĂ© grĂące Ă  une action que j’ai commencĂ© il y a prĂšs de trois ans : supprimer mes comptes en ligne.

Au cours de ces trois derniĂšres annĂ©es, j’ai activement supprimĂ© plus de 600 comptes sur diffĂ©rentes plateformes. Chaque fois que je reçois un mail d’une plateforme sur laquelle j’ai un compte inutilisĂ©, je procĂšde aux dĂ©marches, parfois longues et fastidieuses, pour le supprimer. Au cours de ces trois annĂ©es, de nombreuses plateformes sont rĂ©apparues dont j’avais oubliĂ© jusqu’à l’existence.

Le travail a Ă©tĂ© de trĂšs longue haleine, mais commence Ă  porter ses fruits et m’enseigne Ă©normĂ©ment sur cette marque de viande en boĂźte transformĂ©e en nom commun par des humoristes anglais dĂ©guisĂ©s en Vikings : le spam.

Les différents types de spams

J’ai identifiĂ© trois sortes de spams : le random-spam, l’expected-spam et le white-collar-spam.

Le random-spam est le pur spam dans la plus ancienne tradition du terme. Des emails envoyĂ©s Ă  des millions de personnes sans aucune logique, pour vous vendre du viagra, pour vous convaincre d’installer un spyware, d’aider un prince nigĂ©rien Ă  rĂ©cupĂ©rer des millions ou de verser une rançon en bitcoins, car vous avez Ă©tĂ© soi-disant filmĂ© en train de vous palucher devant un site porno. Une fois que votre adresse est publique, il n’y a rien Ă  faire contre ce type de spam si ce n’est tenter de les filtrer. Il est complĂštement illĂ©gal. C’est d’ailleurs sa caractĂ©ristique premiĂšre : il n’est liĂ© Ă  aucune entitĂ© juridique Ă©vidente. Vous ne pouvez pas vous plaindre ou vous dĂ©sinscrire. Si le random-spam Ă©tait une vraie plaie historiquement, je suis surpris de constater que sur mon adresse la plus spammĂ©e, une adresse publiĂ©e partout depuis quinze annĂ©es, prĂ©sente dans une kyrielle de bases de donnĂ©es publiques, je reçois en moyenne un random-spam tous les deux ou trois jours (il est automatiquement dĂ©tectĂ© comme tel et placĂ© dans mon dossier spam, les faux nĂ©gatifs Ă©tant trĂšs rares). La moitiĂ© de ces spams concernent les cryptomonnaies. J’en dĂ©duis que sur une adresse relativement rĂ©cente et peu publique, vous recevrez trĂšs peu de ces spams.

L’expected-spam est exactement le contraire : c’est du spam envoyĂ© par des plateformes ou des services sur lesquels vous ĂȘtes inscrits de votre plein grĂ©. Notifications, enquĂȘte de satisfaction, newsletters ou autres annonces de nouveautĂ©s. La particularitĂ© est que vous pouvez vous dĂ©sinscrire, mĂȘme si ce n’est souvent que trĂšs temporairement (comme pour Facebook ou Linkedin, qui s’évertuent Ă  crĂ©er des nouvelles newsletters ou catĂ©gories d’emails pour se rappeler Ă  vous). Au final, il est trĂšs simple de se dĂ©barrasser de ce spam : supprimer dĂ©finitivement votre compte de ce service. En thĂ©orie. Parce que certains continuent Ă  vous envoyer des messages dont vous ne pouvez plus vous dĂ©sabonner vu que vous n’avez plus de compte. Une menace de plainte RGPD suffit gĂ©nĂ©ralement Ă  rĂ©soudre le « bug » informatique. Il est donc possible de rĂ©duire l’expected-spam Ă  zĂ©ro (sauf s’il provient de votre employeur. Les entreprises se plaignent du manque de productivitĂ© des employĂ©s, mais paient des gens pour les assommer sous les newsletters internes complĂštement inutiles, allez comprendre).

Vient ensuite la troisiĂšme catĂ©gorie : le white-collar-spam. Le white-collar-spam est en fait du spam qui se donne des fausses impressions de lĂ©galitĂ©. Ce sont des entreprises qui ont achetĂ© vos donnĂ©es et qui vous contactent comme si vous Ă©tiez inscrits chez eux. Un lien de dĂ©sinscription est gĂ©nĂ©ralement toujours disponible. Mais plutĂŽt que de me dĂ©sinscrire simplement, je contacte chacune des entreprises et demande d’oĂč elles tiennent mes donnĂ©es, les menaçant de poursuite RGPD. J’ai ainsi dĂ©couvert que l’immense majoritĂ© des white-collar-spam proviennent, en francophonie, d’un ou deux fournisseurs. Ces fournisseurs sont absolument peu scrupuleux sur la maniĂšre dont ils collectent les donnĂ©es. Ce n’est pas Ă©tonnant : leur mĂ©tier est littĂ©ralement d’emmerder les utilisateurs d’emails. Leurs clients sont les entreprises, les organisations non gouvernementales et les services publics. Ils classent les emails en catĂ©gories et vendent ces bases de donnĂ©es pour une durĂ©e limitĂ©e. Ce dernier point est important, car un an aprĂšs avoir Ă©tĂ© en contact avec l’un de ces spammeurs-lĂ©gaux-professionnels et avoir clairement fait comprendre que mes donnĂ©es ne pouvaient plus ĂȘtre vendues, j’ai reçu du spam d’un de leur client. Il s’est avĂ©rĂ© que le client, un service public français Ă  vocation culturelle, avait rĂ©utilisĂ© une base de donnĂ©es achetĂ©e deux ans auparavant, ce qui Ă©tait interdit par son contrat.

J’ai donnĂ© le nom « white-collar-spam », car ce spam n’est guĂšre diffĂ©rent du random-spam illĂ©gal si ce n’est qu’il est accompli par des sociĂ©tĂ©s ayant pignon sur rue trĂšs fiĂšres de leur mĂ©tier de spammeur. Au lieu de lutter contre le spam, nous en avons fait une activitĂ© honorable et rĂ©munĂ©ratrice !

Outre ces quelques acteurs professionnels du spam, une grande quantitĂ© de white-collar-spam provient indirectement de Linkedin. En effet, certains outils permettent aux professionnels du marketing (le nouveau nom pour spammeur) de rĂ©colter les adresses mails, mĂȘme cachĂ©es, de leurs contacts Linkedin. Si vous avez un de ces trĂšs nombreux spammeurs dans vos contacts sur ce rĂ©seau, vous ĂȘtes foutu. La solution la plus simple : supprimer votre compte Linkedin et laisser les spammeurs entre eux (la fonction premiĂšre de ce rĂ©seau). Le simple fait d’effacer mon compte Linkedin a divisĂ© par deux, en quelques semaines, le nombre de spams que je recevais.

La majoritĂ© du spam que je reçois aujourd’hui est donc ce white-collar-spam qui est plus ou moins lĂ©gal et complĂštement immoral.

Une idĂ©e m’est venue pour le combattre trĂšs simplement : interdire la revente d’une donnĂ©e identifiante sans l’accord de la personne concernĂ©e. Simple comme tout : si une sociĂ©tĂ© souhaite vendre des donnĂ©es, elle doit en demander l’autorisation Ă  chaque transaction. Cette rĂšgle s’appliquerait Ă©galement en cas de rachat d’une sociĂ©tĂ© par une autre ou en cas de transfert d’une entitĂ© juridique Ă  une autre. Il semble en effet Ă©vident que l’on peut partager ses donnĂ©es avec une entitĂ©, mais ne pas vouloir le faire avec une autre. La sociĂ©tĂ© machin sur laquelle vous avez un compte se fait racheter par truc ? Vous devez marquer votre accord sans quoi vos donnĂ©es seront effacĂ©es aprĂšs un dĂ©lai de quelques mois. Simple Ă  implĂ©menter, simple Ă  surveiller, simple Ă  lĂ©gifĂ©rer.

Ce qui signifie que si nous avons du spam, c’est parce que nous le voulons. Comme la cigarette ou la pollution industrielle, le spam fait partie des nuisances dont nous nous plaignons sans rĂ©ellement oser les combattre parce que nous sommes persuadĂ©s qu’il y’a une raison valable pour laquelle ça existe, parce que nous nous y sommes habituĂ©s et parce que certains se sont tellement enrichis avec qu’ils peuvent influencer le pouvoir politique et mĂ©diatique. Pire : nous admirons mĂȘme un peu ceux qui gagnent leur vie de cette maniĂšre et sommes prĂȘts Ă  travailler pour eux si une offre juteuse se prĂ©sente.

Les bénéfices insoupçonnés de la suppression de compte

La solution la plus radicale et qui fonctionne Ă  merveille reste de supprimer tous ses comptes. C’est un processus de longue haleine : je me suis dĂ©couvert plus de 600 comptes au fur et Ă  mesure que je fouillais mon gestionnaire de mot de passe, les comptes liĂ©s Ă  mes comptes Google, Facebook et LinkedIn. Chaque fois que je crois avoir fait le tour, des comptes complĂštement oubliĂ©s rĂ©apparaissent dans ma boĂźte mail lorsqu’ils modifient leurs conditions d’utilisation.

Supprimer un compte qu’on n’utilise plus est un processus pĂ©nible : rĂ©ussir Ă  se reconnecter, Ă  trouver la procĂ©dure pour supprimer qui est souvent planquĂ©e et artificiellement complexe (pas toujours). Mais c’est encore plus difficile lorsqu’il s’agit d’un compte qu’on utilise ou qu’on pense pouvoir rĂ©utiliser. Le plus difficile Ă©tant lorsqu’un historique existe, historique souvent agrĂ©mentĂ© d’un score : nombre d’amis, points, karma, rĂ©compenses, badges
 AprĂšs Facebook et Twitter, Reddit et Quora furent probablement les comptes les plus difficiles Ă  supprimer. Je me suis rendu compte que je tirais une fiertĂ© absurde de mon karma et de mes scores alors que je n’ai jamais Ă©tĂ© un utilisateur assidu de ces plateformes.

Mention spĂ©ciale tout de mĂȘme Ă  ces sites qui ont prĂ©tendu avoir effacĂ© mes donnĂ©es sans rĂ©ellement le faire. Dans le cas d’une chaine de restaurants de sushi, le gestionnaire s’est contentĂ© de rajouter « deleted_ » devant mon adresse email. Ce fut encore pire pour un grand site immobilier belge. Plus d’un an aprĂšs la suppression totale de mes donnĂ©es, le site s’est soudain mis Ă  m’envoyer journaliĂšrement le rĂ©sultat d’une recherche que j’avais enregistrĂ©e une dĂ©cennie auparavant. Sans possibilitĂ© de dĂ©sactiver, mon compte Ă©tant officiellement supprimĂ©. Il a fallu plusieurs semaines d’échanges par email pour rĂ©soudre le problĂšme et obtenir un semblant d’explication : un trĂšs vieux backup aurait Ă©tĂ© utilisĂ© pour restaurer certaines bases de donnĂ©es. Je vous laisse juge de la crĂ©dibilitĂ© d’une telle raison.

De toutes mes histoires, j’ai appris une gĂ©nĂ©ralitĂ© : l’immense majoritĂ© des services est en rĂ©alitĂ© incapable de supprimer vos donnĂ©es, que ce soit par malveillance ou par incompĂ©tence. Toute donnĂ©e entrĂ©e sur un site doit ĂȘtre considĂ©rĂ©e comme dĂ©finitivement compromise et potentiellement publique. Si j’ai trĂšs souvent accordĂ© le bĂ©nĂ©fice du doute, attribuant les erreurs ou difficultĂ©s Ă  l’incompĂ©tence, j’ai plusieurs fois Ă©tĂ© confrontĂ© Ă  ce qui ne pouvait ĂȘtre que des mensonges manifestes et Ă©hontĂ©s. Une grande majoritĂ© des services web rĂ©clamant vos donnĂ©es sont donc soit incompĂ©tents, soit profondĂ©ment malhonnĂȘtes. Soit les deux. L’exception venant des petits services artisanaux, gĂ©nĂ©ralement dĂ©veloppĂ©s par une toute petite Ă©quipe. Dans tous les cas de ce genre, l’effacement s’est fait rapidement, proprement et parfois avec un mot gentil personnalisĂ©. Preuve que la suppression n’est pas un acte techniquement insurmontable.

Contrairement Ă  l’abstinence ou au blocage d’accĂšs Ă  ces sites, la suppression du compte a eu chez moi un impact absolument incroyable. Du jour au lendemain, j’ai arrĂȘtĂ© de penser Ă  ce qui se passait sur ces plateformes. Du jour au lendemain, j’ai arrĂȘtĂ© de penser Ă  ce qui pourrait avoir du succĂšs sur ces plateformes. J’ai arrĂȘtĂ© de penser pour ces plateformes. J’ai arrĂȘtĂ© de me plier Ă  leurs rĂšgles, de travailler inconsciemment pour elles. J’ai arrĂȘtĂ© d’avoir envie de les consulter. Et lorsque me vient l’envie d’y poster ou d’y rĂ©pondre, le fait de devoir recrĂ©er un compte pour l’occasion est assez pour m’arrĂȘter dans mon Ă©lan et me faire remarquer que j’ai mieux Ă  faire. Lorsqu’une plateforme est soudain vraiment nĂ©cessaire, je recrĂ©e un compte, si possible avec une adresse jetable et le supprime aprĂšs emploi. Une fois le rĂ©flexe pris, ce n’est plus tellement contraignant.

Plateformes et militantisme

N’ayant pas supprimĂ© mon compte Mastodon, par simple soutien idĂ©ologique au projet, je me retrouve mĂ©caniquement Ă  explorer cette plateforme. Plateforme elle-mĂȘme complĂštement biaisĂ©e (si je devais la considĂ©rer comme reprĂ©sentative de la France, MĂ©lenchon aurait dĂ» devenir prĂ©sident avec prĂšs de 95% des voix, le reste Ă©tant essentiellement des abstentions).

Dans le militantisme, il existe deux Ă©coles. La premiĂšre prĂ©tend qu’il faut aller chercher les gens oĂč ils sont. Militer pour le logiciel libre sur Facebook par exemple. La seconde soutient qu’il faut d’abord ĂȘtre fidĂšle Ă  ses propres valeurs, ses convictions.

Je suis dĂ©sormais convaincu de la seconde approche. Je pense avoir soutenu la premiĂšre approche pendant des annĂ©es entre autres pour justifier ma quĂȘte Ă©gotique sur les rĂ©seaux propriĂ©taires, pour rĂ©soudre mon conflit interne. Car, quelle que soit l’intention derriĂšre un message, son impact sera toujours contrĂŽlĂ© par la plateforme sur laquelle il est postĂ©. Le simple fait d’utiliser une plateforme nous dĂ©forme et nous conforme Ă  ladite plateforme.

Je pense Ă©galement qu’il ne faut pas aller « chercher les gens lĂ  oĂč ils sont ». Ne pas crier pour tenter de couvrir le bruit ambiant. Il faut au contraire construire des espaces de calme, des espaces personnels et faire confiance aux humains pour les trouver lorsqu’ils en ont besoin. Le simple fait d’avoir un compte sur une plateforme justifie pour tous vos contacts le fait de rester sur cette plateforme. Le premier qui quitte la plateforme s’exclut du groupe. Le second force le groupe Ă  se poser des questions. Le troisiĂšme implique que « le groupe » n’est tout simplement plus sur cette plateforme, que celle-ci est devenue inutile dans le cadre du groupe.

Aucun discours ne convainc autant que montrer l’exemple. Faire plutĂŽt que dire. Être plutĂŽt que convaincre. Vivre ses propres choix, sa propre personnalitĂ© et respecter ceux qui en font d’autres en acceptant que cela puisse nous Ă©loigner.

Oui, en supprimant mes comptes j’ai ratĂ© des opportunitĂ©s sociales. Mais soit je ne m’en suis pas rendu compte, ce qui a Ă©pargnĂ© mon Ă©nergie mentale, soit cela a eu pour impact de faire prendre conscience Ă  mon entourage qu’ils ne pouvaient plus faire entiĂšrement confiance Ă  Facebook ou Whatsapp. Dans tous les cas, le rapport coĂ»t/bĂ©nĂ©fice s’est rĂ©vĂ©lĂ© disproportionnellement en faveur de la suppression.

À chaque compte effacĂ©, j’ai eu le sentiment qu’on m’enlevait un poids des Ă©paules. Je me sentais revivre. Certes, je perdais une « audience potentielle », mais j’y gagnais en libertĂ©, en plaisir d’écrire sur mon blog, sur mon gemlog voire sur ma machine Ă  Ă©crire plutĂŽt que de sans cesse rĂ©agir, rĂ©pondre, ĂȘtre en rĂ©action (au sens le plus NitzchĂ©en du terme).

Si j’ai replongĂ© dans la connexion intermittente, un progrĂšs Ă©norme s’est fait : la connexion m’ennuie de plus en plus. Le nombre de plateformes sur lesquelles lire du contenu s’est Ă  ce point restreint que j’en fais trĂšs vite le tour. J’ai Ă©galement compris que mon addiction n’est pas uniquement due Ă  la connexion, elle est Ă©galement technophile. J’aime ĂȘtre sur mon ordinateur, devant mon Ă©cran. Je tente de trouver des excuses pour garder les mains sur le clavier, pour mettre Ă  jour un logiciel, configurer mon environnement, amĂ©liorer mes processus, dĂ©couvrir, coder. Bref, « chipoter ».

La dĂ©couverte de cette composante de mon addiction m’a convaincu de faire entrer ma dĂ©connexion dans une nouvelle phase. Celle de la matĂ©rialitĂ©.

Recevez les billets par mail ou par RSS. Max 2 billets par semaine, rien d’autre. Adresse email jamais partagĂ©e et dĂ©finitivement effacĂ©e lors du dĂ©sabonnement. Dernier livre paru : Printeurs, thriller cyberpunk. Pour soutenir l’auteur, lisez, offrez et partagez des livres.

Ce texte est publié sous la licence CC-By BE.

August 02, 2022

Zoals altijd zijn we als EU of Europa achtergesteld omdat we geen enkel militair antwoord hebben op de zaken die geostrategisch gaande zijn.

Voor Ukraine kunnen we weinig tot niets doen omdat we geen enkel antwoord hebben op de vraag ‘vanwaar komt het gas dan wel?’

Deze vraag is te belachelijk simpel en toch kan geen enkel EU politicus ze beantwoorden.

Nochtans was het antwoord hierop haalbaar: men had werk kunnen maken van alternatieve energiebronnen. Maar dat vonden de heren en dames EU-politici niet nodig. Overbodig. En zo verder.

M.a.w. zijn ze volstrekt incompetent. Ik schrijf ze effectief volledig af. Want ze hadden daar wel een antwoord op moeten kunnen formuleren. De idioten die er nu nog zitten kunnen dat niet. Daarom zijn het losers en daarom horen we ze te ontslaan uit hun functie. Helaas zijn het ook populisten en daarom zal hun ontslag vele jaren duren (zie Brexit).

Richting Taiwan doen de EU politici alweer hun belachelijke best om wat dan ook te betekenen. Maar iets betekenen doen ze helemaal niet. Ze doen niets dat er toe doet.

Omdat ze collectief besloten hebben geen EU-leger te hebben.

Daarom zijn ze onbelangrijk. Insignificant. Incompetent. Onbelangrijk.

August 01, 2022

I wanted to outline the development and deployment workflow I use on dri.es, my personal website.

My site uses Drupal (obviously) and runs on Acquia Cloud (of course), but a lot of this is a best practice for any web application.

I manage my website's code and configuration in Git. Each time I commit a change to my Git repository, I go through the following steps:

  1. I create a staging environment to test my code before deploying it to production. It's a complete staging environment: not just PHP, MySQL and Nginx, but also Varnish, Memcache, etc.
  2. I check out my Git repository. My Git repository hosts my custom files only. It's a best practice not to commit Drupal core or third-party Drupal modules to your Git repository.
  3. I run PHP Code Sniffer to make sure my code conforms to my coding style rules. I specify my coding style rules in phpcs.xml and use phpcs to make sure my code adheres to them. If not, phpcbf tries to fix my code automatically. I like my code tidy.
  4. I run PHPStan, a static code analysis tool for PHP, that scans my code base for bugs. It will find dead code, type casting problems, incorrect function arguments, missing type hints, unknown function calls, and much more. PHPStan is a fantastic tool.
  5. I run PHP Unit, a PHP testing framework, to make sure my unit tests pass.
  6. I run phpcs-security-audit, a static code analysis tool for PHP. It scans my PHP code for security vulnerabilities and security weaknesses.
  7. I run ESLint, a static code analysis tool for JavaScript. It scans my JavaScript code for security vulnerabilities and weaknesses.
  8. I run nodejs-scan to find insecure code patterns in my Node.js applications. I don't use Node.js at the moment though.
  9. I also run Semgrep, a static code analysis tool for a variety of programming languages.
  10. I run Rector to make sure I don't use deprecated Drupal code. When I do, Rector will try to programmatically update any deprecated code that it finds.
  11. As my Git repository only has custom files, I use Composer to download and install the latest version of Drupal and all third-party modules and components.
  12. I run drush pm:security. Drush is a Drupal-specific tool, and the pm:security option verifies that I have no insecure dependencies installed.

This all might sound like a lot of work to set up, and it can be. For Acquia customers and partners, Acquia Code Studio automates all the steps above. Acquia Code Studio is a fully managed CI/CD based on Gitlab, with specific steps optimized for Drupal. In 20+ years of working on Drupal, it's my best webops workflow yet. It couldn't be easier.

A screenshot of the Acquia Code Studio UI showing some of the automated tests.A screenshot of Acquia Code Studio showing the automated tests feature.

Acquia Code Studio also takes care of automating dependency updates. Code Studio regularly checks if Drupal or any of its dependencies have a new release available. If there is a new release, it will run all the steps above. When all of the above tools pass, Acquia Code Studio can deploy new code to production with one click of a button.

A screenshot of the Acquia Code Studio UI showing that some Composer packages have been updated.A screenshot of Acquia Code Studio showing the automated update feature.

I love it!

July 30, 2022

Many Bluetooth Low Energy (BLE) devices broadcast data using manufacturer-specific data or service data in their advertisements. The data format is often defined in a specification or should be reverse-engineered.

If you want to decode the binary data format into usable chunks of data from various data types in your own Python program, I find the Construct library quite an accessible solution. And Bleak is my favorite BLE library in Python, so first install Bleak and Construct:

pip3 install bleak construct

As an example, let's see how you could decode iBeacon advertisements with Bleak and Construct in Python.

The iBeacon specification

The iBeacon specification, published by Apple, is officially called Proximity Beacon. The idea is to have Bluetooth beacons advertise their presence in order to calculate their approximate distance. You can find the iBeacon specification online.

The specification lists the format of the iBeacon advertising packet. This always consists of two advertising data structures: flags (of length 2) and manufacturer-specific data (of length 26). That’s why an iBeacon advertising packet is always 30 bytes long (1 + 2 + 1 + 26). Here's the structure of the complete packet:

/images/proximity-beacon-advertising-packet.png

We're specifically interested in the second data structure with type 0xff, which signifies that it's manufacturer-specific data. The first two bytes of these manufacturer-specific data are always the company ID. To know which company ID is linked to which company, consult the list of all registered company identifiers. Normally the company ID is the ID of the manufacturer of the device. However, Apple allows other manufacturers to use its company ID for iBeacon devices if they agree to the license.

Note

The company ID is a field of two bytes that are sent as a little-endian value. If you look at an iBeacon packet capture in Wireshark, the bytes on the air are 4c 00. However, Apple's real company ID is 00 4c, or 76 in decimal.

If you want to know more about the meaning of the iBeacon packet's fields, consult the document Getting Started with iBeacon published by Apple.

Decoding iBeacon advertisements

Now that you know the format, let's see how to scan for iBeacon advertisements and decode them:

ibeacon_scanner/ibeacon_scanner.py (Source)

"""Scan for iBeacons.

Copyright (c) 2022 Koen Vervloesem

SPDX-License-Identifier: MIT
"""
import asyncio
from uuid import UUID

from construct import Array, Byte, Const, Int8sl, Int16ub, Struct
from construct.core import ConstError

from bleak import BleakScanner
from bleak.backends.device import BLEDevice
from bleak.backends.scanner import AdvertisementData

ibeacon_format = Struct(
    "type_length" / Const(b"\x02\x15"),
    "uuid" / Array(16, Byte),
    "major" / Int16ub,
    "minor" / Int16ub,
    "power" / Int8sl,
)


def device_found(
    device: BLEDevice, advertisement_data: AdvertisementData
):
    """Decode iBeacon."""
    try:
        apple_data = advertisement_data.manufacturer_data[0x004C]
        ibeacon = ibeacon_format.parse(apple_data)
        uuid = UUID(bytes=bytes(ibeacon.uuid))
        print(f"UUID     : {uuid}")
        print(f"Major    : {ibeacon.major}")
        print(f"Minor    : {ibeacon.minor}")
        print(f"TX power : {ibeacon.power} dBm")
        print(f"RSSI     : {device.rssi} dBm")
        print(47 * "-")
    except KeyError:
        # Apple company ID (0x004c) not found
        pass
    except ConstError:
        # No iBeacon (type 0x02 and length 0x15)
        pass


async def main():
    """Scan for devices."""
    scanner = BleakScanner()
    scanner.register_detection_callback(device_found)

    while True:
        await scanner.start()
        await asyncio.sleep(1.0)
        await scanner.stop()


asyncio.run(main())

First it defines a Struct object from the Construct library, and calls it ibeacon_format. A Struct is a collection of ordered and usually named fields. 1 Each field in itself is an instance of a Construct class. This is how you define the data type of bytes in an iBeacon data structure. In this case the fields are:

  • Const(b"\x02\x15"): a constant value of two bytes, because these are always fixed for an iBeacon data structure.

  • Array(16, Byte): an array of 16 bytes that define the UUID.

  • Int16ub for both the major and minor numbers, which are both unsigned big-endian 16-bit integers.

  • Int8sl for the measured power, which is a signed 8-bit integer.

Now when the device_found function receives manufacturer-specific data from Apple, it can easily parse it. It just calls the parse function on the ibeacon_format object, with the bytes of the manufacturer-specific data as its argument. The result is an object of the class construct.lib.containers.Container, with the fields that are defined in the ibeacon_format struct. That's why you can just refer to the fields like ibeacon.major, ibeacon.minor and ibeacon.power.

However, ibeacon.uuid returns a construct.lib.containers.ListContainer object, which is printed as a list of separate numbers. To print it like a UUID, first convert it to bytes and then create a UUID object from these bytes.

Note

This Python program doesn't explicitly check for the company ID and the first two bytes in the manufacturer-specific data. The code just assumes it receives iBeacon data and catches exceptions if this assumption proves false: the KeyError exception happens if there's no 0x004c key in the manufacturer_data dictionary and the ConstError exceptions happens if the first two bytes of the data don't equal the constant b"\x02\x15". This common Python coding style is called EAFP (easier to ask for forgiveness than permission), and in many cases it makes the code easier to follow. The other style, testing for all conditions before, is called LBYL (look before you leap).

If you run this program, it will scan continuously for iBeacons and shows their information:

$ python3 ibeacon_scanner.py
UUID     : fda50693-a4e2-4fb1-afcf-c6eb07647825
Major    : 1
Minor    : 2
TX power : -40 dBm
RSSI     : -80 dBm
-----------------------------------------------
UUID     : d1338ace-002d-44af-88d1-e57c12484966
Major    : 1
Minor    : 39904
TX power : -59 dBm
RSSI     : -98 dBm
-----------------------------------------------

This will keep scanning indefinitely. Just press Ctrl+c to stop the program.

1

A Struct object in Construct is comparable to a struct in the C programming language.

Long time no Radiohead here, so let’s fix that shall we? Here’s Thom Yorke solo in Zermatt (Switzerland) playing songs from Radiohead, his solo-records and his new band (The Smile). If anything this is a testament to the great songwriter the man is! Also remarkable; he seems so much more at ease on stage now, maybe having accepted the spotlights which sometimes seemed too much for him to cope with.

Source

July 29, 2022

Setup of a HifiBerry AMP2...on a Rapsberry Pi 2.

First attempt was with Volumio, as advised by a friend. Well that works, but I personally find the interface a horror, and I seem to lose control of the Pi since Volumio is a full OS that seems only accessible by web interface. No thanks.

Using Raspberry Pi OS:

- download Raspberry Pi OS lite (command line is fine)

- extract the image

- dd the image to the sd-card

dd if=/home/paul/2022-04-04-raspios-bullseye-armhf-lite.img of=/dev/sdb bs=1M

- mount it to enable ssh

touch /boot/ssh

- I also had to set a password for the pi user, since 'raspberry' was not accepted?

- Boot the Pi (the HifiBerry is still attached)

- ssh into the Pi 

apt update
apt upgrade
apt install vim
vi /boot/config.txt
#dtparam=audio=on
#dtoverlay=vc4-kms-v3d

# added by paul 2022-07-29 for HifiBerry AMP2
dtoverlay=hifiberry-dacplus
force_eeprom_read=0


Comment out the first two lines, add the last two. Check here for other HifiBerries.

Now, before using mplayer or something, LOWER THE VOLUME! Use a really low value, and gradually go up while playing music since the default is extremely loud.

amixer -c 0 sset Digital "20%"

Thanks for listening :)

July 25, 2022

Een eigen Europees leger starten. Waarbij ieder Europees land haar eigen expertise in de groep werpt.

Afspraken maken met Rusland over de energievoorziening van Europa.

Een nieuw veiligheidspakt met Rusland maken opdat er zo weinig mogelijk conflicten in Europa zullen zijn.

Machtsprojectie doen vanuit Europa, met het Europees leger. We moeten opnieuw leren wat het is om aan geostrategie te doen. We moeten dat Europees leger durven inzetten om onze strategische doelen te behalen. We moeten niet verlegen zijn om de wereld duidelijk te maken dat wij zulke strategische doelen hebben.

Het conflict in OekraĂŻne beĂŻndigen. Want het dient ons (Europeanen) en Russen niet. We zijn beiden benadeeld door dit conflict. We hebben er beiden baad bij om dit te beĂŻndigen.

Durven praten over Europa en niet enkel over de Europese Unie.

July 23, 2022

Almost 2 decades ago, Planet Debian was created using the "planetplanet" RSS aggregator. A short while later, I created Planet Grep using the same software.

Over the years, the blog aggregator landscape has changed a bit. First of all, planetplanet was abandoned, forked into Planet Venus, and then abandoned again. Second, the world of blogging (aka the "blogosphere") has disappeared much, and the more modern world uses things like "Social Networks", etc, making blogs less relevant these days.

A blog aggregator community site is still useful, however, and so I've never taken Planet Grep down, even though over the years the number of blogs that was carried on Planet Grep has been reducing. In the past almost 20 years, I've just run Planet Grep on my personal server, upgrading its Debian release from whichever was the most recent stable release in 2005 to buster, never encountering any problems.

That all changed when I did the upgrade to Debian bullseye, however. Planet Venus is a Python 2 application, which was never updated to Python 3. Since Debian bullseye drops support for much of Python 2, focusing only on Python 3 (in accordance with python upstream's policy on the matter), that means I have had to run Planet Venus from inside a VM for a while now, which works as a short-term solution but not as a long-term one.

Although there are other implementations of blog aggregation software out there, I wanted to stick with something (mostly) similar. Additionally, I have been wanting to add functionality to it to also pull stuff from Social Networks, where possible (and legal, since some of these have... scary Terms Of Use documents).

So, as of today, Planet Grep is no longer powered by Planet Venus, but instead by PtLink. Rather than Python, it was written in Perl (a language with which I am more familiar), and there are plans for me to extend things in ways that have little to do with blog aggregation anymore...

There are a few other Planets out there that also use Planet Venus at this point -- Planet Debian and Planet FSFE are two that I'm currently already aware of, but I'm sure there might be more, too.

At this point, PtLink is not yet on feature parity with Planet Venus -- as shown by the fact that it can't yet build either Planet Debian or Planet FSFE successfully. But I'm not stopping my development here, and hopefully I'll have something that successfully builds both of those soon, too.

As a side note, PtLink is not intended to be bug compatible with Planet Venus. For one example, the configuration for Planet Grep contains an entry for Frederic Descamps, but somehow Planet Venus failed to fetch his feed. With the switch to PtLink, that seems fixed, and now some entries from Frederic seem to appear. I'm not going to be "fixing" that feature... but of course there might be other issues that will appear. If that's the case, let me know.

If you're reading this post through Planet Grep, consider this a public service announcement for the possibility (hopefully a remote one) of minor issues.

What does one do on a free Saturday afternoon? Upgrading Linux machines of course! I thought upgrading my Ubuntu 20.04 LTS laptop to Ubuntu 22.04 LTS would be a routine task I could keep running in the background, but... computer said no.

The first hurdle was starting the upgrade:

$ sudo do-release-upgrade
Checking for a new Ubuntu release
There is no development version of an LTS available.
To upgrade to the latest non-LTS development release
set Prompt=normal in /etc/update-manager/release-upgrades.

I was puzzled: although Ubuntu 22.04 has been released three months ago, Ubuntu 20.04 doesn't detect this as a new release. It took some digging around to discover that apparently upgrades from one LTS release to the next one are only available after the first point release. So in this case, Ubuntu 20.04 will not detect a newer version until Ubuntu 22.04.1 is released, which is scheduled for August 4.

Luckily you're still able to upgrade, you just have to ask for the development version:

$ sudo do-release-upgrade -d

Ok, so after this first hurdle I thought this was the most exciting part of the upgrade, but I was wrong. I'm not sure what the problem was, but after all packages had been downloaded and when the upgrade process was in the middle of applying the package upgrades, the screen became grey and showed the message "Something has gone wrong. Please logout and try again." I had to restart Ubuntu and it even didn't reach the login screen, just showing me a grey screen.

This all looked familiar. 1 I encountered exactly the same problem two years ago while upgrading Ubuntu 19.10 to Ubuntu 20.04 LTS. So luckily I could take the blog article I wrote then as a guideline for the recovery process. However, the fix was slightly more complex this time. These are my notes of fixing this.

First reboot your computer and start Ubuntu in recovery mode:

  • Hold the Shift key while booting the PC.

  • In the GRUB boot menu that appears, choose the advanced options and then recovery mode.

  • In the recovery menu that appears, enable networking first and then choose the option to open a root shell.

Because the installation has been aborted, I tried fixing a broken install:

# apt --fix-broken install

While last time this fixed the issue, I now encountered an error about the Firefox package. Ubuntu decided to switch Firefox to a snap package, and apparently apt wasn't able to install the new deb package that installs the snap. As a temporary workaround, I removed Firefox and ran the apt command again:

# apt remove firefox
# apt --fix-broken install

This still resulted in the same error, so my next idea was to prevent the upgrade process from installing Firefox. So I created the apt preference file /etc/apt/preferences.d/firefox-no-snap.pref with the following configuration:

Package: firefox*
Pin: release o=Ubuntu*
Pin-Priority: -1

Then I tried to fix the install again:

# apt --fix-broken install

This worked! No complaints about the Firefox package this time.

To be sure I didn't miss any package configuration, I configured all unpacked but not yet configured packages:

# dpkg --configure -a

This returned silently, so no issues there.

Then I continued the upgrade:

# apt upgrade

And this now went smoothly. So after the upgrade succeeded, I had Ubuntu 22.04 on my system:

# lsb_release -a
No LSB modules are available.
Distributor ID:   Ubuntu
Description:  Ubuntu 22.04 LTS
Release:  22.04
Codename: jammy

I rebooted:

# reboot

And then I could login again and was welcomed by the new jellyfish desktop background:

/images/ubuntu-22.04-desktop.png

Then I reinstalled Firefox as a snap: 2

$ snap install firefox

And finally I had a working Ubuntu laptop again.

Note

If you don't remember exactly what you did in recovery mode to fix your upgrade and you have already rebooted, just run sudo su and then history. It will show you the commands you ran as root.

1

This article about fixing an upgrade to Ubuntu 20.04 is by far the most popular one on my blog, and I still get some emails sometimes from people thanking me that I saved them the trouble of finding out how to fix their broken upgrade.

2

At this moment I was just too lazy to find out how to install Firefox from the Mozilla team's PPA. I know that there are some issues with it, but I'll just have to see whether the snap works for me.

July 22, 2022

There are not enough movies that really floor me. Everything, Everywhere, All at Once did. Go see it!

Source

July 20, 2022

Acquia recently turned 15. Over the past 15 years, I've heard the name "Acquia" pronounced in all sorts of different ways.

It was Jay Batson, my co-founder, who came up with the name "Acquia". Please blame Jay for the difficult name. ;)

When it came time to pick a name for our company, Jay insisted that the name started with the letter A. I remember questioning the value of that. In a world where people search for things, who looks up things in alphabetical order?, I asked.

In the end, Jay was right. I learned a great many things the past 15 years, including how common alphabetical listings still are. You'd be amazed how often Acquia ranks number one in listings. It gave us a small edge.

For more background on how the name Acquia came to be, and some other Acquia trivia, check out Jay's blog post "ah-kwe-eh".

I published the following diary on isc.sans.edu: “Malicious Python Script Behaving Like a Rubber Ducky“:

Last week, it was SANSFIRE in Washington where I presented a SANS@Night talk about malicious Python scripts in Windows environment. I’m still looking for more fresh meat and, yesterday, I found another interesting one.

Do you remember the Rubber Ducky? Pentesters like this kind of gadgets. I still have one as well as others with WiFi capabilities The idea behind these USB keys is to deliver a payload by simulating a keyboard. When you connect then to a computer, they are detected as a HID (“Human Interface Device”). The payload will be “injected” like if the user pressed all the keys one by one… [Read more]

The post [SANS ISC] Malicious Python Script Behaving Like a Rubber Ducky appeared first on /dev/random.

Usually, I receive a lot of emails, and sometimes I read them on my phone and then… I forgot about them.. (shame on me).

On my Linux desktop, I used to use Get Things Gnome for a long time, due to the declining appeal of the project and the end of the extension for Thunderbird, I found it less and less useful.

I was then looking for a solution to have my todolist accessible from everywhere and that I could manage it myself, not hosted somewhere.

I found a very nice, fast and practical project that was easy to deploy and was using MySQL as backend: myTinyTodo.

However, I was missing the possibility to easily create a new task from an email (and especially on my phone).

This is why I decided to write a script that would perform exactly what I was looking for and integrated it with myTinyTodo.

mail2todo was born !

This script reads the emails from a dedicated imap account and populate the MySQL database used by myTinyTodo.

I really like how tags where handled in Get Things Gnome, so I used the same behavior:

This script is doing exactly what I needed !

The requirements are simple:

The code can be improved but currently it does what I need and this is why I decided to share it. You can also see how easy it’s to work with the mysqlx module even to handle only SQL queries.

Enjoy MySQL, Python, the X Protocol and myTinytodo !

July 19, 2022

Autoptimize 3.1 was just released with some new features and some fixes/ improvements: new: HTML sub-option: “minify inline CSS/ JS” (off by default). new: Misc option: permanently allow the “do not run compatibility logic” flag to be removed (which was set for users upgrading from AO 2.9.* to AO 3.0.* as the assumption was things were working anyway). security: improvements to the critical CSS...

Source

July 17, 2022

XPO Space

Zeker doen als je interesse hebt in ruimtevaart, er staan enkele boeiende objecten op ware grootte en dat geeft toch een andere indruk. De geschiedenis van de ruimtevaart wordt er goed in beeld gebracht, helaas gaat de rest van de XPO enkel over de Verenigde Staten, Rusland en een beetje ESA. Geen woord over de Indische, Japanse of Chinese ruimtevaart van de laatste tien-twintig jaren.

Er staat ook een foute schaal bij een Saturn V (1:144 ipv 1:72) en ze geven een nieuwe betekenis voor een 'dag' (op Aarde is een dag 24u, volgens XPO is een dag op de maan 14 Aardse dagen, maar dat moet 28 zijn gezien de nacht ook een integraal deel is van de dag).


Maagdenhuis Antwerpen

Het Maagdenhuis is enerzijds een standaard museum met schilderijen (ze hebben Pieter Paul Rubens, Jacob Jordaens en Antoon van Dyck!) en oude voorwerpen, anderzijds geeft dit museum duidelijk aan hoe de tijdsgeest kan veranderen.

Hier staat ook de houten Clara (helaas bestaat het restaurant met dezelfde naam niet meer).


Mayer van den Bergh Antwerpen

Mayer van den Bergh is wereldberoemd voor al wie 'de Dulle Griet' van Suske en Wiske heeft gelezen. Het schilderij met dezelfde naam van Pieter Breugel is gerestaureerd sinds ik het de laatste keer zag, en ja, het ziet er geweldig uit vandaag.

Behalve de schilderijen en de (soms heel oude) voorwerpen, vind ik hier ook de kamers zelf al de moeite om te bekijken.


Rockoxhuis Antwerpen

Voluit het Snijders&Rockoxhuis is boeiend als je graag oude schilderijen ziet. Je krijgt hier een ipad om informatie te lezen over alles wat er staat (of hangt), en dat vind ik veel beter dan een audioplayer omdat ik veel liever lees dan luister (of kijk).

Hier hangt het beroemde spreekwoorden schilderij van Pieter Breugel. Er staat nu een touchscreen bij dit schilderij dat alle spreuken verraadt... maar misschien is het leuker om er zelf een paar te ontdekken die vandaag nog gangbaar zijn.


July 15, 2022

A while ago I bought an external webcam with better image quality than the one built into my laptop. However, when I wanted to use it on my Linux system, I faced an unexpected problem. Not all programs or web sites allowed me to choose which webcam to use. And even worse: the ones that didn't give me the choice automatically chose the first available webcam device, /dev/video0, which is of course the internal webcam.

Luckily there's a solution for everything in Linux. I just had to find a way to disable the internal webcam. My idea was that the external webcam would then become the first available webcam device and this would then be chosen automatically.

So I first looked at the product information of all connected USB devices:

$ for device in $(ls /sys/bus/usb/devices/*/product); do echo $device;cat $device;done
/sys/bus/usb/devices/1-1/product
HD Pro Webcam C920
/sys/bus/usb/devices/1-7/product
Chicony USB2.0 Camera
/sys/bus/usb/devices/usb1/product
xHCI Host Controller
/sys/bus/usb/devices/usb2/product
xHCI Host Controller

As you see, the first two devices are webcams. The HD Pro Webcam C920 is the external one, while the Chicony USB2.0 Camera is the internal one. I wanted to disable the latter. The file with the product information for this webcam is /sys/bus/usb/devices/1-7/product, and I needed the code 1-7 in its path. This means that the device is connected on USB bus 1 port 7.

With this information I could send a command to the USB driver to unbind the port:

$ echo '1-7' | sudo tee /sys/bus/usb/drivers/usb/unbind

After this, the internal webcam isn't found anymore by software or web sites. If I connect the external webcam after this command, it gets assigned /dev/video0 as the device file.

Re-enabling the internal webcam is easy too:

$ echo '1-7' | sudo tee /sys/bus/usb/drivers/usb/bind

This is the same command as the previous one, but with bind instead of unbind in the path.

To make this easier to remember, I created a small shell script, webcam.sh:

#!/bin/sh

device="1-7"
status=$1

case $status in
    enable) driver_command="bind";;
    disable) driver_command="unbind";;
    *) exit 1;;
esac

echo $device | sudo tee /sys/bus/usb/drivers/usb/$driver_command

After making it executable with chmod +x webcam.sh, I could just run webcam.sh disable before connecting the external webcam every time I wanted to use it. And after disconnecting the external webcam, I could always re-enable the internal webcam with webcam.sh enable, but I never bothered with it.

I used the script for a while like this, until I realized I could even run this script automatically every time I connected or disconnected the external webcam, thanks to a udev rule.

So I added the following udev rule to /etc/udev/rules.d/99-disable-internal-webcam.rules:

SUBSYSTEM=="usb", ACTION=="add", ENV{DEVTYPE}=="usb_device", ENV{PRODUCT}=="46d/8e5/c", RUN+="/home/koan/webcam.sh disable"
SUBSYSTEM=="usb", ACTION=="remove", ENV{DEVTYPE}=="usb_device", ENV{PRODUCT}=="46d/8e5/c", RUN+="/home/koan/webcam.sh enable"

I found the correct value for ENV{PRODUCT} in the output of udevadm monitor --kernel --property --subsystem-match=usb while connecting or disconnecting the external webcam.

So now I never have to bother with disabling, enabling or choosing a webcam device. If my external webcam isn't connected, all software chooses the internal webcam. As soon as I connect the external webcam, the software chooses this one. And as soon as I disconnect the external webcam, the software chooses the internal webcam again.

July 12, 2022

If you write C applications that need to connect to MySQL, you can use the MySQL C API aka libmysqlclient. The MySQL C API replaces the outdated MySQL-Connector-C.

If you want to use MySQL 8.0 as a Document Store with the X Protocol, you need then to use MySQL Connector/C++ 8.0.

Some have asked how to compile only the MySQL C API.

Compiling only libmysqlclient

As the FAQ stipulates it, it’s not possible to only build the library. However, as mentioned in the documentation, it’s possible to reduce the amount of compiled products with some cmake options.

You still need to get the full source tree (from GitHub for example) and bypass the compilation of the server.

~ $ mkdir workspace
~ $ cd workspace
~workspace $ git clone https://github.com/mysql/mysql-server.git
~workspace $ mkdir mysql-server-bin-debug
~workspace $ cd mysql-server-bin-debug
~workspace/mysql-server-bin-debug $ cmake -DWITHOUT_SERVER=ON \
                                    -DDOWNLOAD_BOOST=1 \
                                    -DWITH_BOOST=../mysql-server/downloads/ \
                                    -DWITH_UNIT_TESTS=OFF ../mysql-server
~workspace/mysql-server-bin-debug $ make -j 8                        

On my laptop, this command took 2m24sec

Now you have built the client line tools and also libmysqlclient:

Conclusion

Following the process outlined in this blog, it takes me 2min24sec to build the client line tools and also libmysqlclient. You are required to download the complete source, and build some extra command line tools, but the process is fast.

Update !

Joro added extra information to this bug including how to bypass boost. I recommend reading those comments especially if you are building this on Windows.

July 11, 2022

Wow! But I can’t seem to find who’s playing that bass, is it all keyboard, even in the beginning where you here the snares of an upright?

Source

July 07, 2022

Last month I moved from Merelbeke to Ghent. I registered my new address on the government website, and last week I was invited to update my eID with my new address.

I made an appointment with one of the administrative centers of the city. The entire process took less than 5 minutes, and at the end I got a welcome gift: a box with a lot of information about the city services.

It’s been a while since I last did an unboxing video. The audio is in Dutch, maybe if I’m not too lazy (and only if people ask for it in the comments) I’ll provide subtitles.

Unboxing the Ghent box 🎁

July 06, 2022

Conferences are back! After Botconf in April, that’s Pass-The-Salt that is organized this week in Lille, France. After the two years break, the formula did not change: same location, free, presentations around security, and free software! And, most important, the same atmosphere.

The first day started in the afternoon and talks are grouped by topic. The first one was cryptography, hard way to start the conference if, like me, you don’t like this. But, the talks were interesting anyway! The first one was “Mattermost End-to-End Encryption plugin” by Adrien Guinet & AngĂšle Bossuat. Mattermost is a very popular free chat service (like Slack) but a free version is available and the community around it creates a lot of plugins. Encryption is in place between the client and the server but there was a lack of E2EE or “End-2-End Encryption” (so that even administrators of the server can’t see messages and files exchanged). The plugin was not easy to implement due to the Mattermost limitations: notifications, attachments, modification of sent messages, etc. Adrien & AngĂšle explained how they implemented this, how they solved some challenges, and ended with a quick demo. The plugin is available here if interested.

Then, Ludovic Dubost came on stage to present “CryptPad : a zero knowledge collaboration platform”. This project is already six years old and, even if I’ve heard about it, I never used it. Cryptpad tries to solve all privacy issues with data (example: the cloud by definition). Today, in most cases, it’s all about trust. The situation between the two parties has been widely accepted and let’s sign contracts instead. Cryptpad helps you to work on many documents and share them with your peers. Its key points are encrypted docs that can be edited in real time, E2EE, and key management with secure and private key sharing. Server owners have access to nothing (even recover data). Some features:

  • Create application files (docs, presentations, pads, kanban, surveys, code, forms, 

  • Crypt drive
  • Sharing functions
  • Work in teams

A good resume could be “Get people out of Google Docs”. Another good fact: it requires a low amount of resources. More information on cryptpad.org.

Then, another tool was presented by Yves Rutschle: “Dataflow tabular charts — a presentation tool for security architects”. The tool is called dtc.pl, yes, written in Perl! Its purpose is to create SVG files that will represent drawings of complex infrastructure or systems but the file is generated based on descriptions in a text file. Example:

Human:
-> Human (6)
void / void / void / void / void / Human Content
Reader laptop:
...

The tool is available here.

The second round of sessions focused on operating systems security. MickaĂ«l SalaĂŒn presented “Sandboxing your application with Landlock, illustration with the p7zip case”. Landlock is not new but not known by many people. It’s enable by default on many systems like Ubuntu 22.04-LTS. It’s a software sandboxing systems that allows to restrict access to applications. It must be implemented by developers and is based on only three syscalls! MickaĂ«l explained the system (how to enable it, create rules and populate them) and them applied it to a popular utility, 7zip to add sandboxing capabilities.

The last presentation of the day was “Building operating systems optimized for containers, from IoT to desktops and servers” by TimothĂ©e Ravier. The idea behind the talk was to explain how operating systems can be improved by using other techniques like containers to reduce the attack surface. Indeed, many OS are deployed today with plenty of tools that, because software have bugs, increase the attack surface for attackers. The idea is to reduce this amount of software to the bare minimum and deploy them with other technique to make them easier to patch.

The second day started with a series of presentations around networks. Yves Rutschle came back for a a second time with “sslh — an applicative-level protocol multiplexer“. Yves started this project years ago because he’s a big fan of console mail clients like Mutt and would like to have access to a SSH server from anywhere. In 2000, when firewalls started to filter the outgoing traffic, he developed the first version of sslh. The idea is to let sslh to listen to a port (TCP/443) and, depending on the first bytes of the session, redirect the incoming connection: Apache, the SSH server, etc. With time, it expanded, was rewritten C, as a daemon etc… Today, his tool is integrated into many Linux distributions. It also supports more protocols like OpenVPN, XMPP. The tools is available on Yves’s github repo but also available as a package in your favorite Linux distribution.

The next slot was assigned to Eric Leblond who, as usual, came to speak about the Suricata eco-system: “Write faster Suricata signatures easier with Suricata Language Server“. This time, nothing related to the tool in itself but he focused on a recurrent and annoying task: to write Suricata rules. The classic process looks like: write a rule, test it, fine-tune it, test it, … Very complex signatures can be complex to write. Because, he received multiple time the question: “How to write Suricata signature?”, he developed a solution based on the Language Server Protocol: “SLS” or “Suricata Language Server”. From your preferred text editor (lot of them are supported), you can get help, auto-completion, verification directly when you write Suricata rules. Really interesting if you write a lot of them!

The next one was “Building on top of Scapy: what could possibly go wrong?” by Claire Vacherot who’s a pen tester active in the ICS field. She explained their needs for specific tools. They create a framework (BOF – “Boiboite Opener Framework”) but they faced a lot of problems and tried to find an alternative with Scapy. Scapy was not immediately effective so they decided to keep both and use BOF as a frontend for Scapy. Best conclusion: Don’t just use tools, learn the power of them! make the most of them!

After the welcomed coffee break, we switched to other talks. The next one was “Use of Machine and Deep Learning on RF Signals” by SĂ©bastien Dudek. RF signals are used everywhere and a lot of researchers try to look at them to check many risks are related to them (jamming, eavesdropping, replay, inject, 
) but the very first challenge is to get a good idea about the signals. What are we capturing? SĂ©bastien explained briefly how to get some signals from very expensive devices to “gadgets” connected to free software. His research was to use machine learning and deep learning to help identifying the discovered signals. As I don’t have experiences in these domains (nor RF nor ML), it was a bit complex to follow but the idea seems interesting. I just bookmarked this site which helps to identify RF signals patterns: sigidiki.com.

The keynote was given by Ivan Kwiatkowski: “Ethics in cyberwar times”. It was a great talk and an eye-opener for many of us (well, I hope). The common thread was related to thread intelligence. How it is generated, by who, what companies do with it (and by they buy it!). For Ivan, “TI resellers are intelligence brokers”. Also, can there be neutal, apolitical intelligence? This is a huge debate! Everybody uses TI today: APT groups conduct attacks, TI companies write reports and APT groups buy these reports to improve their attacks. What could go wrong?

After the lunch break, we started a série about reverse and binary files. 

“Abusing archive-based file formats” by Ange Albertini. Once again, Ange covered his techniques and tools to abuse file formats and create “malicious” documents. I recommend you to have a look at his Github repo.

Then, Damien Cauquil presented “Binbloom reloaded”. The first version of this tool was published in 2020 by Guillaume Heilles. Damien stumbled upon an unknown firmware
 First reflex, load it into Ghidra but it was not easy to guess the base address to start the analysis. He started other tools like basefind.py without luck. He reviewed these tools and explained how they search for the base addresses. Finally, he decided to dive into binbloom and improve it. The tool is available here.

Then another tool was presented by Jose E. Marchesi: “GNU poke, the extensible editor for structured binary data“. Honestly, I never of this hex editor. They are plenty of them in the open source ecosystem but Jose explained they weaknesses. Poke looks extremely powerful to extract specific information and convert them to update the file but I was a bit lost in the details. The tool looks complex at a first sight!

To wrap-up the schedule, Romain Thomas presented something pretty cool: “The Poor Man’s Obfuscator“. The idea of his research was to transform a ELF file (Linux executable) or Mach-O (Macos executable) to make them obfuscated and not easy to debug using classic debuggers/disassemblers. Of course, the modified binaries have to remain executable from an OS point of view. Several techniques were covered like creating a lot of exports with random names or confusing names. Another technique was to alter the sections of the file. Really interesting techniques to seem very powerful. With one sample, he was able to crash Ghidra!

The day finished with a series of rump sessions and the social event in the center of Lille. The day three started with talks related to blueteams. The first one was “Sudo logs for Blue Teamers” by Peter Czanik. Peter is an advocate for Sudo & Syslog-NG, two useful tools for your Linux distributions. Both are integrated smoothly because Sudo logs are parsed automatically by Syslog-NG. IO Logs API is a nice feature that many people don’t know: you can interact with the user session. Ex: Use a few lines of Python code to terminate the session if some text if detected. Peter covered the new features like the possibility to execute a chroot with Sudo. Recap of the new versions: more logging capabilities, track and intercept sub-commands.

The next presentation focused on another tool: “DFIR-IRIS – collaborative incident response platform” presented by ThĂ©o Letailleur (&) and Paul Amicelli. As incident handlers, they came with problems to solve: Track elements during investigations, share pieces of information between analysts and handle repetitive tasks. Many solutions existed: TheHive, FIR, Catalyst, DRIFTrack, Aurora. They decided to start their own tool: DFIR-IRIS. It provides the following components:

  • python web application (API, modules)
  • automation (enrichment, reports)
  • collaboration (notes, share, tasks)
  • tracking (IOCs, assets, timeline, evidences)

I was surprised by the number of features and the power of the tool. I’m not using at this time (I use TheHive) but it deserves to be tested! More information here.

The next speaker was Solal Jabob, another regular PTS speaker! This time, he presented: “TAPIR : Trustable Artifact Parser for Incident Response“. Based on what he demonstrated, the amount of work is simply crazy. It’s a complete framework that you can use to collect artefacts and perform triage, export, … It is based on a library (TAP) used to parse data from files, disk images, … then plugins are use to extract metadata. The first tool is presented is Bin2JSON, then TAP-Query, TAPIR is a client/server solution with a REST-API, multi-user capabilities. It is command line or web based. A Python library is also available (TAPyR).

Just before the lunch break, I presented “Improve your Malware Recipes with Cyberchef“. I explained how this tool can be powerful to perform malware analysis.

After the break, the second half of the day was dedicated to pentesting / readteaming presentations. Quickly, we had Antoine Cervoise who presented “MobSF for penetration testers“. MobSF is a free open-source security scanner for mobile applications. He demonstrated how he can find interesting information to be used in his pentest projects. Hugo Vincent presented “Finding Java deserialization gadgets with CodeQL”. Java deserialization remains a common issue in many applications in 2022. It’s still present in the OWASP Top-10. Hugo explained the principle behind this attack, then demonstrated, with the help of CodeQL (a static code analyzer) how he can find vulnerabilities. Pierre Milioni presented “Dissecting NTLM EPA & building a MitM proxy“. NTML is an old protocol but still used to authenticate users on web applications. Microsoft expanded it to “EPA” for “Extended Protection for Authentication”. This make some tools useless because they don’t support this protocol extension. Pierre developed a MitM proxy called Proxy-Ez that helps to use these tools. Finally, MahĂ© Tardy presented “kdigger: A Context Discovery Tool for Kubernetes Penetration Testing“. kdigger is a tool that helps to perform penetration tests in the context of a Kubernetes environment. The demo was pretty nice!

After two years “online”, it was nice to be back at Lille to meet people in real life. We were approximatively 100 attendees, great number to have time to exchange with many people! The talks have been recorded and are already online, slides as well.

The post Pass-The-Salt 2022 Wrap-Up appeared first on /dev/random.

July 02, 2022

A wise person recently told me:

"Release early, release often!"

So here goes... I know a teeny-weeny bit of Python and have recently learned to enjoy FreeCAD. This morning I discovered that FreeCAD can be scripted using Python, so here's my first attempt. Any hints are welcome.

 

 

The script creates a (very flat) cube, attaches four smaller flat cubes that serve as tenons, and then cuts out four small cubes that act as mortises. It ends by creating four simple copies of this piece, forming a four-piece puzzle.

The next step is to automate the inclusion of an SVG file on the surface of these pieces.

June 29, 2022

La digue s’est rompue. Sous la pression des flots furieux, je me suis reconnectĂ©, j’ai Ă©tĂ© inondĂ©.

La cause initiale a Ă©tĂ© l’organisation de plusieurs voyages. De nos jours, organiser un voyage consiste Ă  passer des heures en ligne Ă  faire des recherches, trouver des opportunitĂ©s, comparer les offres, les disponibilitĂ©s puis Ă  rĂ©server, attendre les emails, confirmer les rĂ©servations. Au moment de la confirmation finale d’un vol, j’ai par exemple eu la dĂ©sagrĂ©able surprise de dĂ©couvrir que les bagages n’étaient pas autorisĂ©s. Mais bien sur le mĂȘme vol le lendemain. Il m’a fallu dĂ©caler tout le planning, revenir aux rĂ©servations des hĂ©bergements, etc.

Lorsqu’on passe sa journĂ©e en ligne, papillonnant entre les sites web, rĂ©pondant Ă  un mail de temps en temps, ce genre d’exercice s’inscrit naturellement dans la journĂ©e. Mais quand, comme moi, on chronomĂštre le temps passĂ© en ligne, l’énergie consacrĂ©e Ă  organiser un voyage est effrayante. Outre le temps passĂ© Ă  explorer les possibilitĂ©s, Ă  chercher activement et remplir les formulaires, il y a Ă©galement le temps d’attente pour les confirmations, les dizaines de mails Ă  dĂ©chiffrer dont la plupart ne sont que des arguties administratives ou, dĂ©jĂ , des publicitĂ©s desquelles il faut se dĂ©sabonner.

Le tout Ă©videmment devant ĂȘtre synchronisĂ© avec les autres participants desdits voyages.

Entre deux crĂ©ations de comptes et deux mails de confirmations, attendant la rĂ©ponse urgente d’un des participants, mon cerveau n’a pas la capacitĂ© de se concentrer. Il attend. Et tant qu’à attendre, il y’a justement des dizaines, des centaines, des milliers de petites tranches informationnelles divertissantes. Les rĂ©sultats d’une course cycliste. Les Ă©lections en France. Des sujets passionnants. Voire inquiĂ©tant pour le dernier. Mais un sujet inquiĂ©tant n’en est que plus passionnant. J’observe avec un intĂ©rĂȘt morbide la montĂ©e de l’extrĂȘme droite comme on regarde un film d’horreur : impuissant et sans pouvoir me dĂ©tacher de l’écran.

Dans mon cas, le fait de voyager a Ă©tĂ© la cause de ma reconnexion. Mais cela aurait pu ĂȘtre autre chose. Comme les problĂšmes que j’ai eus avec mon ex-banque, qui force dĂ©sormais l’utilisation d’une application Android revendant mes donnĂ©es privĂ©es afin de fermer les agences et virer le personnel.

Le point commun entre les banques et les voyagistes ? La disparition du service client. La disparition d’un mĂ©tier essentiel qui consistait Ă  Ă©couter le client pour ensuite tenter de transformer ses desiderata en actes administratifs. DĂ©sormais, le client est seul face Ă  la machine administrative. Il doit remplir les formulaires, tout connaitre, tout comprendre tout seul. Se morigĂ©ner pour la moindre erreur, car personne ne vĂ©rifiera Ă  sa place.

Mais, si le service n’existe plus, la fiction du service existe toujours. Les dĂ©partements marketing bombardent d’emails, de courriers papier et d’appels tĂ©lĂ©phoniques intempestifs. Pour vour faire signer ou acheter un Ă©niĂšme produit dont vous ne pourrez plus vous dĂ©faire. L’agression est permanente. Le pouvoir politique est incapable d’agir pour plusieurs raisons.

La premiĂšre est qu’il ne veut pas agir, les politiciens Ă©tant les premiers Ă  vouloir envahir les gens sous leurs publicitĂ©s. Les administrations publiques, peuplĂ©es de spĂ©cialistes du privĂ© dont on a vantĂ© les mĂ©rites organisationnels, se retrouvent
 Ă  faire de la publicitĂ©. C’est absurde et inexorable. Pourquoi les chemins de fer mettent-ils tant d’effort Ă  promouvoir, Ă  travers des publicitĂ©s risibles, des systĂšmes compliquĂ©s d’abonnements incomprĂ©hensibles ? Ce budget ne pourrait-il pas ĂȘtre utilisĂ© Ă  mieux payer les cheminots ?

Le second point est lui plus profond. Les pouvoirs publics se targuent de vouloir faire la diffĂ©rence entre le « bon » marketing et les arnaques malhonnĂȘtes. Le problĂšme est que la diffĂ©rence est purement arbitraire. Les deux cherchent Ă  exploiter une faiblesse quelconque pour soutirer de l’argent.

Pourquoi, par exemple, faut-il explicitement mettre un autocollant sur sa boĂźte aux lettres pour Ă©viter de la voir se remplir de publicitĂ©s sous blister ? L’inverse serait plus logique : n’autoriser la publicitĂ© que lorsqu’elle est explicitement demandĂ©e.

Pourquoi le RGPD est-il tellement dĂ©criĂ© alors qu’il tente de mettre de l’ordre dans la maniĂšre dont sont utilisĂ©es les donnĂ©es privĂ©es ? Parce qu’il a Ă©tĂ©, Ă  dessein, rendu incroyablement complexe. Il suffirait de mettre dans la loi que toute donnĂ©e personnelle ne peut-ĂȘtre utilisĂ©e qu’avec un accord explicite valable un an. Que cet accord n’est pas transfĂ©rable. Cela impliquerait que toute revente de donnĂ©es forcerait l’acheteur Ă  demander l’accord aux personnes concernĂ©es. Et Ă  renouveler cet accord tous les ans. Simple, efficace.

À la base, le rĂŽle du pouvoir public est de protĂ©ger les citoyens, de faire respecter cette frontiĂšre en perpĂ©tuel mouvement entre la libertĂ© de l’individu et le respect de l’autre. Mais lorsque le pouvoir public prĂ©tend devenir rentable et agir comme un acteur Ă©conomique plutĂŽt que politique, son action devient ubuesque.

Comme lorsque l’état engage les grands moyens pour empĂȘcher la contrefaçon de cigarettes. En tentant d’arguer que les cigarettes contrefaites sont
 dangereuses pour la santĂ©. Oubliant que les cigarettes « lĂ©gales » sont responsables de plus de morts que le COVID (dont le tiers ne fume pas), d’une destruction grave de l’environnement et de l’émission de plus de 1% du CO2 annuellement produit.

Plusieurs fois par semaine, mon tĂ©lĂ©phone sonne pour tenter de m’extorquer de l’argent selon une technique quelconque. Je suis pourtant dans une liste rouge. À chaque appel imprĂ©vu, je porte plainte sur le site du gouvernement ainsi que, lorsque c’est possible, auprĂšs de la sociĂ©tĂ© appelant. Cela m’a valu un Ă©change avec un enquĂȘteur travaillant chez le plus gros opĂ©rateur tĂ©lĂ©phonique belge. GrĂące Ă  lui, j’ai compris comment la loi rendait difficile de lutter contre ce type d’arnaque sous prĂ©texte de dĂ©fendre le tĂ©lĂ©marketing « lĂ©gal ».

On en revient toujours au mĂȘme problĂšme : l’optimisation de l’économie implique de maximiser les Ă©changes Ă©conomiques, quels qu’ils soient. De maximiser le marketing, aussi intrusif, aussi absurde, aussi dommageable soit-il. D’exploiter les faiblesses humaines pour soutirer un maximum d’argent, pour gĂ©nĂ©rer un maximum de consommation et donc de pollution.

La pollution de l’environnement, la pollution de l’air, la pollution mentale permanente ne sont que les facettes d’une seule et mĂȘme cause : la maximisation politique des Ă©changes Ă©conomiques. Jusqu’à en crever.

Nous achetons des bouteilles en plastique remplies de sucres morbides Ă  consommer en attendant le Ă©niĂšme message qui fera sonner notre smartphone. Un message qui, la plupart du temps, nous poussera Ă  consommer ou justifiera l’argent que nous recevons mensuellement pour continuer Ă  consommer. Sans message, nous serons rĂ©duits Ă  rafraichir compulsivement l’écran, espĂ©rant une nouvelle info, quelque chose de croustillant. N’importe quoi. La mort d’un animateur tĂ©lĂ©vision de notre enfance, par exemple, histoire de se taper plusieurs heures de vidĂ©os postĂ©es sur YouTube.

Le fait que j’aie en partie replongĂ© me dĂ©montre Ă  quel point la connexion est une drogue. Une addiction savamment entretenue, un danger permanent pour les addicts comme je le suis.

Chaque connexion est jouissive. C’est une bouffĂ©e de plaisir bien mĂ©ritĂ©e, un repos intellectuel. Je peux compulsivement consommer, cliquer sans penser. Le simple fait d’utiliser la souris, de multiples onglets ouverts sur des images ou des vidĂ©os permet de ralentir l’esprit tout en donnant une fausse sensation de contrĂŽle, de puissance.

La problĂ©matique touche d’ailleurs depuis longtemps le monde professionnel. Comme le raconte Cal Newport dans son livre « A world without email », la plupart des mĂ©tiers se rĂ©sument dĂ©sormais Ă  rĂ©pondre Ă  ses emails, ses coups de tĂ©lĂ©phone, le tout en participant Ă  des rĂ©unions. L’échange est permanent et a Ă©tĂ© largement aggravĂ© par l’apparition des messageries professionnelles comme Slack.

Le monde professionnel n’a plus le loisir de penser. Les dĂ©cisions sont prises sans recul et acceptĂ©es sur base du simple charisme d’un manager. Ce n’est pas un hasard. Penser est dangereux. Penser remets en question. Penser fait de vous un paria.

Les Ă©lections en France m’ont donnĂ© envie de politique, de dĂ©bat. Alors j’ai lu « Son Excellence EugĂšne Rougon », de Zola. En version papier. Je me suis remis Ă  penser. J’ai retrouvĂ© la motivation de reprendre le combat. Un combat contre mon addiction. Un combat contre toute la sociĂ©tĂ© qui m’entoure. Un combat contre moi-mĂȘme.

Recevez les billets par mail ou par RSS. Max 2 billets par semaine, rien d’autre. Adresse email jamais partagĂ©e et dĂ©finitivement effacĂ©e lors du dĂ©sabonnement. Dernier livre paru : Printeurs, thriller cyberpunk. Pour soutenir l’auteur, lisez, offrez et partagez des livres.

Ce texte est publié sous la licence CC-By BE.

When performing physical backup on system that are heavily used, it can happen that the backup speed cannot keep up with the redo log generation. This can happen when the backup storage is slower than the redo log storage media and this can lead in inconsistency in the generated backup.

MySQL Enterprise Backup (aka MEB) and probably Percona Xtrabackup, benefit from the possibility to sequentially write redo log records to an archive file in addition to the redo log files.

This feature was introduced in MySQL 8.0.17.

How to enable it ?

To enable this feature, two settings are necessary:

  • set globally a directory where those archiving logs can be stored
  • start the archiving process in a session by calling a dedicated function

The global variable is innodb_redo_log_archive_dirs.

This variable musts contain labelled directories where the archiving redo logs can be stored. The format is a semi-colon separated string like this:

innodb_redo_log_archive_dirs='label1:/backups1;label2:/backups2'

The system user running mysqld must have access to those directories and should not be accessible to all users.

The redo log archiving is started using the function innodb_redo_log_archive_start() and stopped using innodb_redo_log_archive_stop(). Only users with the INNODB_REDO_LOG_ARCHIVE privilege can call those functions.

It’s important to notice that the MySQL session that activates redo log archiving must remain open for the duration of the archiving. You must deactivate redo log archiving in the same session. If the session is terminated before the redo log archiving is explicitly deactivated, the server deactivates redo log archiving implicitly and removes the redo log archive file.

Let’s see how to enable it:

$ sudo mkdir -p /var/lib/mysql-redo-archive/backup1
$ sudo chown mysql. -R /var/lib/mysql-redo-archive
$ sudo chmod -R 700 /var/lib/mysql-redo-archive/

In fact , it’s ready to work but it’s not enabled, only when a session, usually the one initializing the backup, will invoke the innodb_redo_log_archive_start() it will really be enabled:

Is it enabled ?

How can we see that the redo log archiving is active ?

We can check if MySQL is using a redo log archive file using the following query:

select * from performance_schema.file_instances
   where event_name like '%::redo_log_archive_file'\G

If there is an entry, this means that the redo log archive process is enabled or has been enabled and stopped successfully using the dedicated function:

So this is not enough to be sure that the redo log archiving is active. But we have the possibility to also check if the thread is active using this query:

select thread_id, name, type from threads 
   where name like '%redo_log_archive%';

If a row is returned, it means that the redo log archiving is enabled and active:

Error Messages

Here are some common error messages related to Redo Log Archiving:

ERROR: 3850 (HY000): Redo log archiving failed: Session terminated with active redo log archiving - stopped redo log archiving and deleted the file. This error happens when you try to stop the redo log archiving from another session and the session that started it was terminated.

ERROR: 3851 (HY000): Redo log archiving has not been started by this session. This is when the session that started the process is still open and you try to stop the redo log archiving from another session.

ERROR: 3848 (HY000): Redo log archiving has been started on '/var/lib/mysql-redo-archive/backup2/archive.17f6a975-e2b4-11ec-b714-c8cb9e32df8e.000001.log' - Call innodb_redo_log_archive_stop() first: this happens when you try to start the archiving process and there is already one active.

ERROR: 3842 (HY000): Label 'backup2' not found in server variable 'innodb_redo_log_archive_dirs': this is when you try to start the redo log archiving and you are using a label which is not defined in innodb_redo_log_archive_dirs.

ERROR: 3846 (HY000): Redo log archive directory '/var/lib/mysql-redo-archive/backup2' is accessible to all OS users: this is when the directory is accessible by others users too. Only the user running mysqld should have access to it.

ERROR: 3844 (HY000): Redo log archive directory '/var/lib/mysql-redo-archive/backup3' does not exist or is not a directory: this is a very common error, it happens when the subdir is not existing in the directory defined by the corresponding label in innodb_redo_log_archive_dirs. In this example, backup3 is not created in /var/lib/mysql-redo-archive.

ERROR: 3847 (HY000): Cannot create redo log archive file '/var/lib/mysql-redo-archive/backup3/archive.17f6a975-e2b4-11ec-b714-c8cb9e32df8e.000001.log' (OS errno: 13 - Permission denied): this is simple to understand, the directory and sub-directory exist but doesn’t belong to the user running mysqld (usually mysql).

Callable Functions

There are several functions that are related to Redo Log Archiving, we already used 2 of them to start and stop the process. Here is the list as MySQL 8.0.29:

The last two functions are used by MEB and are not documented in MySQL Server’s manual and there is no reason to use them as normal user.

innodb_redo_log_archive_flush is used to flush the redo log archive queue.

innodb_redo_log_sharp_checkpoint makes a checkpoint calling log_make_latest_checkpoint(*log_sys)

Conclusion

Even if not popular yet, this feature is mandatory for heavy workload when the backup storage doesn’t have the same capabilities of the production storage and is not able to follow up the speed of the writes.

When enabled by the DBA, MySQL Enterprise Backup will use it automatically. To know if a the redo log archiving process was started and is still active, the DBA can check the performance_schema.threads table.

June 22, 2022

I published the following diary on isc.sans.edu: “Malicious PowerShell Targeting Cryptocurrency Browser Extensions“:

While hunting, I found an interesting PowerShell script. After a quick check, my first conclusion was that it is again a simple info stealer. After reading the code more carefully, the conclusion was different: It targets crypto-currency browser apps or extensions. The script has a very low score on VT: 1/53… [Read more]

The post [SANS ISC] Malicious PowerShell Targeting Cryptocurrency Browser Extensions appeared first on /dev/random.

June 16, 2022

I published the following diary on isc.sans.edu: “Houdini is Back Delivered Through a JavaScript Dropper“:

Houdini is a very old RAT that was discovered years ago. The first mention I found back is from 2013! Houdini is a simple remote access tool written in Visual Basic Script. The script is not very interesting because it is non-obfuscated and has just been adapted to use a new C2 server… [Read more]

The post [SANS ISC] Houdini is Back Delivered Through a JavaScript Dropper appeared first on /dev/random.

June 14, 2022

If you want to receive Bluetooth Low Energy sensor measurements, there's a new project you can use: Theengs Gateway. It uses Theengs Decoder, an efficient, portable and lightweight C++ library for BLE payload decoding, and it publishes the decoded data as MQTT messages. It already supports 40 BLE devices, including RuuviTags, iBeacons, and various Xiaomi devices.

Recently Mihai Ambrosie created a Theengs Gateway add-on for Home Assistant, so you can install it easily. The installation process goes like this:

  • Click on Settings / Add-ons in Home Assistant and then Add-on Store at the bottom right. Click on the three dots at the top right and then Repositories.

  • Enter the url https://github.com/mihsu81/addon-theengsgw and click on Add. Click on Close after the repository has been added.

  • Click on TheengsGateway in the list of add-ons and then Install.

After the installation is complete, open the Configuration tab of the add-on and enter the host and port of your MQTT broker and optionally a username and password. 1 You can also change some parameters such as the base of the MQTT topics, the scan duration and the time between scans, and a filter for devices that you don't want to be discovered by Home Assistant because they're too numerous. 2

/images/theengs-gateway-addon-configuration.png

Click on Save to save the configuration and then click on Start in the Info tab to start the add-on. After this, all BLE devices that Theengs Gateway detects are automatically discovered by Home Assistant, and you can find them in Settings / Devices & Services. Look at the Devices and Entities tabs:

/images/theengs-gateway-addon-ruuvitag.png
1

If you don't have an MQTT broker yet, install the Mosquitto add-on in Home Assistant.

2

By default, iBeacons, Google/Apple Exposure Notifications (GAEN), and Microsoft Advertising Beacons (advertised by Windows devices) are filtered.

June 13, 2022

Hello

I have over 10,000 photos on my website. All these photos are managed by a custom Drupal module. I wrote the first version of that module over 15 years ago, and continue to work on it from time to time. Like this weekend, when I added a new feature.

Digital photos have EXIF data embedded in them. EXIF data includes information such as camera model, lens, aperture, shutter speed, focal length, ISO, and much more.

My module now extracts the EXIF data from my photos and stores it in a database. Having all my EXIF metadata in a database allows me to analyze my photography history.

For example, over the years, I've owned 11 different cameras and 10 different lenses:

SELECT COUNT(DISTINCT(camera)) AS count FROM images; 
+-------+
| count |
+-------+
|    11 |
+-------+

SELECT COUNT(DISTINCT(lens)) AS count FROM images;  
+-------+
| count |
+-------+
|    10 |
+-------+

Here is a SQL query that shows all cameras I have owned in the last 22 years, and the timeframe I used them for.

SELECT camera, MIN(DATE(date)) AS first, MAX(DATE(date)) AS last, TIMESTAMPDIFF(YEAR, MIN(date), MAX(date)) AS years FROM images GROUP BY camera ORDER BY first; 
+---------------------+------------+------------+-------+
| camera              | first      | last       | years |
+---------------------+------------+------------+-------+
| Sony Cybershot      | 2000-01-01 | 2003-08-01 |     3 |
| Nikon Coolpix 885   | 2001-11-13 | 2004-04-11 |     2 |
| Nikon D70           | 2004-04-03 | 2006-11-19 |     2 |
| Nikon D200          | 2006-12-31 | 2012-06-17 |     5 |
| Panasonic Lumix GF1 | 2011-10-11 | 2014-10-26 |     3 |
| Nikon D4            | 2012-07-01 | 2018-08-26 |     6 |
| Sony Alpha 7 II     | 2015-02-25 | 2019-01-09 |     3 |
| DJI Mavic Pro       | 2017-07-23 | 2019-01-18 |     1 |
| Nikon D850          | 2019-03-16 | 2021-04-24 |     2 |
| Nikon Z 7           | 2019-04-07 | 2021-08-31 |     2 |
| Leica M10-R         | 2021-11-18 | 2022-06-09 |     0 |
+---------------------+------------+------------+-------+

Finally, here is a chart that visualizes my camera history:

Chart that shows my cameras and when I used themThe timeframe I used each camera for. The white numbers on the blue bars represent the number of photos I published on my website.

A few takeaways:

  • I used my Nikon D4 for 6 years and my Nikon D200 for 5 years. On average, I use a camera for 3.3 years.
  • I should dust of my drone (DJI Mavic Pro) as I haven't used it since early 2019.
  • In 2019, I bought a Nikon D850 and a Nikon Z 7. I liked the Nikon Z 7 better, and didn't use my Nikon D850 much.
  • Since the end of 2021, I've been exclusively using my Leica.

I shared this with my family but they weren't impressed. Blank stares ensued, and the conversation took a quick turn. While I could go on and share more statistics, I'll take a hint from my family, and stop here.

June 12, 2022

This week my new book has been published, Develop your own Bluetooth Low Energy Applications for Raspberry Pi, ESP32 and nRF52 with Python, Arduino and Zephyr.

Bluetooth Low Energy (BLE) is one of the most accessible wireless communication standards. You don't need any expensive equipment to develop BLE devices such as wireless sensor boards, proximity beacons, or heart rate monitors. All you need is a computer or a Raspberry Pi, an ESP32 microcontroller board, or a development board with a Nordic Semiconductor nRF5 (or an equivalent BLE SoC from another manufacturer).

On the software side, BLE is similarly accessible. Many development platforms, most of them open source, offer an API (application programming interface) to assist you in developing your own BLE applications. This book shows you the ropes of Bluetooth Low Energy programming with Python and the Bleak library on a Raspberry Pi or PC, with C++ and NimBLE-Arduino on Espressif's ESP32 development boards, and with C on one of the development boards supported by the Zephyr real-time operating system, such as Nordic Semiconductor's nRF52 boards.

While Bluetooth Low Energy is a complex technology with a comprehensive specification, getting started with the basics is relatively easy. This book takes a practical approach to BLE programming to make the technology even more approachable. With a minimal amount of theory, you'll develop code right from the start. After you've completed this book, you'll know enough to create your own BLE applications.

What is Bluetooth Low Energy?

Bluetooth is a wireless communication standard in the 2.4 GHz Industrial, Scientific, and Medical (ISM) frequency band. These days, if you hear about Bluetooth support in a product, this almost always is Bluetooth Low Energy (BLE). It's a radical departure from the original Bluetooth standard, which is now called Classic Bluetooth.

Bluetooth Low Energy and Classic Bluetooth are actually different protocols. Classic Bluetooth is essentially a wireless version of the traditional serial connection. If you want to print a document, transfer a file or stream audio, you want this to happen as fast as possible. Therefore, the focus of development in Classic Bluetooth was on attaining faster and faster speeds with every new version.

However, Classic Bluetooth wasn't a good fit for devices with low power consumption, for instance those powered by batteries. That's why Nokia adapted the Bluetooth standard to enable it to work in low-power scenarios. In 2006, they released their resulting technology onto the market, dubbed Wibree.

The Bluetooth Special Interest Group (SIG), the organization that maintains the Bluetooth specifications, showed interest in this new development. After consulting with Nokia, they decided to adopt Wibree as part of Bluetooth 4.0, with a new name, Bluetooth Low Energy. Classic Bluetooth remained available for high-throughput applications.

Note

In practice, many chipsets support both Classic Bluetooth and Low Energy, especially in laptops and smartphones.

Layered architecture

The Bluetooth Core Specification is more than 3200 pages long. And this is only the core specification; there are many supplemental documents for BLE. However, BLE has a layered architecture. Many end-user applications only use the upper layers, so you don't need to know the details of the architecture's lower layers.

/images/ble-stack.png

The BLE architecture consists of three main blocks: controller, host, and application.

Controller

This has the lower-level layers: the Physical Layer (PHY), Link Layer (LL) and Direct Test Mode (DTM). These are the layers where the Bluetooth radio does its work. The controller communicates with the outside world using the antenna, in a frequency band around 2.4 GHz. It communicates with the host using a standardized interface between the two blocks: the Host Controller Interface (HCI). 1

Host

This is the block with which the end user or application developer comes in contact. The Logical Link Control and Adaptation Protocol (L2CAP) defines channels and signaling commands. On top of it, the Security Manager Protocol (SMP) handles secure connections (with authentication and encryption), and the Attribute Protocol (ATT) defines how to expose and access data as attributes. The Generic Attribute Profile (GATT) 2 builds on the Attribute Protocol to define how to discover services and their characteristics and how to read and write their values. The upper layer of the Host block is the Generic Access Profile (GAP), which defines how devices can discover other devices and connect, pair, and bond to them. The host communicates with the controller using its part of the host controller interface, and applications communicate with the host depending on the APIs exposed by the operating system.

Application

This layer builds on top of the Generic Attribute Profile to implement application-specific characteristics, services, and profiles. A characteristic defines a specific type of data, such as an Alert Level. A service defines a set of characteristics and their behaviors, such as the Link Loss Service. A profile is a specification that describes how two or more devices with one or more services communicate with each other. An example is the Proximity profile, which has two roles: Proximity Monitor and Proximity Reporter.

The three blocks don't have to run on the same processor. In fact, there are three common configurations --- one single-chip and two dual-chip:

Single-chip (SoC)

Controller, host and application code run on the same chip. The host and controller communicate through function calls and queues in the chip's RAM. Most simple devices such as BLE sensors use this configuration; it keeps the cost down. Some smartphones also use this configuration if they have a SoC with Bluetooth built in.

Dual-chip over HCI

A dual-chip solution with application and host on one chip, and the controller on another chip, communicates over HCI. Because HCI is a standardized interface, it lets you combine different platforms. For instance, on a Raspberry Pi, the Wi-Fi and BLE chip implements a BLE controller. If you connect a BLE dongle to an older Raspberry Pi, this dongle also implements a BLE controller. 3 BlueZ, the Raspberry Pi Linux kernel's Bluetooth stack, implements a BLE host. So BlueZ communicates with the BLE controller in the built-in BLE chip or the BLE dongle. In the former case, the HCI uses SDIO, and in the latter, UART over USB. 4 Many smartphones and tablets also use the dual-chip over HCI configuration, with a powerful processor running the host and a Bluetooth chip running the controller.

Dual-chip with connectivity device

Another dual-chip solution is one with the application running on one chip and the host and controller on another chip. The latter is then called the connectivity device because it adds BLE connectivity to the other device. This approach is useful if you have an existing hardware device that you want to extend with BLE connectivity. Because there's no standardized interface in this case, the communication between the application processor and the connectivity device needs to make use of a proprietary protocol implemented by the connectivity device.

A three-chip solution with controller, host, and application each running on its own chip is also possible. However, because of the associated cost, this is typically only done for development systems.

How to communicate with BLE devices?

Bluetooth Low Energy has two ways to communicate between devices: with and without a connection.

Without a connection

Without a connection means that the device just broadcasts information in an advertisement. Every BLE device in the neighborhood is able to receive this information.

/images/ble-broadcaster-observers.png

Some examples of BLE devices broadcasting data are:

Proximity beacons

These devices, often following Apple's iBeacon standard, broadcast their ID. Receivers calculate their approximate distance to the beacons based on the advertisement's Received Signal Strength Indicator (RSSI).

Sensors

Many temperature and humidity sensors broadcast their sensor values. Most devices do this in an unencrypted fashion, but some of them encrypt the data to prevent it being read by every device in the neighborhood.

Mobile phones

After the COVID-19 pandemic started in 2020, Google and Apple collaborated on the Exposure Notifications standard for contact tracing. As part of this technology, Android phones and iPhones broadcast unique (but anonymous) numbers. Other phones can pick up these numbers and use them later to warn users that they have been in contact with someone who is known to have had COVID-19.

With a connection

The other way to communicate between BLE devices is with a connection. One device (the client) scans for BLE advertisements to find the device it wants to connect to. Then, optionally, it may do an active scan to ask the device (the server) which services are offered.

After the client connects to the server, the client can use the server's services. Each BLE service is a container of specific data from the server. You can read this data, or (with some services) write a value to the server.

/images/ble-peripheral-central.png

Some examples of BLE devices using a connection are:

Fitness trackers

Your smartphone can connect to a fitness tracker and read your heart rate, the tracker's battery level, and other measurements.

Sensors

Some environmental sensors let you read their sensor values over a BLE connection.

Proximity reporters

These devices sound an alert when their connection to another device is lost.

Advantages of BLE

Low power consumption

As its name implies, Bluetooth Low Energy is optimized for low-power applications. Its whole architecture is designed to reduce power consumption. For instance, setting up a connection, reading or writing data, and disconnecting happens in a couple of milliseconds. The radio is often the most energy-consuming part of a device. Therefore, the idea is to turn on the Bluetooth radio, create a connection, read or write data, disconnect, and turn off the radio again until the next time the device has to communicate.

This way, a well-designed BLE temperature sensor is able to work on a coin cell for ten years or more. You can use the same approach with other wireless technologies, such as Wi-Fi, but they require more power and more time to set up a connection.

Ubiquitous

BLE radio chips are ubiquitous. You can find them in smartphones, tablets, and laptops. This means that all those devices can talk to your BLE sensors or lightbulbs. Most manufacturers create mobile apps to control their BLE devices.

You can also find BLE radios in many single-board computers, such as the Raspberry Pi, and in popular microcontroller platforms such as the ESP32. 5 This makes it quite easy for you to create your own gateways for BLE devices. And, platforms such as the Nordic Semiconductor nRF5 series of microcontrollers with BLE radio even make it possible to create your own battery-powered BLE devices.

Low cost

There's no cost to access the official BLE specifications. Moreover, BLE chips are cheap, and the available development boards (based on an nRF5 or ESP32) and Raspberry Pis are quite affordable. This means you can just start with BLE programming at minimal cost.

Disadvantages of BLE

Short range

BLE has a short range (for most devices, less than 10 meters) compared to other wireless networks, such as Zigbee, Z-Wave, and Thread. It's not a coincidence that these competitors all have a mesh architecture, in which devices can forward their neighbors' messages in order to improve range. Low-power wide area networks (LPWANs), such as LoRaWAN, Sigfox, and NB-IoT, have even longer ranges.

In 2017, the Bluetooth SIG added Bluetooth Mesh, a mesh protocol. This builds upon BLE's physical and link layers with a whole new stack above them. However, Bluetooth Mesh isn't as well-established as the core BLE protocol, at least not for home use.

Limited speed

The BLE radio has a limited transmission speed. For Bluetooth 4.2 and earlier, this is 1 Mbps, while for Bluetooth 5 and later, this can be up to 2 Mbps. This makes BLE unsuitable for high-bandwidth applications.

You need a gateway

Wi-Fi devices have their own IP addresses, so you can communicate with them directly from other IP-based devices, and they're integrated in your LAN (local area network). Bluetooth doesn't have this: to integrate your BLE devices with other network devices, you need a gateway. This device has to translate Bluetooth packets to IP-based protocols such as MQTT (Message Queuing Telemetry Transport). That's why many BLE device manufacturers have smartphone apps that function as device gateways. 6

Platforms used in this book

This book focuses on Bluetooth Low Energy programming on three platforms:

BLE platforms used in this book

Programming language

Library

Software platform

Hardware platform

Python

Bleak

Windows, Linux, macOS

Raspberry Pi or PC

C++

NimBLE-Arduino

Arduino framework

ESP32

C

/ 7

Zephyr

nRF52

These choices were made in order to demonstrate a wide range of applications compatible with many software and hardware platforms.

Python/Bleak (Raspberry Pi, PC)

Python is an easy-to-use programming language that works on all major operating systems. There are a lot of Python Bluetooth Low Energy libraries, but many of them support only a single operating system. Bleak, which stands for Bluetooth Low Energy platform Agnostic Klient, is a welcome exception. It supports:

  • Windows 10, version 16299 (Fall Creators Update) or higher

  • Linux distributions with BlueZ 5.43 or higher (also on a Raspberry Pi)

  • OS X 10.11 (El Capitan) or macOS 10.12+

/images/rpi4.jpg

Bleak is a GATT client: it's able to connect to BLE devices that act as GATT servers. It supports reading, writing, and getting notifications from GATT servers, and it's also able to discover BLE devices and read advertising data broadcast by them.

Bleak doesn't implement a GATT server. In practice this isn't a big limitation. GATT servers are typically implemented on constrained devices, so, for this purpose, the ESP32 and nRF52 hardware platforms are a better match. 8

C++/NimBLE-Arduino (ESP32)

If you're looking at microcontrollers, the Arduino framework has become quite popular, not only for the original Arduino boards, which didn't have BLE functionality, but also on ESP32 development boards, which do.

/images/esp32-pico-kit-v4.1.jpg

Programming for the Arduino framework is done in a variant of C++, but the framework and many Arduino libraries hide much of C++'s complexity. Even if you only know some C (which is much less complex than C++), you'll be able to use the Arduino framework.

One of the more popular BLE libraries for Arduino on the ESP32 is NimBLE-Arduino. It's a fork of NimBLE, which is part of the Apache Mynewt real-time operating system. With NimBLE-Arduino, you can easily create your own GATT server or client.

C/Zephyr (nRF52)

For even more constrained devices, typically battery-powered, you need a specialized real-time operating system (RTOS). This book uses the Zephyr Project on nRF52840-based devices from Nordic Semiconductor. Zephyr has a completely open-source Bluetooth Low Energy stack.

/images/nrf52840-dongle.png

Zephyr's BLE stack is highly configurable. You can build Zephyr firmware for three configuration types:

Combined build

Builds the BLE controller, BLE host, and your application for a one-chip configuration.

Host build

Builds the BLE host and your application, along with an HCI driver to let your device communicate with an external BLE controller on another chip. 9

Controller build

Builds the BLE controller with an HCI driver to let your device communicate with an external BLE host on another chip.

With some basic knowledge of C, you can create your own BLE devices with Zephyr, such as BLE beacons, sensor boards, and proximity reporters. Zephyr has extensive documentation of its Bluetooth API, as well as a lot of ready-to-use examples that you can build upon.

June 11, 2022

  • Receive an ODT file (OpenDocument Text Document).
  • Everyone: opens the file with either LibreOffice or even Microsoft Office nowadays, apparently.
  • Me: uses Pandoc and LaTeX to convert the file to PDF and read it in Evince because I don’t have LibreOffice installed and I’m too lazy to upload the document to Google Docs.

I needed to review an addendum to a rental contract. (I moved! I’ll write about that later.) The addendum was sent to me in ODT format. At the time, my desktop pc was still packed in a box. On my laptop (a 2011 MacBook Air with Ubuntu 20.04) I only have the most essential software installed, which for me doesn’t include an office suite. I could install LibreOffice, but why make it easy if I can also do it the hard way? 😀

I do have Evince installed, which is a lightweight PDF viewer. To convert ODT to PDF I’m using Pandoc, which is a Swiss army knife for converting document formats. For PDF it needs the help of LaTeX, a document preparation system for typesetting.

First I installed the required software:

$ sudo apt install pandoc texlive texlive-latex-extra
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libapache-pom-java libcommons-logging-java libcommons-parent-java libfontbox-java libpdfbox-java preview-latex-style texlive-base texlive-binaries
  texlive-fonts-recommended texlive-latex-base texlive-latex-recommended texlive-pictures texlive-plain-generic tipa
Suggested packages:
  libavalon-framework-java libcommons-logging-java-doc libexcalibur-logkit-java liblog4j1.2-java texlive-xetex texlive-luatex pandoc-citeproc
  context wkhtmltopdf librsvg2-bin groff ghc php python r-base-core libjs-mathjax node-katex perl-tk xzdec texlive-fonts-recommended-doc
  texlive-latex-base-doc python3-pygments icc-profiles libfile-which-perl libspreadsheet-parseexcel-perl texlive-latex-extra-doc
  texlive-latex-recommended-doc texlive-pstricks dot2tex prerex ruby-tcltk | libtcltk-ruby texlive-pictures-doc vprerex
The following NEW packages will be installed:
  libapache-pom-java libcommons-logging-java libcommons-parent-java libfontbox-java libpdfbox-java pandoc preview-latex-style texlive texlive-base
  texlive-binaries texlive-fonts-recommended texlive-latex-base texlive-latex-extra texlive-latex-recommended texlive-pictures texlive-plain-generic
  tipa
0 upgraded, 17 newly installed, 0 to remove and 1 not upgraded.
Need to get 116 MB of archives.
After this operation, 448 MB of additional disk space will be used.
Do you want to continue? [Y/n]

Just to compare, installing LibreOffice Writer would actually use less disk space. Pandoc is a lot faster though.

$ sudo apt install libreoffice-writer
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libabw-0.1-1 libboost-date-time1.71.0 libboost-filesystem1.71.0 libboost-iostreams1.71.0 libboost-locale1.71.0 libclucene-contribs1v5
  libclucene-core1v5 libcmis-0.5-5v5 libe-book-0.1-1 libeot0 libepubgen-0.1-1 libetonyek-0.1-1 libexttextcat-2.0-0 libexttextcat-data libgpgmepp6
  libjuh-java libjurt-java liblangtag-common liblangtag1 libmhash2 libmwaw-0.3-3 libmythes-1.2-0 libneon27-gnutls libodfgen-0.1-1 liborcus-0.15-0
  libraptor2-0 librasqal3 librdf0 libreoffice-base-core libreoffice-common libreoffice-core libreoffice-math libreoffice-style-colibre
  libreoffice-style-tango librevenge-0.0-0 libridl-java libuno-cppu3 libuno-cppuhelpergcc3-3 libuno-purpenvhelpergcc3-3 libuno-sal3
  libuno-salhelpergcc3-3 libunoloader-java libwpd-0.10-10 libwpg-0.3-3 libwps-0.4-4 libxmlsec1 libxmlsec1-nss libyajl2 python3-uno uno-libs-private
  ure
Suggested packages:
  raptor2-utils rasqal-utils librdf-storage-postgresql librdf-storage-mysql librdf-storage-sqlite librdf-storage-virtuoso redland-utils
  libreoffice-base gstreamer1.0-plugins-bad tango-icon-theme fonts-crosextra-caladea fonts-crosextra-carlito libreoffice-java-common
The following NEW packages will be installed:
  libabw-0.1-1 libboost-date-time1.71.0 libboost-filesystem1.71.0 libboost-iostreams1.71.0 libboost-locale1.71.0 libclucene-contribs1v5
  libclucene-core1v5 libcmis-0.5-5v5 libe-book-0.1-1 libeot0 libepubgen-0.1-1 libetonyek-0.1-1 libexttextcat-2.0-0 libexttextcat-data libgpgmepp6
  libjuh-java libjurt-java liblangtag-common liblangtag1 libmhash2 libmwaw-0.3-3 libmythes-1.2-0 libneon27-gnutls libodfgen-0.1-1 liborcus-0.15-0
  libraptor2-0 librasqal3 librdf0 libreoffice-base-core libreoffice-common libreoffice-core libreoffice-math libreoffice-style-colibre
  libreoffice-style-tango libreoffice-writer librevenge-0.0-0 libridl-java libuno-cppu3 libuno-cppuhelpergcc3-3 libuno-purpenvhelpergcc3-3
  libuno-sal3 libuno-salhelpergcc3-3 libunoloader-java libwpd-0.10-10 libwpg-0.3-3 libwps-0.4-4 libxmlsec1 libxmlsec1-nss libyajl2 python3-uno
  uno-libs-private ure
0 upgraded, 52 newly installed, 0 to remove and 1 not upgraded.
Need to get 78,5 MB of archives.
After this operation, 283 MB of additional disk space will be used.
Do you want to continue? [Y/n] n
Abort.

Next, converting the file. It’s possible to tell Pandoc which file formats to use with the -f (from) and -t (to) switches, but it can usually guess correctly based on the file extensions.

$ time pandoc 2022-06-house-contract-adendum.odt -o 2022-06-house-contract-adendum.pdf

real	0m0,519s
user	0m0,475s
sys	0m0,059s

It took only half a second to convert the file. Opening LibreOffice takes a bit more time on this old laptop.

You can see the PDF document properties with pdfinfo:

$ pdfinfo 2022-06-house-contract-adendum.pdf 
Title:          
Subject:        
Keywords:       
Author:         
Creator:        LaTeX with hyperref
Producer:       pdfTeX-1.40.20
CreationDate:   Sat Jun 11 23:32:30 2022 CEST
ModDate:        Sat Jun 11 23:32:30 2022 CEST
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          2
Encrypted:      no
Page size:      612 x 792 pts (letter)
Page rot:       0
File size:      64904 bytes
Optimized:      no
PDF version:    1.5

I don’t want it in letter format, I want A4:

$ time pandoc -V papersize:a4 -o 2022-06-house-contract-adendum.pdf 2022-06-house-contract-adendum.odt

real	0m0,520s
user	0m0,469s
sys	0m0,060s
$ pdfinfo 2022-06-house-contract-adendum.pdf 
Title:          
Subject:        
Keywords:       
Author:         
Creator:        LaTeX with hyperref
Producer:       pdfTeX-1.40.20
CreationDate:   Sat Jun 11 23:40:16 2022 CEST
ModDate:        Sat Jun 11 23:40:16 2022 CEST
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          2
Encrypted:      no
Page size:      595.276 x 841.89 pts (A4)
Page rot:       0
File size:      64935 bytes
Optimized:      no
PDF version:    1.5

Then I could open the file with evince 2022-06-house-contract-adendum.pdf.

And yes, I know that addendum is with double d. 🙂

June 07, 2022

Lyte was just listed as a top video player on WP Glob...

Source

June 03, 2022

I published the following diary on isc.sans.edu: “Sandbox Evasion
 With Just a Filename!“:

Today, many sandbox solutions are available and deployed by most organizations to detonate malicious files and analyze their behavior. The main problem with some sandboxes is the filename used to submit the sample. The file can be named like “sample.exe”, “suspicious.exe”, “<SHA256>.tmp” or “malware.tmp”… [Read more]

The post [SANS ISC] Sandbox Evasion… With Just a Filename! appeared first on /dev/random.

June 01, 2022

When watching Eurosong a couple of weeks ago -while having a tapas-like dinner for hours and having great fun- one song stood out for me but I did not really enjoy the arrangements. But then a couple of days ago I heard “Saudade, saudade” (by Maro) in a complete live version and I was almost crying. So Eurosong not only is good for an evening of kitsch, fun and food, now and again it also lets the...

Source

May 31, 2022

We spent a week on a power catamaran exploring the British Virgin Islands (BVI). The purpose of the trip was twofold: to experience the BVI, and to come home with a boating license.

The BVI are somewhat hard to get to. It's part of what makes them so stunningly unspoiled. Getting to the BVI required flying from Boston to San Juan, and then from San Juan to Tortola. The flight to Tortola involved a very small plane, which was an experience in itself.

An airline worker stuffing luggage in the wing of a small planeThe plane to Tortola was so small that the luggage had to go in the wings. The pilot also sized each passenger up and decided where we would sit to evenly distribute the weight.

On the first day we met our captain (and instructor) and charted to our first destination; Oil Nut Bay at the Island of Virgin Gorda. It's where we got introduced to the BVI's signature drink; the "painkiller".

Four different rum cocktails lined up next to each othersPainkillers and rum punches, the most popular drinks in the BVI. Every bar and restaurant serves them.Vanessa watching the sunset from an infinity pool while having a cocktailVanessa enjoying a cocktail from the pool at Nova on Oil Nut Bay. Our boat is in the background, tied to a mooring buoy.

We spent the next six days island hopping around the BVI. Each day, we'd arrive at a different near-deserted Caribbean island. We'd tie our boat to a mooring ball, and jump off the boat to swim or paddle board.

A woman standing on a paddle board, surrounded by boats on mooring ballsTaking a stand-up paddle board out in a bay.

After our swim, we'd take our dinghy to shore to explore the island of the day. On shore, there is little to nothing except maybe a few beach bars and restaurants scattered around. In the evening, we'd either cook dinner on the boat, or enjoy the rare restaurant or bar on the island.

Dinghies tied up to a dockTaking our dinghy to a restaurant on shore.

Each island has perfect soft sand and pristine turquoise water. But each island also has something unique to offer; Cooper Island had a rum bar with a selection of 300 rums; Jost Van Dyke had an amazing beach bar (the famous Soggy Dollar who invented the Painkiller); Trellis Bay has a great sushi restaurant; and The Baths are a unique national park, and a must-see attraction.

A hand on top of a nautical chart and next to a course plotterNavigation planning using a nautical chart, course plotter, and brass dividers.

Every day, we had classroom-style learning and practiced driving the boat. This included navigation planning, docking exercises, man-overboard maneuvers, anchoring, and more. At the end of the week, I took the theoretical and practical exams, and passed! Next time, I'll be able to charter my own boat.

May 21, 2022

At work, as with many other companies, we're actively investing in new platforms, including container platforms and public cloud. We use Kubernetes based container platforms both on-premise and in the cloud, but are also very adamant that the container platforms should only be used for application workload that is correctly designed for cloud-native deployments: we do not want to see vendors packaging full operating systems in a container and then shouting they are now container-ready.

May 20, 2022

I have a new laptop. The new one is a Dell Latitude 5521, whereas the old one was a Dell Latitude 5590.

As both the old and the new laptops are owned by the people who pay my paycheck, I'm supposed to copy all my data off the old laptop and then return it to the IT department.

A simple way of doing this (and what I'd usually use) is to just rsync the home directory (and other relevant locations) to the new machine. However, for various reasons I didn't want to do that this time around; for one, my home directory on the old laptop is a bit of a mess, and a new laptop is an ideal moment in time to clean that up. If I were to just rsync over the new home directory, then, well.

So instead, I'm creating a tar ball. The first attempt was quite slow:

tar cvpzf wouter@new-laptop:old-laptop.tar.gz /home /var /etc

The problem here is that the default compression algorithm, gzip, is quite slow, especially if you use the default non-parallel implementation.

So we tried something else:

tar cvpf wouter@new-laptop:old-laptop.tar.gz -Ipigz /home /var /etc

Better, but not quite great yet. The old laptop now has bursts of maxing out CPU, but it doesn't even come close to maxing out the gigabit network cable between the two.

Tar can compress to the LZ4 algorithm. That algorithm doesn't compress very well, but it's the best algorithm if "speed" is the most important consideration. So I could do that:

tar cvpf wouter@new-laptop:old-laptop.tar.gz -Ilz4 /home /var /etc

The trouble with that, however, is that the tarball will then be quite big.

So why not use the CPU power of the new laptop?

tar cvpf - /home /var /etc | ssh new-laptop "pigz > old-laptop.tar.gz"

Yeah, that's much faster. Except, now the network speed becomes the limiting factor. We can do better.

tar cvpf - -Ilz4 /home /var /etc | ssh new-laptop "lz4 -d | pigz > old-laptop.tar.gz"

This uses about 70% of the link speed, just over one core on the old laptop, and 60% of CPU time on the new laptop.

After also adding a bit of --exclude="*cache*", to avoid files we don't care about, things go quite quickly now: somewhere between 200 and 250G (uncompressed) was transferred into a 74G file, in 20 minutes. My first attempt hadn't even done 10G after an hour!

I published the following diary on isc.sans.edu: “A ‘Zip Bomb’ to Bypass Security Controls & Sandboxes“:

Yesterday, I analyzed a malicious archive for a customer. It was delivered to the mailbox of a user who, hopefully, was security-aware and reported it. The payload passed through the different security layers based on big players on the market!

The file is a zip archive (SHA256:97f205b8b000922006c32c9f805206c752b0a7d6280b6bcfe8b60d52f3a1bb5f) and has a score of 6/58 on VT. The archive contains an ISO file that, once mounted, discloses a classic PE file. But let’s have a look at the file… [Read more]

The post [SANS ISC] A ‘Zip Bomb’ to Bypass Security Controls & Sandboxes appeared first on /dev/random.

May 19, 2022

Lorsque j’ai signĂ© pour la publication de Printeurs, mon Ă©diteur m’a fait savoir qu’il attendait de la part des auteurs un certain investissement dans la promotion de leurs livres. J’ai demandĂ© ce qu’il entendait par lĂ  et il m’a parlĂ© de participer Ă  des salons, des sĂ©ances de dĂ©dicaces, ce genre de choses. Salons qui furent parmi les premiĂšres victimes du COVID en 2020 et 2021.

En 2022, il est temps de rattraper le temps perdu et d’honorer ma promesse. Je serai donc prĂ©sent du vendredi 20 mai aprĂšs-midi jusque dimanche 22 mai aprĂšs-midi aux Imaginales Ă  Épinal, la plus cĂ©lĂšbre foire aux boudins de l’imaginaire.

Je n’ai aucune idĂ©e de ce que je suis censĂ© faire.

Si j’ai trĂšs souvent entendu parler des Imaginales, je n’ai aucune idĂ©e de Ă  quoi ressemble ce genre d’évĂ©nements ni ce qu’on peut y attendre d’un auteur. C’est une nouvelle expĂ©rience pour moi et j’en suis assez curieux.

Si vous ĂȘtes dans le coin, n’hĂ©sitez pas Ă  venir faire un petit coucou et me poser des questions sur le livre qui va sortir cette annĂ©e. Cherchez une machine Ă  Ă©crire coincĂ©e entre une pile de « Printeurs », mon roman cyberpunk, et une pile de « Aristide, le lapin cosmonaute », mon livre pour enfants dont il ne reste qu’une poignĂ©e d’exemplaires. Le type derriĂšre le machine qui fait « clac clac clac ding », c’est moi ! Ne vous sentez pas obligĂ© d’acheter un bouquin. Quoi ? Mon Ă©diteur (le type derriĂšre moi, avec le fouet dans la main et un symbole € Ă  la place des pupilles) me dit que si, c’est obligatoire.

Pratiquement, je serai au stand PVH Ă©dition, dans une petite tente blanche face Ă  la fontaine au milieu du parc le samedi entre 11h et 19h et, Ă  confirmer, le dimanche matin au mĂȘme endroit, Ă  cĂŽtĂ© d’une dĂ©monstration de jeu vidĂ©o et d’une imprimante 3D.

J’avoue avoir longuement hĂ©sitĂ© Ă  poster ce message, mais je me suis dit que si certains d’entre vous sont dans le coin, ce serait dommage de se rater. C’est toujours un plaisir pour moi de rencontrer des lecteurs de mes livres ou de mon blog. Certains d’entre vous me suivent et me soutiennent depuis prĂšs de 15 ans et il y a dans la rencontre en chair et en os, mĂȘme trĂšs brĂšve, quelque chose que des dizaines d’emails ne pourront jamais apporter.

Pour ceux qui ne seront pas Ă  Épinal, ce n’est, je l’espĂšre, que partie remise (entre nous, j’ai le secret espoir de pouvoir proposer une confĂ©rence aux Utopiales Ă  Nantes, ce qui me donnerait une bonne excuse pour m’y rendre).

Recevez les billets par mail ou par RSS. Max 2 billets par semaine, rien d’autre. Adresse email jamais partagĂ©e et dĂ©finitivement effacĂ©e lors du dĂ©sabonnement. Dernier livre paru : Printeurs, thriller cyberpunk. Pour soutenir l’auteur, lisez, offrez et partagez des livres.

Ce texte est publié sous la licence CC-By BE.