Planet Grep

Planet'ing Belgian FLOSS people

Planet Grep is maintained by Wouter Verhelst. All times are in UTC.

May 28, 2020

The reason software isn't better is because it takes a lifetime to understand how much of a mess we've made of things, and by the time you get there, you will have contributed significantly to the problem.
Two software developers pairing up on a Rails app

The fastest code is code that doesn't need to run, and the best code is code you don't need to write. This is rather obvious. Less obvious is how to get there, or who knows. Every coder has their favored framework or language, their favored patterns and practices. Advice on what to do is easy to find. More rare is what not to do. They'll often say "don't use X, because of Y," but that's not so much advice as it is a specific criticism.

The topic interests me because significant feats of software engineering often don't seem to revolve around new ways of doing things. Rather, they involve old ways of not doing things. Constraining your options as a software developer often enables you to reach higher than if you hadn't.

Many of these lessons are hard learned, and in retrospect often come from having tried to push an approach further than it merited. Some days much of software feels like this, as if computing has already been pushing our human faculties well past the collective red line. Hence I find the best software advice is often not about code at all. If it's about anything, it's about data, and how you organize it throughout its lifecycle. That is the real currency of the coder's world.

Usually data is the ugly duckling, relegated to the role of an unlabeled arrow on a diagram. The main star is all the code that we will write, which we draw boxes around. But I prefer to give data top billing, both here and in general.

One-way Data Flow

In UI, there's the concept of one-way data flow, popularized by the now omnipresent React. One-way data flow is all about what it isn't, namely not two-way. This translates into benefits for the developer, who can reason more simply about their code. Unlike traditional Model-View-Controller architectures, React is sold as being just the View.

Expert readers however will note that the original trinity of Model-View-Controller does all flow one way, in theory. Its View receives changes from the Model and updates itself. The View never talks back to the model, it only operates through the Controller.

model view controller

The reason it's often two-way in practice is because there are lots of M's, V's and C's which all need to communicate and synchronize in some unspecified way:

model view controller - data flow

The source of truth is some kind of nebulous Ur-Model, and each widget in the UI is tied to a specific part of it. Each widget has its own local model, which has to bi-directionally sync up to it. Children go through their parent to reach up to the top.

When you flatten this, it starts to look more like this:

model view controller - 2-way data flow

Between an original model and a final view must sit a series of additional "Model Controllers" whose job it is to pass data down and to the right, and vice versa. Changes can be made in either direction, and there is no single source of truth. If both sides change at the same time, you don't know which is correct without more information. This is what makes it two-way.

model view controller - one-way stateless data flow

The innovation in one-way UI isn't exactly to remove the Controller, but to centralize it and call it a Reducer. It also tends to be stateless, in that it replaces the entire Model for every change, rather than updating it in place.

This makes all the intermediate arrows one-way, restoring the original idea behind MVC. But unlike most MVC, it uses a stateless function f: model => views to derive all the Views from the Ur-Model in one go. There are no permanent Views that are created and then set up to listen to an associated Model. Instead Views are pure data, re-derived for every change, at least conceptually.

In practice there is an actual trick to making this fast, namely incrementalism and the React Reconciler. You don't re-run everything, but you can pretend you do. A child is guaranteed to be called again if a parent has changed. But only after giving that parent, and its parents, a chance to react first.

Even if the Views are a complex nested tree, the data flow is entirely one way except at the one point where it loops back to the start. If done right, you can often shrink the controller/reducer to such a degree that it may as well not be there.

Much of the effort in developing UI is not in the widgets but in the logic around them, so this can save a lot of time. Typical MVC instead tends to spread synchronization concerns all over the place as the UI develops, somewhat like a slow but steadily growing cancer.

The solution seems to be to forbid a child from calling or changing the state of its parent directly. Many common patterns in old UI code become impossible and must be replaced with alternatives. Parents do often pass down callbacks to children to achieve the same thing by another name. But this is a cleaner split, because the child component doesn't know who it's calling. The parent can decide to pass-through or decorate a callback given to it by its parent, and this enables all sorts of fun composition patterns with little to no boilerplate.

You don't actually need to have one absolute Ur-Model. Rather the idea is separation of concerns along lines of where the data comes from and what it is going to be used for, all to ensure that change only flows in one direction.

The benefits are numerous because of what it enables: when you don't mutate state bidirectionally, your UI tree is also a data-dependency graph. This can be used to update the UI for you, requiring you to only declare what you want the end result to be. You don't need to orchestrate specific changes to and fro, which means a lot of state machines disappear from your code. Key here is the ability to efficiently check for changes, which is usually done using immutable data.

The merit of this approach is most obvious once you've successfully built a complex UI with it. The discipline it enforces leads to more elegant and robust solutions, because it doesn't let you wire things up lazily. You must instead take the long way around, and design a source of truth in accordance with all its intended derivatives. This forces but also enables you to see the bigger picture. Suddenly features that seemed insurmountably complicated, because they cross-cut too many concerns, can just fall out naturally. The experience is very similar to Immediate Mode UI, only with the ability to decouple more and do async.

If you don't do this, you end up with the typical Object-Oriented system. Every object can be both an actor and can be mutually acted upon. It is normal and encouraged to create two-way interactions with them and link them into cycles. The resulting architecture diagrams will be full of unspecified bidirectional arrows that are difficult to trace, which obscure the actual flows being realized.

Unless they represent a reliable syncing protocol, bidirectional arrows are wishful thinking.

Immutable Data

Almost all data in a computer is stored on a mutable medium, be it a drive or RAM. As such, most introductions to immutable data will preface it by saying that it's kinda weird. Because once you create a piece of data, you never update it. You only make a new, altered copy. This seems like a waste of perfectly good storage, volatile or not, and contradicts every programming tutorial.

Because of this it is mandatory to say that you can reduce the impact of it with data sharing. This produces a supposedly unintuitive copy-on-write system.

But there's a perfect parallel, and that's the pre-digital office. Back then, most information was kept on paper that was written, typed or printed. If a document had to be updated, it had to be amended or redone from scratch. Aside from very minor annotations or in-place corrections, changes were not possible. When you did redo a document, the old copy was either archived, or thrown away.

data sharing - copy on write

The perfectly mutable medium of computer memory is a blip, geologically speaking. It's easy to think it only has upsides, because it lets us recover freely from mistakes. Or so we think. But the same needs that gave us real life bureaucracy re-appear in digital form. Only it's much harder to re-introduce what came naturally offline.

Instead of thinking of mutable data as the default, I prefer to think of it as data that destroys its own paper trail. It shreds any evidence of the change and adjusts the scene of the crime so the past never happened. All edits are applied atomically, with zero allowances for delay, consideration, error or ambiguity. This transactional view of interacting with data is certainly appealing to systems administrators and high-performance fetishists, but it is a poor match for how people work with data in real life. We enter and update it incrementally, make adjustments and mistakes, and need to keep the drafts safe too. We need to sync between devices and across a night of sleep.

banksy self-shredding painting

Girl With Balloon aka The Self-shredding Painting (Banksy)

Storing your main project in a bunch of silicon that loses its state as soon as you turn off the power is inadvisable. This is why we have automated backups. Apple's Time Machine for instance turns your computer into a semi-immutable data store on a human time scale, garbage collected behind the scenes and after the fact. Past revisions of files are retained for as long is practical, provided the app supports revision control. It even works without the backup drive actually hooked up, as it maintains a local cache of the most recent edits as space permits.

It's a significant feat of engineering, supported by a clever reinterpretation of what "free disk space" actually means. It allows you to Think Different™ about how data works on your computer. It doesn't just give you the peace of mind of short-term OS-wide undo. It means you can still go fish a crumpled piece of data out of the trash long after throwing banana peels and coke cans on top. And you can do it inline, inside the app you're using, using a UI that is only slightly over the top for what it does.

That is what immutable data gets you as an end-user, and it's the result of deciding not to mutate everything in place as if empty disk space is a precious commodity. The benefits can be enormous, for example that synchronization problems get turned into fetching problems. This is called a Git.

It's so good most developers would riot if they were forced to work without it, but almost none grant their own creations the same abilities.

Linus Torvalds

Git repositories are of course notorious for only growing bigger, never shrinking, but that is a long-standing bug if we're really honest. It seems pretty utopian to want a seamless universe of data, perfectly normalized by key in perpetuity, whether mutable or immutable. Falsehoods programmers believe about X is never wrong on a long enough time-scale, and you will need affordances to cushion that inevitable blow sooner or later.

One of those falsehoods is that when you link a piece of data from somewhere else, you always wish to keep that link live instead of snapshotting it, better known as Database Normalization. Given that screenshots of screenshots are now the most common type of picture on the web, aside from cats, we all know that's a lie. Old bills don't actually self-update after you move house. In fact if you squint hard "Print to PDF" looks a lot like compiling source code into a binary for normies, used for much the same reasons.

The analogy to a piece of paper is poignant to me, because you certainly feel it when you try to actually live off SaaS software meant to replicate business processes. Working with spreadsheets and PDFs on my own desktop is easier and faster than trying to use an average business solution designed for that purpose in the current year. Because they built a tool for what they thought people do, instead of what we actually do.

These apps often have immutability, but they use it wrong: they prevent you from changing something as a matter of policy, letting workflow concerns take precedence over an executive override. If e.g. law requires a paper trail, past versions can be archived. But they should let you continue to edit as much as you damn well want, saving in the background if appropriate. The exceptions that get this right can probably be counted on one hand.

Business processes are meant to enable business, not constrain it. Requiring that you only ever have one version of everything at any time does exactly that. Immutability with history is often a better solution, though not a miracle cure. Doing it well requires expert skill in drawing boundaries between your immutable blobs. It also creates a garbage problem and it won't be as fast as mutable in the short term. But in the long term it just might save someone a rewrite. It's rarely pretty when real world constraints collide with an ivory tower that had too many false assumptions baked into it.

Rolls containing Acts of Parliament in the Parliamentary Archives at Victoria Tower, Palace of Westminster

Parliamentary Archives at Victoria Tower – Palace of Westminster

Pointerless Data

Data structures in a systems language like C will usually refer to each other using memory pointers: these are raw 64-bit addresses pointing into the local machine's memory, obscured by virtualization. They reference memory pages that are allocated, with their specific numeric value meaningless and unpredictable.

This has a curious consequence: the most common form of working with data on a computer is one of the least useful encodings of that data imaginable. It cannot be used as-is on any other machine, or even the same machine later, unless loaded at exactly the same memory offset in the exact same environment.

Almost anything else, even in an obscure format, would have more general utility. Serializing and deserializing binary data is hence a major thing, which includes having to "fix" all the pointers, a problem that has generated at least 573 kiloyaks worth of shaving. This is strange because the solution is literally just adding or subtracting a number from a bunch of other numbers over and over.

Okay that's a lie. But what's true is that every pointer p in a linked data structure is really a base + i, with a base address that was determined once and won't change. Using pointers in your data structure means you sprinkle base + invisibly around your code and your data. You bake this value into countless repeated memory cells, which you then have to subtract later if you want to use their contents for outside purposes.

Due to dynamic memory allocation the base can vary for different parts of your linked data structure. You have to assume it's different per pointer, and manually collate and defragment all the individual parts to serialize something.

Pointers are popular because they are easy, they let you forget where exactly in memory your data sits. This is also their downside: not only have you encoded your data in the least repeatable form possible, but you put it where you don't have permission to search through all of it, add to it, or reorganize it. malloc doesn't set you free, it binds you.

But that's a design choice. If you work inside one contiguous memory space, you can replace pointers with just the relative offset i. The resulting data can be snapshotted as a whole and written to disk. In addition to pointerless, certain data structures can even be made offsetless.

For example, a flattened binary tree where the index of a node in a list determines its position in the tree, row by row. Children are found at 2*i and 2*i + 1. This can be e.g. used on GPUs and allows for very efficient traversal and updates. It's also CPU-cache friendly. This doesn't work well for arbitrary graphs, but is still a useful trick to have in your toolbox. In specific settings, pointerless or offsetless data structures can have significant benefits. The fact that it lets you treat data like data again, and just cargo it around wholesale without concern about the minutiae, enables a bunch of other options around it.

Binary Tree - Flattened

It's not a silver bullet because going pointerless can just shift the problem around in the real world. Your relative offsets can still have the same issue as before, because your actual problem was wrangling the data-graph itself. That is, all the bookkeeping of dependent changes when you edit, delete or reallocate. Unless you can tolerate arbitrary memory fragmentation and bloating, it's going to be a big hassle to make it all work well.

Something else is going on beyond just pointers. See, most data structures aren't really data structures at all. They're acceleration structures for data. They accelerate storage, querying and manipulation of data that was already shaped in a certain way.

The contents of a linked list are the same as that of a linear array, and they serialize to the exact same result. A linked list is just an array that has been atomized, tagged and sprayed across an undefined memory space when it was built or loaded.

Because of performance, we tend to use our acceleration structures as a stand-in for the original data, and manipulate that. But it's important to realize this is programmer lazyness: it's only justified if all the code that needs to use that data has the same needs. For example, if one piece of code does insertions, but another needs random access, then neither an array nor linked list would win, and you need something else.

We can try to come up with ever-cleverer data structures to accommodate every imaginable use, and this is called a Postgres. It leads to a ritual called a Schema Design Meeting where a group of people with differently shaped pegs decide what shape the hole should be. Often you end up with a too-generic model that doesn't hold anything particularly well. All you needed was 1 linked list and 1 array containing the exact same data, and a function to convert one to the other, that you use maybe once or twice.

When a developer is having trouble maintaining consistency while coding data manipulations, that's usually because they're actually trying to update something that is both a source of truth and output derived from it, at the same time in the same place. Most of the time this is entirely avoidable. When you do need to do it, it is important to be aware that's what that is.

My advice is to not look for the perfect data structure which kills all birds with one stone, because this is called a Lisp and few people use it. Rather, accept the true meaning of diversity in software: you will have to wrangle different and incompatible approaches, transforming your data depending on context. You will need to rely on well-constructed adaptors that exist to allow one part to forget about most of the rest of the universe. It is best to become good at this and embrace it where you can.

As for handing your data to others, there is already a solution for that. They're called file formats, and they're a thing we used to have. Software used to be able to read many of them, and you could just combine any two tools that had the same ones. Without having to pay a subscription fee for the privilege, or use a bespoke one-time-use convertor. Obviously this was crazy.

These days we prefer to link our data and code using URLs, which is much better because web pages can change invisibly underneath you without any warning. You also can't get the old version back even if you liked it more or really needed it, because browsers have chronic amnesia. Unfortunately it upsets publishers and other copyright holders if anyone tries to change that, so we don't try.

squeak / smalltalk


Suspend and Resume

When you do have snapshottable data structures that can be copied in and out of memory wholesale, it leads to another question: can entire programs be made to work this way? Could they be suspended and resumed mid-operation, even transplanted or copied to another machine? Imagine if instead of a screenshot, a tester could send a process snapshot that can actually be resumed and inspected by a developer. Why did it ever only 'work on my machine'?

Obviously virtual machines exist, and so does wholesale-VM debugging. But on the process level, it's generally a non-starter, because sockets and files and drivers mess it up. External resources won't be tracked while suspended and will likely end up in an invalid state on resume. VMs have well-defined boundaries and well-defined hardware to emulate, whereas operating systems are a complete wild west.

It's worth considering the worth of a paper trail here too. If I suspend a program while a socket is open, and then resume it, what does this actually mean? If it was a one-time request, like an HTTP GET or PUT, I will probably want to retry that request, if at all still relevant. Maybe I prefer to drop it as unimportant and make a newer, different request. If it was an ongoing connection like a WebSocket, I will want to re-establish it. Which is to say, if you told a network layer the reason for opening a socket, maybe it could safely abort and resume sockets for you, subject to one of several policies, and network programming could actually become pleasant.

Files can receive a similar treatment, to deal with the situation where they may have changed, been deleted, moved, etc. Knowing why a file was opened or being written to is required to do this right, and depends on the specific task being accomplished. Here too macOS deserves a shout-out, for being clever enough to realize that if a user moves a file, any application editing that file should switch to the new location as well.

Systems-level programmers tend to orchestrate such things by hand when needed, but the data flow in many cases is quite unidirectional. If a process, or a part of a process, could resume and reconnect with its resources according to prior declared intent, it would make a lot of state machines disappear.

It's not a coincidence this post started with React. Even those aware of it still don't quite realize React is not actually a thing to make web apps. It is an incremental job scheduler, for recursively expanding a tree in an asynchronous and rewindable fashion. It just happens to be built for SGML-like trees, and contains a bunch of legacy fixes for browsers. The pattern can be applied to many areas that are not UI and not web. If it sounds daunting to consider approaching resources this way, consider that people thought exactly the same about async I/O until someone made that pleasant enough.

However, doing this properly will probably require going back further than you think. For example, when you re-establish a socket, should you repeat and confirm the DNS lookup that gave you the IP in the first place? Maybe the user moved locations between suspending and resuming, so you want to reconnect to the nearest data center. Maybe there is no longer a need for the socket because the user went offline.

All of this is contextual, defined by policies informed by the real world. This class of software behavior is properly called etiquette. Like its real world counterpart it is extremely messy because it involves anticipating needs. Usually we only get it approximately right through a series of ad-hoc hacks to patch the worst annoyances. But it is eminently felt when you get such edge cases to work in a generic and reproducible fashion.

Mainly it requires treating policies as first class citizens in your designs and code. This can also lead you to perceive types in code in a different way. A common view is that a type constrains any code that refers to it. That is, types ensure your code only applies valid operations on the represented values. When types represent policies though, the perspective changes because such a type's purpose is not to constrain the code using it. Rather, it provides specific guarantees about the rules of the universe in which that code will be run.

This to me is the key to developer happiness. As opposed to, say, making tools to automate the refactoring of terrible code and make it bearable, but only just.

The key to end-user happiness is to make tools that enable an equivalent level of affordance and flexibility compared to what the developer needed while developing it.

* * *

When you look at code from a data-centric view, a lot of things start to look like stale or inconsistent data problems. I don't like using the word "cache" for this because it focuses on the negative, the absence of fresh input. The real issue is data dependencies, which are connections that must be maintained in order to present a cohesive view and cohesive behavior, derived from a changing input model. Which is still the most practical way of using a computer.

Most caching strategies, including 99% of those in HTTP, are entirely wrong. They fall into the give-up-and-pray category, where they assume the problem is intractable and don't try something that could actually work in all cases. Which, stating the obvious, is what you should actually aim for.

Often the real problem is that the architect's view of the problem is a tangled mess of boxes and arrows that point all over the place, with loopbacks and reversals, which makes it near-impossible to anticipate and cover all the applicable scenarios.

If there is one major thread running through this, it's that many currently accepted sane defaults really shouldn't be. In a world of terabyte laptops and gigabyte GPUs they look suspiciously like premature optimization. Many common assumptions deserve to be re-examined, at least if we want to adapt tools like from the Offline Age to a networked day. We really don't need a glossier version of a Microsoft Office 95 wizard with a less useful file system.

We do need optimized code in our critical paths, but developer time is worth more than CPU time most everywhere else. Most of all, we need the ambition to build complete tools and the humility to grant our users access on an equal footing, instead of hoarding the goods.

The argument against these practices is usually that they lead to bloat and inefficiency. Which is definitely true. Yet even though our industry has not adopted them much at all, the software already comes out orders of magnitude bigger and slower than before. Would it really be worse?

If you tested your blog’s performance on Google PageSpeed Insights yesterday and do so again today, you might be in for a surprise with a lower score even if not one byte (letter) got changed on your site. The reason: Google updated PageSpeed Insights to Lighthouse 6, which changes the KPI’s (the lab data metrics) that are reported, adds new opportunities and recommendations and changes the way the total score is calculated.

So all starts with the changed KPI’s in the lab metrics really; whereas up until yesterday First Contentful Paint, Speed Index, Time to Interactive, First Meaningful Paint, First CPU Idle and First input delay were measured, the last 3 ones are now not shown any more, having been replaced by:

  • Largest Contentful Paint marks the point when the page’s main content has likely loaded, this can generally be improved upon by removing removing render-blocking resources (JS/ CSS), optimizing images, …
  • Total Blocking Time quantifies how non-interactive a page while loading, this is mainly impacted by Javascript (local and 3rd party) blocking the main thread, so improving that generally means ensuring there is less JS to execute
  • Cumulative Layout Shift which measures unexpected layout shifts

The total score is calculated based on all 6 metrics, but the weight of the 3 “old” ones (FCP, SI, TTI) is significantly lowered (from 80 to 45%) and the new LCP & TBT account for a whopping 50% of your score (CLS is only 5%).

Lastly some one very interesting opportunity and two recommendations I noticed;

  • GPSI already listed unused CSS, but now adds unused JS to that list, which will prove to be equally hard to control in WordPress as JS like CSS is added by almost each and every plugin. Obviously if you’re using Autoptimize this will flag the Autoptimized JS, disalbe Autoptimize for the test by adding ?ao_noptimize=1 to the URL to see what original JS is unused.
  • GPSI now warns about using document.write and about the impact of passive listeners on scrolling performance which can lead to Google complaining about … Google :-)

Summary: Google Pagespeed Insights changed a lot and it forces performance-aware users to stay on their toes. Especially sites with lots of (3rd party) JavaScript might want to reconsider some of the tools used.

I published the following diary on “Flashback on CVE-2019-19781“:

First of all, did you know that the Flame malware turned 8 years today! Happy Birthday! This famous malware discovered was announced on May 28th, 2012. The malware was used for targeted cyber espionage activities in the Middle East area. If this malware was probably developed by a nation-state organization. It infected a limited amount of hosts (~1000 computers) making it a targeted attack… [Read more]

[The post [SANS ISC] Flashback on CVE-2019-19781 has been first published on /dev/random]

May 25, 2020

I recently learned that quite a few (old) root certificates are going to expire, and many websites still send those along in the TLS handshake.

May 23, 2020

I published the following diary on “AgentTesla Delivered via a Malicious PowerPoint Add-In“:

Attackers are always trying to find new ways to deliver malicious code to their victims. Microsoft Word and Excel are documents that can be easily weaponized by adding malicious VBA macros. Today, they are one of the most common techniques to compromise a computer. Especially because Microsoft implemented automatically executed macros when the document is opened. In Word, the macro must be named AutoOpen(). In Excel, the name must be Workbook_Open(). However, PowerPoint does not support this kind of macro. Really? Not in the same way as Word and Excel do… [Read more]

[The post [SANS ISC] AgentTesla Delivered via a Malicious PowerPoint Add-In has been first published on /dev/random]

May 21, 2020

I published the following diary on “Malware Triage with FLOSS: API Calls Based Behavior“:

Malware triage is a key component of your hunting process. When you collect suspicious files from multiple sources, you need a tool to automatically process them to extract useful information. To achieve this task, I’m using FAME which means “FAME Automates Malware Evaluation”. This framework is very nice due to the architecture based on plugins that you can enable upon your needs. Here is an overview of my configuration… [Read more]

[The post [SANS ISC] Malware Triage with FLOSS: API Calls Based Behavior has been first published on /dev/random]

May 19, 2020

Recently I had to work with one of my colleagues (David) on something that was new to me : Openshift. I never really looked at OpenShift but knew the basic concepts, at least on OKD 3.x.

With 4.x, OCP is completely different as instead of deploying "normal" Linux distro (like CentOS in our case), it's now using RHCOS (so CoreOS) as it's foundation. The goal of this blog post is not to dive into all the technical steps required to deploy/bootstrap the openshift cluster, but to discuss of one particular 'issue' that I found myself annoying while deploying: how to disable dhcp on the CoreOS provisioned nodes.

To cut a long story short, you can read the basic steps needed to deploy Openshift on bare-metal in the official doc

Have you read it ? Good, now we can move forward :)

After we had configured our install-config.yaml (with our needed values) and also generated the manifests with openshift-install create manifests --dir=/path/ we thought that it would be just deploying with the ignition files built by the openshift-install create ignition-configs --dir=/path step (see in the above doc for all details)

It's true that we ended up with some ignition files like:

  • bootstrap.ign
  • worker.ign
  • master.ign

Those ignition files are (more or less) like traditional kickstart files to let you automate the RHCOS deploy on bare-metal. The other part is really easy, as it's a matter (with ansible in our case) to just configure the tftp boot argument, and call an ad-hoc task to remotely force a physical reinstall of the machine (through ipmi):

So we kicked off first the bootstrap node (ephemeral node being used as a temporary master, from which the real master forming the etcd cluster will get their initial config from), but then we realized that, while RHCOS was installed and responding with the fixed IP we set through pxeboot kernel parameters (and correctly applied on the reboot), each RHCOS node was also trying by default to activate all present NICs on the machine.

That was suddenly "interesting" as we don't fully control the network where those machines are, and each physical node has 4 NICs, all in the same vlan , in which we have also a small dhcp range for other deployments. Do you see the problem about etcd and members in the same subnet and multiple IP addresses ? yeah, it wasn't working as we saw some requests coming from the dhcp interfaces instead of the first properly configured NIC in each system.

The "good" thing is that you can still ssh into each deployed RHCOS (even if not adviced to) , to troubleshoot this. We discovered that RHCOS still uses NetworkManager but that default settings would be to enable all NICs with DHCP if nothing else declared which is what we need to disable.

After some research and help from Colin Walters, we were pointed to this bug report for coreos

With the traditional "CentOS Linux" sysadmin mindset, I thought : "good, we can just automate with ansible ssh'ing into each provisioned rhcos to just disable it", but there should be a clever other way to deal with this, as it was also impacting our initial bootstrap and master nodes (so no way to get cluster up)

That's then that we found this : Customing deployment with Day0 config : here is a simple example for Chrony

That's how I understood the concept of MachineConfig and how that's then supposed to work for a provisioned cluster, but also for the bootstrap process. Let's so use those informations to create what we need and start a fresh deploy.

Assuming that we want to create our manifest in :

openshift-install create manifests --dir=/<path>/

And now that we have manifests, let's inject our machine configs : You'll see that because it's YAML all over the place, injecting Yaml in Yaml would be "interesting" so the concept here is to inject content as base64 encoded string, everywhere.

Let's suppose that we want the /etc/NetworkManager.conf.d/disabledhcp.conf having this content on each provisioned node (master and worker) to tell NetworkManager to not default to auto/dhcp:


Let's first encode it to base64:

cat << EOF | base64

Our base64 value is W21haW5dCm5vLWF1dG8tZGVmYXVsdD0qCg==

So now that we have content, let's create manifests to create automatically that file at provisioning time :

pushd <path>
# To ensure that provisioned master will try to become master as soon as they are installed
sed -i 's/mastersSchedulable: true/mastersSchedulable: false/g' manifests/cluster-scheduler-02-config.yml

pushd openshift
for variant in master worker; do 
cat << EOF > ./99_openshift-machineconfig_99-${variant}-nm-nodhcp.yaml
kind: MachineConfig
  labels: ${variant}
  name: nm-${variant}-nodhcp
      config: {}
        tls: {}
      timeouts: {}
      version: 2.2.0
    networkd: {}
    passwd: {}
      - contents:
          source: data:text/plain;charset=utf-8;base64,W21haW5dCm5vLWF1dG8tZGVmYXVsdD0qCg==
          verification: {}
        filesystem: root
        mode: 0644
        path: /etc/NetworkManager/conf.d/disabledhcp.conf
  osImageURL: ""


I think this snipped is pretty straight-forward, and you see in the source how we "inject" the content of the file itself (previous base64 value we got in previous step)

Now that we have added our customizations, we can just proceed with the openshift-install create ignition-configs --dir=/<path> command again, retrieve our .ign file, and call ansible again to redeploy the nodes, and this time they were deployed correctly with only the IP coming from ansible inventory and no other nic in dhcp.

And also that it works, deploying/adding more workers node in the OCP cluster is just a matter to calling ansible and physical nodes are deployed in a matter of ~5minutes (as RHCOS is just extracting its own archive on disk and reboot)

I don't know if I'll have to take multiple deep dives into OpenShift in the future , but at least I learned multiple things, and yes : you always learn more when you have to deploy something for the first time and that it doesn't work straight away .. so while you try to learn the basics from official doc, you have also to find other resources/docs elsewhere :-)

Hope that it can help people in the same situation when having to deploy OpenShift on premises/bare-metal.

May 15, 2020

As someone who has spent his entire career in Open Source, I've been closely following how Open Source is being used to fight the COVID-19 global pandemic.

I recently moderated a panel discussion on how Open Source is being used with regards to the coronavirus crisis. Our panel included: Jim Webber (Chief Scientist at Neo4J), Ali Ghodsi (CEO at Databricks), Dan Eiref (Senior Director of Product management at Markforged) and Debbie Theobold (CEO at Vecna Robotics). Below are some of the key takeaways from our discussion. They show how Open Source is a force for good in these uncertain times.

Open Source enables knowledge sharing

Providing accurate information related to COVID-19 is an essential public service. Neo4J worked with data scientists and researchers to create CovidGraph. It is an Open Source graph database that brings together information on COVID-19 from different sources.

Jim Webber from Neo4J explained, The power of graph data [distributed via an open source management system] is that it can pull together disparate datasets from medical practitioners, public health officials and other scientific publications into one central view. People can then make connections between all facts. This is useful when looking for future long-term solutions.. CovidGraph helped institutions like the Canadian government integrate data from multiple departments and facilities.

Databricks CEO Ali Ghodsi also spoke to his company's efforts to democratize data and artificial intelligence. Their mission is to help data teams solve the world's toughest problems. Databricks created Glow, an Open Source toolkit built on Apache Spark that enables large-scale genomic analysis. Glow helps scientists understand the development and spread of the COVID-19 virus. Databricks made their datasets available for free. Using Glow's machine learning tools, scientists are creating predictive models that track the spread of COVID-19.

Amid the positive progress we're seeing from this open approach to data, some considerations were raised about governments' responsibilities with the data they collect. Maintaining public trust is always a huge concern. Still, as Ali said, The need for data is paramount. This isn't a matter of using data to sell ads; it's a matter of using data to data to save lives..

Open Source makes resources accessible on a global scale

It's been amazing to watch how Open Source propels innovation in times of great need. Dan Eiref from 3D printer company Markforged spoke to how his company responded to the call to assist in the pandemic. Markforged Open Sourced the design for face masks and nasal swabs. They also partnered with doctors to create a protective face shield and distributed personal protective equipment (PPE) to more than 500 hospitals.

Almost immediately we got demand from more than 10,000 users to replicate this design in their own communities, as well as requests to duplicate the mask on non-Markforged printers. We decided to Open Source the print files so anyone could have access to these protections., said Eiref.

The advantage of Open Source is that it can quickly produce and distribute solutions to people who need it the most. Debbie Theobold, CEO of Vecna Robotics, shared how her company helped tackle the shortage of ventilators. Since COVID-19 began, medical manufacturers have struggled to provide enough ventilators, which can cost upwards of $40,000. Venca Robotics partnered with the Massachusetts Institute of Technology (MIT) to develop an Open Source ventilator design called Ventiv, a low-cost alternative for emergency ventilation. The rapid response from people to come together and offer solutions demonstrates the altruistic pull of the Open Source model to make a difference., said Theobald.

Of course, there are still challenges for Open Source in the medical field. In the United States, all equipment requires FDA certification. The FDA isn't used to Open Source, and Open Source isn't used to dealing with FDA certification either. Fortunately, the FDA has adjusted its process to help make these designs available more quickly.

Open Source accelerates digital transformations

A major question on everyone's mind was how technology will affect our society post-pandemic. It's already clear that long-term trends like online commerce, video conferencing, streaming services, cloud adoption and even Open Source are all being accelerated as a result of COVID-19. Many organizations need to innovate faster in order to survive. Responding to long-term trends by slowly adjusting traditional offerings is often "too little, too late".

For example, Debbie Theobold of Vecna Robotics brought up how healthcare organizations can see greater success by embracing websites and mobile applications. These efforts for better, patient-managed experiences that were going to happen eventually are happening right now. We've launched our mobile app and embraced things like online pre-registration. Companies that were relying on in-person interactions are now struggling to catch up. We've seen that technology-driven interactions are a necessity to keeping patient relationships., she said.

At Acquia, we've known for years that offering great digital experiences is a requirement for organizations looking to stay ahead.

In every crisis, Open Source has empowered organizations to do more with less. It's great to see this play out again. Open Source teams have rallied to help and come up with some pretty incredible solutions when times are tough.


So I got my Xiaomi M365 e-scooter a few months, and it quickly started to show quite some disadvantages. The most annoying was the weak motor : going up some long hills quickly forced me to step off as the e-scooter came to a grinding halt. The autonomy was low which required a daily charging session of 4 hours. Another issue was the bulky form factor which made the transportation on the train a bit cumbersome. And last but not least : an e-scooter still looks like a childs toy. I know I'm a grown-up child, but that doesn't mean I want to shout it out to everyone.

In the mean time, I've encountered some information on monowheels: they are single wheeled devices with pedals on the side. It looks quite daunting to use one, but when I received my Inmotion V10, I was immediately sold. This kind of device is really revolutionary : powerfull motor, great range and looks. It is compact enough to easily take it on the public transport, and has a maximum speed of 40 kph.

It however took me quite a few days to learn to ride this thing : only after a week with a daily exercise session of half an hour, things finally 'clicked' inside my head, and a week later, I found myself confident enough to ride in traffic. So a steep learning curve indeed, but when you persist, the reward is immense : riding this thing feels like you're flying !

I ran into this error when doing a very large MySQL import from a dumpfile.

May 14, 2020

Annoyingly, the date command differs vastly between Linux & BSD systems. Mac, being based on BSD, inherits the BSD version of that date command.

May 11, 2020

Blue hearts

I'm excited to announce that the Drupal Association has reached its 60-day fundraising goal of $500,000. We also reached it in record time; in just over 30 days instead of the planned 60!

It has been really inspiring to see how the community rallied to help. With this behind us, we can look forward to the planned launch of Drupal 9 on June 3rd and our first virtual DrupalCon in July.

I'd like to thank all of the individuals and organizations who contributed to the #DrupalCares fundraising campaign. The Drupal community is stronger than ever! Thank you!

May 10, 2020

In a few hours, the Bitcoin network will experience its third “halving”. So what is it and how does it work under the hood?
In a few hours, the Bitcoin network will experience its third “halving”. So what is it and how does it work under the hood?

May 08, 2020

I published the following diary on “Using Nmap As a Lightweight Vulnerability Scanner“:

Yesterday, Bojan wrote a nice diary about the power of the Nmap scripting language (based on LUA). The well-known port scanner can be extended with plenty of scripts that are launched depending on the detected ports. When I read Bojan’s diary, it reminded me of an old article that I wrote on my blog a long time ago. The idea was to use Nmap as a lightweight vulnerability scanner. Nmap has a scan type that tries to determine the service/version information running behind an open port (enabled with the ‘-sV’ flag). Based on this information, the script looks for interesting CVE in a flat database. Unfortunately, the script was developed by a third-party developer and was never integrated into the official list of scripts… [Read more]

[The post [SANS ISC] Using Nmap As a Lightweight Vulnerability Scanner has been first published on /dev/random]

May 06, 2020

I published the following diary on “Keeping an Eye on Malicious Files Life Time“:

We know that today’s malware campaigns are based on fresh files. Each piece of malware has a unique hash and it makes the detection based on lists of hashes not very useful these days. But can we spot some malicious files coming on stage regularly or, suddenly, just popping up from nowhere… [Read more]

[The post [SANS ISC] Keeping an Eye on Malicious Files Life Time has been first published on /dev/random]

May 05, 2020

These instructions can be followed to create a 2-out-of-3 multisignature address on the EOS blockchain (or any derivative thereof).

May 03, 2020

A quick reminder to myself that the Developer Console in Chrome or Firefix is useful to mass-select a bunch of checkboxes, if the site doesn’t have a “select all”-option (which really, it should).
I had a use case where I wanted to be notified whenever a particular string occured in a log file.

May 02, 2020


When you want to store your GnuPG private key(s) on a smartcard, you have a few options like the Yubikey, NitroKey GPG compatible cards, or the OpenPGP. The advantage of these cards is that they support GnuPG directly. The disadvantage is that they can only store 1 or a few keys.

Another option is SmartCardHSM, NitroKey HSM is based on SmartCardHsm and should be compatible. The newer versions support 4k RSA encryption keys and can store up 19 RSA 4k keys. The older version is limited to 2k RSA keys. I still have the older version. The advantage is that you can store multiple keys on the card. To use it for GPG encryption you’ll need to set up a gpg-agent with gnupg-pkcs11-scd.



I use 3 smartcards to store my keys, these SmartCardHSM’s were created with Device Key Encryption Key (DKEK) keys. See my previous blog posts on how to setup SmartCardHSM with Device Key Encryption Keys:

I create the public / private key pair on an air gaped system running Kal Linux live and copy the key to the other smartcards. See my previous blog posts on how to do this. I’ll only show how to create the keypair in this blog post.

Setup gpg

Create the keypair.

kali@kali:~$ pkcs11-tool --module --keypairgen --key-type rsa:2048 --label gpg.intern.stafnet.local --login
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA 
  label:      gpg.intern.stafnet.local
  ID:         47490caa5589d5b95e2067c5bc49b03711b854da
  Usage:      decrypt, sign, unwrap
  Access:     none
Public Key Object; RSA 2048 bits
  label:      gpg.intern.stafnet.local
  ID:         47490caa5589d5b95e2067c5bc49b03711b854da
  Usage:      encrypt, verify, wrap
  Access:     none

Create and upload the certificate

Create a self signed certificate

Create a self-signed certificate based on the key pair.

$ openssl req -x509 -engine pkcs11 -keyform engine -new -key 47490caa5589d5b95e2067c5bc49b03711b854da -sha256 -out cert.pem -subj "/CN=gpg.intern.stafnet.local"

Convert to DER

The certificate is created in the PEM format, to be able to upload it to the smartcard we need it in the DER format (we’d have created the certificate directly in the DER format with -outform der).

$ openssl x509 -outform der -in cert.pem -out cert.der

Upload the certificate to the smartcard(s)

$ pkcs11-tool --module /usr/lib64/ -l --write-object cert.der --type cert --id 47490caa5589d5b95e2067c5bc49b03711b854da --label "gpg.intern.stafnet.local"
Using slot 0 with a present token (0x0)
Logging in to "UserPIN (SmartCard-HSM)".
Please enter User PIN: 
Created certificate:
Certificate Object; type = X.509 cert
  label:      gpg.intern.stafnet.local
  subject:    DN: CN=gpg.intern.stafnet.local
  ID:         47490caa5589d5b95e2067c5bc49b03711b854da

Setup the gpg-agent

Install the gnupg-pkcs11-scd from GNU/Linux distribution package manager.

Configure gnupg-agent

$ cat ~/.gnupg/gpg-agent.conf
scdaemon-program /usr/bin/gnupg-pkcs11-scd
pinentry-program /usr/bin/pinentry
$ cat ~/.gnupg/gnupg-pkcs11-scd.conf
providers smartcardhsm
provider-smartcardhsm-library /usr/lib64/

Reload the agent

gpg-agent --server gpg-connect-agent << EOF


$ gpg --card-status
Application ID ...: D2760001240111503131171B486F1111
Version ..........: 11.50
Manufacturer .....: unknown
Serial number ....: 171B486F
Name of cardholder: [not set]
Language prefs ...: [not set]
Sex ..............: unspecified
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: 1R 1R 1R
Max. PIN lengths .: 0 0 0
PIN retry counter : 0 0 0
Signature counter : 0
Signature key ....: [none]
Encryption key....: [none]
Authentication key: [none]
General key info..: [none]

Get the GPG KEY-FRIEDNLY string

gpg-agent --server gpg-connect-agent << EOF
$ gpg-agent --server gpg-connect-agent << EOF
OK Pleased to meet you
gnupg-pkcs11-scd[26682.2406156096]: Listening to socket '/tmp/gnupg-pkcs11-scd.NeQexh/agent.S'
gnupg-pkcs11-scd[26682.2406156096]: accepting connection
gnupg-pkcs11-scd[26682]: chan_0 -> OK PKCS#11 smart-card server for GnuPG ready
gnupg-pkcs11-scd[26682.2406156096]: processing connection
gnupg-pkcs11-scd[26682]: chan_0 <- GETINFO socket_name
gnupg-pkcs11-scd[26682]: chan_0 -> D /tmp/gnupg-pkcs11-scd.NeQexh/agent.S
gnupg-pkcs11-scd[26682]: chan_0 -> OK
gnupg-pkcs11-scd[26682]: chan_0 <- LEARN
gnupg-pkcs11-scd[26682]: chan_0 -> S SERIALNO D2760001240111503131171B486F1111
gnupg-pkcs11-scd[26682]: chan_0 -> S APPTYPE PKCS11
S SERIALNO D2760001240111503131171B486F1111
gnupg-pkcs11-scd[26682]: chan_0 -> S KEY-FRIEDNLY 5780C7B3D0186C21C8C4503DDA7641FC71FD9B54 /CN=gpg.intern.stafnet.local on UserPIN (SmartCard-HSM)
gnupg-pkcs11-scd[26682]: chan_0 -> S CERTINFO 101 www\x2ECardContact\x2Ede/PKCS\x2315\x20emulated/DECM0102330/UserPIN\x20\x28SmartCard\x2DHSM\x29/47490CAA5589D5B95E2067C5BC49B03711B854DA
gnupg-pkcs11-scd[26682]: chan_0 -> S KEYPAIRINFO 5780C7B3D0186C21C8C4503DDA7641FC71FD9B54 www\x2ECardContact\x2Ede/PKCS\x2315\x20emulated/DECM0102330/UserPIN\x20\x28SmartCard\x2DHSM\x29/47490CAA5589D5B95E2067C5BC49B03711B854DA
gnupg-pkcs11-scd[26682]: chan_0 -> OK
S KEY-FRIEDNLY 5780C7B3D0186C21C8C4503DDA7641FC71FD9B54 /CN=gpg.intern.stafnet.local on UserPIN (SmartCard-HSM)
S CERTINFO 101 www\x2ECardContact\x2Ede/PKCS\x2315\x20emulated/DECM0102330/UserPIN\x20\x28SmartCard\x2DHSM\x29/47490CAA5589D5B95E2067C5BC49B03711B854DA
S KEYPAIRINFO 5780C7B3D0186C21C8C4503DDA7641FC71FD9B54 www\x2ECardContact\x2Ede/PKCS\x2315\x20emulated/DECM0102330/UserPIN\x20\x28SmartCard\x2DHSM\x29/47490CAA5589D5B95E2067C5BC49B03711B854DA
gnupg-pkcs11-scd[26682]: chan_0 <- RESTART
gnupg-pkcs11-scd[26682]: chan_0 -> OK
$ gnupg-pkcs11-scd[26682]: chan_0 <- [eof]
gnupg-pkcs11-scd[26682.2406156096]: post-processing connection
gnupg-pkcs11-scd[26682.2406156096]: accepting connection
gnupg-pkcs11-scd[26682.2406156096]: cleanup connection
gnupg-pkcs11-scd[26682.2406156096]: Terminating
gnupg-pkcs11-scd[26682.2369189632]: Thread command terminate
gnupg-pkcs11-scd[26682.2369189632]: Cleaning up threads

Import the key into GPG

$ gpg --expert --full-generate-key
gpg (GnuPG) 2.2.19; Copyright (C) 2019 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
   (7) DSA (set your own capabilities)
   (8) RSA (set your own capabilities)
   (9) ECC and ECC
  (10) ECC (sign only)
  (11) ECC (set your own capabilities)
  (13) Existing key
  (14) Existing key from card
Your selection? 13

Use the KEY-FRIEDNLY string as the grip.


List your key

$ gpg --list-keys
pub   rsa2048 2020-05-02 [SCE]
uid           [ultimate] gpg.intern.stafnet.local (signing key) <>


Create a test file.

$ echo "I'm boe." > /tmp/boe



Enter your pin code.

│ Please enter the PIN (PIN required for token 'SmartCard-HSM (UserPIN)' (try 0))  │
│ to unlock the card                                                               │
│                                                                                  │
│ PIN ____________________________________________________________________________ │
│                                                                                  │
│            <OK>                                                <Cancel>          │   


$ gpg --verify /tmp/boe.gpg
gpg: Signature made Sat 02 May 2020 12:16:48 PM CEST
gpg: Good signature from "gpg.intern.stafnet.local (signing key) <>" [ultimate]

Have fun…


April 30, 2020

When you need to quickly investigate a suspicious computer located thousands of kilometers away or during a pandemic like we are facing these days, it could be critical to gain remote access to the computer. Just to perform basic investigations. Also, if the attacker did a clever job, he could be monitoring processes running on his/her target. In this case, you should prevent to use of classic remote management tools like VNC, TeamViewer, etc.

The following computer is running a LanDesk process which indicates that it can be controlled remotely:

Click to zoom

Also, if the suspicious computer is potentially under the control of the attacker, it could be interesting to not ring a bell by using classic tools. Today, web conferencing tools are very popular. Why not use them to gain remote access to start your investigations?

Via Zoom (but the feature is available via other tools), any participant to a web conference can share his/her screen but also transfer the control (mouse & keyboard) to a specific participant:

Click to zoom

Now, you can download your favourite tools (events collector, memory dumper, etc)… This technique has many advantages:

  • No need to reconfigure a firewall to allow incoming connections
  • There are chances that the web conferencing tool is already installed
  • From a forensic point of view, this has a small footprint: no new login events on the computer, no changes applied to the investigated computer.
  • You gain the same rights as the connected user (which can already be ‘administrator’ rights in some (bad) cases.

Back on Zoom, the free subscription limits the conference duration up to 40 minutes but it’s enough to launch some tasks on the remote computer. If the meeting is aborted, just restart a new one. Everything you launched will keep running…

[The post Web Conferencing Tools Used for Forensic Investigations has been first published on /dev/random]

I published the following diary on “Collecting IOCs from IMAP Folder“:

I’ve plenty of subscriptions to “cyber security” mailing lists that generate a lot of traffic. Even if we try to get rid of emails, that’s a fact: email remains a key communication channel. Some mailing lists posts contain interesting indicators of compromize. So, I searched for a nice way to extract them in an automated way (and to correlate them with other data). I did not find a solution ready to use that matched my requirements… [Read more]

[The post [SANS ISC] Collecting IOCs from IMAP Folder has been first published on /dev/random]

April 29, 2020

SmartCardHSM on Kali In this blog post, we will set up a CA authority with SmartCardHSM.

When you to create internal certificate authority for internal services it’s important to protect the private key. When somebody with bad intentions gets access to the private key(s) of the signing certificate authorities, it can be used to issue new certificates. This would enable the man in the middle attacks.



I use 3 smartcards, these SmartCardHSM’s were created with Device Key Encryption Key (DKEK) keys. This makes it possible to copy the private key to another smartcard securely. The backup and Device Key Encryption Keys are stored on an encrypted USB stick. This USB stick is copied 3 times.

See my previous blog post on howto setup the SmartCard-Hsm cards with Device Key Encryption Keys.

Air gapped system to generate the private key.

We create the private on air gapped system. I use kali Linux live as the operation system of this air gappedd system. Kali Linux live is nice GNU/Linux distribution use for pentesting normally but it comes with required tools to generate the private key and copy this private to backup smartcards (opensc, openssl).

CA host

The CA authority will run Centos 8 GNU/Linux.

Create the CA public/private key pair

Create the key pair

kali@kali:~$ pkcs11-tool --module --keypairgen --key-type rsa:2048 --label ca.intern.stafnet.local --login
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA 
  label:      ca.intern.stafnet.local
  ID:         853222fd3b35a4fdf0346d05d9bbc86baa9be6ba
  Usage:      decrypt, sign, unwrap
  Access:     none
Public Key Object; RSA 2048 bits
  label:      ca.intern.stafnet.local
  ID:         853222fd3b35a4fdf0346d05d9bbc86baa9be6ba
  Usage:      encrypt, verify, wrap
  Access:     none


Using reader with a card: Cherry GmbH SmartTerminal ST-2xxx [Vendor Interface] (******) 00 00
PKCS#15 Card [SmartCard-HSM]:
        Version        : 0
        Serial number  : ******
        Manufacturer ID:
        Flags          : 
Private RSA Key [ca.intern.stafnet.local]
        Object Flags   : [0x03], private, modifiable
        Usage          : [0x2E], decrypt, sign, signRecover, unwrap
        Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
        ModLength      : 2048
        Key ref        : 6 (0x06)
        Native         : yes
        Auth ID        : 01
        ID             : 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba
        MD:guid        : e6d4cec1-0f7e-5517-f08c-de2ff317a475
Public RSA Key [ca.intern.stafnet.local]
        Object Flags   : [0x00]
        Usage          : [0x51], encrypt, wrap, verify
        Access Flags   : [0x02], extract
        ModLength      : 2048
        Key ref        : 0 (0x00)
        Native         : no
        ID             : 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba
        DirectValue    : <present>


Mount the encrypted USB device

Find the encrypted USB devices to store the key backup.

root@kali:~# lsblk -o NAME,SIZE,VENDOR,SUBSYSTEMS | grep -i usb
sda    3.9G Imation  block:scsi:usb:pci

Mount the device.

root@kali:~# lsblk -o NAME,SIZE,VENDOR,SUBSYSTEMS | grep -i usb
sda    3.9G Imation  block:scsi:usb:pci
root@kali:~# cryptsetup luksOpen /dev/sda boe
Enter passphrase for /dev/sda: 
root@kali:~# mount /dev/mapper/boe /mnt

Backup the key pair

Always a good idea to not make the file world-readable. Therefore we set the umask to 077.

kali@kali:/mnt/hsm$ umask 077
kali@kali:/mnt/hsm$ sc-hsm-tool --wrap-key ca.intern.stafnet.local --key-reference 6
Using reader with a card: Cherry GmbH SmartTerminal ST-2xxx [Vendor Interface] (21121745111568) 00 00
Enter User PIN : 

Store the key pair to the other smartcards

kali@kali:/mnt/hsm$ sc-hsm-tool --unwrap-key ca.intern.stafnet.local --key-reference 6
Using reader with a card: Cherry GmbH SmartTerminal ST-2xxx [Vendor Interface] (21121745111568) 00 00
Wrapped key contains:
  Key blob
  Private Key Description (PRKD)
Enter User PIN : 

Key successfully imported


kali@kali:/mnt/hsm$ pkcs15-tool -D
Using reader with a card: Cherry GmbH SmartTerminal ST-2xxx [Vendor Interface] (*****) 00 00
PKCS#15 Card [SmartCard-HSM]:
        Version        : 0
        Serial number  : *****
        Manufacturer ID:
        Flags          : 
Private RSA Key [ca.intern.stafnet.local]
        Object Flags   : [0x03], private, modifiable
        Usage          : [0x2E], decrypt, sign, signRecover, unwrap
        Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
        ModLength      : 2048
        Key ref        : 6 (0x06)
        Native         : yes
        Auth ID        : 01
        ID             : 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba
Public RSA Key [ca.intern.stafnet.local]
        Object Flags   : [0x00]
        Usage          : [0x51], encrypt, wrap, verify
        Access Flags   : [0x02], extract
        ModLength      : 2048
        Key ref        : 0 (0x00)
        Native         : no
        ID             : 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba
        DirectValue    : <present>


CA Authority

My CA Authority runs on a GNU/Linux Centos 8 host. Most public CA authority will have a “Root CA certificate” and an “intermediate CA certificate” The Root CA certificate is only used the sign the intermediate certificates. The intermediate certificate is used to sign client certificates. I’ll only use a single certificate setup. Some people will already find this overkill for a home setup :-)

Create the CA directory

Create the base directory for our new ca.

bash-4.4$ mkdir -p ~/ca/ca.intern.stafnet.local
bash-4.4$ cd ~/ca/ca.intern.stafnet.local

Create the sub directories.

bash-4.4$ mkdir certs crl newcerts private csr


Copy the default openssl.cnf

-bash-4.4$ cp /etc/pki/tls/openssl.cnf .
-bash-4.4$ vi openssl.cnf

CA section

The [ ca ] section is the start point for the openssl ca, default_ca is set to [ CA_default ].

[ ca ]
default_ca  = CA_default    # The default ca section


In the [ CA_default ] section update dir to the path of your ca.

x509_extensions is set to usr_cert. This defines the attributes that are applied when a new certificate is issued.

[ CA_default ]

dir   = /home/staf/ca/ca.intern.stafnet.local    # Where everything is kept
certs   = $dir/certs    # Where the issued certs are kept
crl_dir   = $dir/crl    # Where the issued crl are kept
database  = $dir/index.txt  # database index file.
#unique_subject = no      # Set to 'no' to allow creation of
          # several certs with same subject.
new_certs_dir = $dir/newcerts   # default place for new certs.

certificate = $dir/cacert.pem   # The CA certificate
serial    = $dir/serial     # The current serial number
crlnumber = $dir/crlnumber  # the current crl number
          # must be commented out to leave a V1 CRL
crl   = $dir/crl.pem    # The current CRL
private_key = $dir/private/cakey.pem# The private key

x509_extensions = usr_cert    # The extensions to add to the cert

# Comment out the following two lines for the "traditional"
# (and highly broken) format.
name_opt  = ca_default    # Subject Name options
cert_opt  = ca_default    # Certificate field options

# Extension copying option: use with caution.
# copy_extensions = copy

# Extensions to add to a CRL. Note: Netscape communicator chokes on V2 CRLs
# so this is commented out by default to leave a V1 CRL.
# crlnumber must also be commented out to leave a V1 CRL.
# crl_extensions  = crl_ext

default_days  = 365     # how long to certify for
default_crl_days= 30      # how long before next CRL
default_md  = sha256    # use SHA-256 by default
preserve  = no      # keep passed DN ordering

# A few difference way of specifying how similar the request should look
# For type CA, the listed attributes must be the same, and the optional
# and supplied fields are just that :-)
policy    = policy_match

# For the CA policy
[ policy_match ]
countryName   = match
stateOrProvinceName = match
organizationName  = match
organizationalUnitName  = optional
commonName    = supplied
emailAddress    = optional


x509_extensions is set to usr_cert. This defines the attributes that are applied when a new certificate is issued. Update the attributes like the nsCaRevocationUrl if want to use a CRL.

[ usr_cert ]

# These extensions are added when 'ca' signs a request.

# This goes against PKIX guidelines but some CAs do it and some software
# requires this to avoid interpreting an end user certificate as a CA.


# Here are some examples of the usage of nsCertType. If it is omitted
# the certificate can be used for anything *except* object signing.

# This is OK for an SSL server.
# nsCertType      = server

# For an object signing certificate this would be used.
# nsCertType = objsign

# For normal client use this is typical
# nsCertType = client, email

# and for everything including object signing:
# nsCertType = client, email, objsign

# This is typical in keyUsage for a client certificate.
# keyUsage = nonRepudiation, digitalSignature, keyEncipherment

# This will be displayed in Netscape's comment listbox.
nsComment     = "OpenSSL Generated Certificate"

# PKIX recommendations harmless if included in all certificates.

# This stuff is for subjectAltName and issuerAltname.
# Import the email address.
# subjectAltName=email:copy
# An alternative to produce certificates that aren't
# deprecated according to PKIX.
# subjectAltName=email:move

# Copy subject details
# issuerAltName=issuer:copy

nsCaRevocationUrl   = http://ca.inter.stafnet.local/crl.pem

# This is required for TSA certificates.
# extendedKeyUsage = critical,timeStamping

req section

The [ req ] section specifies the section for the ca signing requests. Update the defaults_bits to rsa key size 4096. distinguished_name is set to req_distinguished_name. This defines the default settings when you create a ca signing request.

[ req ]
default_bits    = 4096
default_md    = sha256
default_keyfile   = privkey.pem
distinguished_name  = req_distinguished_name
attributes    = req_attributes
x509_extensions = v3_ca # The extensions to add to the self signed cert

# Passwords for private keys if not present they will be prompted for
# input_password = secret
# output_password = secret

# This sets a mask for permitted string types. There are several options.
# default: PrintableString, T61String, BMPString.
# pkix   : PrintableString, BMPString (PKIX recommendation before 2004)
# utf8only: only UTF8Strings (PKIX recommendation after 2004).
# nombstr : PrintableString, T61String (no BMPStrings or UTF8Strings).
# MASK:XXXX a literal mask value.
# WARNING: ancient versions of Netscape crash on BMPStrings or UTF8Strings.
string_mask = utf8only

# req_extensions = v3_req # The extensions to add to a certificate request


[ distinguished_name ] defines the default settings for a ca request. Update the setting with you country, organization etc.

[ req_distinguished_name ]
countryName     = Country Name (2 letter code)
countryName_default   = BE
countryName_min     = 2
countryName_max     = 2

stateOrProvinceName   = State or Province Name (full name)
stateOrProvinceName_default = Antwerp

localityName      = Locality Name (eg, city)
localityName_default    = Antwerp

0.organizationName    = Organization Name (eg, company)
0.organizationName_default  = stafnet.local

# we can do this but it is not needed normally :-)
#1.organizationName   = Second Organization Name (eg, company)
#1.organizationName_default = World Wide Web Pty Ltd

organizationalUnitName    = Organizational Unit Name (eg, section)
organizationalUnitName_default  = intern.stafnet.local

commonName      = Common Name (eg, your name or your server\'s hostname)
commonName_max      = 64

emailAddress      = Email Address
emailAddress_max    = 64

# SET-ex3     = SET extension number 3

[ req_attributes ]
challengePassword   = A challenge password
challengePassword_min   = 4
challengePassword_max   = 20

unstructuredName    = An optional company name

Create the CA certificate

Get the keypair id

bash-4.4$ pkcs15-tool -D
rivate RSA Key [ca.intern.stafnet.local]
	Object Flags   : [0x3], private, modifiable
	Usage          : [0x2E], decrypt, sign, signRecover, unwrap
	Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
	ModLength      : 2048
	Key ref        : 6 (0x6)
	Native         : yes
	Auth ID        : 01
	ID             : 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba
	MD:guid        : 03580e77-ebb8-48f3-cebb-bae4bc9ff34a

Create the CA certificate

Use pkcs15-tool -D to find the ID of the keypair. You’ll find this ID 2 times as this is a public/privaye keypair.

-bash-4.4$ openssl req -config openssl.cnf -engine pkcs11 -new -x509 -days 1095 -keyform engine -key 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba -out cacert.pem
engine "pkcs11" set.
Enter PKCS#11 token PIN for UserPIN (SmartCard-HSM):
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
Country Name (2 letter code) [BE]:
State or Province Name (full name) [Antwerp]:
Locality Name (eg, city) [Antwerp]:
Organization Name (eg, company) [stafnet.local]:
Organizational Unit Name (eg, section) [intern.stafnet.local]:
Common Name (eg, your name or your server's hostname) []:ca.intern.stafnet.local
Email Address []:

Create a client certificate

Create the private key

Create the private key for the client certificate. When you specify -aes256 the key will get encrypted by a passphrase.

-bash-4.4$ openssl genrsa -aes256 -out private/client001.key 4096
Generating RSA private key, 4096 bit long modulus (2 primes)
e is 65537 (0x010001)
Enter pass phrase for private/client001.key:
Verifying - Enter pass phrase for private/client001.key:

Create the certificate signing request

bash-4.4$ openssl req -config ./openssl.cnf -key private/client001.key -out csr/client001.csr -new -nodes
Enter pass phrase for ca/private/client001.key:
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
Country Name (2 letter code) [BE]:
State or Province Name (full name) [Antwerp]:
Locality Name (eg, city) [Antwerp]:
Organization Name (eg, company) [stafnet.local]:
Organizational Unit Name (eg, section) [intern.stafnet.local]:
Common Name (eg, your name or your server's hostname) []:testcert.intern.stafnet.local
Email Address []:

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

Create the index, serial and crlnumber

Create the index file, this wil hold the list (table of contents) with the issued and revoked certificates. serial is the serial number for certificate, the number will increase when a certificate is issued. crtlnumber is serial number for the certificate revocation list, the number will increase when a certificate is revoked.

-bash-4.4$ touch index.txt
-bash-4.4$ echo 01 > serial
-bash-4.4$ echo 01 > crlnumber

Sign the test cert

-bash-4.4$ openssl ca -config ./openssl.cnf -engine pkcs11 -keyform engine -keyfile 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba -cert cacert.pem -out certs/client001.crt -infiles csr/client001.csr
engine "pkcs11" set.
Using configuration from ./openssl.cnf
Enter PKCS#11 token PIN for UserPIN (SmartCard-HSM):
Check that the request matches the signature
Signature ok
Certificate Details:
        Serial Number: 1 (0x1)
            Not Before: Apr 28 17:30:10 2020 GMT
            Not After : Apr 28 17:30:10 2021 GMT
            countryName               = BE
            stateOrProvinceName       = Antwerp
            organizationName          = stafnet.local
            organizationalUnitName    = intern.stafnet.local
            commonName                = testcert.intern.stafnet.local
        X509v3 extensions:
            X509v3 Basic Constraints: 
            Netscape Comment: 
                OpenSSL Generated Certificate
            X509v3 Subject Key Identifier: 
            X509v3 Authority Key Identifier: 

            Netscape CA Revocation Url: 
Certificate is to be certified until Apr 28 17:30:10 2021 GMT (365 days)
Sign the certificate? [y/n]:y

1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated



Create the crl

-bash-4.4$ openssl ca -config openssl.cnf -engine pkcs11 -keyform engine -keyfile 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba -gencrl -out crl/ca.intern.stafnet.local.crl
engine "pkcs11" set.
Using configuration from openssl.cnf
Enter PKCS#11 token PIN for UserPIN (SmartCard-HSM):


-bash-4.4$ openssl crl -in crl/ca.intern.stafnet.local.crl -text -noout
Certificate Revocation List (CRL):
        Version 2 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = BE, ST = Antwerp, L = Antwerp, O = stafnet.local, OU = intern.stafnet.local, CN = ca.intern.stafnet.local
        Last Update: Apr 28 17:58:28 2020 GMT
        Next Update: May 28 17:58:28 2020 GMT
        CRL extensions:
            X509v3 CRL Number: 
No Revoked Certificates.
    Signature Algorithm: sha256WithRSAEncryption


-bash-4.4$ openssl ca -config openssl.cnf -engine pkcs11 -keyform engine -keyfile 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba -revoke certs/client001.crt
engine "pkcs11" set.
Using configuration from openssl.cnf
Enter PKCS#11 token PIN for UserPIN (SmartCard-HSM):
Revoking Certificate 01.
Data Base Updated

Recreate the crl

-bash-4.4$ openssl ca -config openssl.cnf -engine pkcs11 -keyform engine -keyfile 853222fd3b35a4fdf0346d05d9bbc86baa9be6ba -gencrl -out crl/ca.intern.stafnet.local.crl
engine "pkcs11" set.
Using configuration from openssl.cnf
Enter PKCS#11 token PIN for UserPIN (SmartCard-HSM):


-bash-4.4$ openssl crl -in crl/ca.intern.stafnet.local.crl -text -noout
Certificate Revocation List (CRL):
        Version 2 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = BE, ST = Antwerp, L = Antwerp, O = stafnet.local, OU = intern.stafnet.local, CN = ca.intern.stafnet.local
        Last Update: Apr 28 18:06:49 2020 GMT
        Next Update: May 28 18:06:49 2020 GMT
        CRL extensions:
            X509v3 CRL Number: 
Revoked Certificates:
    Serial Number: 01
        Revocation Date: Apr 28 18:01:25 2020 GMT
    Signature Algorithm: sha256WithRSAEncryption


Every Sunday, I send out a newsletter called cron.weekly to over 10.000 subscribers. In this post, I’ll do a deep-dive into how those mails get delivered to subscribers.
This post will show you how to use Bitbucket Pipelines to build and test your Laravel Project in a docker container.

April 28, 2020

In my DrupalCon Amsterdam keynote last October, I shared my thoughts on what I believe is important for the future of Drupal.

Today it is your turn.

After every major release of Drupal, we distribute a survey to get the community's feedback on what to focus on for future releases of Drupal.

The last time we conducted such a survey was four years ago, after the release of Drupal 8. The results were the basis for defining Drupal 8 product initiatives, organized on the Drupal 8 mountain that I've been using for the past few years.

A sample of the last product survey showing that we should focus on content authors
An example result from the 2016 Drupal product survey. The result shows that in 2016 we decided to focus on "content authors" as the most important persona. Since that survey, we improved Drupal's authoring workflows, media management, layout building, and more.

In a similar way, this new survey will help lay the groundwork for the strategic direction of the Drupal project over the next 3–5 years.

👉 Take the 2020 Drupal product survey here. The survey takes about 15 minutes to complete.

We'd like to hear from everyone who cares about Drupal: content managers, site owners, site builders, module developers, front-end developers, people selling Drupal, and anyone else who touches the Drupal project in any capacity. Whether you are a Drupal expert or just getting started with Drupal, every voice counts!

I will be presenting the results at the next DrupalCon, as well as sharing the results on my blog. DrupalCon North America was originally planned to be in Minneapolis at the end of May, but because of COVID-19, it moved to a global, virtual DrupalCon in July. Join us at DrupalCon, or if you can't, consider subscribing to my blog to get the results via email. I look forward to hearing from you about how we should create the future of Drupal together.

To create the survey, I worked with Brian Prue, Gábor Hojtsy, Angela Byron and Matthew Grasmick. Initial versions were improved based on test runs with over a dozen people in the Drupal community. Thank you for all their feedback along the way.

April 27, 2020

I published the following diary on “Powershell Payload Stored in a PSCredential Object“:

An interesting obfuscation technique to store a malicious payload in a PowerShell script: In a PSCredential object! The PSCredential class can be used to manage credentials in a centralized way. Just have a look at this example. First, let’s encrypt our strong password… [Read more]

[The post [SANS ISC] Powershell Payload Stored in a PSCredential Object has been first published on /dev/random]

Introduction Some of members of the cloud-native computing world have recently been trying to find a way to address the areas where Kubernetes falls short when trying to run on resource constrained environments.

April 26, 2020

Here’s a quick example of setting up a custom 404 landing page if you use Caddy V2 to serve static sites, like this blog.
Here’s what I’ve been doing for most of my life, when it comes to writing crontab entries:
Deploying a Linux server is really easy these days thanks to the abundance of cloud server providers like Linode, Digital Ocean etc.

April 24, 2020

I published the following diary on “Malicious Excel With a Strong Obfuscation and Sandbox Evasion“:

For a few weeks, we see a bunch of Excel documents spread in the wild with Macro V4. But VBA macros remain a classic way to drop the next stage of the attack on the victim’s computer. The attacker has many ways to fetch the next stage. He can download it from a compromised server or a public service like,, or any other service that allows sharing content. The problem is, in this case, that it generates more noise via new network flows and the attack depends on the reactivity of the other party to clean up the malicious content. If this happens, the macro won’t be able to fetch the data and the infection will fail. The other approach is to store the payload in the document metadata, the document itself or appended to it… [Read more]

[The post [SANS ISC] Malicious Excel With a Strong Obfuscation and Sandbox Evasion has been first published on /dev/random]

April 23, 2020

lookat 2.0.1

“lookat” (or “bekijk” in Dutch) is a program to view text files and manual pages. It is designed to be more user-friendly than more conventional text viewers such as less. And supports colored manpages.

Lookat 2.0.1 is the latest stable release of Lookat/Bekijk.


  • BUGFIX: corrected screen refresh code. To handle non-utf8 terminals correctly.
  • BUGFIX: ensure that menus are initialized before using them.
  • BUGFIX: corrected type menu handling.
  • BUGFIX: failed to open type enabled extentions from the commandline.

Lookat 2.0.1 is available at:

Have fun

Versterking nodig voor je organisatie? Het zijn solden!

Mijn partner is een administratieve superkracht:
  • vier jaar brede ervaring
  • viertalig (Nederlands C1, Engels C1, Lets C2, Russisch C1)
  • communicatief ijzersterk
  • heel vlot achter een computerscherm
  • leert pijlsnel
Je kan op haar beroep doen, ofwel via telewerk, ofwel in regio Leuven/ Zaventem/Brussel/Mechelen.

Voor lezers uit de openbare sector:
  • buitenlands bachelordiploma erkend door NARIC Vlaanderen
  • taalcertificaat B2 (KULeuven)

April 21, 2020

Hundreds of blue hearts

Last week, Vanesssa and I pledged to match $100,000 in individual contributions to the Drupal Association.

Then today, 29 organizations in the Drupal community pledged another $100,000 to match Vanessa's and my pledge!

For every $1 you donate, the Drupal Association now gets $3. 🚀

So far 700 people have contributed roughly $72,000 to #DrupalCares. Because both matching gifts apply to all individual donations or memberships from the start of the campaign, we're up to $216,000 (3 x $72,000).

It's been heartwarming to see that so many have stepped up to provide support.

The various creative responses have also been amazing. One of my favorite examples is Gábor Hojtsy's: Gábor will donate €9 for every Drupal module that gets updated to Drupal 9. Best of all, his €9 donation will also be matched 2-for-1. For every module that gets updated in the next two weeks, €27 could get fundraised for the Drupal Association. Since Gábor announced his challenge 5 days ago, 65 modules have been updated already. It raises money for the Drupal Association and it helps the Drupal community prepare for the Drupal 9 release later this year. Such a clever idea and what a difference it can make!

Thank you again to all of the individuals and organizations who have graciously donated to the Drupal Association so far. Every contribution matters and means so much! We hope those of you reading this will join us in donating or launching creative campaigns.

A window of a shop with a 'shop local' sign.

In the small town where I live, some of the local businesses have "shop local" signs on their windows, and even out on the curbs. They are reminders to support local businesses instead of going to large supermarkets and department stores.

There are three important things that we know for a fact:

  1. When we support local businesses, we know that they are investing a portion of their profits back into our communities — benefits flow to local employees, the local baker, the local farmers, etc.
  2. We also know that shoppers need sustained public education of why to "shop local", hence the many signs.
  3. Lastly, we know that this education pays off: by making a small change in their shopping habits, shoppers can make a huge impact on their local economies.

All of this applies to Open Source: we need sustained public education to encourage end users to "shop local" and to support those organizations and individuals that contribute to Open Source. This has a real, meaningful impact on Open Source sustainability and growth.

The Open Source dividend

A sign that reads 'Open Source dividends'

Open Source ecosystems have a built-in dividend system. The concept is straightforward:

  • Every commercial project that is awarded to an Open Source Maker (an organization that contributes back) results in contributions to the underlying Open Source project as a whole. Some companies give back more than 5% of their revenues by contributing "time, talent or treasure".
  • In contrast, every commercial project that goes to an Open Source Taker (an organization that doesn’t contribute back) results in little to no contribution or benefit back to the community, or a 0% dividend.

Open Source projects can grow faster when you funnel commercial work to Makers, and away from Takers. It's an Open Source dividend system. By promoting those who give back, we maximize the total amount of contribution, to the benefit of the Open Source community at large.

End users of Open Source software can help maximize the Open Source dividend by working with implementation partners that give back to Open Source. If more end users of Open Source took this stance, it would have a massive impact on Open Source sustainability and innovation.

Large end users could even mandate Open Source contributions from their vendors. We have large-scale, institutionalized examples of this in the Drupal community: organizations like Pfizer and the State of Georgia made Open Source contribution a required vendor selection criteria and only work with agencies that contribute back to Drupal. (Sources: Pfizer, State of Georgia)

Last week, Acquia announced that it is buying advertising space for top Drupal contributors. It is the equivalent of putting "shop local" signs in the windows. This is just one of many ways we can promote Makers.

All of this follows the recommendations in my blog post Balancing Makers and Takers to scale and sustain Open Source. I’m excited that we are experimenting with ways to improve Open Source sustainability.

Earlier this month, GitLab B.V.'s package signing key expired, requiring them to rotate their key. This means that anyone who uses one of their packages needs to jump through a number of manual hoops to update their apt key configuration, which is an annoying manual process that also requires people to download random files from the Internet -- something extrepo was written to prevent. At least they're served over https, but still.

I didn't notice until today, but I just updated the extrepo metadata to carry the new key. That means that if you enable one of the GitLab repositories through extrepo enable, you will get the new key rather than the old one. On top of that, if you had already enabled the repository through extrepo, all that is needed for you right now to pull in the new key is to run extrepo update.

While I do apologise for the late update, hopefully this should make some people's lives a bit easier.

And if GitLab B.V. reads this: please send me a MR to the repository next time, so that we can make process be done in time ;-)

I spent the last week or so building Docker images and a set of YAML files that allows one to run SReview, my 99%-automated video review and transcode system, inside minikube, a program that sets up a mini Kubernetes cluster inside a VM for development purposes.

I wish the above paragraph would say "inside Kubernetes", but alas, unless your Kubernetes implementation has a ReadWriteMany volume that can be used from multiple nodes, this is not quite the case yet. In order to fix that, I am working on adding an abstraction layer that will transparently download files from an S3-compatible object store; but until that is ready, this work is not yet useful for large installations.

But that's fine! If you're wanting to run SReview for a small conference, you can do so with minikube. It won't have the redundancy and reliability things that proper Kubernetes provides you, but then you don't really need that for a conference of a few days.

Here's what you do:

  • Download minikube (see the link above)
  • Run minikube start, and wait for it to finish
  • Run minikube addon enable ingress
  • Clone the SReview git repository
  • From the toplevel of that repository, run perl -I lib scripts/sreview-config -a dump|sensible-pager to see an overview of the available configuration options.
  • Edit the file dockerfiles/kube/master.yaml to add your configuration variables, following the instructions near the top
  • Once the file is configured to your liking, run kubectl apply -f master.yaml -f storage-minikube.yaml
  • Add to /etc/hosts, and have it point to the output of minikube ip.
  • Create preroll and postroll templates, and download them to minikube in the location that the example config file suggests. Hint: minikube ssh has wget.
  • Store your raw recorded assets under /mnt/vda1/inputdata, using the format you specified for the $inputglob and $parse_re configuration values.
  • Profit!

This doesn't explain how to add a schedule to the database. My next big project (which probably won't happen until after the next FOSDEM is to add a more advanced administrator's interface, so that you can just log in and add things from there. For now though, you have to run kubectl port-forward svc/sreview-database 5432, and then use psql to localhost to issue SQL commands. Yes, that sucks.

Having said that, if you're interested in trying this out, give it a go. Feedback welcome!

(many thanks to the people on the #debian-devel IRC channel for helping me understand how Kubernetes is supposed to work -- wouldn't have worked nearly as nice without them)

It’s good to be reminded of the fact that the internet is, in fact, a pretty hostile place.
I want to try something new for this blog: guest writing. If you have an idea for a Linux/Server/Dev guide or a strong (but interesting) opinion on technology and its implications, I would love to hear from you.

April 20, 2020

Whenever I want to give people career advice, I inevitably start saying the things that worked for me.

April 18, 2020

The popular web conference platform Zoom has been in the storm for a few weeks. With the COVID19 pandemic, more and more people are working from home and the demand for web conference tools has been growing. Vulnerabilities have been discovered in the Zoom client and, based on the fact that meetings were not properly secured, a new type of attack was also detected: Zoom Bombing.

Keep also in mind that Zoom, by design, allows attendees to exchange information via a chat session. Another vulnerability discovered was the leak of Windows credentials via UNC links. Recently, it appeared that 500K Zoom users‘ credentials were for sale on the dark web. Bad days for Zoom!

To protect against these issues, many sites started to explain how to harden your Zoom meetings. By example, by defining a password to join all meetings:

Click to enlarge

Good, you changed your default settings and you’re now safe. Are you sure? When you schedule a Zoom meeting, there are chances that you will send invites to attendees. Meeting requests are sent via a standard format “ICS” compatible with most of the calendar applications. These files attached to the invitation or can be downloaded via URLs. Consider them as regular files and, today, files are made to be processed, scanned, indexed, etc…

I created a YARA rule on VT to search for such files. Calendar invitations have not only a common extension but a common name: ‘invite.ics’. Here is the simple rule:

rule calendar_zoom
   $s1 = "Join Zoom Meeting"
   any of ($s*) and new_file and file_name == "invite.ics"

I added the rule yesterday and already collected 92 invitations. Detecting files in real-time is very nice because of the increased chances that the meeting will be organized in the future or be recurring! And all details are available in ICS files:

DESCRIPTION:xxx is inviting you to a scheduled Zoom meeting.\n\n\n\n
Join Zoom Meeting\n\n\n\n\n\n
Meeting ID: xxx xxx xxxx\n\nPassword: xxxxxx\n\n\n\n

We have the meeting URL, ID and… the password! We also have more interesting details like the organizer’s details:

ORGANIZER;CN=xxx xxx:mailto:xxx@xxx

When the meeting is scheduled:


But also attendees:

CN=xxx xxx
CN=xxx xxx

I found email addresses with very very interesting domain names from big companies or international organizations! This information is perfect to perform a social engineering attack. If an attendee receives an email from a Zoom event organizer with, by example, a malicious attachment, guess what he/she will do?

I also performed a retro hunt on VirusTotal and found 6000+ invite.ics files with Zoom meeting details! There are less relevant because most of them are outdated.

What about Google? Can we also find invitations indexed by the search engine? I tried some Google searches:

“Join Zoom Meeting” filetype:txt’ : 137 results

“Join Zoom Meeting” inurl:ics: 74 results

I found in my Zoom account settings that it’s possible to disable the generation of passwords for a single-click meeting:

Click to enlarge

The generated URL will be: ‘’ with just the meeting ID. But the password is still sent in the invitation. It could be interesting to have an option to *NOT* send the password via this channel but via an alternative one…

Based on the files that I checked, 50% of the meetings do not have a password set…

[The post Finding Zoom Meeting Details in the Wild has been first published on /dev/random]

April 17, 2020

I published the following diary on “Weaponized RTF Document Generator & Mailer in PowerShell“:

Another piece of malicious PowerShell script that I found while hunting. Like many malicious activities that occur in those days, it is related to the COVID19 pandemic. Its purpose of simple: It checks if Outlook is used by the victim and, if it’s the case, it generates a malicious RTF document that is spread to all contacts extracted from Outlook. Let’s have a look at it… [Read more]

[The post [SANS ISC] Weaponized RTF Document Generator & Mailer in PowerShell has been first published on /dev/random]

April 16, 2020

Autoptimize 2.7 is in the final stages and the beta version can as of now be downloaded from

Main new features:

So who want to test? Feedback in the comments please! :-)

Today, Acquia announced how it will support Drupal and the Drupal Association in these challenging times. In his blog post, Mike Sullivan, Acquia's CEO, lists Acquia's many contributions to Drupal.

One contribution that I'm very excited about is the last one on Mike's list: Advertising for top contributors. Acquia will buy advertising on, and use said advertising to highlight the 10 Acquia partners that contribute to Drupal the most.

These partners will have the option to advertise their expertise and use cases on industry pages or feature pages for the remainder of 2020. Given that sees two million unique visitors and 15 million page views per month, it should drive more awareness and new business to these valued Drupal contributors.

It's a win-win-win-win: the Drupal Association gets advertising revenue, Drupal's top contributors get more business, and Drupal can grow faster. All of this is obviously good for Acquia as well. It's a nice way to give back.

Promoting the companies that contribute to Drupal is always a good idea, but in an economic downturn, it's even more important. I'm proud that Acquia continues to support Drupal in a big way, and I'm excited that Acquia is experimenting with ways to promote some of Drupal's top contributors.

I’m working on some new versions of PHP packages, and for that I’m working of a temporary branch.

April 15, 2020

Over the last 3-4 years, I’ve been learning more about the inner workings of Bitcoin and all the tech & finance surrounding it.

April 14, 2020

Dozens of blue hearts

Vanessa and I have been talking about how we can help contribute to the Drupal Association, and by extension the Drupal community, in these uncertain times. We want to show our support given Drupal's positive impact on our lives.

We have decided to donate a minimum of $44,000 to match all 485 individual donations so far. We will also continue to match new donations up to $100,000.

In order for your donation to be matched, it needs to be an individual donation or an individual memberships, not an organizational contribution, and it needs to be donated by April 30, 2020.

Together, we can provide $200,000 to #DrupalCares. Our total fundraising goal is $500,000 so $200,000 coming from individuals would be an incredible start, and will help us raise the remaining $300,000.

We hope you join us in donating.

Be kind to others and help where you can! 💙

April 11, 2020

If you read this post through Planet Debian, then you may already know this through "Johnathan"'s post on the subject: on the 29th of February this year, Tammy and I got married. Yes. Really. No, I didn't expect this to happen myself about five years ago, but here we are.

Tammy and I met four years ago at DebConf16 in Cape Town, South Africa, where she was a local organizer. If you were at dc16, you may remember the beautifully designed conference stationery, T-shirts, and bag; this was all her work. In addition, the opening and closing credits on the videos of that conference were designed by her.

As it happens, that's how we met. I've been a member of the Debconf video team since about 2010, when I first volunteered to handle a few cameras. In 2015, since someone had to do it, I installed Carl's veyepar on Debian's on-site servers, configured, and ran it. I've been in charge of the postprocssing infrastructure -- first using veyepar, later using my own SReview -- ever since. So when I went to the Debconf organizers in 2016 to ask for preroll and postroll templates for the videos, they pointed me to Tammy; and we haven't quite stopped talking since.

I can speak from experience now when saying that a long-distance relationship is difficult. My previous place of residence, Mechelen, is about 9600km away from Cape Town, and so just seeing Tammy required about a half month's worth of pay -- not something I always have to spare. But after a number of back-and-forth visits and a lot of paperwork, I have now been living in Cape Town for just over a year. Obviously, this has simplified things a lot.

The only thing left for me to do now is to train my brain to stop thinking there's something stuck on my left ring finger. It's really meant to be there, after all...

April 10, 2020

I published the following diary on “PowerShell Sample Extracting Payload From SSL“:

Another diary, another technique to fetch a malicious payload and execute it on the victim host. I spotted this piece of Powershell code this morning while reviewing my hunting results. It implements a very interesting technique. As usual, all the code snippets below have been beautified. First, it implements a function to reverse obfuscated strings… [Read more]

[The post [SANS ISC] PowerShell Sample Extracting Payload From SSL has been first published on /dev/random]

I just migrated this webserver to Caddy 2 and with it, enabled HTTP/3 support. This post will give a short explanation how you can do that.
Last week, the first Release Candidate of Caddy 2 saw the light of day. I don’t usually like to run production environments on beta software, but for Caddy I wanted to make an exception

April 09, 2020

One of my sources of threat intelligence is a bunch of honeypots that I’m operating here and there. They are facing the wild Internet and, as you can imagine, they get hit by many “attackers”. But are they really bad people? Of course, the Internet is full of bots tracking vulnerabilities 24 hours a day and 7 days a week. But many IP addresses that hit my honeypots are servers or devices that have just been compromised.

To get an idea of whose behind these IP addresses, why not try to get a screenshot of available web services running on common ports? (mainly 80,81,8080,8088,443,8443,8844, etc). Based on a list of 60K+ IP addresses, I used Nmap to collect screenshots of detected web services. Nmap has a nice script to automate this task.

Once the huge amount of screenshots generated, I searched for an idea to display them in a nice way: a big “patchwork” of all images. Here is the resulting image:

The final picture is quite big:

  • Contains 20599 screenshots
  • 223MB
  • 60000 x 41100

It can be zoomed in and out to browse the screenshots for fun:

To generate this patchwork image, I used a few lines of Python code together with the Pillow library. Images have been resized and “white” images removed (because the Nmap script does not work well with technologies like Flash).

I won’t share the full picture because it contains sensitive information like files listing, admin interfaces and much more. Based on screenshots, many IP addresses run open interfaces, interfaces with default credentials or unconfigured applications (like CMS). Some screenshots reveal corporate organizations. This is probably due to NAT in place for egress traffic generated by compromised internal hosts. As you can imagine, some interfaces reveal critical services. Some examples?

If you’re interested to generate such kind of big images, the script I write is available on my GitHub repo. Don’t blame me for the quality of the code 😉

[The post Hey Scanners, Say “Cheese!” has been first published on /dev/random]

April 03, 2020

Bad guys are always trying to use “exotic” file extensions to deliver their malicious payloads. If common dangerous extensions are often blocked by mail security gateways, there exists plenty of less common extensions. These days, with the COVID19 pandemic, we are facing a peak of phishing and scams trying to lure victims. I spotted one that uses such exotic extension: “DAA”.

“DAA” stands for “Direct-Access-Archive” and is a file format developed by Power Software and its toolbox PowerISO. This is not a brand new way to distribute malware, my friend Didier Stevens already wrote an Internet Storm Center diary about this file format. A DAA file can normally only be processed by PowerISO. This restricts greatly the number of potential victims because, today, Microsoft Windows is able to handle ISO files natively. So, how to handle a suspicious DAA file?

Hopefully, PowerISO has a command-line tool available for free (and statically compiled!). It helps to extract the content of DAA files. Let’s do it in a Docker to not mess with your base OS…

xavier : /Volumes/MalwareZoo/20200401 $ ls Covid-19.001.daa
xavier : /Volumes/MalwareZoo/20200401 $ docker run -it --rm -v $(pwd):/data ubuntu bash
root@0c027d353187:/# cd /data
root@0c027d353187:/data# wget -q -O -|tar xzvf -
root@0c027d353187:/data# chmod a+x poweriso
root@0c027d353187:/data# ./poweriso extract Covid-19.001.daa / -od . 

PowerISO   Copyright(C) 2004-2008 PowerISO Computing, Inc                
            Type poweriso -? for helpExtracting to 

./Covid-19.001.exe ...   100%

root@0c027d353187:/data# file Covid-19.001.exe
Covid-19.001.exe: PE32 executable (GUI) Intel 80386, for MS Windows

Now, you have got the PE file and you go further with the analysis…

As you can see in the Copyright message, the tool is old (2008) but it works pretty well and deserves to be added to your personal reverse-engineering arsenal!

[The post Handling Malware Delivered Into .daa Files has been first published on /dev/random]

I published the following diary on “Obfuscated with a Simple 0x0A“:

With the current Coronavirus pandemic, we continue to see more and more malicious activity around this topic. Today, we got a report from a reader who found a nice malicious Word document part of a Coronavirus phishing campaign. I don’t know how the URL was distributed (probably via email) but the landing page is fake White House-themed page. So, probably targeting US citizens… [Read more]

[The post [SANS ISC] Obfuscated with a Simple 0x0A has been first published on /dev/random]

March 29, 2020

March 27, 2020

When cached HTML links to deleted Autoptimized CSS/ JS the page is badly broken … no more with a new (experimental) option in AO27 to use fallback CSS/ JS which I just committed on the beta branch on GitHub.

For this purpose Autoptimize hooks into template_redirect and will redirect to fallback Autoptimized CSS/ JS if a request for autoptimized files 404’s.

For cases where 404’s are not handled by WordPress but by Apache, AO adds an ErrorDocument directive in the .htaccess-file redirecting to wp-content/autoptimize_404_handler.php. Users on NGINX or MS IIS or … might have to configure their webserver to redirect to wp-content/autoptimize_404_handler.php themselves though, but those are smart cookies anyway, no?

If you want to test, you can download Autoptimize 2.7 beta here and replace 2.6 with it.

I published the following diary on “Malicious JavaScript Dropping Payload in the Registry“:

When we speak about “fileless” malware, it means that the malware does not use the standard filesystem to store temporary files or payloads. But they need to write data somewhere in the system for persistence or during the infection phase. If the filesystem is not used, the classic way to store data is to use the registry. Here is an example of a malicious JavaScript code that uses a temporary registry key to drop its payload (but it also drops files in a classic way)… [Read more]

[The post [SANS ISC] Malicious JavaScript Dropping Payload in the Registry has been first published on /dev/random]

March 26, 2020

I published the following diary on “Very Large Sample as Evasion Technique?“:

Security controls have a major requirement: they can’t (or at least they try to not) interfere with normal operations of the protected system. It is known that antivirus products do not scan very large files (or just the first x bytes) for performance reasons. Can we consider a very big file as a technique to bypass security controls? Yesterday, while hunting, I spotted a very interesting malware sample. The malicious PE file was delivered via multiple stages but the final dropped file was large… very large… [Read more]

[The post [SANS ISC] Very Large Sample as Evasion Technique? has been first published on /dev/random]

March 25, 2020

A blue heart

Today, I'm asking for your financial support for the Drupal Association. As we all know, we are living in unprecedented times, and the Drupal Association needs our help. With DrupalCon being postponed or potentially canceled, there will be a significant financial impact on our beloved non-profit.

Over the past twenty years, the Drupal project has weathered many storms, including financial crises. Every time, Drupal has come out stronger. As I wrote last week, I'm confident that Drupal and Open Source will weather the current storm as well.

While the future for Drupal and Open Source is in no doubt, the picture is not as clear for the Drupal Association.

Thirteen years ago, six years after I started Drupal, the Drupal Association was formed. As an Open Source non-profit, the Drupal Association's mission was to help grow and sustain the Drupal community. It still has that same mission today. The Drupal Association plays a critical role in Drupal's success: it manages, hosts Open Source collaboration tools, and brings the community together at events around the world.

The Drupal Association's biggest challenge in the current crisis is to figure out what to do about DrupalCon Minneapolis. The Coronavirus pandemic has caused the Drupal Association to postpone or perhaps even cancel DrupalCon Minneapolis.

With over 3,000 attendees, DrupalCon is not only the Drupal community's main event — it's also the most important financial lever to support the Drupal Association and the staff, services, and infrastructure they provide to the Drupal project. Despite efforts to diversify its revenue model, the Drupal Association remains highly dependent on DrupalCon.

No matter what happens with DrupalCon, there will be a significant financial impact to the Drupal Association. The Drupal Association is now in a position where it needs to find between $400,000 and $1.1 million USD depending on if we postpone or cancel the event.

In these trying times, the best of Drupal's worldwide community is already shining through. Some organizations and individuals proactively informed the Drupal Association that they could keep their sponsorship dollars or ticket price whether or not DrupalCon North America happens this year: Lullabot, Centarro, FFW,, Amazee Group and Contegix have come forward to pledge that they will not request a refund of their DrupalCon Minneapolis sponsorship, even if it will be cancelled. Acquia, my company, has joined in this campaign as well, and will not request a refund of its DrupalCon sponsorship either.

These are great examples of forward-thinking leadership and action, and is what makes our community so special. Not only do these long-time Drupal Association sponsors understand that the entire Drupal project benefits from the resources the Drupal Association provides for us — they also anticipated the financial needs the Drupal Association is working hard to understand, model and mitigate.

In order to preserve the Drupal Association, not just DrupalCon, more financial help is needed:

  • Consider making a donation to the Drupal Association.
  • Other DrupalCon sponsors can consider this year's sponsorship as a contribution and not seek a refund should the event be cancelled, postponed or changed.
  • Individuals can consider becoming a member, increasing their membership level, or submitting an additional donation.

I encourage everyone in the Drupal community, including our large enterprise users, to come together and find creative ways to help the Drupal Association and each other. All contributions are highly valued.

The Drupal Association is not alone. This pandemic has wreaked havoc not only on other technology conferences, but on many organizations' fundamental ability to host conferences at all moving forward.

I want to thank all donors, contributors, volunteers, the Drupal Association staff, and the Drupal Association Board of Directors for helping us work through this. It takes commitment, leadership and courage to weather any storm, especially a storm of the current magnitude. Thank you!