On International Women’s Day

Yet another international women’s day is here. This is marked, world over, with events in various forms. From honouring women and recognizing their contribution at all levels, to acknowledging that there exists an opportunity gap, that we need to actively work towards reducing.
Most of us will be reminded of this day with the forwards that we receive on various social and messaging platforms. And the day, and its significance is forgotten soon after.

There sure has been progress with women being given more opportunities, but a lot more needs to be done. While we need to address issues of Gender Pay Gap[Australia specific data] and Equity and Equality. We need to talk about basic rights, female foeticide and sexual harassment, which are much more common than we think. We need to talk about these issues among ourselves and also in any public platform we have the opportunity to. We need to make people aware.

Today isn’t the day to sit back and relax, instead today is when we look forward and work towards an inclusive and equal world. We should ensure that this day and every other day isn’t a form of tokenism. Progress depends on what we do each and every day after all the hype has died down on a randomly chosen day in March.

#BeBoldForChange

 

Cogent – verification made easy

At Trustworthy Systems, we are one of the few places that understand software verification. We’ve built the worlds only verified, general purpose micro-kernel. And many people in the group are core contributors to Isabelle/HOL and CakeML. And, as a group, we not only believe in software verification, we understand what it involves, and also realise that the returns we get from verified software don’t come cheap. With that in mind, we started working on Cogent, with the aim of reducing the cost of verified systems. For those who are keen to read about how we achieve that, I suggest reading the publications section here.

What is Cogent?

Cogent is a restricted, polymorphic, higher-order and purely functional language with linear types and without the need for a trusted runtime or garbage collector.

In the context of Cogent, the adjectives that I used above, mean the following:

  • restricted: Cogent omits certain features that one might come to expect from a programming language. For instance, Cogent doesn’t support general recursion.
  • polymorphic: This is similar to templates in C++ or generics in Rust.
  • higher-order: Cogent uses software components like functions, modules or objects as values.
  • purely functional: Cogent forbids changing state, mutable data. Evaluation is done similar to a mathematical function.
  • linear types: ‘Objects'(although there is no such concept in Cogent, I’m using it to very broadly define instances/variable of a particular type) can be used exactly once. Rust does something similar, I think.
  • trusted runtime: Cogent compiler generates C code, so the trusted component is limited to the C runtime system.
  • garbage collector: Since Cogent compiler generates C code, we would manage memory manually, like we do in any C program.

Being very new to functional programming, language theory and type systems, my level of understanding of the theory behind the earlier mentioned concepts, is still pretty shallow. For an authoritative and in-depth explanation of these concepts, please refer to the Cogent publication list and Liam O’Connor‘s(one of the compiler writers, the other being Zilin Chen) blog, where he occasionally writes about the compiler and type system.

As mentioned earlier, one of the key ideas behind developing Cogent is to reduce the cost of verifying software. Since we are a systems engineering research group, our focus is particularly on reducing the cost of verifing system software. And that brings us to the next key question.

What can Cogent be used for?

I want to respond to the question with the answer, “For anything”, but I would be lying. If you intend to write a program in Cogent that prints a  “Hello, World”, I can tell you that it isn’t possible without using C code, since we don’t have IO support in Cogent. Infact things that are simple to write in other programming languages, loops for example, have very complicated usage patterns in Cogent.

I think, for any compiler to be good enough, it takes a long time and a lot of man-years(Look at gcc, llvm/clang, rustc and various other mature compilers). In comparison, Cogent is still in its infancy and at our peak, we had about 4 compiler developers(We have 2 people working on the compiler now). Having said that, the Cogent compiler does let us write verifiable Linux file system drivers, and it does some parts of it well. Infact we have implemented 3 filesystems(at various levels of completion) in Cogent – bilbyfs, ext2 and vfat.  All of the Cogent source is available on Github.

Our eventual goal is to have complete implementations of these(and more) filesystems along with other parts of the system software stack, viz., networking code, drivers and the like. Currently the implementations we have in Cogent have varying degrees of performance costs(based on the usage, as highlighted in the Cogent paper), but we are working towards making sure that we don’t lose out on performance,  although we aren’t there yet.

Why?

I’ve already explained why we are working on Cogent, and what our goals are. And if I have piqued your interest enough, I suggest that you got read some of the papers in the publication list.

What I want to mention is why I’m writing about this now. Two reasons. One, I think writing about it will give me a lot more clarity about what I’ve personally done so far, and what I want to do with the project going forward. And two, I want to be able to talk about Cogent from an engineers perspective. When I started working on this project a year ago, I was overwhelmed. Having mostly used imperative languages for as long as I remember and having primarily done systems programming, everything looked daunting. It was a steep learning curve trying to understand functional programming, Haskell, software verification. And after having spent a year trying to implement ext2, f2fs and contributing a little to the compiler, which is written in Haskell(I forgot to mention that earlier), I’ve started to appreciate these ideas a lot more. I have barely scratched the surface, but I’m very enthusiastically digging deeper.

My idea is to write several posts about using Cogent and this post is meant to be a curtain-raiser for the posts(time permitting) that follow.

One last thing. Liam gave a talk titled, Refinement through Restraint: Bringing Down the Cost of Verification  at ICFP 2016, the video of which is available here. The 26 minute talk is both entertaining and informative and it is worth a watch.

gdm doesn’t start after Arch Linux update(Feb 2017).

I update my workstation once a week, and I typically do it before I turn the machine off, so that I have an updated system the next time I’m using it. Being on a bleeding edge distro like Arch, I occassionally have trouble with the update, like it happened this morning. (Actually it is amazing how infrequent problems with upgrades are, it speaks volumes about the Arch Linux development community, hats off to you guys).

This morning when I turned my machine on, after the boot sequence, I was just shown a blank screen instead of the usual GDM login prompt. The logs (Xorg logs and system logs) showed nothing wrong. And restarting gdm didn’t help either. I was clueless and I switched to using lightDM. LightDM gave me the login screen. I still wasn’t sure what the problem was. I thought it could be some broken Wayland dependency. But when I tried runing the HipChat Linux client, I noticed the following errors on the terminal:

libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast

And then it dawned on me, that this could be a driver error and I actually might be missing the GL driver for my NVIDIA Quadro 600 Card, so I ran:

$ sudo pacman -S nvidia-libgl

and restarted gdm, and it worked. The HipChat client runs too. This problem most likely occurs when using the nVidia binary drivers, like I do. I’m not sure if  the Nouveau drivers have the same issue. Or if

seL4 now runs on Pi3

Very recently, we(the Trustworthy Systems group) merged Raspberry Pi 3 support for seL4. This excellent blog post by Kofi already highlights why this is great news. Personally, it is exciting because, a Raspberry Pi 3 is a lot more accessible than all the other boards we’ve previously ran our kernel on. I own one too, and it is absolutely amazing that I can run, muck around and put my Raspberry Pi 3 to some good use. I’ve been playing around with seL4 on my Pi3. The things that I’m currently trying to do are:

  • Get Haskell runtime on seL4. Some might say this is an exercise in futility, but this will help me get Cogent build on seL4, and then I can start building verified systems software(currently file systems) on top of seL4. Ofcourse, this involves a lot more than just the filesystem implementation(need a VFS layer, block cache, drivers etc.). It might be a long shot, but hey, I’m trying!
  • The OS that I’ve been trying to build, forever(Who hasn’t tried to build one, really?). With seL4, the kernel is taken care of, letting us build all the fun stuff on top(Students of the AOS class at UNSW do exactly this).

Good times!

Printing bits of a byte in C

Printing the bit representation of a byte is a very simple and interesting problem. There are numerous ways this can be done in many different programming languages out there. In C, one would do something like this for signed integers:


void print_bits(int num)
{
size_t bits = sizeof(num) * 8;
int max = 1 << (bits – 1);
int i;
for (i = 0; i < bits; i++) {
printf("%u", (num & max) ? 1 : 0);
num = num << 1;
}
printf("\n");
}

What exactly is this function doing?

Most of the code is pretty obvious. The key portions of the code are in lines 3 and 9.

Line 3, calculates the maximum possible positive value for the given type. In the example above, the given type is an int, and on my machine running GNU/Linux(x86_64),  the number of bits in an int is 32 and its maximum possible positive value for 2147483648.

The binary representation(shown in 4 chunks of 8 bits) of 1 on a 32 bit machine is:

00000000 00000000 00000000 00000001

which when left-shifted 31 times(line 3 in the above code), becomes(notice how the digit 1 has moved from being the LSB to being the MSB):

10000000 00000000 00000000 00000000

whose decimal value is 2147483648.

Having got the max value, we then bitwise-and(&) it with the give number(represented by num) and check the result(line 8). The bitwise-and will yield a number greater than 0, if there is a 1 in the MSB and will yield a 0 otherwise. Based on the result of this operation, we either print a 1 or a 0.

In each iteration, we left-shift  the given number by 1 bit(line 9). We continue to iterate until we’ve traversed all bits.

We can modify the function print_bits() to print the output in a much more readable fashion:


#include <stdint.h>
#include <limits.h>
#define numbits(x) (sizeof(x) * 8)
#define max_signed_value_of_type(t) \
(INTMAX_MAX >> (numbits(intmax_t) – numbits(t)))
void print_bits(int num)
{
size_t bits = numbits(num);
int max = max_signed_value_of_type(num) + 1;
int i;
for (i = 0; i < bits; i++) {
if (i && (i % 8) == 0)
printf(" ");
printf("%u", !!(num & max));
num = num << 1;
}
printf("\n");
}

We can also implement a function to count the number of 1’s and 0’s in the binary representation of a given number:


#include <stdint.h>
#include <limits.h>
#define numbits(x) (sizeof(x) * 8)
#define max_signed_value_of_type(t) \
(INTMAX_MAX >> (numbits(intmax_t) – numbits(t)))
void count_bits(unsigned int num)
{
size_t bits = numbits(num);
unsigned int max = max_signed_value_of_type(num) + 1;
int i;
unsigned int t = num;
int zeroes = 0, ones = 0;
for (i = 0; i < bits; i++) {
if (!!(num & max))
ones++;
else
zeroes++;
num = num << 1;
}
printf("%u has %d 0's and %d 1's in its binary representation.\n", t, zeroes, ones);
}

view raw

count_bits.c

hosted with ❤ by GitHub

The count_bits() function above for input number 2 which in binary is 00000000 00000000 00000000 00000010 we get an output:

2 has 31 0’s and 1 1’s in its binary representation.

 

This post is a result of a discussion with my nephew who recently took to programming and I was trying to explain bits and bytes to him. Although C is probably not the right language when introducing a high-schooler to programming, I thought it is the language that expresses bits and bytes well, and also, it is the language that I know the best.

Battered RAM!

My trusted Thinkpad X220, started behaving erratically with random segfaults. Initially the segfaults very sparse, happening occasionally. But it wasn’t long before it was completely unusable.

IMG_20141026_184825I initially suspected an update of the OS might have something to do with it. But I soon realised that it was a memory issue. The image on the left shows my laptop running the Memtest86+, with a *lot* of reds.

Work has been really busy and I haven’t had the time to get a replacement RAM module. I have been without my laptop for a month now, Intend to fix my laptop this week, and while I’m on it, planning to upgrade to SSD as well. After some research, I have decided to buy a Samsung 840 EVO SSD (500GB) and a Corsair 8G RAM module.

Upgrading to an SSD means that I need to reinstall. And that means that I will be without a functional laptop for a couple of weeks more.

XXXIV!

Last week, I completed 34 years of existence on this wonderful planet. And I wanted to run 34K to celebrate. Although I set out to run the full 34K, the weather turned from bad to worse. What started of as an windy and not-so-cloudy morning transformed itself into a wet and rainy one. It was a really heavy downpour and I had to cut my run short to 20K (20.7KM to be precise in 1:55’11.5). But I thoroughly enjoyed every bit of the run from Portsea to Fort Napean and back.

Click for larger image

Portsea to Fort Napean and back.

 

It was a beautiful path all the way to Fort Napean. And I would definitely recommend it to anyone who loves a good run. I did manage to run 34K a few days later though and it was the first time that I ran beyond 20K alone.

Hopefully this year turns out to be better than the last.

Updates on QNAP Finder for Linux

I have made a few fixes to the qnap-finder, and pushed those changes to qnap-finder repository on github. Visually, the only change that is seen is the access URL:

$ ./qnap-finder
1)
Hostname    : cher
IP Address  : 10.0.0.59
Type        : NAS(TS-410)TS-419ITS-410
URL         : https://10.0.0.59/cgi-bin/login.html

If run with ‘-h’ or ‘–help’, qnap-finder will now display a list of all available options.

$ ./qnap-finder -h
qnap-finder v0.1

Usage: qnap-finder [options]

options include:
–help|-h          This help text
–detail|-d        Query for detailed information.(default is brief)
–verbose|-v       Verbose debug
–version|-V       Prints current version

More updates, when I have them.

Arch Linux kernel soft lockup issues!

Earlier this morning, my laptop (a Lenovo Thinkpad X220) refused to wake up from suspend. And afer a hard reset, my machine refused to boot with the following messages on the console:


BUG: soft lockup = CPU#1 stuck for 22s! [which:604]
hda_codec: rates == 0 (ni=0x10, val=0x0, ovrd=1)
hda_codec: cannot attach PCM stream 0 for codec #0
INFO: rcu_preempt detected stalls on CPUs/tasks: { 0}

Rebooting the machine again didn’t help,It was the same error again, but this time around it was systemd and systemd-udev causing the lockup. Pretty clueless and with no access to the interwebs I was at least hoping to get a stack trace of some sort. There were none.

My first (and the most obvious) suspect was the kernel and drivers. I had a quick look at my grub.cfg, part of which is shown here:


menuentry 'Arch Linux Linux, with Linux core repo kernel' --class arch --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-core repo kernel-true-7cb6bd41-b198-4d1d-8ae4-fc8f169abb00' {
load_video
set gfxpayload=keep
insmod gzio
insmod part_gpt
insmod ext2
set root='hd0,gpt3'
if [ x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root --hint-bios=hd0,gpt3 --hint-efi=hd0,gpt3 --hint-baremetal=ahci0,gpt3  7cb6bd41-b198-4d1d-8ae4-fc8f169abb00
else
search --no-floppy --fs-uuid --set=root 7cb6bd41-b198-4d1d-8ae4-fc8f169abb00
fi
echo    'Loading Linux core repo kernel ...'
linux    /boot/vmlinuz-linux root=UUID=7cb6bd41-b198-4d1d-8ae4-fc8f169abb00 ro root=/dev/sda3 ro fastboot splash=silent quiet threadirqs add_efi_memmap pcie_aspm=force i915.i915_enable_rc6=7 i915.i915_enable_fbc=1 i915.lvds_downclock=1 rootfstype=ext4 init=/usr/lib/systemd/systemd
echo    'Loading initial ramdisk ...'
initrd    /boot/initramfs-linux.img
}

I decided to cut down all the cruft and see if I can get the machine up and running. Switching to the grub command-line, i did the following(remember TAB completion is your friend):


insmod part_gpt
insmod ext2
set root=(hd0,gpt3)
linux /boot/vmlinuz-linux root=/dev/sda3 rootfstype=ext4
initrd /boot/initramfs-linux.img
boot

Thankfully, That worked!

With my machine back in action. The next step was to look through the kernel docs. And I found this about soft lockups:

A ‘softlockup’ is defined as a bug that causes the kernel to loop in
kernel mode for more than 20 seconds (see “Implementation” below for
details), without giving other tasks a chance to run.

More about it here. There is even a command line parameter, softlockup_panic (search for it in kernel parameters doc) that will help generate the much needed stack trace.

This problem is clearly system specific. Some combination of the kernel alongwith drivers and/or firmware messes around with usage of the cpu. In my case it is one of the following parameters:

* threadirqs
* add_efi_memmap
* pcie_aspm=force
* i915 and family

I had a personal committment to get to, so didn’t probe this issue any further, will get to it during the weekend. For now, I thought I should share it, in case someone else might find this information helpful and also has some pointers on what caused the problem.

I am currently running 3.13.2-1-ARCH kernel and xf86-video-intel 2.99.910-1 drivers. The machine has an Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz and the following PCI hardware:

00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04)
00:16.3 Serial controller: Intel Corporation 6 Series/C200 Series Chipset Family KT Controller (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4)
00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 2 (rev b4)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 (rev b4)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b4)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation QM67 Express Chipset Family LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04)
03:00.0 Network controller: Intel Corporation Centrino Advanced-N 6205 [Taylor Peak] (rev 34)
0d:00.0 System peripheral: Ricoh Co Ltd MMC/SD Host Controller (rev 07)

QNAP finder for Linux

I have a QNAP Turbo NAS TS410. And under a DHCP setup that is available where I live, it is difficult to figure out the IP address of my NAS box is every time I restart, since the router isn’t doing a great job with the DHCP leases. The QNAP Finder for Linux that QNAP provides has strict dependencies on a specific gtk+ version. The source isn’t available, making it impossible to use on my Arch Linux machine.

It got me wondering  though, what the app actually does. I installed the Windows version of it on a borrowed laptop and using wireshark, I figured a bunch of things, all of which I’ve documented here. With that information, I wrote a small command-line utility that will essentially do the job for me.  I call it qnap-finder and it is available on github.

A quick preview of what the tool shows when run:


$ ./qnap-finder
1)
Hostname    : blah
IP Address  : 10.0.0.30
Type           : NAS(TS-410)TS-419ITS-410

This is just the basic detail. There is a lot more information available, like the number of hard disks, the version of the firmware which I still haven’t got to parsing and displaying, will get that done soon. Hopefully someone else will find this useful. And it is quite possible that I haven’t done something right, so please feel free to contribute/fork or report bugs.

* Disclaimer
============
This program is a work in progress and isn't guaranteed to work with all QNAP devices.
Although this is meant to be a replacement for the Windows/Mac versions of QNAP finder,
it is not guaranteed to work that way.