Unity Editor on Linux and the high CPU usage problem

Linux is not for gaming

Die-hard Linux fans have often had to revel in (or struggle with) the idea that their OS of choice isn't the number one target for game development. Partly due to the nature of FOSS and how it is supported, and partly due to the perceived uses of Linux as being for nerds and servers.

But this is changing, slowly. if a Linux user is willing to put aside their distaste of proprietary software and take a pragmatic approach, there are tools available to allow people to no dirty their hands with other operating systems. Indeed, Unreal Engine 4 is completely Open Source and available on GitHub.

Unity was arguably the first game engine to reach the mainstream, with its slick editor, lively asset store, and large community. Users of the software have been asking for a Linux build of the editor for years. The Windows and OSX versions could build Linux games, but the editor could only be run in Wine on Linux, with varying levels of success. Last year however, the Unity developers released a (very) beta build of the editor, and there was much rejoicing. People understood that it was very much a side project, probably full of bugs, and perhaps not ready for the big time, but it was a welcome start.

The bug that wouldn't die

But reports starting appearing on the forums that for some people the editor was running a CPU core at 100% while idle, and only running the game in the editor calmed the CPU down. Several updates to the editor have been released and still there are reports of high CPU usage. Not everybody was suffering the problem, but a significant number, including me, had their CPU fans running constantly which made it less than fun. The Unity developers seem silent on the matter, possibly because not many resources are being thrown at it.

Various solutions have been posted by users, mostly scripts that limit the CPU for that particular process, but that's a bit drastic. So I got to thinking, what is it about running even an empty scene that reduces the CPU usage? I also noticed swinging the camera around in the editor had the same effect, and when I stopped moving the camera the CPU pegged again. But it wasn't until I looked in the About dialog box to find which version I was running, that I realised what the problem might be.

A fix less hacky

You see, the About box has a scrolling list of the many contributors to Unity, and when it gets to the bottom of the list it loops around, and having this box open reduced the CPU usage to normal. The editor was doing something. Not much, but not idling either. And as it is a modeless window, it can be minimised and it still reduces CPU usage. Granted, if you tab away from and then back to the main editor window it restores the About box, but it's better than nothing.

Why?

This is pure speculation on my part, but what I think is happening is that somewhere inside the editor there is an event loop, a function that waits for events such as mouse movements, keyboard input, or redraw requests. If it gets an event, it broadcasts that to any part of the application that is listening so it can be handled. Because of the arbitrary nature of these events, event loops are usually single-threaded to avoid reentrancy problems and so on. Much of the time, there are no events coming in, and so the event loop should sleep until it is called again to check for incoming events. This sleep function instructs the OS to do other stuff with the time it was allocated. But if the sleep function isn't called, the event loop will spin as fast as it can taking up all the time allocated to it, and I think it is this that is happening to the Unity editor. Somewhere in the code either the sleep function is wrapped in a condition that is false when there are no events, or the sleep function itself is failing for some reason.  If I am on the right lines, my money is on the former.

A slightly better workaround

The Unity editor is extensible with custom controls and plugins, and people have come up with some fantastic tools. But what if someone created a simple widget that redrew itself every frame? Regardless of user input, and drawing nothing more than the window decorations, it should generate enough events to keep the event loop quiet. So that's what I did:

using UnityEngine;
using UnityEditor;

[ExecuteInEditMode]
public class Idler : EditorWindow {
[MenuItem("Window/Idler")]
public static void ShowWindow() {
GetWindow(typeof(Idler));
}

void Update() {
Repaint();
}
}

Put this script somewhere in your project, activate it from the Window menu, and dock it somewhere out of the way. It does have to be visible, so you can't put it on a tab bar with other windows. It also has a minimum size, so it won't disappear completely. I tried using the OnInspectorUpdate() function which is called less often than Update(), but that didn't make much difference because there is still plenty of time where the event loop is idle.

The preferred fix

Of course, the best way of solving the problem is for the Unity developers to actually fix the code. I appreciate that support for a Linux build is practically non-existent because it's really quite a niche market and they don't want to, or aren't able to, provide the resources necessary. But just having someone spend an afternoon looking for the bug isn't too much to ask, is it?

FPGAs for Linux software engineers - Part II

In Part I, I went through the fairly painless process of installing and setting up my first FPGA development board. In this post, I'll be describing my journey through Verilog and paradigm shifts needed to go from software development to silicon development.

My first FPGA

The Terasic DE0-Nano development board has 8 green LEDs on it, and the tutorial project that comes in the user manual is a simple counter that speeds up when you press one of the buttons on the board, and slows down again when it is released. But I'm not going to describe the tutorial, instead I wanted to get stuck in to something beyond blinkenlights.

RS232 serial interface

The RS232 serial specification is venerable - it was introduced in 1962 according to Wikipedia, and covers a lot more than just the actual serial data protocol, there are defined voltages you must use, other communication pins for deciding which device is talking and all manner of fun. But these days of Arduinos and FTDI chips, most of that has been discarded, and the part of the spec that deals with the bits of data coming down the wire is pretty much all that's left. But for my purposes that's absolutely fine, I want a device that takes a serial stream from a microcontroller, does something with the data, and spits something out again. The first problem however is getting the data into the FPGA...

Concurrency nightmares

As a software engineer, I have wrestled with multithreading and all the joys of race conditions that are almost inevitable. Imagine this but on a massive scale, that's what developing for an FPGA seems like. At the highest level, you have "modules", these have a set of inputs, usually at least a clock and a reset, and at least one output. So far so good. Within each module are essentially two types of processing, combinational logic (note this is not combinatorial as you will see written quite often) and sequential logic. Both of these refer to assigning some value, but they happen at different stages of execution. In Verilog, sequential logic happens within an always block:

always @ (clk posedge) begin
  a <= b + 1;
  c <= a;
end

This says, on the rising edge of the clock pulse, assign the value of b + 1 to a, and a to c. But despite it being called sequential, this doesn't happen sequentially as I would know it, but concurrently, and at some point after the block has executed. So c will be set to the original value of a and not to b + 1. Of course, on the next clock cycle a will have been updated to b + 1, and so c is always 1 cycle behind a. Note that this kind of nonblocking assignment uses the <= operator. 

Combinational logic happens at a different part of the execution pipeline, and is handled immediately. Often this is written outside of an always block (apart from when it's another type of always block... yeah, told you this was odd). So a simple case might be:

assign led = x & y;

This assigns the result of a bitwise and of x and y to led. As soon as either x or y's state changes, led will be updated. Note here the = operator is used to denote blocking assignment. The assign keyword is there to really drive home what you want to do because, you know, Verilog.

A second kind of combinational logic can occur in the different kind of always block:

always @ (x, y) begin
  foo = x & y;
end

The first thing to note is that there is no posedge (or its counterpart negedge), this block effectively gets executed whenever x or y changes. Inside the block the = operator is used, but the assign keyword is absent.

Verilog allows you to mix and match combinational logic and sequential logic in the same always block, but it seems the general consensus is avoid it at all costs! Because although Verilog allows you to do it, the tool that generates the actual FPGA configuration, the synthesiser, will have a harder time figuring out what you meant and may get it wrong. This is because both Verilog and its nemesis VHDL were designed for simulating hardware, and the naturally sequential nature of CPUs means your results may differ.

Next...

I will get onto the actual design of my RS232 receiver, and how the hardware equivalent of printf() debugging is to use an oscilloscope.

FPGAs for Linux software engineers - Part I

For a while now, I have fancied playing around with FPGAs; I mean, how hard can it be? I had watched a number of YouTube videos, most notably Ben Heck and EEVBlog's fantastic offerings, and they both spoke highly of the Terasic DE0-Nano. It was a bit more expensive than some eBay offerings, but I wanted something I knew would work out of the box. Maybe my next one will be a cheaper one now I have a vague idea what I'm doing.

The Terasic DE0-Nano has an Altera Cyclone IV FPGA, a few LEDs, switches, an accelerometer, a bunch of RAM and more GPIO than you can shake a stick at. Its form-factor is such that it could work very nicely for an actual project, rather than some of the other boards out there that have all the bells and whistles already attached whether you use them or not. 

Software

My initial thought was that the development tools would all be Windows applications, and I would have to set up a VM and all the fun that means for USB devices an so on. But I was pleasantly surprised to find that Altera's Quartus II software was available for Linux as well - it turns out it's a Java application, but one that is for the most part very nice indeed. Granted, the download was an eye-watering 5GB, but much of that was taken up with things I didn't need to install such as profiles for different FPGAs. And after scratching the surface of the tools that make up Quartus II, I can begin to see why it is a hefty install.

One issue I should note is that each manufacturer's FPGAs are entirely incompatible, and can only be programmed by that company's tools (Project IceStorm is an open source attempt at reverse-engineering Lattice's iCE40 FPGAs), so make sure you meet the system requirements before investing, and be prepared for a bit of vendor lock-in.

The DE0-Nano user manual has a couple of tutorials to get you started, and the CD has a handful of demo configurations to show off things like the accelerometer. The first tutorial is called my_first_fpga which I thought a reasonable starting point, so I dived in. A couple of issues are that the manual covers an older version of Quartus, and the MegaWizard Plug-in Manager (yes, that's an actual thing) has now been moved to a different part of the UI and is now the relatively disappointingly named IP Catalog which by default sits over on the far right of the main UI in a dock window and took me ages to actually notice it. Secondly, once you create an IP component (no idea what it stands for - Intellectual Property I guess), make sure the configuration window that appears is resized to remove scrollbars. Otherwise the first interaction with the GUI will cause a CPU core to peg and the application to freeze. Other than that, the tutorial was reasonably clear and worked with the minimum of head-scratching.

Hardware

Altera devices need a programmer called a USB Blaster which seems to mostly just be an FTDI chip to handle USB to JTAG communications, and cheap Chinese knockoffs can be found all over eBay. But the DE0-Nano has it on board, so it's a simple case of using the supplied micro USB cable (which comes in one of those snazzy but annoyingly weighty spring-loaded roll-up things) and plugging it straight in. You will have to set up a udev rule to give users permission, e.g. in /etc/udev/rules.d/40-usbblaster.rules:

SUBSYSTEM=="usb", ATTRS{idVendor}=="09fb", ATTRS{idProduct}=="6001", GROUP="plugdev", MODE="0666", SYMLINK+="usbblaster"

Once that is done, it all works mostly smoothly... I found that the programming tool detects the USB Blaster but doesn't seem to like it if it is in certain USB ports on my machine. I've yet to get to the bottom of it, but it might be worth experimenting if it doesn't work first time.

Next...

In Part II, I'll go into more detail about actually creating a project and some of the many pitfalls that exist for software engineers - them silicon engineers are an odd bunch of people.