Wouldn’t it be great to write code that runs high-speed parallel algorithms on just about every kind of hardware out there, and without needing to be hand-tweaked to run well on GPUs versus CPUs?
That’s the promise behind a new project being developed by professors and students from the University of Edinburgh and the University of Münster, with support from Google. Together they’re proposing a new open source functional language, called “Lift,” for writing algorithms that run in parallel across a wide variety of hardware.
Lift creates code for OpenCL, a programming system designed to target CPUs, GPUs, and FPGAs alike as well as to automatically generate optimizations for each of those hardware types.
OpenCL can be optimized “by hand” to improve performance in different environments — on a GPU versus a regular CPU, for instance. Unfortunately, those optimizations aren’t portable across hardware types, and code has to be optimized for CPUs and GPUs separately. In some cases, OpenCL code optimized for GPUs won’t even work at all on a CPU. Worse, the optimizations in question are tedious to implement by hand.
Lift is meant to work around all this. In language-hacker terms, Lift is what’s called an “intermediate language,” or IL. According to a paper that describes the language’s concepts, it’s intended to allow the programmer to write OpenCL code by way of high-level abstractions that map to OpenCL concepts. It’s also possible for users to manually declare functions written in “a subset of the C language operating on non-array data types.”
When Lift code is compiled to OpenCL, it’s automatically optimized by iterating through many possible versions of the code, then testing their actual performance. This way, the results are not optimized in the abstract, as it were, for some given piece of hardware. Instead, they’re based on the actual performance of that algorithm.
One stated advantage for targeting multiple hardware architectures with a given algorithm is it allows the same distributioned application to run on a wider variety of hardware, and to take advantage of heterogenous architectures. If you have a system that has a mix of CPU, GPU, and FPGA hardware or two different kinds of GPU, the same application can in theory take advantage of all of those resources simultaneously. The end result is easier to deploy, since it’s not confined to any one kind of setup.