What is KANN?
See the GitHub repo page. In short, KANN is a flexible 4-file deep learning library, supporting convolutional neural networks (CNNs), recurrent neural networks (RNNs) and non-standard topologies addressable with differentiable computation graphs.
Why a new library?
The initial motivation is that I wanted to understand how deep learning frameworks work, down to the very details. The best way is to implement one by myself. After I got the basic working, I realized the code may be of use to other C/C++ programmers who prefer an efficient and flexible library without carrying all the non-trivial dependencies of mainstream frameworks. So, here we go.
Comparison to other deep learning frameworks
Theano and Tensorflow, with a code base many times larger than KANN, are definitely more powerful than KANN. Importantly, they can take the advantage of GPUs and even distributed computing, while KANN not. On the other hand, KANN comes close in flexibility and can be faster in the multi-threading mode for CPU-only training. KANN also has no extra dependencies by default, which makes it easy to deploy.
Tiny-dnn is a popular lightweight framework in C++. Importing pre-trained Caffe models is its particular strength that KANN lacks. However, tiny-dnn does not support RNNs and has difficulties in constructing non-standard model (e.g. variational autoencoder). It is several times slower than KANN and mainstream frameworks. Tiny-dnn also requires a C++11 compiler, which is not available everywhere yet (e.g. on CentOS 6).
Limitations
KANN does not support GPU right now. For MLPs and RNNs with no more than a couple of hundred hidden neurons, multi-threaded KANN is actually no slower than GPU-based implementations, because small matrix multiplications have not saturated the capacity of GPU yet. However, for CNNs and large RNNs, I have seen GPU-based implementations outperforming KANN by a factor of 5. The performance gap is probably larger with bigger networks.
KANN lacks some important operators, such as batch normalization (BN). A direct implementation of the original BN method is tricky as training needs an extra step different from normal training. It seems that Caffe et al are implementing a variant of BN with running average, but I am not so sure.
KANN does not support bidirectional RNNs and seq2seq models out of box. In principle, these models can be constructed with KANN by manually chaining RNN blocks, but I have not tried.
Conclusion
If you are looking for a tiny, standalone, performant, open source library in C/C++ that supports common components including MLP, CNN and RNN, and has the flexibility and extensibility close to mainstream deep learning frameworks, KANN might be your only viable choice as of now.
Leave a comment