The CodeKoan Search Engine

A Source Code Pattern Search Engine

Codekoan is a special search engine, which lets users submit large source code documents as queries. In these source code documents Codekoan recognizes approximate reuses of short code fragments from a large index. Codekoan’s algorithm is mostly programming language independent but it has some small programming language specific parts.

A lot of publicly available code search tools use some form of string based search. CodeKoan uses a more complex token based algorithm that searches for source code based on the “micro-structure” of source code. The advantage of the algorithm is, that it is capable of recognizing short code patterns over a variety of different application domains.

Other Projects


mnist-idx is a haskell library for reading and writing the IDX format that stores vectors or matrices for use in machine learning algorithms. The most widely known data set in this format is the MNIST database of handwritten digits.