A Source Code Pattern Search Engine
Codekoan is a special search engine, which lets users submit large source
code documents as queries. In these source code documents Codekoan
recognizes approximate reuses of short code fragments from a large
index. Codekoan's algorithm is mostly programming language independent but
it has some small programming language specific parts.
Smart Source Code Search
A lot of publicly available code search tools use some form of string based
search. CodeKoan uses a more complex token based algorithm that searches for
source code based on the "micro-structure" of source code. The advantage of
the algorithm is, that it is capable of recognizing short code patterns over
a variety of different application domains.
Other Projects
mnist-idx is a haskell library for reading and writing the IDX format that
stores vectors or matrices for use in machine learning algorithms. The
most widely known data set in this format is the
MNIST database of handwritten
digits