George L. Coulouris (2010).
BASALT: Blocked Alignment Score Approximation in Linear Time
http://coulouris.org/basalt/

About

BASALT is a scalable, bandwidth-bound algorithm to detect regions of similarity in large nucleotide sequences.

Performance

Currently, the best observed speedup is around 46x:

USE_ZEROCOPY=0
BASALT_THREADS_PER_BLOCK=64
BASALT_BV_SIZE=2048
BASALT_NUM_QUERY_TILES=512
BASALT_NUM_SUBJECT_TILES=512
device 0 = Tesla C1060
gpu processing time : 59.300999 (ms) 
memory bandwidth    : 67.452490 (GB/s)
cpu processing time : 2752.378906 (ms) 
Test PASSED

Download

basalt_src.tgz