Database search
N = null model (random bases or AAs)
Report all sequences with
logP(s|M) - logP(s|N) > logP(N) - logP(M)
Example, say a/b hydrolase fold is rare in the database, about 10 in 10,000,000. The threshold is 20 bits. If considering 0.05 as a significant level, then the threshold is 20+4.4 = 24.4 bits.