In line with Och, a reliable base for building a usable statistical device translation system for a new set of languages from scratch would encompass a bilingual textual content corpus (or parallel assortment) of in excess of 150-200 million text, and two monolingual corpora Every of much more than a billion words and phrases.[85] Statistical desig