Statistics for Natural Language Processing on Data Warehouses