| Corpus | Description |
| Historical / Genres | 45 million words, 1300s-1900s. For the 1900s, divided evenly among spoken, fiction, newspaper, and academic |
| Note: infinitives + clitic (e.g. fazê-lo), -ndo forms + clitic (e.g. fazendo-o), and words like da, nesta are all one word each | |
| Web / Dialects | One billion words in web pages from four Portuguese-speaking countries. |
| Note: infinitives + clitic (e.g. fazer lo), -ndo forms + clitic (e.g. fazendo ), and words like da, nesta are all two words each | |
| NOW (News on the Web) | 1.1 billion words, 4 countries, 2012-2019. |
| Note: infinitives + clitic (e.g. fazer lo), -ndo forms + clitic (e.g. fazendo ), and words like da, nesta are all two words each |