o corpus do português

Created by Mark Davies. Funded by the US National Endowment for the Humanities (2004-2005, 2015-2017).

		Corpus	Size	Created
1	Info	Genre / Historical	45 million words	2006
2	Info	Web / Dialects *	1 billion words	2016
3	Info	NOW (2012 - 2019)	1.1 billion words	2018

This is the "original" Corpus do Português (2006), with the older interface (2008).

The corpus contains 45 million words of data from the 1200s-1900s, and it can be used to look at the history of Portuguese. For the 1900s, it is equally divided between spoken, fiction, newspaper, and academic texts, which means that you can use it to compare genres of Portuguese.

The link above is for the original interface, which was created back in 2008. We recommend that you use the newer interface (#1 above), which provides many new features.

At the current time, the older (2008) interface is the only one in both Portuguese and English, although we are in the process of translating the new one to Portuguese as well.