o corpus do português

Os corpora
Nova interface
Tamanho dos corpus
Comparar com outros corpora
Pesquisadores (inglês)

Problemas (inglês)

The new billion word corpus has been tagged for part of speech (e.g. casa = noun, fazem = verb form) and it has been lemmatized (e.g. faço, fizeram, and fizemos are all forms of the lemma fazer). But there are still problems.

If you are a native speaker of Portuguese and can spend even 10-15 minutes a week to help correct errors, we would really appreciate it. You can also "earn credit" that will count towards increased corpus access or corpus data.

More information and tutorial

Thanks for your help!