The ACQDIV Corpus

The ACQDIV Corpus is a database that brings ten corpora together in a formally and semantically standardized format:

The corpus thus contains data from nine of the ten languages in the ACQDIV sample. To learn more about the corpus’ linguistic design, its structure, and its technical realization, read the corpus manual (linked below).

Access to the corpus may be granted to researchers upon request to PI Sabine Stoll. The corpus is not made publicly available because it contains sensitive data from unpublished subcorpora. In the current initial research phase, access is restricted to the ACQDIV core team and official collaborators of the project. We plan to publish the subcorpus that is based on CHILDES data under the Creative Commons license CC BY-NC-SA 3.0, as stipulated by the TalkBank Ground Rules.

Download materials