Resources

The ACQDIV Corpus

The ACQDIV Corpus is a database that brings twelve corpora representing eleven languages together in a formally and semantically standardized format:

Allen Inuktitut Corpus
Chintang Language Corpus
Corpus of the Chisasibi Language Acquisition Study (Cree)
Demuth Sesotho Corpus
Hellwig Qaqet Corpus
Koç University Longitudinal Language Development Database
MiiPro Japanese Corpus
Miyata Japanese Corpus
MPI-EVA Jakarta Child Language Database
Sarvasy Nungon Corpus
Pfeiler Yucatec Child Language Corpus
Stoll Russian Corpus
The Ku Waru Child Language Socialization Study (KWCLSS)

The corpus thus contains data from nine of the ten languages in the original ACQDIV sample and expanded to cover two more. To learn more about the corpus’ linguistic design, its structure, and its technical realization, read the corpus manual (linked below).

Access to the corpus may be granted to researchers upon request to PI Sabine Stoll. The corpus is not made publicly available because it contains sensitive data from unpublished subcorpora. In the current initial research phase, access is restricted to the ACQDIV core team and official collaborators of the project. We plan to publish the subcorpus that is based on CHILDES data under the Creative Commons license CC BY-NC-SA 3.0, as stipulated by the TalkBank Ground Rules.

Weiterführende Informationen

ACQDIV Corpus Manual (PDF, 2 MB)

Language, ACQuisition, DIVersity Lab (ACQDIV)

Quicklinks und Sprachwechsel

Main navigation

Resources

The ACQDIV Corpus

Weiterführende Informationen

Download Materials