In many studies involving large numbers of individuals with autism and controls, case cohorts are formed by drawing from members of repositories such as the Autism Genetic Resource Exchange (AGRE), the Autism Genome Project (AGP), the International Molecular Genetic Study of Autism Consortium (IMGSAC), or the Simons Simplex Collection (SSC).

When available, the following information on both case and control cohorts is extracted and presented in the CNV module as population data:


A brief synopsis of the cohort, including the source of the individuals within the cohort.

Cohort size

Case and control cohorts come in a wide range of sizes. Case cohorts of smaller sizes frequently provide more information on the phenotypic characteristics of affected individuals within the cohort, but are of less significance in statistically determining the pathogenic relevance of a CNV at a given locus across populations. On the other hand, larger case cohorts are more useful in statistically determining pathogenic CNV relevance, but typically provide far less information on the phenotypic characteristics of affected individuals.


The diagnostic criteria (ADI-R, ADOS, etc.) are often described, along with the number of individuals with specific primary diagnoses, such as autism, Asperger's, or PDD-NOS.


Typically given as either a range of ages or a mean age.


Males are diagnosed with ASD approximately 4x more often than females. Reflecting this disparity, males with autism tend to make up roughly 70-85% of most large case cohorts. Control cohorts, on the other hand, are typically 50% male.

Geographical Ancestry

The majority of cohorts are predominantly of Caucasian/European origin. As such, determining the pathogenic relevance of a CNV at a given locus across ethnic groups is difficult.

Each cohort in the CNV module dataset is assigned a name (or cohort ID) based on information from the initial report. Cohort names consist of the name of the first author listed in the report, the year the report in which the cohort is described was published, the disease being investigated, whether the cohort is a discovery cohort or a replication cohort, and whether the cohort consists of cases (i.e. individuals diagnosed with the disease of interest) or controls. While all reports in the database feature a discovery case cohort, only a few also describe a replication case cohort, in which the authors attempt to replicate their findings in the discovery cohort sample with a new population of cases.

For example, for the ASD discovery case cohort described in Pinto D 2010, the name of the cohort in the module would be: pinto_10_ASD_discovery_cases.

For the ASD replication control cohort described in Glessner JT 2009, the name of the cohort would be: glessner_09_ASD_replication_controls.

