1.0 Getting Started

Welcome to SFARI Gene.

We hope that you benefit from this publicly available, curated, web-based, searchable database for autism research. SFARI Gene contains interactive modules linking information about risk genes for autism with corresponding data from peer-reviewed research on human genes, animal models, and more.

1.1   About SFARI Gene

SFARI Gene is a database built on information extracted from peer-reviewed scientific and clinical studies on the molecular genetics and biology of autism spectrum disorders (ASD).

The SFARI Gene database features a comprehensive list of genes and copy number variants associated with ASD, an up-to-date collection of every known protein interaction occurring between these gene products, and the latest relevant data curated from associated animal models.

The candidate genes are richly annotated for their relevance to autism. Every entry in the database includes an in-depth description of the gene’s molecular function, along with a calculated score that reflects the strength of the evidence linking a gene to ASD. The list of genes is continuously updated by our dedicated team of researchers that combs newly published scientific data for emerging discoveries regarding ASD candidate genes.

SFARI Gene is organized into interactive modules, which currently include Human Gene, Animal Model, Protein Interaction (PIN), Copy Number Variant (CNV), Gene Scoring, and Data Visualization. The unique genetic information contained in each of these modules can be found via the Advanced Search function, by browsing the database, or by using our new data visualization tools.

1.2   New Features in SFARI Gene 3.0

SFARI Gene 3.0 has been completely retooled and reconfigured to accelerate discoveries by the autism research community. The scientific content of the database has been expanded and reformatted for increased consistency, the user interface has been completely redesigned to enhance usability, and new functionalities and interactive data visualization tools have been added to help users more easily navigate the database.

Our review team has combed through the data to ensure that all data entries are consistent. Empty text fields have been replaced with drop-down menu options to establish uniformity throughout the database and allow for increased interconnectivity between the distinct data modules.

The interface has undergone a complete overhaul for improved functionality, and several new features have been added to the database itself. Universal status columns have been added to the gene summary pages to indicate any recent updates or additions to the database related to that gene, and blue dots now appear on tabs to denote any recent changes. Additionally, users can now view a gene’s scoring history, which will allow researchers to see at a glance whether a gene’s link to ASD has become more or less probable.

Data Visualization Tools
The data visualization tools in SFARI Gene are designed to help users find specific data on individual chromosomes, genes, and proteins and allow them to more clearly see the connections between genes associated with ASD. These dynamic tools instantly reflect any updates or additions made to the database, ensuring that the latest genetic information is available to the autism research community.

The new data visualization tools in SFARI Gene are:

Human Genome Scrubber
The Human Genome Scrubber maps ASD candidate genes by their location along the human genome and provides users with information including the assigned gene score and the number of reports associated with the gene. Results can be filtered by chromosome and gene score, and an overlay feature can be utilized to show the ratio of autism-specific reports versus non-autism-specific reports.

Copy Number Variant (CNV) Scrubber
The CNV Scrubber provides users with a quantitative look at copy number variants that occur in every chromosome. The scrubber shows the number of CNVs found at a particular locus, the number of reports curated, and whether a CNV is primarily caused by deletion or duplication.

Ring Browser
SFARI Gene’s Ring Browser visualizes all the human genetic information contained in the database. This tool is also used to illustrate all known protein interactions that take place between gene products associated with ASD.

Additionally, SFARI Gene 3.0 features advanced functionalities and tools, such as:

Quick Search
The new Quick Search feature allows users to instantly filter rows of results in the main database tables. This will let users easily locate specific information without having to scroll through the entirety of the data or use their browser’s find function.

Advanced Search
The Advanced Search function of SFARI Gene has been updated to give users increased access to all information in the database, including genetic loci, gene scores, associated disorders, and details about the scientific studies used to support any given gene’s inclusion. Search results can be filtered and sorted to help users find information most pertinent to their research.

Advanced Search can be found under the Tools menu. To use the Advanced Search function, simply input your query into the provided box.

Updated Interactome
SFARI Gene now features an updated visual interactome that allows users to easily view the genetic information contained in the database. When users view an individual gene’s summary page, they can actively filter information by the type of protein interactions and connections linking two genes, as opposed to viewing a series of single, static images.


1.3   Information Contained within SFARI Gene

SFARI Gene integrates genetic, neurobiological, and clinical information about genes associated with ASD. SFARI Gene‘s content is entirely based on the peer-reviewed scientific literature and is manually annotated by expert researchers and biologists. Data presented in abstracts or at conferences are not include

1.4   The Modules of SFARI Gene

The site‘s interactive modules connect information on candidate autism genes with research that illuminates their molecular functions. Curated from various academic sources of both human and animal genetic information, the data contained in the various modules of SFARI Gene can help researchers more thoroughly and insightfully analyze the most up-to-date data on ASD available.

The modules of SFARI Gene are now more closely interconnected, allowing researchers to see any relevant data contained in different modules and to more easily navigate between the distinct modules. The tabs on a gene’s summary page show related data found in other modules and act as gateways to this information, giving users convenient access to data that may be pertinent to their research.

Current modules include:

Human Gene
The Human Gene module of SFARI Gene is a thoroughly annotated list of genes that have been studied in the context of autism. It contains information about the genes themselves, relevant references from scholarly articles, the genetic variants within the gene that have been identified, and a description of the evidence linking the genes to ASD.

Gene Scoring
SFARI Gene uses an innovative assessment system to assign every gene in the database with a score reflecting the strength of the evidence linking the gene to ASD. These scores are regularly updated based on the publication of new scientific data and feedback from the research community.

Copy Number Variant (CNV)
The Copy Number Variant (CNV) module of SFARI Gene is a parallel resource that catalogues the single-gene and multi-gene deletions and duplications in the genome and describes their potential link to autism.

Animal Models
SFARI Gene’s Animal Models module contains information about lines of genetically modified animals that represent potential models of autism. This information includes the nature of the targeting construct, the background strain and, most importantly, a thorough summary of the phenotypic features that are most relevant to autism.

Protein Interaction (PIN)
The Protein Interaction (PIN) module is a compilation of all known interactions gene products implicated in autism, including both protein-protein and protein-nucleic acid interactions. It presents both graphical and tabular views of interactomes, highlighting connections between autism candidate genes. Each protein interaction is manually verified by consultation with the primary reference.

Data Visualization
SFARI Gene includes a new series of unique data visualization tools to better illustrate the genetic data contained in the database. These visualizations will help researchers see the frequency with which particular mutations have been linked to autism, as well as the full range of protein-protein interactions that occur between autism-associated gene products, all in the scope of the entire genome. These tools will provide researchers with a useful way to assess the progress of the field.

1.5   Gene Classification

Autism-related genes in SFARI Gene are classified into four categories:

(1) Rare: This category applies to genes implicated in rare monogenic forms of ASD, such as SHANK3. The types of allelic variants within this class include rare polymorphisms and single gene disruptions/mutations directly linked to ASD. Submicroscopic deletions/duplications (copy number variations) encompassing single genes specific for ASD are also included.

(2) Syndromic: This category includes genes implicated in syndromic forms of autism, in which a subpopulation of patients with a specific genetic syndrome, such as Angelman syndrome or fragile X syndrome, develops symptoms of autism.

(3) Association: This category is for small risk-conferring candidate genes with common polymorphisms that are identified from genetic association studies in idiopathic ASD, or autism of unknown cause, which makes up the majority of autism cases.

(4) Functional: This category lists functional candidates relevant for ASD biology, not covered by any of the other genetic categories. Examples include the gene CADSP2, in which knockout mouse models exhibit autistic characteristics, but the gene itself has not been directly tied to known cases of autism.

A gene can belong to more than one category, depending on the mutation. For instance, a common variant may confer risk for developing idiopathic autism, but an inactivating mutation in the same gene places it in the higher risk-conferring categories. In such cases, all the appropriate categories are used to annotate the genes.

Both rare and syndromic categories represent monogenic forms of ASD. However, we include a distinct class for each because syndromic genes have been definitively linked to syndromic forms of autism, such as fragile X syndrome, whereas rare genes are only potential candidates for autism. Syndromic genes are present in individuals whose ASD was diagnosed secondary to the main clinical features of their genetic disorder. In contrast, rare genes are identified in the course of genetic screening of individuals diagnosed with ASD.

1.6   Data Organization in SFARI Gene 3.0

A SFARI Gene entry is the presentation of a gene linked to ASD along with all of the gene’s molecular, neurobiological, and clinical attributes.

There are several steps involved in the curation of genes into SFARI Gene. First, all reports pertaining to a candidate gene are extracted, counted for the number of studies, and the information is compiled into a gene entry.

Second, molecular information about the gene is annotated from highly cited and recently published articles and reviewed to assess the gene’s relevance to ASD.

Third, these annotations are reviewed and the gene is assigned a score reflecting its link to ASD.

Finally, the information is added to the SFARI Gene database where it is available to the public.

To view our curation criteria, please see the About Curation sections for each of the modules of SFARI Gene:
Human Gene Curation
Animal Models Curation
PIN Curation
CNV Curation
Gene Scoring Curation

1.7   Accessing Data in SFARI Gene

The redesigned interface of SFARI Gene allows users to easily browse all of the genetic information contained in the database. Users can find specific information via the Advanced Search tool or use the provided filters to sort the data by individual chromosome or gene score. Information on related protein interactions and copy number variants can be found on the gene summary page under the related tabs, as well as in the dedicated PIN and CNV modules.

Data can be accessed three ways using SFARI Gene 3.0:

Next section:
Submit New Gene

Report an Error