ADHDgene Database
  • Published Variant
  • Published Gene: 359
  • Published Region: 128
  • Pathway by PBA: 8
  • Study: 361

Tutorial

1. Overview

To our knowledge, ADHDgene is the first genetic database for ADHD. It contains multi-type genetic factors related with ADHD including SNPs, CNVs, VNTR, microsatellites, genes, chromosomal regions, and biological pathways. Data in this database were derived from both profound literature screening with manual curation (core data) and extended functional analyses (extended data), such as LD analysis of literature-origin SNPs, pathway-based analysis (PBA) for GWAS data, and gene mapping to acquire more candidate susceptibility genes. A full annotation of each genetic factor was made, and the correlations among these genetic factors were demonstrated to make ADHDgene a comprehensive and connected information resource for further genetic studies of ADHD. ADHDgene provided several ways of search (part 5) to facilitate users to access all the data and data connections. To facilitate visualization of data, gBrower was incorporated to show the correlations between different types of genetic factors (variant/gene/region/pathway) in context of genomics data to further facilitate the understanding of these data in ADHDgene.


2. Data summary

At present, the ADHDgene database includes a total of 361 studies. The number of genetic entries for each type of genetic factors is summarized in Table 1.

3. Core Data: data collection from literature

We performed a comprehensive search in PubMed for genetic susceptibility studies of ADHD, including genome-wide association study, candidate-gene association study, linkage study, mutational study, genome-wide copy number variation analysis and meta-analysis study. Details of each study (e.g. study design, sample population, analytic method) and the statistical values (e.g. P-values, ORs, LOD) of all genetic factors were extracted from the original publication. To better illustrate the associations between genetic candidates and ADHD, results of statistical analysis were categorized into 'Significant', 'Non-significant' and 'Trend' mainly according to their statistical evidences. Authors' comments were also presented for users' reference. Categorization of the statistical results was made according to the following criteria:

A full annotation of each genetic factor and their correlations was made according to NCBI, Ensembl, UCSC and other public databases.


4. Extended Data: extended functional analysis

4.1 LD analysis of literature-origin SNPs
To find more functional and possible candidates for ADHD, the LD-proxy of the literature-origin SNPs were included in ADHDgene. The LD data used in this analysis were downloaded from HapMap FTP (ftp.ncbi.nlm.nih.gov/hapmap/ld_data/2009-04_rel27/). Briefly, the LD data were compiled from merged genotype data from phases I+II+III (HapMap rel #27, NCBI B36) submitted by HapMap genotyping centers to the DCC. LD SNP pairs were selected with a threshold 0.8 for r2. All HapMap populations were included for the LD analysis.

4.2 Pathway-based analysis (PBA) of GWAS data
We performed pathway-based analysis (PBA) on the two ADHD GWAS data sets [2, 3] by using i-GSEA [4,5], a method developed by our team for identification of pathways/gene sets associated with traits. Default parameters were used for i-GSEA and the annotated GO terms downloaded from MSigDB v3.0 were utilized as reference gene sets. Pathways/gene sets with statistical significance of FDR < 0.05 were included in ADHDgene database.


5. Search

Besides the 'Quick Search' provided in the left navigation bar, two search modules, namely 'Advanced Search' and 'Cross Search', were developed. The 'Advanced Search' module allows users to search each type of genetic factor or study from both core data and extended data by setting detailed requirements. The 'Cross Search' module allows users to make a thorough investigation of the core data. Users can search the data connections among SNP, gene, region and their reported studies. Figure 1 shows a demo search.

Figure 1

6. gBrowse

GBrowse is a popular visualization tool for visualizing genetic and genomic data. GBrowse was incorporated in ADHDgene to facilitate users to view the different types of genetic factors (variant/gene/region) simultaneously in the context of genomic regions. Entries from literature (core data) and extended functional analysis (extended data) are marked in different colors in the browser.

Reference:
1. Lander, E. and L. Kruglyak (1995). Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nature Genetics 11: 241-247.
2. Lesch, K.P., Timmesfeld, N., Renner, T.J., Halperin, R., Roser, C., Nguyen, T.T., Craig, D.W., Romanos, J., Heine, M., Meyer, J. et al. (2008) Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies. J Neural Transm, 115, 1573-1585.
3. Neale, B.M., Lasky-Su, J., Anney, R., Franke, B., Zhou, K., Maller, J.B., Vasquez, A.A., Asherson, P., Chen, W., Banaschewski, T. et al. (2008) Genome-wide association scan of attention deficit hyperactivity disorder. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 147B, 1337-1344.
4. Zhang, K., Cui, S., Chang, S., Zhang, L. and Wang, J. (2010) i-GSEA4GWAS: a web server for identification of pathways/gene sets associated with traits by applying an improved gene set enrichment analysis to genome-wide association study. Nucleic Acids Res, 38 Suppl, W90-95.
5. Zhang, K., Chang, S., Cui, S., Guo, L., Zhang, L. and Wang, J. (2011) ICSNPathway: identify candidate causal SNPs and pathways from genome-wide association study by one analytical framework. Nucleic Acids Res, 39 Suppl 2, W437-443.