What is a GUDMAP entry?
For ISH and IHC, entries contain the expression data for a single gene, assayed with a defined probe in a single mouse strain of a single mouse sex at a single stage or time in development. Sex can be unknown (e.g. GUDMAP:9683).
For microarray, entries encompass the expression data for many genes sampled from a single stage or time in development. Sex can be unknown (e.g. GUDMAP:7077) or both. Each entry relates to a single sample taken from a single microarray series within the database. So if you search by gene you'll get returned all microarray entries that contain that gene on the chip.
- Querying by gene
- Querying by anatomy
- Boolean Anatomy Search
- Querying by accession number
- Querying by function
- Querying by disease
Query by Gene
The Query > Gene text box on the GUDMAP Gene Expression page can be used to return GUDMAP entries that contain information about the expression of the gene or genes of interest. The text box has a predictive text function for gene symbols. This search addresses all in situ, and micro-array expression data. The result of this query is returned as an expression summary for each gene. Under ‘options’ there is an alternative option to view results as a list of database entries that satisfy the search criteria.
Genes can be searched for using the following terms:
- Gene Symbols/Names (and synonyms)
- MGI, Ensembl and MTF accession IDs
The predictive text function only works for gene symbols. It is still possible to search using a gene name although this won't appear in the predictive text list.
The search automatically assume a 'wildcard' at the end of the search string, so 'uro' will search for words/symbols beginning with 'uro', e.g. urothelium, urogential, uroplakin etc...
By clicking on 'options' next to the text box :
- Users can select if the result of the query are returned as expression summaries, the default option, or as a list of database entries by selecting GUDMAP entries
- The search on GUDMAP entries can be restricted to an individual Theiler stage.
- Users can perform a batch query and upload batch query files.
- Use care when searching with gene names. They may not be what you expect (e.g. "homeo box").
- When entering an accession ID be sure to use the full name (e.g. MGI:98957 or ENSMUSG00000036856) - searches will only be success if an exact match is found, even if the 'starts with' operator is used.
- The correct format for MTF IDs is "MTF" followed by a hash "#" followed by the ID. For example: MTF#66. Do not use leading zeros in the ID number.
It is possible to search for multiple genes in two ways:
Firstly, terms can be entered into the Query > Gene text box as a semicolon (;) separated list. Predictive text is available only for the first gene symbol in the list but other valid symbols can be added by typing on, using ; to separate terms. This list can contain a mixture of different terms (e.g. MGI accession IDs and Gene Symbols).
Secondly, by clicking on options, the batch query tool can be used to upload a file containing a list of gene symbols. The batch query will accept data as either tab-delimited or newline-delimited csv file or plain text file. Ensure your file has the correct file ending, e.g. .txt or .csv. The batch query will only accept queries by gene symbol. Batch files can be uploaded from the output of other databases, for example from a file exported from an ENSEMBL Biomart job. Queries using a list of terms and batch queries are interpreted as A OR B OR C…..
To create a batch file using Ensembl Biomart:
Go to http://www.ensembl.org/biomart/martview/
1. Choose database and dataset: Ensembl 56 / Human GRCh37
2. Select filters -> region -> chromosome and Base Pair (or markers / bands)
3. Select Attributes -> Homologs -> Mouse orthologs (and choose mouse Ensembl gene ID and whatever other features you want to export)
4. Select Results -> export as file.
This will give you a list of Mouse Ensembl IDs.
To get Mouse gene symbols, do another Biomart job :
1. Chose the Mouse NCBIM37 as dataset
2. Click on Filters -> Gene -> ID list limit -> Ensembl gene IDs (load the file with the ensembl IDs)
3. In Attributes -> Gene, ask for Associated gene names.
4. You can output a list of gene names in the correct format to read directly into GUDMAP as a batch file.
Query by Anatomy
The Query > Anatomy text box on the GUDMAP Gene Expression page can be used to search for genes annotated with expression present, uncertain or not detected in a given structure(s). This search addresses annotated in situ expression data and micro-array data. The result is returned as a list of GUDMAP entries that satisfy the search criteria:
- For ISH/IHC - that contain either a direct annotation for the anatomical term specified (whether it is present, uncertain or possible) or an inferred annotation for the anatomical term (see below). Each entry includes the relevant symbol for the gene expressed in the queried structure. Symbols are the current standard gene symbols (see MGI). To better understand the context of your results please see the tutorial on genitourinary development.
- For Microarray - where the anatomical term specified is the sample material or where sub-components of the anatomical term have been used as sample material. For example, using the term ‘maturing nephron’ for the query will return database entries where the sample was the ‘early proximal tubule’ and entries where the sample was ‘maturing renal corpuscle’. Both these structures are part of the maturing nephron.
The Query > Anatomy text box on the GUDMAP Gene Expression page accepts anatomy terms contained within the Mouse Genitourinary System Anatomy Ontology. The ontology is described in more detail in the ontology resources page.
Structures are often referred to with a variety of different terms. The predictive text in the anatomy search box on the GUDMAP Gene Expression page will help you enter the correct term. However, some structures have common names that begin with different text from the ontology name. For example, proximal tubule is represented by the term ‘renal proximal tubule’ in the ontology. You can easily find the ontology term for a structure by viewing the interactive anatomy ontology tree on the left side of the Boolean Anatomy Search page. This tree is supported by a text string search that will find terms containing a given string. For example, typing ‘proximal’ in the ‘find anatomy component’ box will highlight ‘renal proximal tubule’ in the tree. If you wish to use a term in the anatomy query on the GUDMAP Gene Expression page, for convenience click on the desired term in the anatomy tree; this will be presented in red on the right; copy the term from here and paste into the anatomy query box on the GUDMAP Gene Expression page. The anatomy query on the GUDMAP Gene Expression page will use this term simply as a text string; this is in contrast to the more advanced Boolean search which distinguishes different items in the tree with the same name as different entities (see help for Boolean Anatomy Search).
Queries can be performed for multiple components by entering terms as a semicolon separated list (e.g. 'kidney; ovary'). Predictive text is available only for the first term in the list but other valid ontology terms can be added by typing on, using ; to separate terms. Queries with multiple terms are treated as A OR B OR C…
Suppose, for example, the anatomical term 'superficial cellular layer' has been annotated as expression 'present' for a particular gene. As a consequence, the anatomical term 'urothelium' has 'inferred present' annotation (even though it has not been annotated directly) because 'superficial cellular layer' is a part of the 'urothelium'. Equally, if 'urothelium' was annotated as 'not detected' then its parts, including 'superficial cellular layer' would have 'inferred not detected' annotation.
The original annotations and original expression images are displayed on the page for the corresponding database entry.
The Boolean Anatomy Search allows complex queries to be constructed to search for gene expression based on selected anatomical structures. The search allows combinations of structures and developmental stages and different combinations of expression ‘present’, ‘not detected’ and ‘uncertain’. The search can be applied to a combination of structures or just to one structure. For example, to retrieve only genes expressed in a structure, or to compare expression in the same structure at different stages.
For more details, please go to the Boolean Anatomy Search Help Page.
Query by accession ID
The Query > Accession ID text box on the GUDMAP Gene Expression page will accept a variety of different accession IDs and will return all GUDMAP entries that contain information about the expression relating to the accession ID.
Accession ID's supported by GUDMAP:
- GUDMAP entry ID's (e.g. GUDMAP:8200, GUDMAP:7720).
- Ensembl ID (e.g. ENSMUSG00000016458).
- MGI ID (e.g. MGI:98968).
- MA probe ID (e.g. maprobe:4427).
Queries using multiple accession IDs can be performed by entering terms as a semicolon (;) separated list (e.g. GUDMAP:7200;MGI:98957).
Please note, the prefix is not required when using this search. If no prefix is specified then the search will be performed across all accession types.
NOTE: when searching using an ID other than GUDMAP ID only in-situ (ISH & IHC) entries will be returned.
Query by function
The query by function text box on the GUDMAP Gene Expression page can be used to search for genes and probes annotated with a Gene Ontology (GO) Molecular Function, Biological Process, or Subcellular Location term. This search addresses all in situ expression data in GUDMAP.
The query-string entered into the text box will be used to search the GO ontology for occurrences of this term. For each of the GO terms found a check is done to find all gene products annotated with these terms. A nice explanation of the methods and types of annotation used by GO is the GO Evidence Decision Tree.
The list of gene products gives a list of gene symbols that can then be used to search the GUDMAP database find entries that contain information about the expression of these genes.
NOTE: Searching by function will only return ISH or IHC entries.
Query by disease
The GUDMAP disease resource is acccessible by clicking on Query > Disease or via the Disease menu bar item on the top of the page. This searchable database of associations between genes and diseases of the urogenital system can be queried :
- by Disease-gene associations- Select a disease to find genes associated with this condition or select a gene to search for potential diseases associated with it.
- by Phenotype-gene associations -Select a phenotype to find associated genes or select a gene to search for potential phenotypes associated with it.
Links to GATACA and ToppGene can be found in the left side bar of the GUDMAP disease resource pages.