User Tools

Site Tools


doc:gpr

GPr

Tool version

1.0

Keywords

Database query, GEO Profiles database

Summary

The GPr (GEO Profiles requests) program does elaborate requests on the GEO Profiles database in order to select studies that display a significant difference between two (or more) conditions for the input list genes.

Description

The process is decomposed in three main steps.

In the first step, it retrieves studies that use all genes indicated in the “Gene list” text field. The organism of interest can be specified or not, with the help of the “Organism” drop-down list. We recommend the use of the “Organism” field, because it accelerates the computing time by limiting the number of studies that will be further analyzed.

In the second step, for each retrieved study, a p-value is computed with an ANOVA statistic test. This p-value indicates if there is at least one significant difference between the averages of two conditions. Each green box of the GEO Profiles graphic represents a condition. If a condition contains only one value, then the ANOVA test cannot be applied and no p-value is associated (represented by a '-' character in the output). In this case, we consider that this dataset has a significant difference, and let the user to check himself this dataset. The significativity threshold can be adjusted with the “Significativity threshold for the p-value” text field. We recommend the use of one of the following values (the less stringent to the more stringent): 0.1, 0.05 or 0.01. If we set the threshold to 1, then all studies are retrieved whatever their p-value. More than one dataset can be associated to a gene, in this case, we consider that this gene is significant if there is at least one dataset that is significant.

In the third step, we select the studies that have at least x percentage of significant genes among the listed genes. This value can be adjusted with the “Show only studies that have a minimum percentage of significant genes in the gene list” text field. If this value is set to 1, then only studies that have a significant p-value in each gene are shown. If it is set to 0, then all studies are shown, even if they have not any significant p-value.

General comments (Warning/Tips)

This tool has no specific warning, however keep in mind that some queries can take some time. To decrease the computation time, don’t use the “All” option for the “Organism” drop-down list.

Input

No input file is required to use this tool. Nevertheless, it uses some options that are described below:

  • “Gene list”: this text field takes a list of genes on which you want to do the querying. If you want querying on many genes use a comma to separate each gene name. For example: “adck3, rabac1, rsad2”;
  • “Organism”: this drop-down list indicates the specific organism to query. If you want no specific organism, select “All” option;
  • “Significativity threshold for the p-value”: this text field indicates the significativity threshold for the p-value. This must be a real between 0 and 1. If it is set to 1, all p-values are significant. Typically 0.1, 0.05 and 0.01 are the most common values used;
  • “Show only studies that have a minimum percentage of significant genes in the gene list”: this text field is used to filter the returned list of studies. It displays only studies that have at least x% of significant genes among the queried genes; x is defined by the user. This must be a real between 0 and 1. If it is set to 0, no filter is applied and if it is set to 1, only studies that have a significant p-value on each gene are displayed.

Output

The output is in html format. It summarizes first all the input parameters, and then lists the retained studies with these criteria. The studies are displayed in a table where each row represents a study and where:

  • the first column is a number indicating the study rank;
  • the second column is the GDS reference number with a link to the web page of this study on the GEO Profiles database;
  • the third column is the title of the study;
  • the fourth column is the percentage of significant genes for this study. The studies are sorted on this value;
  • the other columns represent each gene provided by the user. For each gene and each dataset, the p-value of the ANOVA statistic test is displayed. In red are the significant p-values. For each p-value, a link to the web page of the GEO Profiles graphic is provided.

Download the result web page on your computer to have a better view of the results, which is different in the galaxy viewing module.

Example

Input

  • “Gene list”: rsad2
  • “Organism”: Gallus gallus
  • “Significativity threshold for the p-value”: 1.0
  • “Show only studies that have a minimum percentage of significant genes in the gene list”: 1.0

Output

Edited on

July 3rd, 2014.

doc/gpr.txt · Last modified: 2014/11/28 16:58 by slegras