G4RNA

G4RNA Help


This page contains a short description of each functionality of the G4RNA browsing tool. The G4RNA database is publicly available for users to perform comparative analysis between known RNA G-quadruplexes. It can also be used as a quick reference tool to extract available data on RNA sequences tested for G-quadruplex folding. The browsing tool is designed mainly to perform precise queries. Even though it can be used to generate large tables of results, users are advised to contact Jean-Michel Garant to obtain the entire database: jean-michel(dot)garant(at)usherbrooke(dot)ca


Search engines


There are two different search engines available to query the G4RNA database. They can be used independently or altogether in order to perform a more complex query. To select a search engine, the user must check the corresponding box.

Keyword driven search

The keyword search engine allows the user to search G4RNA's entries using a chain of character to obtain as result every corresponding sequences. Three form fields are used to formulate the query.

  • "Search by": Allows selection between search strategies below.

    Gene symbol: Search for the symbol attributed to a gene by the HUGO Gene Nomenclature Committee (HGNC).

    Sequence: Search for an RNA sequence.

    Experiment: Search for an experiment used to validate the folding of a G-quadruplex.

    Reference: Search for a reference of a paper as defined by Nature publishing group.

    doi: Search for a digital object identifier, a unique chain of character attributed to every paper published.

  • "Search type": Indicates the search engine how to interpret the user supplied term.

    Contains: The supplied term must be found within a G4RNA entry for it to be shown in results. Only one term at the time may be searched with the "Contains" search type. However "Contains" search type supports regular expressions and IUPAC's nucleic acid code alphabet (R,Y,W,S,[...],N) to increase the freedom of the user's query

  • Exact: The supplied term must match perfectly the corresponding entry in G4RNA for it to be shown on screen in the search results. It is possible to search for more than one term at a time with the "Exact" search type by spacing them with a new line, a space character or any of the following characters [ , ; : - _ ].

  • "Search term": Text area where the user can supply search terms to the search engine. Make sure that the terms respect the choices made in the preceding fields.
  • Position driven search

    The position search engine uses the human genome (hg38) annotations to locate G4RNA's entries on the 24 human chromosomes within a user supplied interval. Three fields are used to define the request.

  • "Chromosome": Select the chromosome of interest.

    All: All entries of G4RNA without any restriction related to chromosome.

    1, 2, 3, [...], 22, X, Y: Indicates which of the 24 human chromosome will be investigated.

    Not in Genome: Select entries not found at a specific chromosomal position such as telomeric repeat-containing RNA (TERRA) and artificial sequences.

  • "From": Set the bottom limit of the interval in which the search engine will look for G4RNA entries.
  • "To": Set the top limit of the interval.

  • Display fields


    Choices available as display fields allow the user to personalize the research results by selecting relevant information to display on screen. The results order is also customizable using the sorting options. All available information choices are described below in 4 categories:

    Sequences

    Each entry in G4RNA contains an RNA sequence that was tested in laboratory for G-quadruplex folding. Information gathered on those sequences are listed below:

  • Gene symbol: Displays a short abbreviation of the gene containing the sequence if it applies. The annotation "Artificial" is displayed for sequences not found in the genome retrieved from G4RNA. Displayed symbols are approved by HUGO Gene Nomenclature Committee (HGNC).
  • Sequence identifier: Displays the annotation used by the experimenters to refer to the sequence. "WT" letters means the sequence is wild type, and is found in the human genome. The default display is restricted to wild type sequences, the user must select the "Sequence identifier" field to display the different mutants and constructions designed by the experimenters.
  • Location in mRNA: Displays the location of the sequence within the messenger RNA (ex: 5'UTR, 3'UTR, Intron 1, Exon 3...). The "Artificial" annotation is displayed for artificial sequences found in G4RNA.
  • Chromosome: Displays the chromosome on which the sequence is found. The "None" annotation is displayed for non-genomic sequences.
  • Start position: Displays the chromosomal position where the sequence starts using the annotation of the (hg38) assembly.
  • End position: Displays the chromosomal position where the sequence ends using the annotation of the (hg38) assembly.
  • Sequence length: Displays the length of the RNA sequence.
  • Sequence: Displays the RNA sequence as a chain of characters.
  • Experiments

    Experimental results and references.

  • Experiment: Displays experiments performed on the sequence of interest. Each experiment will be displayed as a new row.
  • G4 folding: Displays the outcome of the folding of a G-quadruplex as determined by experimentation.
  • Reference: Displays the entire reference of the article in the format used by Nature publishing group.
  • doi: Displays the digital object identifier of the related paper. A unique combination of characters attributed to publications to refer to metadata such as the URL where the article can be found. The doi is a stable way of referring to an article since changes in URL is tracked in the metadata.
  • Predictions

    Values obtained by submitting the sequences to different G-quadruplex prediction tools.

  • cGcC score: Displays the cGcC score obtained by sequence analysis as described by JD. Beaudoin, R. Jodoin and JP. Perreault.
  • 2D Structure without G4: Displays the predicted secondary structure of the sequence as determined using default parameters of RNAfold from ViennaRNA package v2.1.7 by R. Lorenz and al.
  • Mfe without G4: Displays the predicted minimum free energy (Mfe) in kcal/mol as determined using default parameters of RNAfold from ViennaRNA package v2.1.7.
  • Mfe per nt without G4: Displays the predicted minimum free energy (Mfe) expressed as kcal/mol per nucleotides (nt) as a way to compare sequences of various length. The Mfe is determined using default parameters of RNAfold from ViennaRNA package v2.1.7.
  • 2D Structure with G4: Displays the predicted secondary structure of the sequence as determined using G-quadruplexes compatible parameters of RNAfold from ViennaRNA package v2.1.7.
  • Mfe with G4: Displays the predicted minimum free energy (Mfe) in kcal/mol as determined using G-quadruplex compatible parameters of RNAfold from ViennaRNA package v2.1.7.
  • Mfe per nt with G4: Displays the predicted minimum free energy (Mfe) expressed as kcal/mol per nucleotides (nt) as a way to compare sequences of various lengths. The Mfe is determined using G-quadruplex compatible parameters of RNAfold from ViennaRNA package v2.1.7.
  • QGRS Mapper

    The QGRS Mapper is a G-quadruplex predictive tool developed by O. Kikin, L. D'Antonio and PS. Bagga. It screens a sequence to find potential G-quadruplexes and scores them considering the number of quartets and the difference between each loops length. The information displayed using this option was obtained for the maximal scored potential G-quadruplex in each sequences of G4RNA using an adapted version of the QGRS Mapper using the following parameters: a maximal length of 45 nt, a minimal number of quartets of 2 and loops size ranging between 1 and 7 nt.

  • QGRS score: Displays the QGRS score of the potential G-quadruplex.
  • QGRS start position: Displays the position of the first nucleotide of the potential G-quadruplex within the sequence.
  • QGRS end position: Displays the position of the last nucleotide of the potential G-quadruplex within the sequence.
  • QGRS sequence length: Displays the potential G-quadruplex length.
  • QGRS sequence: Displays the sequence of the potential G-quadruplex.
  • Number of quartets: Displays the number of quartets in the structure (or the number of consecutive G in each of the four runs).
  • 1st loop sequence: Displays the sequence of the first loop situated between the first and second G run.
  • 2nd loop sequence: Displays the sequence of the second loop situated between the second and third G run.
  • 3rd loop sequence: Displays the sequence of the third loop situated between the third and fourth G run.
  • Sorting

    Results of the query can be sorted by choosing a characteristic and an order (Ascending or descending). The choices match the display fields defined above and include:

  • Gene symbol
  • Location
  • Chromosome
  • Sequence length
  • Experiment
  • Reference
  • cGcC score
  • Mfe without G4
  • Mfe per nt without G4
  • Mfe with G4
  • Mfe per nt with G4
  • QGRS score
  • QGRS sequence length
  • Number of quartets
  •  

     


    Results


    Research results are presented in a table where columns are defined by the display fields chosen by the user. Their order is the same as in display fields. Sequences, references and 2D structures more than 45 characters long are truncated to keep the table readable. Truncated informations are clickable in order to display the complete information in a pop-up bubble.

    Results are also downloadable using the button above the table. The table will be converted in a comma separated values format (.csv) which is supported by most spreadsheet applications such as Microsoft Excel and LibreOffice Calc.


    Contact us


    G4RNA is developed in collaboration with the research group of Jean-Pierre Perreault Ph.D. and is supported by the RiboClub.

    G4RNA is managed by Jean-Michel Garant. All comments, questions or suggestions should be communicated via e-mail : jean-michel(dot)garant(at)usherbrooke(dot)ca