User Tools

Site Tools


doc:select

Select

Tool version

1.0.1

Keywords

Selection, Lines, Pattern

Summary

Searches for lines matching (or not) a given pattern.

Description

This tool selects the lines matchin (or not) the pattern.

General comments (Warning/Tips)

If your data is not tab-delimited, use Text Manipulation→Convert.

For details about the synthax of the pattern, please have a look below or check the informations about regular expressions.

  SYNTHAX
  ( ) { } [ ] . * ? + ^ $ are all special characters. \ can be used to "escape" a special 
  character, allowing that special character to be searched for.
  \A matches the beginning of a string(but not an internal line).
  \d matches a digit, same as [0-9].
  \D matches a non-digit.
  \s matches a whitespace character.
  \S matches anything BUT a whitespace.
  \t matches a tab.
  \w matches an alphanumeric character.
  \W matches anything but an alphanumeric character.
  ( .. ) groups a particular pattern.
  \Z matches the end of a string(but not a internal line).
  { n or n, or n,m } specifies an expected number of repetitions of the preceding pattern.
      {n} The preceding item is matched exactly n times.
      {n,} The preceding item is matched n or more times.
      {n,m} The preceding item is matched at least n times but not more than m times.
  [ ... ] creates a character class. Within the brackets, single characters can be placed. 
  A dash (-) may be used to indicate a range such as a-z.
  . Matches any single character except a newline.
  * The preceding item will be matched zero or more times.
  ? The preceding item is optional and matched at most once.
  + The preceding item will be matched one or more times.
  ^ has two meaning: - matches the beginning of a line or string. - indicates negation in 
  a character class. For example, [^...] matches every character except the ones inside brackets.
  $ matches the end of a line or string.
  | Separates alternate possibilities.
  Source : GALAXY

Input

  • Select lines from: select the input file.
  • that: select the desired option :
    • Matching: the lines matching the pattern will be selected and the lines not matching removed.
    • NOT Matching: the lines not matching the pattern will be selected and the lines matching will be removed.
  • the pattern: enter the pattern you want to use to select the lines.
  EXAMPLE
  ^chr([0-9A-Za-z])+ would match lines that begin with chromosomes, such as lines in a BED 
  format file.
  (ACGT){1,5} would match at least 1 "ACGT" and at most 5 "ACGT" consecutively.
  ([^,][0-9]{1,3})(,[0-9]{3})* would match a large integer that is properly separated with 
  commas such as 23,078,651.
  (abc)|(def) would match either "abc" or "def".
  ^\W+# would match any line that is a comment.
  Source : GALAXY

Output

The tab-delimited output dataset contains the selected lines.

Example

Usage Example: selecting all genes located on a chromosome 2.

Input

Select lines from: file_select.txt

that: Matching

the pattern: chr2

With file_select.txt containing :

Output

Edited on

July 18th, 2014

doc/select.txt · Last modified: 2014/11/28 17:03 by slegras