EDNA2: E-probe Diagnostic for Nucleic Acid Analysis

Already have an account? / Create an account

  • Installing EDNA and EDNA2 in your Linux system

    Download the EDNA.tar.gz and EDNA2.tar.gz from our Downloads page. You could do a right click on the links to EDNA and EDNA2 and hit copy link to clipboard, then you can download them as follows:

    wget EDNA.tar.gz
    wget EDNA2.tar.gz
    Once you have downloaded the links you can extract the contents of the compressed file as follows:
    tar -xzvf EDNA.tar.gz
    tar -xzvf EDNA2.tar.gz
    Depending on which EDNA version you would like to use. EDNA refers to the very first version of our bioinformatic tool, which is a longer Perl pipeline where multiple steps are followed in order to obtain either a positive, negative or suspect results. EDNA2 refers to the second version of EDNA and has incorporated various options among the most important is the capability of analyzing multiple samples simultaneously.

  • E-probe generation

    E-probes are generated by comparing the target genome with the nearest taxonomic taxa genomes. Comparisons are done using Mummer and nucmer, where perl scripts parse output data to create highly c urated e-probe databases.

    Our current pipeline to generate e-probes consists of four steps.

    Raw e-probe design: During this step of the pipeline, raw e-probes are generated by using our code. The e-probes are generated by comparing a target genome with the taxonomically closest species genome. These are considered raw e-probes because they have only be curated by comparing two genomes.

    Assuming that you have EDNA installed in your Linux system, the e-probe design code will have the following options

    -p OUTPUT “output” = e-probe database output name
    -P PROBE LENGTH “80” The intended e-probe length that you want to generate (default 80)
    -t TARGET FASTA FILE (The genome of your pathogen of interest)
    -n NEAR NEIGHBOR FASTA FILE: The genome of the taxonomically closest species to your pathogen
    -g MAX GAPS “0”. Maximum gap between two adjacent matches in a cluster (default 0)
    -m MIN MATCH LENGTH “15” Minimum length of an maximal exact match (default 20)
    -l MIN PROBE LENGTH “20” Keep unique sequences larger than this length (default 20)
    -L MAX PROBE LENGTH "4000" Keep unique sequences shorter than this length (default 4000)

    Here we are going to generate raw e-probes that will be found in the examples folder of your downloaded EDNA file. If you are now located in the uncompressed EDNA folder, you will have to change your directory to examples as follows:

    cd examples

    In the examples folder you will found various genomes, let's focus in an eukaryotic genome from a plant pathogen that infects roots of seedlings and causes severe losses. Let's use Pythium ultimum, which in the examples folder should be found as ultimum.fasta. The file ultimum.fasta is the complete genome of P. ultimum and will need to be compared with its taxonomically closest species genome. The closest species having a genome available is Pythium aphanidermatum, a soil-borne pathogen that is usually found co-infecting roots of seedlings with P. ultimum. We will be comparing those two genomes using the following code:

    Pipeline.pl -p ultimum-120.fasta -t ultimum.fasta -n aphanidermatum.fasta -m 15 -P 120

    Once the computing is finished, in your examples folder you will find a fasta file with the name ultimum-120.fasta, these are your raw e-probes. These are considered raw e-probes because they have only been curated by comparing two genomes.

    Subsequently, it is neccesary to

  • Rapid Pathogen Detection in Metagenomes

    E-probes are used in sequence alignments against raw reads from sequencing databases, without the need of metagenome assembly.

    EDNA2 analysis time: variable depending on host, usually less than 1 hour.

  • Rapid Pathogen semi-quantification in Metatranscriptomics

    RNA sequencing might be the best approach to assess the presence of actively infecting/growing microorganisms in a sample. EDNA2 can manage RNAseq and present you with a semi-quantitative outut relative to a control or baseline.

    RNA sequencing analysis with EDNA2 takes less time than any regular RNAseq analysis where mapping to the genome is needed.