1. Introduction

Phylogeny.fr has been designed to provide a high performance platform that transparently chains programs relevant to phylogenetic analysis in a comprehensive and flexible pipeline. Although phylogenetic aficionados will be able to find most of their favorite tools and run sophisticated analyzes, the primary philosophy of Phylogeny.fr is to assist biologists with no experience in phylogeny in analyzing their data in a robust way.

Note: The Phylogeny.fr platform offers a phylogeny pipeline which can be executed through three main modes, designed to fit your specific expertise level and analysis requirements.

"One Click" mode

Targets users that do not wish to deal with program and parameter selection. The pipeline is set up to run and connect programs recognized for their accuracy and speed.

Advanced mode

Proposes the succession of the same programs but users can choose the steps to perform (multiple sequence alignment, phylogenetic reconstruction, tree drawing) and customize options.

"A la carte" mode

Offers the possibility of running and testing more alignment and phylogeny programs such as MUSCLE, ClustalO, T-Coffee, PhyML, Iq-Tree, and others with complete control.

Alternatively, users have the possibility to run the different programs separately in the Tools section.

2. Phylogeny Analysis

2.1 "One Click" mode

This is a "default" mode which proposes a pipeline already set up to run and connect programs recognized for their accuracy and speed to reconstruct a robust phylogenetic tree from a set of sequences. The workflow depends on the chosen execution mode (Fast or Accurate):

Mafft for rapid and robust multiple sequence alignment.
Optionally ClipKit for alignment curation (trimming unreliable regions).
FastTree (Fast mode) or IQ-TREE (Accurate mode) for phylogeny.

What users have to do is just to upload their FASTA file or paste their sequences and pick an execution speed. The system will handle everything else, automatically formatting data between steps and applying scientifically sound default parameters.

You have the ability to toggle the curation step (ClipKit) to eliminate poorly aligned positions and highly divergent regions before the tree inference step. At the end of the analysis, a beautiful, interactive rendition of the generated tree will be available for you to interact with and export.

Default Parameters

For transparency, below is the exhaustive list of the parameters used by the tools behind the scenes during a One Click run:

Parameter	Default Value
Mafft (Alignment)
Alignment strategy	Default (FFT-NS-2)
ClipKit (Alignment Curation)
Mode	Gappy (removes columns with high gap frequency)
Gaps threshold	0.4 (40%)
FastTree (Fast Phylogeny)
Evolutionary Model	Default (JTT+CAT for proteins, JC+CAT for DNA)
Branch Support	SH-like local supports
IQ-TREE (Accurate Phylogeny)
Model Selection	Auto (ModelFinder)
Branch Support	Ultrafast Bootstrap (5000 replicates)
Burn-in iterations	200
PhyloTree Tool (Tree Utilities)
Rooting	Midpoint Rooting

2.2 Advanced mode

The Advanced mode follows the same robust pipeline philosophy as "One Click" mode, utilizing a predefined succession of programs (Mafft, ClipKit, and FastTree). However, unlike One Click mode, it provides you with complete control over the parameters for each step of the analysis.

This mode is designed for intermediate to advanced users who want to fine-tune their phylogenetic reconstruction without dealing with the complex selection of individual tools from scratch.

Tip: If you are unsure about a specific parameter, the Advanced mode forms provide descriptive tooltips for each option to help you make scientifically sound choices.

2.3 "A la carte" mode

The "A la carte" mode provides the ultimate flexibility for building your phylogenetic analysis pipeline. Unlike the predefined workflows of One Click and Advanced modes, this mode allows you to handpick each tool for every step of the analysis, giving you complete control over your methodology.

Ideal for: Researchers who want to compare different tools, test specific methodologies, or have particular requirements that the standard pipelines do not cover.

Two-Phase Workflow

The A la carte mode operates in two distinct phases:

Pipeline Design: Select which tools you want to include in your analysis. A visual pipeline at the top of the page shows your current selection in real-time.
Configuration & Launch: After confirming your tool selection, configure each tool's parameters individually, upload your FASTA input, and submit the analysis.

Available Tools by Step

Tool	Description
Step 1: Multiple Alignment
MUSCLE	Fast and accurate aligner, ideal for small to medium datasets
T-Coffee	Consistency-based aligner with high accuracy for divergent sequences
ClustalO	Scalable aligner optimized for large datasets using HMM profiles
Mafft	Versatile aligner with multiple strategies (FFT-NS, L-INS-i, etc.)
Step 2: Alignment Curation (Optional)
Gblocks	Removes poorly aligned regions using a block-selection approach
ClipKit	Modern trimming tool that preserves phylogenetically informative sites
BMGE	Removes highly variable or ambiguous regions based on entropy
Step 3: Phylogenetic Tree
PhyML	Maximum Likelihood inference with model selection support
BioNJ	Fast distance-based method, good for exploratory analyses
MrBayes	Bayesian inference using MCMC sampling for posterior probabilities
IQTree	State-of-the-art ML with ultrafast bootstrap and ModelFinder
RAxML-NG	Optimized ML for large-scale phylogenomic datasets
FastTree	Approximate ML, extremely fast for large alignments

Flexible Selection: You are not required to select a tool for every step. For example, you can run only an alignment tool, or skip the curation step entirely. However, at least one tool must be selected to proceed.

How It Works

Navigate to the A la carte page from the main menu.
Click on the tool tiles to select your preferred tool for each step (Alignment, Curation, Phylogeny). Selected tools are highlighted with a checkmark.
Click "Continue to Configuration" to proceed.
Upload your FASTA file or paste your sequences directly.
Configure each selected tool's parameters using the provided forms.
Optionally enter your email to receive a notification when the analysis completes.
Click "Submit Analysis" to start the pipeline.

Note: Make sure your input data flows correctly through the pipeline. For instance, phylogeny tools expect an alignment as input—if you skip the alignment step, ensure your input file is already a valid multiple sequence alignment.

3. BLAST Explorer

The BLAST Explorer allows you to search for sequences similar to your query within major biological databases. Beyond standard BLAST searches, our platform automatically builds a phylogenetic tree from the top hits, enabling you to explore the evolutionary relationships among homologous sequences interactively.

Unique Feature: Unlike traditional BLAST services, Phylogeny.fr automatically aligns your query with the best hits using Mafft and reconstructs a phylogenetic tree using FastTree—giving you instant evolutionary context.

3.1 Submitting a query

To run a BLAST search, you need to provide a single sequence in FASTA format. You can either upload a file or paste your sequence directly into the text area.

BLAST Programs

Select the appropriate program based on your query and target database types:

Program	Query	Database	Use case
blastp	Protein	Protein	Find homologous proteins
blastn	Nucleotide	Nucleotide	Find similar DNA/RNA sequences
blastx	Nucleotide	Protein	Translate DNA and search proteins
tblastn	Protein	Nucleotide	Search protein against translated DNA

Available Databases

Choose from a variety of regularly updated sequence databases:

Database	Type	Description
Protein Databases
nr_cluster	Protein	NCBI non-redundant protein sequences (clustered)
SwissProt	Protein	Curated, high-quality UniProt entries
UniRef90	Protein	UniProt clustered at 90% identity
PDB	Protein	Protein Data Bank sequences with known 3D structures
RefSeq Viral Protein	Protein	NCBI curated viral protein sequences
Nucleotide Databases
Core Nucleotide	Nucleotide	NCBI core nucleotide collection
PDB	Nucleotide	PDB nucleotide sequences
RefSeq Viral Genomic	Nucleotide	NCBI curated viral genomic sequences

E-value Threshold

The E-value (Expect value) represents the number of hits you would expect to find by chance. A lower E-value indicates a more significant match:

1.e-5 (default): Good balance between sensitivity and specificity.
1.e-10 to 1.e-30: More stringent, returns only highly confident hits.
0.1 to 1: More permissive, may include distant homologs.

Tip: If your initial search returns no hits, try increasing the E-value threshold or switching to a broader database like nr_cluster.

3.2 Interactive results

Once your BLAST job completes, you are presented with an interactive visualization combining a phylogenetic tree and a powerful floating Toolbox panel for filtering and exporting your results.

Phylogenetic Tree Display

The tree displays your query sequence alongside the top BLAST hits, allowing you to visualize evolutionary relationships at a glance:

Query highlighting: Your input sequence is visually emphasized in the tree.
Taxonomy coloring: Sequences are colored by taxonomic group (adjustable depth).
Interactive selection: Click directly on tree leaves to select or deselect sequences.
External links: Click on sequence IDs to access their original database entry.

The Toolbox Panel

The floating Toolbox is your control center for exploring BLAST results. It can be dragged anywhere on screen and offers two distinct modes via tabs:

Tree View Tab

This tab focuses on the top hits displayed in the phylogenetic tree (typically the best-scoring sequences). Use it for precise, small-scale selection:

Selection counter: Shows how many sequences are currently selected out of total tree leaves.
Quick select buttons: Instantly select all or deselect all sequences in the tree.
Bitscore filter pills: Filter sequences by score ranges (<40, 40-50, 50-80, 80-200, >200) with color-coded pills.
Taxonomy depth slider: Adjust the taxonomic classification level displayed (1=Kingdom to 8=Species level).
Gap-free indicator: Shows the percentage of alignment positions without gaps for your current selection—useful for assessing alignment quality.

All Hits Tab

This tab operates on all BLAST hits, not just those in the tree. Use it for large-scale filtering across the entire result set:

E-value threshold: Dynamically filter hits by E-value (from 1e-5 to 1e-100).
Histogram filters: Open interactive histograms to select sequences by:
- Score (bitscore): Filter by alignment score strength.
- Similarity (% identity): Filter by sequence similarity percentage.
- Coverage (% query coverage): Filter by how much of your query is covered.
Taxonomy filter: Open a hierarchical taxonomy tree to select sequences from specific taxonomic groups (e.g., only Bacteria, only Vertebrates).
Reset selection: Clear all filters and start fresh.

Tip: Use the All Hits tab when you need to work with sequences that didn't make it into the tree visualization. The histogram filters let you identify and select sequences based on precise score ranges.

Exporting & Pipeline Integration

Both tabs provide export options for your selected sequences:

Download FASTA: Export selected sequences as a FASTA file for external analysis.
Send to Pipeline: Directly transfer selected sequences to the One Click or Advanced phylogeny pipelines. The sequences are automatically loaded into the analysis form.

Seamless Workflow: A typical workflow might be: run BLAST → explore the tree → filter by taxonomy or score → send a curated subset directly to the phylogeny pipeline for detailed analysis.

Taxonomy Legend

Below the tree, a dynamic legend displays the taxonomic groups present in your results with their associated colors. The groups shown depend on the taxonomy depth setting in the Toolbox.

4. Tools & Versions

Below is the list of all bioinformatics tools integrated in the platform, along with their current version.

Tool	Description	Version
Blast	Basic Local Alignment Search Tool	2.16.0
MMseqs2	Ultra-fast and sensitive sequence search and clustering	sse2
Mafft	Multiple sequence alignment program for large datasets	7.526
Muscle	Fast and accurate multiple sequence alignment	5.3
Clustal Omega	Fast multiple sequence alignment	1.2.4
TCoffee	Advanced multiple sequence alignment program	13.46.0.919e8c6b
GBlocks	Alignment curation — eliminates poorly aligned and divergent regions	0.91b
ClipKit	Alignment curation using smart-gap trimming	2.11.4
BMGE	Alignment curation using gap trimming	2.0
FastTree	Approximately-maximum-likelihood phylogenetic tree inference	2.2.0
IQ-Tree	Maximum-likelihood phylogenetic inference with automatic model selection	2.4.0
RAxML-NG	Randomized Axelerated Maximum Likelihood phylogenetic inference	2.0.0
MrBayes	Bayesian inference of phylogenetic trees	3.2.7a
PhyML	Maximum-likelihood phylogenetic inference	3.3.20241207
Fastphylo	Fast tools for phylogenetics — distance computation and neighbor-joining	1.0.1
BioNJ	Neighbor-joining phylogenetic inference	1.0

phylogeny.fr