Output Interpretation Guide

For the outputs of this pipeline (.qza and .qzv files), you should be able to open them with QIIME2 View. This page explains the purpose and interpretation of major output file generated by the 16S microbiome workflow. Most of files that are explained here are .qzv files, since they are visualizations of data.

For more details, refer to QIIME2’s official tutorial.

🧾 Summary Table

Step File(s) Purpose Key Insights
Demultiplexing demux.qzv Input stats & quality Check read balance & trimming point
Denoising stats.qzv Filter & denoise reads Inspect % input passed filter
Features rep-seqs.qzv, table.qzv Feature identity & abundance Explore sequence richness
Phylogenetic Tree rooted-tree.qza, tree.nwk Evolutionary relationships Input for diversity analysis
Alpha/Beta Diversity evenness.qzv, faith-pd.qzv, unweighted_unifrac.qzv Diversity comparisons Compare diversity between groups
Rarefaction alpha-rarefaction.qzv Sequencing depth evaluation Check for plateau
Taxonomy taxa-bar-plots.qzv Taxonomic composition Examine dominant taxa
ANCOM-BC da-barplot.qzv, ancombc-level6.qzv Differential abundance Identify significant taxa

Demultiplexing

Files: demux.qzv

Purpose: Provides basic statistics and quality information about your raw input reads.

  • Sequence count summary: Check the number of forward and reverse reads for each sample.
    • The counts should roughly match.
    • Large mismatches may indicate pairing or sequencing issues.
  • Interactive quality plot: Displays the average quality score per base position.
    • Look for the position where the quality drops sharply.
    • This helps you decide your trimming lengths for DADA2.

Denoising

Files: stats.qzv

Purpose: Filter and denoise the raw reads, removing low-quality or chimeric sequences.

Interpretation:

  • Filter summary: Shows how many reads passed each filtering step.
  • Sorting: The interactive table allows sorting based on different columns. For example, sorting based on percentage of input passed filter gives insights about the sequencing quality.
  • This step shows how your data was “cleaned up” before downstream analysis.

Feature Table and Representative Sequences

Files:

  • table.qzv
  • rep-seqs.qzv

Purpose: Summarizes the abundance and identity of amplicon sequence variants (ASVs) in your dataset.

Interpretation:

  • rep-seqs.qzv: Shows representative sequences for each feature (ASV).
    • Use this file if you want to inspect the actual FASTA sequences.
  • table.qzv: Displays the abundance of each feature across samples.
    • Overview: High-level statistics of frequency.
    • Interactive Sample Detail: Tells you how many times a feature appears in each sample. It’s useful for visualizing community composition and richness.
    • Feature Detail: Frequency table of each feature.

Phylogenetic Tree

Files: rooted-tree.qza

Purpose: Represents the evolutionary relationships among features (ASVs) for use in diversity analyses that require phylogenetic distances.

Interpretation:

  • The rooted tree provides phylogenetic relationships among features.
  • You can download tree.nwk and visualize it in:
    • iTOL — upload tree.nwk.
    • R with ggtree package (library(ggtree)).

Alpha & Beta Diversity

Files:

  • evenness.qzv
  • faith-pd.qzv
  • unweighted_unifrac.qzv

Purpose: Quantifies and compares microbial diversity within and between samples.

Interpretation:

  • These three files have the same function but use different statistic methods.
  • evenness.qzv: Compares how evenly species are distributed across samples.
    • Look for p-values comparing different groups.
  • faith-pd.qzv: Similar to evenness but uses phylogenetic diversity instead of counts.
  • unweighted_unifrac.qzv: Tests beta diversity differences between groups.

Rarefaction Curves

File: alpha-rarefaction.qzv

Purpose: Visualizes sequencing depth vs. observed diversity.

Interpretation:

  • Look for a plateau in the curve.
  • Plateau = sufficient sequencing depth; no need for deeper sequencing.
  • If the curve is still rising, some samples might need higher read depth.

Taxonomic Analysis

File: taxa-bar-plots.qzv

Purpose: Displays the taxonomic composition of samples at multiple classification levels.

Interpretation:

  • Visualizes the relative abundance of taxa at different taxonomic levels (domain to species).
  • Change the taxonomic level (e.g., level 2 = phylum, level 3 = class).
  • Most informative levels are typically 2 or 3.
  • You can:
    • Sort samples by metadata category.
    • Export the visualization as CSV for downstream R analysis.

Differential Abundance

Files:

  • da-barplot-vegetation.qzv
  • ancombc-level6.qzv

Purpose: Identifies taxa that are significantly different in abundance between experimental groups.

Interpretation:

  • da-barplot-vegetation.qzv: Identifies which species are over- or under-expressed among groups (based on feature IDs only).
  • ancombc-level6.qzv: Displays full taxonomic names and significance results.
    • Shows which taxa are differentially abundant.
    • You can customize by changing the taxonomic level parameter when running the pipeline. Please check the customization page for the more detailed instruction on this step.