Documentation

SeqVis Documentation

A complete reference for every feature in SeqVis โ€” from loading files to exporting publication-ready plots.

Live preview โ€” drag to rotate, scroll to zoom, click any dot to highlight it
01

Overview

SeqVis is a browser-based tool for visualising compositional heterogeneity in nucleotide alignments. Each sequence is mapped to a point inside a regular tetrahedron whose four vertices represent the nucleotides A, T, G, and C. When all four nucleotide frequencies are equal (0.25 each) a sequence maps to the centroid; sequences biased toward one nucleotide cluster near the corresponding vertex.

02

Input โ€” FASTA / PHYLIP / NEXUS

Upload Data page

Click Upload Data on the home page to reach /visjson. Use the Input FASTA file picker to load a .fasta /.phy /.nex file. Files are read in 64 KB streaming chunks so very large genomes (hundreds of MB) load without freezing the browser. A progress bar tracks completion.

  • Each >header line starts a new sequence entry.
  • Unknown bases (N) and gap characters (- . space) are excluded from frequency counts.
  • Frequencies are computed for all positions as well as 1st, 2nd, and 3rd codon positions independently.
  • Both UNIX and Windows line endings are normalised automatically.
03

Input โ€” JSON

Upload Data page

Use the Input JSON file picker to reload a previously exported SeqVis JSON. The expected schema is an object keyed by sequence name, each entry containing four frequency maps:

{
  "Homo sapiens": {
    "allPositionFreq":   { "A": 0.29, "T": 0.29, "G": 0.21, "C": 0.21 },
    "firstPositionFreq": { "A": 0.30, "T": 0.28, "G": 0.22, "C": 0.20 },
    "secondPositionFreq":{ "A": 0.27, "T": 0.31, "G": 0.20, "C": 0.22 },
    "thirdPositionFreq": { "A": 0.30, "T": 0.28, "G": 0.21, "C": 0.21 }
  },
  ...
}
04

Input โ€” CSV / TSV

Visualize CSV/TSV page

Click Visualize CSV/TSV on the home page to reach /vistab. Upload any .csv, .tsv, or.tab file with the columns below. This lets you plot published nucleotide composition tables directly without running a sequence alignment.

label,A,T,G,C
Homo sapiens,0.29,0.29,0.21,0.21
Mus musculus,0.28,0.30,0.20,0.22
  • First column is the row label (any name).
  • Columns A T G C must be decimal proportions summing to โ‰ˆ 1.
  • Tab-separated files (.tsv / .tab) are auto-detected by extension.
05

Codon Position Frequency

The four Relative Frequency buttons switch the data being plotted without reloading the file. Each codon position can reveal different evolutionary pressures:

ModeWhat is plotted
All PositionsMean frequency across the whole sequence
1st PositionFrequencies at every 3rd base starting at offset 0
2nd PositionFrequencies at every 3rd base starting at offset 1
3rd PositionFrequencies at every 3rd base starting at offset 2 โ€” most variable, reflects synonymous substitutions
06

Vertex Assignment

Expand the Vertex assignment panel to drag nucleotides between the four tetrahedron corners. This lets you group purines (A+G) or pyrimidines (C+T) onto a single vertex to visualise strand-bias or other compositional patterns.

  • Each vertex can hold one or more nucleotide letters (e.g., AG vs CT).
  • When a nucleotide is moved, its frequency is added to the destination vertex weight.
  • The 3D plot and vertex labels update instantly โ€” no re-upload required.
  • Click Reset Tetrahedron to restore the default A/T/G/C layout.
07

Auto-scale Axes

When Auto-scale axes to data maximum is checked, the axis range is set to the highest individual nucleotide frequency present in the dataset (rounded up to the nearest 0.05). This expands the visible region of the tetrahedron so that points fill the space rather than clustering near the centroid.

axisScale = โŒˆ max(A, T, G, C across all sequences) / 0.05 โŒ‰ ร— 0.05

The current axis max is shown as a badge next to the checkbox when the scale is less than 1. Disable auto-scale to compare multiple datasets on a fixed 0โ€“1 axis.

08

Spread (Visual Separation)

The Spread slider (0โ€“100%) applies a visual-only radial expansion to separate overlapping dots. It does not alter the underlying frequency data or the exported JSON.

centroid = mean of all plotted points
dแตข = Pแตข โˆ’ centroid
scale = (inradius ร— 0.90 ร— strength) / max(|dแตข|)
Pโ€ฒแตข = centroid + dแตข ร— scale (only if scale > 1)
  • The tetrahedron inradius bounds the spread so no point can exit the shape.
  • Relative distances between points are preserved โ€” nearby sequences stay nearby.
  • Spread re-applies automatically when vertex assignment changes.
  • A badge on the 3D view indicates when spread is active.
09

3D Interaction

The right panel is a live Three.js canvas. All standard OrbitControls gestures apply:

GestureAction
Left-dragRotate tetrahedron
Right-drag / two-finger dragPan the scene
Scroll / pinchZoom in / out
Click a dotToggle highlight (pink) and show species name overlay
Click a table rowHighlight the corresponding dot in the 3D view
Cycle Camera AngleSnap to 6 preset viewpoints around the tetrahedron
Start / Stop RotatingToggle continuous auto-rotation
10

Export

Three export options are available in the Export section of the left panel:

FormatContentsNotes
โ†“ JSONRaw frequency data for all sequencesSame schema as Input JSON. Spread / scale not applied โ€” scientifically accurate.
โ†“ PNGRaster screenshot of the current 3D viewCaptures the exact current rotation and camera angle. Requires preserveDrawingBuffer on the Canvas.
โ†“ SVGSVG wrapper embedding the PNG data-URLSuitable for vector workflows; the 3D content is a raster image inside the SVG envelope.
11

Citation

If you use SeqVis in published work, please cite:

Jermiin LS, Ho JWK, Lichtenstein A (2004). SeqVis: A tool for detecting compositional heterogeneity among aligned nucleotide sequences. Bioinformatics 20(12): 1963โ€“1964.