Skip to content
Written for v1.1.0· Last updated: Mar 31, 2026

DEG Analysis

InSilicoLab Feature

This feature is exclusive to Portrai InSilicoLab.

Differential Expression Gene (DEG) analysis identifies genes that show statistically significant differences in expression between cell populations. This is essential for understanding biological differences between tissue regions, cell types, or experimental conditions.

DEG Analysis overview

Overview

What is DEG Analysis?

DEG analysis compares gene expression levels between two or more groups of cells to find:

  • Upregulated genes - Higher expression in the target group
  • Downregulated genes - Lower expression in the target group
  • Statistically significant differences - Accounting for biological and technical variation

When to Use DEG Analysis

  • Comparing tumor cells vs. normal cells
  • Identifying marker genes for cell types
  • Finding spatially variable genes
  • Comparing treatment vs. control conditions

Opening DEG Analysis

  1. Click the DEG icon (DNA symbol) in the Activity Bar
  2. The DEG extension opens with:
    • Sidebar - New Analysis button and list of previous results
    • Workspace - New analysis form or selected result viewer

DEG extension initial view

Setting Up Comparison Groups

DEG analysis requires defining at least two groups of cells to compare. Each group is defined using subset conditions.

Quick Setup

Quick Setup automatically creates groups based on a categorical feature.

DEG Quick Setup

  1. In the Quick Setup section, select a categorical feature from the Distribute by dropdown (e.g., cell_type)
  2. A preview panel appears showing all available labels with checkboxes
  3. Select which labels to include as groups (use Select All / Deselect All for convenience)
  4. Click Apply to create groups (minimum 2 labels required)
  5. If existing groups are present, a confirmation dialog asks whether to replace them

Manual Setup

For more control, manually configure each comparison group.

DEG Comparison Groups

Adding Groups

  1. Click Add Group to create a new comparison group
  2. Groups are automatically named (Group A, Group B, etc.)
  3. Add at least 2 groups for comparison

Configuring Group Conditions

Each group uses the same condition system as the Subset:

  1. Click on a group card to expand it
  2. Add subset conditions:
    • Column conditions - Select by feature values
    • Lasso conditions - Select by spatial/embedding selection
    • Signature conditions - Select by gene set scores
  3. Combine conditions with AND/OR operators
  4. View the item count to verify your selection

Renaming Groups

  1. Click on the group name
  2. Enter a descriptive name (e.g., "Tumor Core", "Stromal Border")
  3. Press Enter to save

Using Saved Subsets

Apply saved subsets to quickly configure groups:

  1. Open the group's condition editor
  2. Click Apply Saved Subset
  3. Each saved subset shows its name and a condition expression preview (hover for full expression)
  4. Select a saved subset
  5. The saved subset's conditions are applied to the group

Handling Overlapping Items

When cells match multiple group criteria, you need to decide how to handle overlaps.

Overlap Statistics

The overlap handler shows:

  • Number of overlapping items between each pair of groups
  • Percentage of total items affected
  • Adjusted item counts after applying the strategy

Overlap Strategies

StrategyDescriptionBest For
Exclude from allRemove overlapping items from all groups (default)Strictest separation
Allow overlapsKeep items in all matching groups (warning only)When overlap is intentional
Keep in first groupKeep in the first matching group onlyPriority-based assignment
Keep in largest groupKeep in the group with most itemsStatistical power

Choosing a Strategy

  • Allow overlaps - Use when groups intentionally share items (e.g., comparing overlapping phenotypes)
  • Exclude from all - Use when you need completely distinct populations
  • Keep in first group - Use when groups have a natural priority order
  • Keep in largest group - Use to maximize statistical power in each group

Validation

Before running analysis, the system validates your configuration.

DEG Validation Summary

Requirements

  • Minimum 2 groups - At least two groups are required for comparison
  • Sufficient items - Groups with very few items may produce unreliable results

Handling Empty Groups

When using Quick Setup, some categories may result in groups with no items (e.g., if certain cell types are absent from the current subset). Empty groups are shown as warnings rather than errors.

DEG Empty Groups Confirmation

When you attempt to run analysis with empty groups:

  1. A confirmation dialog appears listing the empty groups
  2. You can choose to Exclude empty groups and continue - The analysis proceeds without the empty groups
  3. Or Cancel - Return to adjust your group configuration

TIP

Empty groups are automatically excluded from the analysis. You don't need to manually remove them if you choose to continue.

Validation Messages

TypeMeaning
ErrorMust be fixed before analysis can run
WarningAnalysis can run, but results may be affected (e.g., empty groups)

Running Analysis

  1. Verify all groups are configured correctly
  2. Check the validation summary for errors
  3. Click Run DEG Analysis
  4. The analysis is submitted to the server
  5. Progress is shown in the Sidebar

Analysis Duration

Analysis time depends on:

  • Number of cells in each group
  • Number of groups being compared
  • Server load

You can navigate away and return later - results are saved.

Understanding Results

Results List

DEG Results List

Previous analysis results appear in the Sidebar:

  • Click a result to view it
  • Results show a progress indicator while processing
  • Completed results can be reopened anytime

Result View Layout

When you open a completed result, the workspace displays a tabbed visualization pane (Heatmap or Volcano Plot) in the top half and the Data Table in the bottom half:

┌─────────────────────────────────────┐
│  [Heatmap] [Volcano Plot]   (tabs) │
│       Active visualization         │
│   (Heatmap or Volcano Plot)        │
├═══════════════ divider ═════════════│
│       Data Table (bottom pane)      │
│   Detailed statistics per gene      │
└─────────────────────────────────────┘
  • Tabs - Switch between Heatmap and Volcano Plot using the tab bar at the top of the visualization pane
  • Resizable divider - Drag the horizontal divider between the two panes to adjust the height ratio (default is 50/50)
  • Shared thresholds - The Log2FC and FDR thresholds are shared between the Heatmap and Volcano Plot views; changing thresholds in one view applies to the other
  • Cross-component selection - Genes selected via lasso in the Volcano Plot or via checkboxes in the Data Table are synchronized bidirectionally
  • Both the active visualization and the Data Table are always visible simultaneously

Heatmap View

DEG Heatmap

The heatmap shows expression patterns across groups:

  • Rows - Genes (filtered by significance)
  • Columns - Comparison groups
  • Colors - Expression level (log fold change)
  • Clustering - Similar genes are grouped together
  • Statistical Method - Displays the test used (e.g., "Wilcoxon rank-sum test")

Statistical Method

The heatmap configuration panel displays the statistical method used for differential expression testing:

MethodDescription
Wilcoxon rank-sum testNon-parametric test comparing distributions between groups

This information helps ensure reproducibility and proper interpretation of results.

Display Metric

Choose which statistical measure to visualize in the heatmap:

MetricDescriptionColor ScaleUse Case
Log2 Fold Change (default)Expression ratio between groupsRed-Blue (diverging)Shows expression direction
Score (Z-score)Statistical test scoreRed-Blue (diverging)Compare relative significance
Adjusted P-valueFDR-corrected significanceYellow-Red (sequential)Focus on statistical significance

Genes per Group

Control how many top genes are displayed per group in the heatmap:

OptionDescription
5Show top 5 genes per group (default)
10Show top 10 genes per group
20Show top 20 genes per group
50Show top 50 genes per group

Genes are ranked by the selected display metric after applying direction and FDR filters, then the top N genes from each group are shown. Increasing this value shows more genes but may make the heatmap harder to read.

TIP

The selected gene count is reflected in the exported filename (e.g., Top10 in the filename indicates 10 genes per group were displayed).

Axis Swap

Toggle the Swap Axes switch at the bottom of the heatmap to transpose the orientation:

  • Default: Genes on X-axis, groups on Y-axis
  • Swapped: Groups on X-axis, genes on Y-axis

This is useful when you have many groups and want to see them spread horizontally, or when gene labels are easier to read on a specific axis.

Y-Axis Order

Groups are displayed on the Y-axis in the order they were defined, with the first selected group on top. This ensures the visual order matches your group definition order.

Reading the Heatmap

The color interpretation depends on the selected metric:

For Log2 Fold Change and Score:

  • Red/warm colors - Higher expression (upregulated) or positive score
  • Blue/cool colors - Lower expression (downregulated) or negative score
  • Intensity - Magnitude of difference

For Adjusted P-value:

  • Dark colors - More statistically significant (lower p-value)
  • Light colors - Less statistically significant (higher p-value)

Volcano Plot

The Volcano Plot visualizes the relationship between fold change magnitude (X-axis) and statistical significance (Y-axis) for each gene, making it easy to identify genes that are both statistically significant and biologically meaningful.

View Modes

The Volcano Plot offers two viewing modes, selectable via a toggle in the toolbar:

ModeDescriptionBest For
SingleDisplays one comparison group at a time with a group selectorDetailed exploration of a single group
GridDisplays all groups simultaneously in a responsive gridQuick comparison across all groups

Single View:

  • Select a comparison group from the Group dropdown
  • Full interactive toolbox available (Pan, Lasso Selection, Reset View)

Grid View:

  • All comparison groups are shown simultaneously in a responsive grid (columns adjust to container width)
  • Each cell displays the group name and a subtitle showing "Up: N Down: N" gene counts
  • Click any cell (title or data points) to switch to Single View for that group
  • Mouse wheel zoom is available within each cell

Toolbar Controls

ControlLocationDescription
View Mode toggleLeftSwitch between Single and Grid view
Group selectorLeftSelect comparison group (Single View only)
Y-Axis MetricLeftChoose the Y-axis statistical measure
ThresholdsLeftConfigure Log2FC and FDR threshold values
Gene LabelsLeftToggle gene name labels for significant genes
Export ImageRightExport chart as PNG (Single View only)

Y-Axis Metric

Choose which statistical measure to display on the Y-axis:

MetricDescription
-log10(Adjusted P-value) (default)FDR-corrected significance, more conservative
-log10(P-value)Raw significance, less conservative

Higher values on the Y-axis indicate greater statistical significance.

Thresholds

Click the Thresholds button to open a popover with two settings:

  • Log2FC Threshold - Minimum fold change cutoff (default: 1, equivalent to 2-fold change). Genes beyond ±threshold that pass the FDR cutoff are classified as significant.
  • FDR Threshold - Select from: No threshold, 0.1, 0.05, 0.01.

Threshold lines are displayed on the chart as dashed lines:

  • Vertical lines at ±Log2FC threshold
  • Horizontal line at the FDR threshold (in -log10 scale)

TIP

Threshold values are shared with the Heatmap view. Changing the Log2FC or FDR threshold in the Volcano Plot also updates the Heatmap, and vice versa.

Gene Labels

Click the Gene Labels button to toggle gene name labels:

  • Labels are shown for all significant (upregulated and downregulated) genes
  • Labels are positioned to the right for upregulated genes (positive logFC) and to the left for downregulated genes (negative logFC)
  • Overlapping labels are automatically hidden to maintain readability

Color Coding

CategoryColorCriteria
UpregulatedRedlogFC > threshold AND adjusted p-value < FDR threshold
DownregulatedBluelogFC < -threshold AND adjusted p-value < FDR threshold
Not SignificantGrayDoes not meet both fold change and significance cutoffs

Chart Interactions (Single View)

The chart toolbox (top-right corner) provides:

ToolIconDescription
PanHandDrag to pan the chart view
Lasso SelectionLassoDraw a polygon to select genes within the area
Reset ViewCrosshairReset zoom and pan to the initial state

Additional interactions:

  • Mouse wheel zoom on both X and Y axes
  • Hover tooltip showing gene name, group, Log2 Fold Change, Adjusted P-value, and direction
  • Legend - Click to toggle category visibility; hover to highlight

Cross-component Selection

The Volcano Plot and Data Table share a bidirectional gene selection:

  1. Volcano to Table - Use Lasso Selection to draw a polygon around genes. Selected genes are automatically checked in the Data Table.
  2. Table to Volcano - Select genes via checkboxes in the Data Table. Corresponding points are highlighted in the Volcano Plot.

Selected genes from either source can be used to Create Signature via the Data Table toolbar.

TIP

Lasso selection is only available in Single View mode.

Data Table

DEG Data Table

The data table provides detailed statistics for each gene:

ColumnDescription
Gene NameGene symbol or identifier
ScoreStatistical test score
Log Fold Changelog2(Group B / Group A) expression ratio
P-valueRaw statistical significance
Adjusted P-value (FDR)Multiple testing corrected p-value

Column Groups

Columns for each comparison group can be expanded or collapsed:

  • Click the Expand Columns / Collapse Columns button in the table toolbar to toggle all groups
  • You can also manually expand or collapse individual column groups by clicking the group header
  • Default: All column groups are expanded

When expanded, each group shows: Log Fold Change, Adjusted P-value, and Score columns. When collapsed, only Log Fold Change is visible.

Sorting Results

  • Click column headers to sort
  • Click again to reverse sort order
  • Sort by adjusted p-value to find most significant genes

Selection and Cross-component Sync

Select genes using the row checkboxes. Selections are synchronized with the Volcano Plot:

  • Genes selected in the Data Table are highlighted in the Volcano Plot
  • Genes selected via Lasso in the Volcano Plot are automatically checked in the Data Table
  • The selection count is displayed next to the "Data Table" heading

Create Signature from Results

After reviewing DEG results, you can create a Signature directly from selected genes for downstream analysis.

  1. In the Data Table, select genes using the row checkboxes
  2. Click the Create Signature button in the table toolbar (enabled when 1+ genes are selected)
  3. In the dialog, enter:
    • Name (required) - e.g., "DEG Upregulated Genes"
    • Description (optional) - Additional context
    • A preview of selected genes is shown (up to 50, with "+X more" for larger selections)
  4. Click Create Signature
  5. A success notification confirms creation

The new Signature appears immediately in the Explorer's Signatures tab and can be used for color mapping, subsetting, or scatter plot axes.

Filtering Results

FDR Threshold

Filter genes by statistical significance using False Discovery Rate (FDR):

LevelThresholdDescription
No thresholdAllShow all genes
ExploratoryFDR < 0.1Lenient, for initial exploration
StandardFDR < 0.05Commonly used significance level
High ConfidenceFDR < 0.01Strict, high confidence results

Choosing a Threshold

  • Start with No threshold to see overall patterns
  • Use Standard (0.05) for publishable results
  • Use High Confidence (0.01) for validation candidates

TIP

The FDR threshold is shared between the Heatmap and Volcano Plot views. Changing the threshold in one view also updates the other. The Data Table maintains its own independent filter.

Direction Filter

Filter genes by expression change direction with a configurable logFC threshold:

FilterCriteriaShows
All-All differentially expressed genes
UpregulatedlogFC > thresholdGenes with higher expression in target
DownregulatedlogFC < -thresholdGenes with lower expression in target
Bidirectional|logFC| > thresholdStrongly changed in either direction

LogFC Threshold

The logFC threshold determines the minimum fold change required for direction filtering. The default value is 1 (equivalent to a 2-fold change).

  • When selecting a direction filter (Upregulated, Downregulated, or Bidirectional), an inline input field appears within the selected option
  • Adjust the threshold value to make filtering more or less strict
  • Changes apply immediately in the heatmap view

TIP

A logFC threshold of 1 means 2-fold change, 2 means 4-fold change, and 0.5 means ~1.4-fold change. Adjust based on the magnitude of expression differences in your data.

TIP

The logFC threshold is shared between the Heatmap and Volcano Plot views, where it controls the vertical threshold lines at ±logFC. The Data Table maintains its own independent threshold.

Interpreting Direction

  • Upregulated (logFC > threshold) - Higher expression in target group
  • Downregulated (logFC < -threshold) - Lower expression in target group
  • Bidirectional - Strongly changed regardless of direction

Exporting Results

Heatmap Image Export

Capture the heatmap as a PNG image with configurable dimensions:

  1. Open a completed DEG analysis result
  2. Click the Export Image button
  3. A popover appears with dimension settings:
    • Width - Export width in pixels (default: 1200, max: 10000)
    • Height - Export height in pixels (default: 800, max: 10000)
  4. Click Download to export

The exported image includes:

  • Heatmap with expression colors
  • Gene names and group names on axes
  • Color scale legend
  • Current filter settings applied

File naming format: DEG_Heatmap_{date}_{metric}_{direction}_FDR_{threshold}_Top{n}.png

TIP

When axis swap is enabled, the filename includes a "_Transposed" suffix.

See Image Export for more details.

Volcano Plot Image Export

Capture the Volcano Plot as a PNG image with configurable dimensions (Single View only):

  1. Switch to Single View and select the desired comparison group
  2. Click the Export Image button in the toolbar (right side)
  3. A popover appears with dimension settings:
    • Width - Export width in pixels (default: 1200, max: 10000)
    • Height - Export height in pixels (default: 800, max: 10000)
  4. Click Download to export

The exported image includes:

  • Scatter plot with color-coded gene points
  • Threshold lines (logFC and FDR)
  • Legend (Upregulated, Downregulated, Not Significant)
  • Gene labels (if enabled)
  • Current Y-axis metric and threshold settings

File naming format: DEG_Volcano_{date}_{group}_{yAxisMetric}.png

TIP

Image export is only available in Single View mode. Grid View does not support image export.

Data Table CSV Export

Export the full DEG results as a CSV file:

  1. Open a completed DEG analysis result
  2. In the Data Table section, click Export CSV
  3. The file downloads as deg_results.csv

The exported CSV includes for each gene:

  • Gene name
  • For each comparison group:
    • Log fold change (logFC)
    • Adjusted P-value
    • Score

Best Practices

Group Definition

  1. Use meaningful groups - Define groups based on biological questions
  2. Ensure sufficient size - Aim for at least 50-100 cells per group
  3. Check for overlaps - Review overlap statistics before analysis
  4. Name groups clearly - Use descriptive names for documentation

Result Interpretation

  1. Start broad, then filter - Begin with no thresholds, then apply filters
  2. Check multiple genes - Don't rely on single gene results
  3. Validate biologically - Confirm results make biological sense
  4. Document your analysis - Record group definitions and parameters

Common Pitfalls

IssueSolution
Too few significant genesRelax FDR threshold or check group definitions
Too many significant genesUse stricter FDR threshold or direction filter
Unexpected resultsVerify group definitions match your hypothesis
Analysis failsCheck that groups have sufficient, non-overlapping items

Troubleshooting

Analysis Won't Start

  • Check validation summary for errors
  • Ensure at least 2 groups are defined
  • Verify each group has items selected

No Results After Analysis

  • The analysis may still be processing (check progress)
  • Try refreshing the results list
  • Results expire after some time - rerun if needed

Unexpected Gene Rankings

  • Verify group definitions are correct
  • Check for overlapping items affecting statistics
  • Consider the overlap handling strategy used

Portrai Explorer Documentation