Skip to content
Written for v1.0.0· Last updated: Feb 11, 2026

DEG Analysis

InSilicoLab Feature

This feature is exclusive to Portrai InSilicoLab.

Differential Expression Gene (DEG) analysis identifies genes that show statistically significant differences in expression between cell populations. This is essential for understanding biological differences between tissue regions, cell types, or experimental conditions.

DEG Analysis overview

Overview

What is DEG Analysis?

DEG analysis compares gene expression levels between two or more groups of cells to find:

  • Upregulated genes - Higher expression in the target group
  • Downregulated genes - Lower expression in the target group
  • Statistically significant differences - Accounting for biological and technical variation

When to Use DEG Analysis

  • Comparing tumor cells vs. normal cells
  • Identifying marker genes for cell types
  • Finding spatially variable genes
  • Comparing treatment vs. control conditions

Opening DEG Analysis

  1. Click the DEG icon (DNA symbol) in the Activity Bar
  2. The DEG extension opens with:
    • Sidebar - New Analysis button and list of previous results
    • Workspace - New analysis form or selected result viewer

DEG extension initial view

Setting Up Comparison Groups

DEG analysis requires defining at least two groups of cells to compare. Each group is defined using subset conditions.

Quick Setup

Quick Setup automatically creates groups based on a categorical feature.

DEG Quick Setup

  1. In the Quick Setup section, select a categorical feature from the Distribute by dropdown (e.g., cell_type)
  2. A preview panel appears showing all available labels with checkboxes
  3. Select which labels to include as groups (use Select All / Deselect All for convenience)
  4. Click Apply to create groups (minimum 2 labels required)
  5. If existing groups are present, a confirmation dialog asks whether to replace them

Manual Setup

For more control, manually configure each comparison group.

DEG Comparison Groups

Adding Groups

  1. Click Add Group to create a new comparison group
  2. Groups are automatically named (Group A, Group B, etc.)
  3. Add at least 2 groups for comparison

Configuring Group Conditions

Each group uses the same condition system as the Subset:

  1. Click on a group card to expand it
  2. Add subset conditions:
    • Column conditions - Select by feature values
    • Lasso conditions - Select by spatial/embedding selection
    • Signature conditions - Select by gene set scores
  3. Combine conditions with AND/OR operators
  4. View the item count to verify your selection

Renaming Groups

  1. Click on the group name
  2. Enter a descriptive name (e.g., "Tumor Core", "Stromal Border")
  3. Press Enter to save

Using Saved Subsets

Apply saved subsets to quickly configure groups:

  1. Open the group's condition editor
  2. Click Apply Saved Subset
  3. Each saved subset shows its name and a condition expression preview (hover for full expression)
  4. Select a saved subset
  5. The saved subset's conditions are applied to the group

Handling Overlapping Items

When cells match multiple group criteria, you need to decide how to handle overlaps.

Overlap Statistics

The overlap handler shows:

  • Number of overlapping items between each pair of groups
  • Percentage of total items affected
  • Adjusted item counts after applying the strategy

Overlap Strategies

StrategyDescriptionBest For
Exclude from allRemove overlapping items from all groups (default)Strictest separation
Allow overlapsKeep items in all matching groups (warning only)When overlap is intentional
Keep in first groupKeep in the first matching group onlyPriority-based assignment
Keep in largest groupKeep in the group with most itemsStatistical power

Choosing a Strategy

  • Allow overlaps - Use when groups intentionally share items (e.g., comparing overlapping phenotypes)
  • Exclude from all - Use when you need completely distinct populations
  • Keep in first group - Use when groups have a natural priority order
  • Keep in largest group - Use to maximize statistical power in each group

Validation

Before running analysis, the system validates your configuration.

DEG Validation Summary

Requirements

  • Minimum 2 groups - At least two groups are required for comparison
  • Sufficient items - Groups with very few items may produce unreliable results

Handling Empty Groups

When using Quick Setup, some categories may result in groups with no items (e.g., if certain cell types are absent from the current subset). Empty groups are shown as warnings rather than errors.

DEG Empty Groups Confirmation

When you attempt to run analysis with empty groups:

  1. A confirmation dialog appears listing the empty groups
  2. You can choose to Exclude empty groups and continue - The analysis proceeds without the empty groups
  3. Or Cancel - Return to adjust your group configuration

TIP

Empty groups are automatically excluded from the analysis. You don't need to manually remove them if you choose to continue.

Validation Messages

TypeMeaning
ErrorMust be fixed before analysis can run
WarningAnalysis can run, but results may be affected (e.g., empty groups)

Running Analysis

  1. Verify all groups are configured correctly
  2. Check the validation summary for errors
  3. Click Run DEG Analysis
  4. The analysis is submitted to the server
  5. Progress is shown in the Sidebar

Analysis Duration

Analysis time depends on:

  • Number of cells in each group
  • Number of groups being compared
  • Server load

You can navigate away and return later - results are saved.

Understanding Results

Results List

DEG Results List

Previous analysis results appear in the Sidebar:

  • Click a result to view it
  • Results show a progress indicator while processing
  • Completed results can be reopened anytime

Result View Layout

When you open a completed result, the workspace displays the Heatmap and Data Table simultaneously in a vertically split layout:

┌─────────────────────────────────────┐
│         Heatmap (top pane)          │
│  Filters, color-coded expression    │
├═══════════════ divider ═════════════│
│       Data Table (bottom pane)      │
│  Detailed statistics per gene       │
└─────────────────────────────────────┘
  • Resizable divider - Drag the horizontal divider between the two panes to adjust the height ratio (default is 50/50)
  • Independent filters - Each pane has its own direction and FDR filter controls, allowing you to apply different filter criteria to the heatmap and data table simultaneously
  • Both views are always visible, allowing you to see expression patterns and detailed statistics side by side

Heatmap View

DEG Heatmap

The heatmap shows expression patterns across groups:

  • Rows - Genes (filtered by significance)
  • Columns - Comparison groups
  • Colors - Expression level (log fold change)
  • Clustering - Similar genes are grouped together
  • Statistical Method - Displays the test used (e.g., "Wilcoxon rank-sum test")

Statistical Method

The heatmap configuration panel displays the statistical method used for differential expression testing:

MethodDescription
Wilcoxon rank-sum testNon-parametric test comparing distributions between groups

This information helps ensure reproducibility and proper interpretation of results.

Display Metric

Choose which statistical measure to visualize in the heatmap:

MetricDescriptionColor ScaleUse Case
Log2 Fold Change (default)Expression ratio between groupsRed-Blue (diverging)Shows expression direction
Score (Z-score)Statistical test scoreRed-Blue (diverging)Compare relative significance
Adjusted P-valueFDR-corrected significanceYellow-Red (sequential)Focus on statistical significance

Genes per Group

Control how many top genes are displayed per group in the heatmap:

OptionDescription
5Show top 5 genes per group (default)
10Show top 10 genes per group
20Show top 20 genes per group
50Show top 50 genes per group

Genes are ranked by the selected display metric after applying direction and FDR filters, then the top N genes from each group are shown. Increasing this value shows more genes but may make the heatmap harder to read.

TIP

The selected gene count is reflected in the exported filename (e.g., Top10 in the filename indicates 10 genes per group were displayed).

Axis Swap

Toggle the Swap Axes switch at the bottom of the heatmap to transpose the orientation:

  • Default: Genes on X-axis, groups on Y-axis
  • Swapped: Groups on X-axis, genes on Y-axis

This is useful when you have many groups and want to see them spread horizontally, or when gene labels are easier to read on a specific axis.

Y-Axis Order

Groups are displayed on the Y-axis in the order they were defined, with the first selected group on top. This ensures the visual order matches your group definition order.

Reading the Heatmap

The color interpretation depends on the selected metric:

For Log2 Fold Change and Score:

  • Red/warm colors - Higher expression (upregulated) or positive score
  • Blue/cool colors - Lower expression (downregulated) or negative score
  • Intensity - Magnitude of difference

For Adjusted P-value:

  • Dark colors - More statistically significant (lower p-value)
  • Light colors - Less statistically significant (higher p-value)

Data Table

DEG Data Table

The data table provides detailed statistics for each gene:

ColumnDescription
Gene NameGene symbol or identifier
ScoreStatistical test score
Log Fold Changelog2(Group B / Group A) expression ratio
P-valueRaw statistical significance
Adjusted P-value (FDR)Multiple testing corrected p-value

Column Groups

Columns for each comparison group can be expanded or collapsed:

  • Click the Expand Columns / Collapse Columns button in the table toolbar to toggle all groups
  • You can also manually expand or collapse individual column groups by clicking the group header
  • Default: All column groups are expanded

When expanded, each group shows: Log Fold Change, Adjusted P-value, and Score columns. When collapsed, only Log Fold Change is visible.

Sorting Results

  • Click column headers to sort
  • Click again to reverse sort order
  • Sort by adjusted p-value to find most significant genes

Create Signature from Results

After reviewing DEG results, you can create a Signature directly from selected genes for downstream analysis.

  1. In the Data Table, select genes using the row checkboxes
  2. Click the Create Signature button in the table toolbar (enabled when 1+ genes are selected)
  3. In the dialog, enter:
    • Name (required) - e.g., "DEG Upregulated Genes"
    • Description (optional) - Additional context
    • A preview of selected genes is shown (up to 50, with "+X more" for larger selections)
  4. Click Create Signature
  5. A success notification confirms creation

The new Signature appears immediately in the Explorer's Signatures tab and can be used for color mapping, subsetting, or scatter plot axes.

Filtering Results

FDR Threshold

Filter genes by statistical significance using False Discovery Rate (FDR):

LevelThresholdDescription
No thresholdAllShow all genes
ExploratoryFDR < 0.1Lenient, for initial exploration
StandardFDR < 0.05Commonly used significance level
High ConfidenceFDR < 0.01Strict, high confidence results

Choosing a Threshold

  • Start with No threshold to see overall patterns
  • Use Standard (0.05) for publishable results
  • Use High Confidence (0.01) for validation candidates

Direction Filter

Filter genes by expression change direction with a configurable logFC threshold:

FilterCriteriaShows
All-All differentially expressed genes
UpregulatedlogFC > thresholdGenes with higher expression in target
DownregulatedlogFC < -thresholdGenes with lower expression in target
Bidirectional|logFC| > thresholdStrongly changed in either direction

LogFC Threshold

The logFC threshold determines the minimum fold change required for direction filtering. The default value is 1 (equivalent to a 2-fold change).

  • When selecting a direction filter (Upregulated, Downregulated, or Bidirectional), an inline input field appears within the selected option
  • Adjust the threshold value to make filtering more or less strict
  • Changes apply immediately in the heatmap view

TIP

A logFC threshold of 1 means 2-fold change, 2 means 4-fold change, and 0.5 means ~1.4-fold change. Adjust based on the magnitude of expression differences in your data.

Interpreting Direction

  • Upregulated (logFC > threshold) - Higher expression in target group
  • Downregulated (logFC < -threshold) - Lower expression in target group
  • Bidirectional - Strongly changed regardless of direction

Exporting Results

Heatmap Image Export

Capture the heatmap as a PNG image with configurable dimensions:

  1. Open a completed DEG analysis result
  2. Click the Export Image button
  3. A popover appears with dimension settings:
    • Width - Export width in pixels (default: 1200, max: 10000)
    • Height - Export height in pixels (default: 800, max: 10000)
  4. Click Download to export

The exported image includes:

  • Heatmap with expression colors
  • Gene names and group names on axes
  • Color scale legend
  • Current filter settings applied

File naming format: DEG_Heatmap_{date}_{metric}_{direction}_FDR_{threshold}_Top{n}.png

TIP

When axis swap is enabled, the filename includes a "_Transposed" suffix.

See Image Export for more details.

Data Table CSV Export

Export the full DEG results as a CSV file:

  1. Open a completed DEG analysis result
  2. In the Data Table section, click Export CSV
  3. The file downloads as deg_results.csv

The exported CSV includes for each gene:

  • Gene name
  • For each comparison group:
    • Log fold change (logFC)
    • Adjusted P-value
    • Score

Best Practices

Group Definition

  1. Use meaningful groups - Define groups based on biological questions
  2. Ensure sufficient size - Aim for at least 50-100 cells per group
  3. Check for overlaps - Review overlap statistics before analysis
  4. Name groups clearly - Use descriptive names for documentation

Result Interpretation

  1. Start broad, then filter - Begin with no thresholds, then apply filters
  2. Check multiple genes - Don't rely on single gene results
  3. Validate biologically - Confirm results make biological sense
  4. Document your analysis - Record group definitions and parameters

Common Pitfalls

IssueSolution
Too few significant genesRelax FDR threshold or check group definitions
Too many significant genesUse stricter FDR threshold or direction filter
Unexpected resultsVerify group definitions match your hypothesis
Analysis failsCheck that groups have sufficient, non-overlapping items

Troubleshooting

Analysis Won't Start

  • Check validation summary for errors
  • Ensure at least 2 groups are defined
  • Verify each group has items selected

No Results After Analysis

  • The analysis may still be processing (check progress)
  • Try refreshing the results list
  • Results expire after some time - rerun if needed

Unexpected Gene Rankings

  • Verify group definitions are correct
  • Check for overlapping items affecting statistics
  • Consider the overlap handling strategy used

MATISSE Explorer Documentation