Skip to content
Written for v1.0.0· Last updated: Jan 8, 2026

DEG Analysis

InSilicoLab Feature

This feature is exclusive to Portrai InSilicoLab.

Differential Expression Gene (DEG) analysis identifies genes that show statistically significant differences in expression between cell populations. This is essential for understanding biological differences between tissue regions, cell types, or experimental conditions.

DEG Analysis overview

Overview

What is DEG Analysis?

DEG analysis compares gene expression levels between two or more groups of cells to find:

  • Upregulated genes - Higher expression in the target group
  • Downregulated genes - Lower expression in the target group
  • Statistically significant differences - Accounting for biological and technical variation

When to Use DEG Analysis

  • Comparing tumor cells vs. normal cells
  • Identifying marker genes for cell types
  • Finding spatially variable genes
  • Comparing treatment vs. control conditions

Opening DEG Analysis

  1. Click the DEG icon (DNA symbol) in the Activity Bar
  2. The DEG extension opens with:
    • Sidebar - List of previous analysis results
    • Workspace - New analysis form or result viewer

DEG extension initial view

Setting Up Comparison Groups

DEG analysis requires defining at least two groups of cells to compare. Each group is defined using filter conditions.

Quick Setup

Quick Setup automatically creates groups based on a categorical annotation.

DEG Quick Setup

  1. In the Quick Setup section, select a categorical annotation from the Distribute by dropdown (e.g., cell_type)
  2. A preview panel appears showing all available labels with checkboxes
  3. Select which labels to include as groups (use Select All / Deselect All for convenience)
  4. Click Apply to create groups (minimum 2 labels required)
  5. If existing groups are present, a confirmation dialog asks whether to replace them

Manual Setup

For more control, manually configure each comparison group.

DEG Comparison Groups

Adding Groups

  1. Click Add Group to create a new comparison group
  2. Groups are automatically named (Group A, Group B, etc.)
  3. Add at least 2 groups for comparison

Configuring Group Filters

Each group uses the same filter system as the Filter:

  1. Click on a group card to expand it
  2. Add filter conditions:
    • Column conditions - Filter by annotation values
    • Lasso conditions - Filter by spatial/embedding selection
    • Variable Set conditions - Filter by gene set scores
  3. Combine conditions with AND/OR operators
  4. View the item count to verify your selection

Renaming Groups

  1. Click on the group name
  2. Enter a descriptive name (e.g., "Tumor Core", "Stromal Border")
  3. Press Enter to save

Using Filter Presets

Apply saved filter presets to quickly configure groups:

  1. Open the group's filter editor
  2. Click Load Preset
  3. Select a saved preset
  4. The preset's conditions are applied to the group

Handling Overlapping Items

When cells match multiple group criteria, you need to decide how to handle overlaps.

Overlap Statistics

The overlap handler shows:

  • Number of overlapping items between each pair of groups
  • Percentage of total items affected
  • Adjusted item counts after applying the strategy

Overlap Strategies

StrategyDescriptionBest For
Allow overlapsKeep items in all matching groups (warning only)When overlap is intentional
Exclude from allRemove overlapping items from all groupsStrictest separation
Keep in first groupKeep in the first matching group onlyPriority-based assignment
Keep in largest groupKeep in the group with most itemsStatistical power

Choosing a Strategy

  • Allow overlaps - Use when groups intentionally share items (e.g., comparing overlapping phenotypes)
  • Exclude from all - Use when you need completely distinct populations
  • Keep in first group - Use when groups have a natural priority order
  • Keep in largest group - Use to maximize statistical power in each group

Validation

Before running analysis, the system validates your configuration.

DEG Validation Summary

Requirements

  • Minimum 2 groups - At least two groups are required for comparison
  • Sufficient items - Groups with very few items may produce unreliable results

Handling Empty Groups

When using Quick Setup, some categories may result in groups with no items (e.g., if certain cell types are absent from the current filter). Empty groups are shown as warnings rather than errors.

DEG Empty Groups Confirmation

When you attempt to run analysis with empty groups:

  1. A confirmation dialog appears listing the empty groups
  2. You can choose to Exclude empty groups and continue - The analysis proceeds without the empty groups
  3. Or Cancel - Return to adjust your group configuration

TIP

Empty groups are automatically excluded from the analysis. You don't need to manually remove them if you choose to continue.

Validation Messages

TypeMeaning
ErrorMust be fixed before analysis can run
WarningAnalysis can run, but results may be affected (e.g., empty groups)

Running Analysis

  1. Verify all groups are configured correctly
  2. Check the validation summary for errors
  3. Click Run DEG Analysis
  4. The analysis is submitted to the server
  5. Progress is shown in the Sidebar

Analysis Duration

Analysis time depends on:

  • Number of cells in each group
  • Number of groups being compared
  • Server load

You can close the tab and return later - results are saved.

Understanding Results

Results List

DEG Results List

Previous analysis results appear in the Sidebar:

  • Click a result to view it
  • Results show a progress indicator while processing
  • Completed results can be reopened anytime

Result Tabs

  • Open multiple results as tabs
  • Switch between results to compare
  • Close tabs when done

Heatmap View

The heatmap shows expression patterns across groups:

  • Rows - Genes (filtered by significance)
  • Columns - Comparison groups
  • Colors - Expression level (log fold change)
  • Clustering - Similar genes are grouped together
  • Statistical Method - Displays the test used (e.g., "Wilcoxon rank-sum test")

Statistical Method

The heatmap configuration panel displays the statistical method used for differential expression testing:

MethodDescription
Wilcoxon rank-sum testNon-parametric test comparing distributions between groups

This information helps ensure reproducibility and proper interpretation of results.

Reading the Heatmap

  • Red/warm colors - Higher expression (upregulated)
  • Blue/cool colors - Lower expression (downregulated)
  • Intensity - Magnitude of difference

Data Table

The data table provides detailed statistics for each gene:

ColumnDescription
Gene NameGene symbol or identifier
ScoreStatistical test score
Log Fold Changelog2(Group B / Group A) expression ratio
P-valueRaw statistical significance
Adjusted P-value (FDR)Multiple testing corrected p-value

Sorting Results

  • Click column headers to sort
  • Click again to reverse sort order
  • Sort by adjusted p-value to find most significant genes

Filtering Results

FDR Threshold

Filter genes by statistical significance using False Discovery Rate (FDR):

LevelThresholdDescription
No thresholdAllShow all genes
ExploratoryFDR < 0.1Lenient, for initial exploration
StandardFDR < 0.05Commonly used significance level
High ConfidenceFDR < 0.01Strict, high confidence results

Choosing a Threshold

  • Start with No threshold to see overall patterns
  • Use Standard (0.05) for publishable results
  • Use High Confidence (0.01) for validation candidates

Direction Filter

Filter genes by expression change direction:

FilterCriteriaShows
All-All differentially expressed genes
UpregulatedlogFC > 1Genes with higher expression in target
DownregulatedlogFC < -1Genes with lower expression in target
Bidirectional|logFC| > 1Strongly changed in either direction

Interpreting Direction

  • Upregulated (logFC > 1) - At least 2-fold higher in target group
  • Downregulated (logFC < -1) - At least 2-fold lower in target group
  • Bidirectional - Strongly changed regardless of direction

Exporting Results

Heatmap Image Export

Capture the heatmap as a PNG image:

  1. Open a completed DEG analysis result
  2. In the Heatmap Controls section, click Export Image
  3. The image downloads with current filter settings applied

The exported image includes:

  • Heatmap with expression colors
  • Gene names (rows)
  • Group names (columns)
  • Color scale legend
  • Current filter settings (direction, FDR threshold, genes per group)

File naming format: DEG_Heatmap_{date}_{direction}_{fdr}_Top{n}.png

See Image Export for more details.

Data Table CSV Export

Export the full DEG results as a CSV file:

  1. Open a completed DEG analysis result
  2. In the Data Table section, click Export CSV
  3. The file downloads as deg_results.csv

The exported CSV includes for each gene:

  • Gene name
  • For each comparison group:
    • Log fold change (logFC)
    • Adjusted P-value
    • Score

Best Practices

Group Definition

  1. Use meaningful groups - Define groups based on biological questions
  2. Ensure sufficient size - Aim for at least 50-100 cells per group
  3. Check for overlaps - Review overlap statistics before analysis
  4. Name groups clearly - Use descriptive names for documentation

Result Interpretation

  1. Start broad, then filter - Begin with no thresholds, then apply filters
  2. Check multiple genes - Don't rely on single gene results
  3. Validate biologically - Confirm results make biological sense
  4. Document your analysis - Record group definitions and parameters

Common Pitfalls

IssueSolution
Too few significant genesRelax FDR threshold or check group definitions
Too many significant genesUse stricter FDR threshold or direction filter
Unexpected resultsVerify group definitions match your hypothesis
Analysis failsCheck that groups have sufficient, non-overlapping items

Troubleshooting

Analysis Won't Start

  • Check validation summary for errors
  • Ensure at least 2 groups are defined
  • Verify each group has items selected

No Results After Analysis

  • The analysis may still be processing (check progress)
  • Try refreshing the results list
  • Results expire after some time - rerun if needed

Unexpected Gene Rankings

  • Verify group definitions are correct
  • Check for overlapping items affecting statistics
  • Consider the overlap handling strategy used

MATISSE Explorer Documentation