Skip to content

3. Adding batch analysis and plots

milesAraya edited this page Apr 1, 2025 · 5 revisions

Adding New Visualizations to OptiNiSt Database

This guide walks through the process of adding new visualizations to OptiNiSt Digital Marmoset Database website.

An example branch has been created with all of the files that need to be changed added with "# Add this:" for easy identification. Branch name: template/add-new-graphs

1. Complete File Structure Updates

When adding new visualizations, you'll need to modify more files than initially listed. Here's a comprehensive list of files to update:

New Files:

  • studio/app/optinist/wrappers/expdb/my_new_analysis.py - Your analysis implementation
  • studio/app/optinist/wrappers/expdb/params/my_new_analysis.yaml - Parameters for the analysis

Modified Files:

  • studio/app/optinist/dataclass/stat.py - Add properties and visualization methods
  • studio/app/optinist/core/nwb/oristat.py - Update NWB schema with new data types
  • studio/app/optinist/core/expdb/batch_unit.py - Add methods to generate plots during batch processing
  • studio/app/optinist/wrappers/expdb/analyze_stats.py - Integrate your analysis into the main analysis pipeline
  • studio/app/optinist/wrappers/expdb/__init__.py - Register your analysis function in the component registry
  • experiments_public/view_configs/experiment_graphs.yaml - Register experiment-level visualizations on database website
  • studio/app/routers/optinist/routers/view_configs/experiment_graphs.yaml - For permanent changes, also update app template
  • studio/app/config/cell_graphs.yaml - Register cell-specific visualizations (if applicable)

Files You May Need to Modify (Context-Dependent):

  • studio/app/optinist/core/expdb/batch_runner.py - If you need to modify the batch execution flow
  • studio/app/optinist/wrappers/expdb/stat_file_convert.py - When modifying core processing in Stat data flow

2. Configuration File Locations

The OptiNiSt system loads configuration files from specific locations. Understanding these locations is important for proper integration of new visualizations.

2.1. Website YAML Configuration Files

YAML configuration files determine which visualizations appear in the Website:

experiments_public/view_configs/experiment_graphs.yaml  # For experiment-level visualizations
experiments_public/view_configs/cell_graphs.yaml        # For cell-specific visualizations

For experiment-level visualizations in experiment_graphs.yaml:

# Add these lines (content depends on your plot type)
my_primary_plot:
  title: "My Primary Visualization"  # Display name in the website
  dir: "plots"                       # Directory where plot is saved
  type: "single"                     # "single" for one image, "multi" for patterns

For cell-specific visualizations in cell_graphs.yaml:

my_cell_plot:
  title: "My Cell Analysis"
  dir: "cellplots"

2.2. Parameter Files

Parameter files for analysis functions should be placed in:

# Add this: # Add this: Add analysis parameters 
studio/app/optinist/wrappers/expdb/params/analyze_stats.yaml

These YAML files define default parameters for your analysis functions. For example:

# my_new_analysis
threshold: 0.5
min_value: 0
max_value: 1
units: "normalized"
use_normalization: true

2.3. Output File Organization

When saving visualization files, follow these directory conventions:

  • Main plot directory: {output_dir}/plots/

    • Use for most experiment-level visualizations
    • Referenced as dir: "plots" in YAML configuration
  • Cell-specific plots: {output_dir}/cellplots/ or similar

    • Use for cell-specific visualizations
    • Create a dedicated directory if needed for organization

2.4. NWB Configuration

NWB schema definitions are defined in:

studio/app/optinist/core/nwb/oristat.py

This file defines the data types and structures for the Neuroscience Without Borders (NWB) format integration.

2.5. The Property-Path-Config Connection

To successfully integrate visualizations with the frontend, understand the crucial connection between:

  1. StatData Properties: The property names in your StatData class (e.g., my_primary_plot)
  2. File System Paths: The paths and filenames where visualizations are saved (e.g., plots/my_primary_plot.png)
  3. YAML Configuration Entries: The entries in configuration files (e.g., my_primary_plot: { title: "My Plot", dir: "plots" })

These three elements must align precisely:

  • The property name and YAML key should match exactly
  • The directory in the YAML must match where your files are saved
  • The filename should match the pattern expected by the frontend:
    • Experiment graphs: {property_name}.png
    • Cell graphs: {property_name}_{cell_index}.png
    • Multi-component: Must match the pattern specified in YAML

Misalignment between these elements is the most common cause of visualizations not appearing in the Website.

3. Creating Backend Analysis Files

3.1. Implementation Overview

Before creating specific files, understand the basic components needed:

  1. A Python module containing your analysis algorithm
  2. A YAML parameter file defining configurable parameters
  3. Registration in the component registry
  4. Updates to the StatData class to store and process your data
  5. NWB schema updates to ensure proper data storage

3.2. Create Analysis Module

Create a new Python file in the studio/app/optinist/wrappers/expdb/ directory:

# studio/app/optinist/wrappers/expdb/my_new_analysis.py

# Add this: A new analysis function
import numpy as np

from studio.app.common.core.logger import AppLogger
from studio.app.optinist.core.nwb.nwb import NWBDATASET
from studio.app.optinist.dataclass import StatData

def my_new_analysis(
    stat: StatData, output_dir: str, params: dict = None, **kwargs
) -> dict(stat=StatData):
    """
    Implement a new analysis and visualization method
    Parameters
    ----------
    stat : StatData
        The statistics data object to update
    output_dir : str
        Output directory for results
    params : dict, optional
        Parameters for the analysis
    Returns
    -------
    dict
        Dictionary of outputs including updated stat object and visualizations
    """
    logger = AppLogger.get_logger()
    logger.info("Running my_new_analysis...")
    # Set default parameters if none provided
    if params is None:
        params = {
            "threshold": 0.5,
        }

    # Process data and calculate summary statistics
    all_processed_data = []
    mean_data = np.zeros(stat.ncells)

    for i in range(stat.ncells):
        processed = process_my_data(stat.data_table[i])
        all_processed_data.append(processed)
        mean_data[i] = np.mean(processed)  # Summary statistic (mean)

    # Create 3D array from processed data
    if all_processed_data:
        first_array = all_processed_data[0]
        if hasattr(first_array, "shape"):
            rows, cols = first_array.shape
            stat.my_new_metric = np.array(all_processed_data).reshape(
                len(all_processed_data), rows, cols
            )

    # Store results and calculate responsive cells
    stat.my_summary_value = mean_data
    stat.index_responsive_cells = mean_data >= params["threshold"]
    stat.ncells_responsive = np.sum(stat.index_responsive_cells)

    # Call the setter to create visualization objects
    stat.set_my_new_props()

    # Return updated stat object and visualization properties
    return {
        "stat": stat,
        "my_primary_plot": stat.my_primary_plot,
        "my_summary_plot": stat.my_summary_plot,
        "nwbfile": {NWBDATASET.ORISTATS: stat.nwb_dict_my_new_analysis},
    }


def process_my_data(data):
    """
    Process input data for the analysis
    Parameters
    ----------
    data : numpy.ndarray
        Raw input data to process
    Returns
    -------
    numpy.ndarray
        Processed data ready for visualization
    """
    # Implement your analysis algorithm
    processed_data = data.copy()

    # Example processing
    processed_data = processed_data - np.mean(processed_data, axis=1, keepdims=True)
    processed_data = processed_data / np.std(processed_data, axis=1, keepdims=True)

    return processed_data

3.3. Create Parameter File

Create a YAML parameter file to define default parameters:

# studio/app/optinist/wrappers/expdb/params/my_new_analysis.yaml
# Add this: Add analysis parameters

# Analysis parameters
threshold: 0.5
min_value: 0
max_value: 1
units: "normalized"

# Additional options
use_normalization: true

3.4. Add Analysis Module to Init File

Register your analysis function in the component registry by adding it to the __init__.py file:

# studio/app/optinist/wrappers/expdb/__init__.py

# Add this: Import your new analysis function
from studio.app.optinist.wrappers.expdb.my_new_analysis import my_new_analysis

expdb_wrapper_dict = {
    "expdb": {
        # ... existing entries ...
        
        # Add this: Register your new analysis
        "preset_components": {
            # ... existing components ...
            "my_new_analysis": {
                "function": my_new_analysis,
                "conda_name": "expdb",  # Use appropriate conda environment
            },
        },
    },
}

3.5. Update StatData Class

Add new properties to the StatData class: Add a setter method (set_props) for creates visualization objects from your calculated data Add dictionary to agglomerate data and then add to the NWB dictionary

# studio/app/optinist/dataclass/stat.py

class StatData(BaseData):
    def __init__(self, data_table=None, **kwargs):
        super().__init__(**kwargs)
        
        # Initialize core properties
        self.data_table = data_table
        self.ncells = len(data_table) if data_table is not None else 0
        
        # Add this: Initialize new analysis properties
        # --- my_new_analysis ---
        self.my_new_metric = None
        self.my_new_value = np.full(self.ncells, np.NaN)
        self.my_summary_value = np.full(self.ncells, np.NaN)
        self.index_responsive_cells = None
        self.ncells_responsive = None


    # Add this: Create a setter method for visualization objects
    def set_my_new_props(self):
        """Create visualization objects for my new analysis."""
        logger = AppLogger.get_logger()
        logger.info("Creating visualizations for my_new_analysis...")
        # Create primary visualization (histogram of values)
        # HistogramData only accepts data and file_name parameters
        self.my_primary_plot = HistogramData(
            data=self.my_summary_value[~np.isnan(self.my_summary_value)],
            file_name="my_primary_plot",  # Must match property name
        )
        # Create summary visualization (pie chart of responsive cells)
        # PieData requires data and labels parameters
        self.my_summary_plot = PieData(
            data=np.array(
                (
                    self.ncells_responsive,
                    self.ncells - self.ncells_responsive,
                ),
                dtype=np.float64,
            ),
            labels=["Responsive", "Non-responsive"],
            file_name="my_summary_plot",
        )

    # Add this: Add your data to a dictionary
    @property
    def nwb_dict_my_new_analysis(self) -> dict:
        """Return NWB dictionary for my new analysis"""
        # Only include fields that are defined in MY_NEW_ANALYSIS_TYPES
        nwb_dict = {}
        nwb_dict["my_new_metric"] = self.my_new_metric
        nwb_dict["my_summary_value"] = self.my_summary_value
        nwb_dict["index_responsive_cells"] = self.index_responsive_cells
        nwb_dict["ncells_responsive"] = self.ncells_responsive

        return nwb_dict


    @property
    def nwb_dict_all(self) -> dict:
        """Return complete NWB dictionary for all properties"""
        
        # Add your new properties to the NWB dictionary
        **self.nwb_dict_my_new_analysis,
        
        return nwb_dict

3.6. Update StatData Visualization Properties

After adding properties to the StatData class to store your analysis results, you need to create visualization objects. These objects are used by the batch processing pipeline to generate plots.

3.6.1. Call the Setter in Your Analysis Function

Your analysis function should call this setter after calculating the necessary data:

def my_new_analysis(
    stat,
    cnmf_info,
    output_dir,
    params=None,
    **kwargs,
) -> dict:
    # ... your analysis code that calculates values ...
    
    # Call the setter to create visualization objects
    stat.set_my_new_props()
    

3.6.2. Visualization Object Types

The OptiNiSt system supports several visualization types:

  • LineData: For line plots (e.g., time series, tuning curves)
  • HistogramData: For histogram plots
  • PieData: For pie charts
  • HeatmapData: For heatmaps
  • ScatterData: For scatter plots
  • PolarData: For polar plots (e.g., orientation preferences)

Choose the appropriate visualization type based on your data and analysis goals. Each visualization type has specific parameters that control its appearance.

3.6.3. Integration with Batch Processing

The batch processing pipeline will call save_plot() on your visualization objects:

# In batch_unit.py, generate_plots or generate_plots_using_cnmf_info

# Add this: to generate and save visualizations
stat_data.my_primary_plot.save_plot(dir_path)
stat_data.my_summary_plot.save_plot(dir_path)

This pattern follows existing implementations like set_pca_props(), set_kmeans_props(), and set_anova_props(), ensuring your visualizations are properly integrated with the rest of the system.

3.7. Update NWB Schema

Add your data types to the NWB schema:

# studio/app/optinist/core/nwb/oristat.py

# Add this: Define types for your new analysis
MY_NEW_ANALYSIS_TYPES = {
    "my_new_metric": ("float", (None, None, None)),       # 3D array with shape (10, 20, 5)
    "my_summary_value": ("float", (None,)),               # 1D array with shape (10,)
    "index_responsive_cells": ("bool", (None,)),          # 1D boolean array
    "ncells_responsive": "int",                           # Single integer value
}

# Update oristat_ext definition
oristat_ext = NWBGroupSpec(
    doc="oristats",
    datasets=[
        # Add this: Add your new analysis types
        *get_dataset_list(MY_NEW_ANALYSIS_TYPES),
    ],
)

3.8. Integrate with analyze_stats.py Pipeline

Your new analysis must be properly integrated into the main analysis pipeline in analyze_stats.py:

# studio/app/optinist/wrappers/expdb/analyze_stats.py

# Add this: Import your new analysis function
from studio.app.optinist.wrappers.expdb import my_new_analysis

def analyze_stats(expdb: ExpDbData, output_dir: str, params: dict = None, **kwargs) -> dict:
    # The existing analysis chain
    stat = stat_file_convert(expdb, output_dir, params).get("stat")
    stat = anova1_mult(stat, output_dir, params).get("stat")
    stat = vector_average(stat, output_dir, params).get("stat")
    stat = curvefit_tuning(stat, output_dir, params).get("stat")
    # Add this: Call your new analysis
    stat = my_new_analysis(stat, kwargs["cnmf_info"], output_dir, params).get("stat")
    
    # Update the return dictionary to include your new properties
    return {
        "stat": stat,
        "tuning_curve": stat.tuning_curve,

        # Add this: Return your new analysis results
        "my_primary_plot": stat.my_primary_plot,
        "my_summary_plot": stat.my_summary_plot,

        "nwbfile": {NWBDATASET.ORISTATS: stat.nwb_dict_all},
    }

3.9. Integrating with Batch Processing

Understanding the batch processing pipeline is crucial for correctly integrating your visualizations. The OptiNiSt system uses a two-tiered approach:

  1. Core Analysis Pipeline (analyze_stats.py): Handles basic statistical calculations and data processing
  2. Visualization Generation (batch_unit.py): Manages plot creation and file saving

3.9.1. Types of Visualizations in the Pipeline

The system has two primary methods for generating visualizations:

  • Standard Plots (generate_plots): For visualizations that only need the output of analyze_stats.py
  • CNMF-dependent Plots (generate_plots_using_cnmf_info): For visualizations that require additional cell detection data

3.9.2. Adding Standard Visualizations

For visualizations that only need the StatData object, modify the generate_plots method:

# studio/app/optinist/core/expdb/batch_unit.py

@stopwatch(callback=__stopwatch_callback)
def generate_plots(self, stat_data: StatData):
    self.logger_.info("process 'generate_plots' start.")

    for expdb_path in self.expdb_paths:
        dir_path = expdb_path.plot_dir
        create_directory(dir_path)

        # Existing visualizations
        stat_data.tuning_curve.save_plot(dir_path)
        stat_data.tuning_curve_polar.save_plot(dir_path)
        # ... other existing plots ...

        # ADD YOUR VISUALIZATION HERE
        if hasattr(stat_data, "my_primary_plot") and stat_data.my_primary_plot is not None:
            stat_data.my_primary_plot.save_plot(dir_path)
            
        if hasattr(stat_data, "my_summary_plot") and stat_data.my_summary_plot is not None:
            stat_data.my_summary_plot.save_plot(dir_path)

3.9.3. Handling Additional Inputs

Sometimes your visualization may need additional inputs beyond just StatData and cnmf_info. Looking at batch_unit.py, there's a clear pattern for how to handle this:

def generate_plots_with_custom_input(self, stat_data: StatData, additional_data):
    """Example of handling custom input requirements"""
    self.logger_.info("process 'generate_plots_with_custom_input' start.")
    
    # Process your data with the additional input
    results = my_custom_analysis(
        stat=stat_data,
        custom_data=additional_data,
        output_dir=self.raw_path.output_dir,
        params=get_default_params("my_custom_analysis"),
        nwbfile=self.nwbfile,
    )
    
    # Update NWB file and save visualizations as shown in previous examples
    # ...

3.9.4. Specific example: CNMF-dependent Visualizations

For visualizations that require cell detection data from CNMF, follow the pattern in generate_plots_using_cnmf_info:

def generate_plots_using_cnmf_info(self, stat_data: StatData, cnmf_info: dict):
    self.logger_.info("process 'generate_plots_using_cnmf_info' start.")
    
    # Existing PCA analysis
    pca_results = pca_analysis(
        stat=stat_data,
        cnmf_info=cnmf_info,
        output_dir=self.raw_path.output_dir,
        params=get_default_params("pca_analysis"),
        nwbfile=self.nwbfile,
    )
    
    # Existing k-means analysis
    kmeans_results = kmeans_analysis(
        stat=stat_data,
        cnmf_info=cnmf_info,
        output_dir=self.raw_path.output_dir,
        params=get_default_params("kmeans_analysis"),
        nwbfile=self.nwbfile,
    )
    
    # Update NWB file with results
    self.nwbfile = pca_results["nwbfile"]
    self.nwbfile = kmeans_results["nwbfile"]
    
    # ADD YOUR ANALYSIS HERE
    self.logger_.info("process 'generate_my_new_analysis_plots' start.")
    
    # Run your analysis function with CNMF data
    my_analysis_results = my_new_analysis(
        stat=stat_data,
        cnmf_info=cnmf_info,
        output_dir=self.raw_path.output_dir,
        params=get_default_params("my_new_analysis"),
        nwbfile=self.nwbfile,
    )
    
    # Update NWB file
    if "nwbfile" in my_analysis_results:
        self.nwbfile = my_analysis_results["nwbfile"]
    
    # Save plots for each path
    for expdb_path in self.expdb_paths:
        dir_path = expdb_path.plot_dir
        create_directory(dir_path)
        
        # Save visualization objects for all analyses
        stat_data.pca_analysis.save_plot(dir_path)
        stat_data.pca_analysis_variance.save_plot(dir_path)
        stat_data.pca_contribution.save_plot(dir_path)
        stat_data.clustering_analysis.save_plot(dir_path)
        
        # Save your visualization objects
        if hasattr(stat_data, "my_primary_plot") and stat_data.my_primary_plot is not None:
            stat_data.my_primary_plot.save_plot(dir_path)
        
        # Generate additional visualizations if needed
        # This pattern is useful for multi-component visualizations
        if hasattr(stat_data, "my_components") and stat_data.my_components is not None:
            generate_my_component_visualization(
                components=stat_data.my_components,
                output_dir=dir_path,
            )

3.9.5. When to Use Each Approach

  • Use the generate_plots method when:

    • Your visualization only needs the statistical data from StatData
    • Your analysis was already performed in the core pipeline (analyze_stats.py)
    • You're visualizing standard tuning curve properties
  • Use the generate_plots_using_cnmf_info method when:

    • You need direct access to fluorescence traces or cell masks
    • Your visualization requires spatial information about cells
    • You're performing more complex analyses like PCA, clustering, or dimensionality reduction

Remember that StatData properties must be populated before they can be visualized:

  1. If using generate_plots, ensure your analysis in analyze_stats.py populates your visualization properties
  2. If using generate_plots_using_cnmf_info, ensure your analysis function populates these properties

The existing pca_analysis and kmeans_analysis functions demonstrate this pattern, where they:

  1. Perform the analysis
  2. Store results in StatData properties
  3. Return an updated StatData object
  4. Generate visualizations based on the analysis results

The key is to follow the existing patterns in the codebase, which help maintain consistency and ensure your visualizations are properly integrated into the batch processing workflow.

3.10. Configure Frontend Visualization

Register your visualizations in the experiment_graphs.yaml file:

# experiments_public/view_configs/experiment_graphs.yaml

# Add this: Add for plots to appear in the website
my_primary_plot:
  title: "My Primary Analysis"  # Display name in website
  dir: "plots"                  # Directory containing visualization
  type: "single"                # Standard single-image visualization

my_summary_plot:
  title: "Summary Statistics"
  dir: "plots"
  type: "single"

3.11. Frontend Data Flow

Understanding the frontend data flow:

  1. expdb.py loads configuration files via the load_graph_configs() function
  2. When the frontend requests data, these configurations are used to:
    • Create URLs for your plot images
    • Add them to the response as graph_urls properties
  3. The frontend React components display these URLs as images

The configuration structure directly affects how visualizations appear in the website:

  • title: Controls the column name in the website
  • dir: Must match where your plot is saved in the filesystem
  • For experiment graphs, the format {name}.png is expected

3.12 Advanced: Modifying Core Data Processing

While most analyses can be added without changing core files, some scenarios require updating stat_file_convert.py:

When to Modify Core Processing

  • New Feature Extraction: When calculating fundamental metrics not currently extracted
  • Different Data Types: When analyzing data with different structures or stimulus paradigms
  • Signal Processing Changes: When implementing alternative preprocessing techniques

The addition of non_circular_index() demonstrates a proper case for extending core feature extraction:

4. Other Graph Types

Beyond standard experiment-level visualizations, the OptiNiSt system supports specialized graph types for different use cases.

4.1. Cell-Specific Graphs

Cell-specific graphs require special handling as they need to be generated for each detected cell.

4.1.1. Create Cell-Specific Visualization Method

In your batch processing class, implement a method to generate individual plots for each cell:

def generate_my_cell_plots(self):
    """Generate plots for each individual cell"""
    self.logger_.info("process 'generate_my_cell_plots' start.")
    
    # Create directories for cell plots
    for expdb_path in self.expdb_paths:
        create_directory(join_filepath([expdb_path.output_dir, "cellplots"]))
    
    # Load cell data
    _, _, ncells = self.load_raw_cellmask_data()
    
    # Generate a plot for each cell
    for i in range(ncells):
        # Get data for this specific cell
        cell_data = self.get_data_for_cell(i)  # Your function to get cell-specific data
        
        # Create and save the plot with cell index in filename
        for expdb_path in self.expdb_paths:
            # Create matplotlib figure
            plt.figure()
            plt.plot(cell_data)
            plt.title(f"Cell {i} Analysis")
            
            # Save with the required naming pattern: {name}_{cell_index}.png
            plot_path = join_filepath(
                [expdb_path.output_dir, "cellplots", f"my_cell_plot_{i}.png"]
            )
            plt.savefig(plot_path, bbox_inches="tight")
            plt.close()
            
            # Always create thumbnails for website performance
            save_thumbnail(plot_path)
    
    self.logger_.info("process 'generate_my_cell_plots' complete.")

4.1.2. Integrate with Batch Processing

Call this method from your main processing flow:

def generate_plots(self, stat_data: StatData):
    # ... existing code ...
    
    # Generate cell-specific plots
    self.generate_my_cell_plots()
    
    # ... rest of method ...

4.1.3. Register Cell Plots in Configuration

Register your cell-specific visualization in the cell_graphs.yaml file:

# studio/app/config/cell_graphs.yaml
my_cell_plot:
  title: "My Cell Analysis"  # Display name in website
  dir: "cellplots"           # Directory where plots are saved

4.1.4. Understanding the Cell Graph URL Generation

The cell plot configuration is used by the get_cell_urls function in expdb.py:

def get_cell_urls(source, exp_dir, index: int, params=None):
    """
    Generates URLs for cell-specific visualizations.
    
    This function expects plot files to follow the pattern:
    {name}_{cell_index}.png
    """
    return [
        ImageInfo(urls=[f"{exp_dir}/{v['dir']}/{k}_{index}.png"], params=params)
        for k, v in source.items()
    ]

The frontend will display your cell-specific plots in the cell details view, with each plot in its own column labeled with the title from your configuration.

4.2. Multi-Component Visualizations

Multi-component visualizations allow you to generate and display a series of related images as a single visualization component. This is especially useful for methods like PCA or clustering analysis that produce multiple output components.

4.2.1. When to Use Multi-Component Visualizations

Use multi-component visualizations when your analysis produces:

  • Multiple related images that should be grouped together (e.g., PCA components)
  • A series of visualizations that share the same type but differ in parameters
  • Component-wise breakdowns of analysis results
  • Time series or sequential data visualizations

4.2.2. Generate Multiple Image Files with Consistent Naming

Modify your analysis function to generate multiple PNG files with a consistent naming pattern:

def generate_multi_component_visualization(self, data, output_dir, n_components=5):
    """Generate multiple component visualizations"""
    
    # For each component, generate a separate visualization
    for i in range(n_components):
        # Process this component (for example, PCA components)
        component_data = data[:, i]
        
        # Generate spatial map
        plt.figure(figsize=(8, 6))
        plt.imshow(component_data.reshape(self.height, self.width), cmap='viridis')
        plt.colorbar(label='Weight')
        plt.title(f"Component {i+1} Spatial Map")
        
        # Save spatial component
        spatial_path = join_filepath(
            [output_dir, f"my_component_{i+1}_spatial.png"]
        )
        plt.savefig(spatial_path, bbox_inches="tight", dpi=100)
        plt.close()
        
        # Create thumbnail
        save_thumbnail(spatial_path)
        
        # Generate time course
        plt.figure(figsize=(10, 4))
        plt.plot(self.time_axis, self.time_courses[i], 'k-')
        plt.title(f"Component {i+1} Time Course")
        plt.xlabel("Time (s)")
        plt.ylabel("Amplitude")
        
        # Save time component
        time_path = join_filepath(
            [output_dir, f"my_component_{i+1}_time.png"]
        )
        plt.savefig(time_path, bbox_inches="tight", dpi=100)
        plt.close()
        
        # Create thumbnail
        save_thumbnail(time_path)

4.2.3. Register Multi-Component Configuration

In your experiment_graphs.yaml file, register your visualization with the special multi type and a file pattern:

# experiments_public/view_configs/experiment_graphs.yaml

my_components_spatial:
  title: "Component Spatial Maps"   # Display name in website
  dir: "plots"                      # Directory containing images
  type: "multi"                     # Specifies multi-component visualization
  pattern: "my_component_*_spatial.png"  # Pattern to match files

my_components_time:
  title: "Component Time Courses"
  dir: "plots"
  type: "multi"
  pattern: "my_component_*_time.png"

The pattern parameter uses glob syntax to match multiple files. Common patterns include:

  • my_component_*_plot.png - Matches all files with variable content in the middle
  • my_component_[0-9]_plot.png - Matches only single-digit components
  • my_component_{type}_*.png - For categorized components

4.2.4. How Multi-Components Are Processed

In the backend, the get_experiment_urls function in expdb.py handles multi-component visualizations:

def get_experiment_urls(source, exp_dir, params=None):
    result = []
    for key, value in source.items():
        if value.get("type") == "multi":
            # Handle multi-image components
            component_dir = value["dir"]
            pattern = value["pattern"]
            
            # Construct directory path
            dirs = exp_dir.split("/")
            pub_dir = f"{EXPDB_DIRPATH.PUBLIC_EXPDB_DIR}/{dirs[-2]}/{dirs[-1]}/{component_dir}/"
            
            # Find all matching files using the pattern
            component_files = sorted(
                list(
                    set(glob(f"{pub_dir}/{pattern}"))
                    - set(glob(f"{pub_dir}/*.thumb.png"))
                )
            )
            
            # Create a single ImageInfo with all found files
            if component_files:
                urls = [
                    f"{exp_dir}/{component_dir}/{os.path.basename(file)}"
                    for file in component_files
                ]
                thumb_urls = [url.replace(".png", ".thumb.png") for url in urls]
                
                result.append(
                    ImageInfo(urls=urls, thumb_urls=thumb_urls, params=params)
                )
        else:
            # Standard single-image component
            # ...

This code finds all matching files and bundles them into a single ImageInfo object with multiple URLs.

4.2.5. Integrating with Batch Processing

Add your multi-component visualization generation to the batch processing workflow:

def generate_plots_using_data(self, stat_data, additional_info):
    # ...existing code...
    
    # Save plots for each path
    for expdb_path in self.expdb_paths:
        create_directory(expdb_path.plot_dir)
        
        # Generate multi-component visualizations
        self.generate_multi_component_visualization(
            data=stat_data.component_data,
            output_dir=expdb_path.plot_dir,
            n_components=5  # Or get from parameters
        )

4.2.6. Thumbnail Generation

Thumbnails are used for preview and navigation in the interface, therefore are required for plots to be uploaded to the Database website.

Thumbnail images must be saved using the naming pattern

  • my_plot.png # Original image
  • my_plot.thumb.png # Thumbnail image. Saved in the same folder
save_thumbnail(plot_path)  # After saving with matplotlib

In batch processing thumbnails are generated using this function

# Where THUMBNAIL_HEIGHT = 128
def save_thumbnail(plot_file):
    with Image.open(plot_file) as img:
        # Calculate new dimensions
        w, h = img.size
        new_width = int(w * (THUMBNAIL_HEIGHT / h))
        # LANCZOS is high-quality downsampling filter
        thumb_img = img.resize((new_width, THUMBNAIL_HEIGHT), Image.Resampling.BILINEAR)
        thumb_img.save(plot_file.replace(".png", ".thumb.png"))

Thumbnails are used for preview and navigation in the interface, and are automatically referenced by the get_experiment_urls function.

4.3. Tips for Testing Visualizations

To test if your visualizations appear correctly:

  1. Run a batch job processing an experiment
  2. Check that your PNG files are generated in the correct directories
  3. Verify that thumbnails (with .thumb.png extension) are also created
  4. Use the database website to view the experiment and confirm your plots appear in the interface

If your visualizations don't appear:

  • Check browser developer tools for 404 errors (incorrect paths)
  • Verify that your plot names in code match those in the YAML configuration
  • Ensure your plots are being saved to the correct directories with correct filenames
  • For cell plots, confirm the naming pattern {name}_{cell_index}.png is followed
  • For multi-component plots, verify the glob pattern in your YAML matches the filenames

5. Frontend Integration

Once your backend implementation is working correctly, you can integrate your plots into the frontend:

5.1. Frontend Integration

The frontend integration requires updates to YAML configuration files and understanding how visualization data flows to the website.

5.2 Register Plots in YAML Configuration Files

Your visualizations need to be registered in the YAML configuration files:

5.2.1 For Experiment-Level Visualizations:

Update: Inside the app: studio/app/routers/optinist/routers/view_configs/experiment_graphs.yaml Inside experiments_public folder outside the app: experiments_public/view_configs/experiment_graphs.yaml (See 6.2. for explanation)

# For the single file saved at experiments_public/plots/my_primary_plot.png
my_primary_plot:                     # Must match the name of the saved files (my_primary_plot.png)
  title: "My Primary Visualization"  # Display name in the website
  dir: "plots"                       # Directory where plot is saved (`experiments_public/plots/`)
  type: "single"                     # "single" for one image, "multi" for multiple images

# For the multiple files saved at:
# experiments_public/my_experiments_folder/my_multi_plot_1.png
# experiments_public/my_experiments_folder/my_multi_plot_2.png
my_multi_plot:                      # Must match the name of the saved files (my_multi_plot.png)
  title: "My multi-plots"           # Display name in the website
  dir: "my_experiments_folder"      # Directory where plot is saved (`experiments_public/my_experiments_folder/`)
  type: "multi"                     # "single" for one image, "multi" for multiple images
  pattern: "my_multi_plot_*.png"    # In the case of multiple images, give the name saving pattern where * is a number

5.2.2 For Cell-Level Visualizations:

Update studio/app/config/cell_graphs.yaml:

# Add your cell-specific visualizations
my_cell_plot:
  title: "My Cell Analysis"         # Display name in the website
  dir: "cellplots"                  # Directory where plot is saved

5.3: Understanding the Frontend Data Flow

The YAML files you've modified are loaded by the following process:

  1. expdb.py loads them via the load_graph_configs() function
  2. When the frontend requests data, experiment_transformer() and expdbcell_transformer() functions use these configurations to:
    • Create URLs for your plot images
    • Add them to the response as graph_urls or cell_image_urls properties
  3. The frontend React components display these URLs as images

The YAML configuration structure directly affects how visualizations appear in the website:

  • title: Controls the column name in the website
  • dir: Must match where your plot is saved in the filesystem
  • For cell graphs, the format {name}_{cell_index}.png is expected
  • For experiment graphs, the format {name}.png is expected

5.4: Testing Your Frontend Integration

To test if your visualization appears correctly:

  1. Run a batch job processing an experiment
  2. Check that your PNG files are generated in the correct directories
  3. Verify that thumbnails (with .thumb.png extension) are also created
  4. Use the database web website to view the experiment and confirm your plots appear in the interface

If your visualizations don't appear:

  • Check browser developer tools for 404 errors (incorrect paths)
  • Verify that your plot names in code match those in the YAML configuration
  • Ensure your plots are being saved to the correct directories with correct filenames

5.5. Note on the Property-Path-Config Connection

To successfully integrate your visualizations with the frontend, you must understand the crucial connection between:

  1. StatData Properties: The property names in your StatData class (e.g., my_primary_plot)
  2. File System Paths: The paths and filenames where visualizations are saved (e.g., plots/my_primary_plot.png)
  3. YAML Configuration Entries: The entries in experiment_graphs.yaml or cell_graphs.yaml (e.g., my_primary_plot: { title: "My Plot", dir: "plots" })

These three elements must align precisely:

  • The property name and YAML key should match exactly
  • The directory in the YAML must match where your files are saved
  • The filename should match the pattern expected by the frontend:
    • Experiment graphs: {property_name}.png
    • Cell graphs: {property_name}_{cell_index}.png
    • Multi-component: Must match the pattern specified in YAML

Misalignment between these elements is the most common cause of visualizations not appearing in the website.

6. Example of all files related to one plot type

6.1. Backend Components

StatData Class (studio/app/optinist/dataclass/stat.py)

  • __init__: Initializes stat data with arrays for analysis results
  • set_anova_props: Creates PlotData objects for ANOVA visualization (direction_responsivity_ratio, orientation_responsivity_ratio, direction_selectivity, orientation_selectivity, best_responsivity)
  • fill_nan_with_none: Utility for database storage of NaN values

Analysis Pipeline

Main Orchestrator (studio/app/optinist/wrappers/expdb/analyze_stats.py)
  • analyze_stats: Main orchestrator that calls stat_file_convert, anova1_mult, vector_average, and curvefit_tuning
ANOVA Analysis (studio/app/optinist/wrappers/expdb/anova1_mult.py)
  • anova1_mult: Performs ANOVA analysis on orientation/direction data
  • multi_compare: Helper for multiple comparisons with Tukey test
Data Preparation (studio/app/optinist/wrappers/expdb/stat_file_convert.py)
  • stat_file_convert: Prepares raw data for analysis
  • detrend_tc: Detrends time courses
  • sort_tc: Sorts time courses by stimulus
  • get_data_tables: Extracts data tables for analysis
  • dir_index: Calculates direction selectivity indices
  • get_stat_data: Creates StatData object

Batch Processing

Batch Unit (studio/app/optinist/core/expdb/batch_unit.py)
  • generate_statdata: Calls analyze_stats to generate StatData
  • generate_plots: Saves visualization plots to disk
  • save_nwb: Saves data to NWB format
Batch Runner (studio/app/optinist/core/expdb/batch_runner.py)
  • __process_dataset_registration: Orchestrates analysis pipeline
  • __process_datasets: Processes multiple datasets

Database Integration (studio/app/optinist/core/expdb/crud_cells.py)

  • bulk_insert_cells: Stores analysis results in database

Utility Functions

  • join_filepath (studio/app/common/core/utils/filepath_creater.py): Utility for creating file paths
  • create_directory (studio/app/common/core/utils/filepath_creater.py): Creates directories for output files
  • save_thumbnail (studio/app/common/dataclass/utils.py): Creates thumbnail images for plots

6.2. Configuration Files (YAML)

Analysis Parameters

  • anova1_mult.yaml (studio/app/optinist/wrappers/expdb/anova1_mult.yaml): Contains parameters for ANOVA analysis (p_value_threshold, r_best_threshold, si_threshold)
  • analyze_stats.yaml (studio/app/optinist/wrappers/expdb/analyze_stats.yaml): Contains parameters for the full analysis pipeline

Frontend YAML Configuration

  • experiment_graphs.yaml: Defines which graphs appear in the website including:
    • orientation_responsivity_ratio
    • orientation_selectivity
    • best_responsivity

Note: experiment_graphs.yaml is defined in two locations Inside the app: studio/app/routers/optinist/routers/view_configs/experiment_graphs.yaml Inside experiments_public folder outside the app: experiments_public/view_configs/experiment_graphs.yaml The experiments_public experiment_graphs.yaml is the overriding file, easily accessible to users on the database server. If this file is not found, the internal file is used.

6.3. Frontend Components

API Endpoints (studio/app/optinist/routers/expdb.py)

  • search_public_experiments: API endpoint for retrieving experiments
  • search_db_experiments: API endpoint for database experiments
  • get_experiment_urls: Generates URLs for experiment plots
  • experiment_transformer: Transforms experiment data for frontend

Frontend Components (frontend/src/components/Database/DatabaseExperiments.tsx)

  • getColumns: Creates columns for data grid including visualization links
  • handleOpenDialog: Opens image viewer for plots
  • render: Renders the experiment data grid

6.4. Data Flow Summary

The ANOVA1_mult results flow through these components:

  1. Raw data is analyzed by anova1_mult function
  2. Results are stored in StatData object
  3. StatData's set_anova_props creates visualization objects
  4. generate_plots saves these visualizations as PNG files
  5. get_experiment_urls constructs URLs to these files
  6. Frontend DataGrid displays them with thumbnails

7 Glossary

  • CNMF: A region of interest (ROI) extraction algorithm included in the main workflow.
  • NWB: Neuroscience Without Borders, a standardized saving framework for neuroscience data.
Clone this wiki locally