Skip to content

Interpretation of Kcat prediction values in Enzeptional optimization output #267

@lucaspalmeira

Description

@lucaspalmeira

Description

When using Enzeptional for enzyme sequence optimization with the XGBoost model, the predicted Kcat values in the output include both low positive numbers (e.g., 0.499) and negative values (e.g., -0.874), which appear physically invalid for Kcat measurements.

From the framework documentation and previous papers, I understand that:

  • The training process uses logarithmic transformation on Kcat data to improve linearity
  • The scaler.pkl file is applied to reverse transformations during prediction
  • However, it's unclear whether the final scores from SequenceScorer are already in the original scale or if manual conversion (e.g., 10^x) is required

Questions

  1. Scale Interpretation: Are the Kcat values in the optimization output already converted back to the original (non-logarithmic) scale, or do they require additional transformation?

  2. Negative Values: How should negative Kcat values be interpreted? Do they indicate:

    • Invalid sequences?
    • Pipeline errors?
    • Artifacts of the model prediction range?
    • Values below a certain detection threshold?
  3. Expected Value Range: What is the expected valid range for Kcat predictions, and how should outliers or physically impossible values be handled?

Context

Example output values observed:

  • Positive but low: 0.499, 0.123, 0.876
  • Negative: -0.874, -1.234, -0.567

Additional Information

  • Using Enzeptional via GT4SD library
  • XGBoost model for Kcat prediction
  • Following the example from: examples/enzeptional/example_enzeptional.py

Any clarification on the proper interpretation of these output values would be greatly appreciated for correct analysis of optimization results.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions