Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Anglerfish parsing, only assign barcode-specific UDFs to barcoded samples #306

Merged
merged 1 commit into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions VERSIONLOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Scilifelab_epps Version Log

## 20240208.1

In Anglerfish parsing, only try to assign barcode-specific UDFs for barcoded samples.

## 20240130.1

Handle Anglerfish result parsing for runs W/O ONT barcodes
Expand Down
22 changes: 13 additions & 9 deletions scripts/parse_anglerfish_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,15 +114,6 @@ def ont_barcode_well2name(barcode_well: str) -> str:


def fill_udfs(currentStep: Process, df: pd.DataFrame):
# Dictate which LIMS UDF corresponds to which column in the dataframe
udfs_to_cols = {
"# Reads": "num_reads",
"Avg. Read Length": "mean_read_len",
"Std. Read Length": "std_read_len",
"Representation Within Run (%)": "repr_total_pc",
"Representation Within Barcode (%)": "repr_within_barcode_pc",
}

# Get Illumina pools
illumina_pools = [
input_art
Expand Down Expand Up @@ -160,12 +151,25 @@ def fill_udfs(currentStep: Process, df: pd.DataFrame):
]

else:
# Subset df to the current Illumina sample
df_sample = df[df["sample_name"] == illumina_sample.name]

assert (
len(df_sample) == 1
), f"Multiple entries matching both Illumina sample name {illumina_sample.name} and ONT barcode {barcode_name} was found in the dataframe."

# Determine which UDFs to assign
udfs_to_cols = {
"# Reads": "num_reads",
"Avg. Read Length": "mean_read_len",
"Std. Read Length": "std_read_len",
"Representation Within Run (%)": "repr_total_pc",
}
if barcode_well:
udfs_to_cols[
"Representation Within Barcode (%)"
] = "repr_within_barcode_pc"

# Start putting UDFs
for udf, col in udfs_to_cols.items():
try:
Expand Down
Loading