Skip to content

Commit 3c6077d

Browse files
author
Lee Newberg
committed
Say more about feature shapes, SLIC, and huggingface UNI
1 parent 2354768 commit 3c6077d

File tree

2 files changed

+51
-6
lines changed

2 files changed

+51
-6
lines changed

paper/paper.bib

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,48 @@ @misc{TCGAData
6363
url = {https://www.cancer.gov/tcga},
6464
note = {Accessed: 2022-11-10]}
6565
}
66+
67+
@article{SLIC2012,
68+
author = {Radhakrishna Achanta and
69+
Appu Shaji and
70+
Kevin Smith and
71+
Aurelien Lucchi and
72+
Pascal Fua and
73+
Sabine S\"usstrunk},
74+
title = {SLIC superpixels compared to state-of-the-art superpixel methods},
75+
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
76+
year = {2012},
77+
volume = {34},
78+
number = {11},
79+
pages = {2274-2282}
80+
}
81+
82+
@article{huggingface2024uni,
83+
author = {Chen, Richard J and
84+
Ding, Tong and
85+
Lu, Ming Y and
86+
Williamson, Drew F K and
87+
Jaume, Guillaume and
88+
Song, Andrew H and
89+
Chen, Bowen and
90+
Zhang, Andrew and
91+
Shao, Daniel and
92+
Shaban, Muhammad and
93+
Williams, Mane and
94+
Oldenburg, Lukas and
95+
Weishaupt, Luca L and
96+
Wang, Judy J and
97+
Vaidya, Anurag and
98+
Le, Long Phi and
99+
Gerber, Georg and
100+
Sahai, Sharifa and
101+
Williams, Walt and
102+
Mahmood, Faisal},
103+
title = {Towards a general-purpose foundation model for computational pathology},
104+
journal = {Nature Medicine},
105+
year = {2024},
106+
volume = {30},
107+
number = {3},
108+
pages = {850-862},
109+
month = {Mar}
110+
}

paper/paper.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
title: 'WSI Superpixel Guided Labeling'
33
tags:
44
- Python
5-
- histology
6-
- bioimage informatics
7-
- whole slide annotation
5+
- histology
6+
- bioimage informatics
7+
- whole slide annotation
88
- whole slide images
99
- guided labeling
1010
# (add orcid for anyone who has one)
@@ -50,7 +50,7 @@ bibliography: paper.bib
5050

5151
# Summary
5252

53-
`WSI Superpixel Guided Labeling` facilitates active learning on whole slide images. It has a user interface built on top of the HistomicsUI [@histomicsui] base and deployed as part of the Digital Slide Archive [@Gutman2017, @digitalslidearchive], and uses the HistomicsTK [@histomicstk] tool kit as part of the process.
53+
`WSI Superpixel Guided Labeling` facilitates active learning on whole slide images. It has a user interface built on top of the HistomicsUI [@histomicsui] base and deployed as part of the Digital Slide Archive [@Gutman2017, @digitalslidearchive], and uses the HistomicsTK [@histomicstk] tool kit as part of the process.
5454

5555
Users label superpixel regions or other segmented areas of whole slide images to be used as classification input for machine learning algorithms. An example algorithm is included which generates superpixels, features, and machine learning models for active learning on a directory of images. The interface allows bulk labeling, labeling the most impactful superpixels to improve the model, and reviewing labeled and predicted categories.
5656

@@ -60,13 +60,13 @@ One of the limitations in generating accurate models is the need for labeled dat
6060

6161
`WSI Superpixel Guided Labeling` provides a user interface and workflow for this guided labeling process. Given a set of whole slide images, the images are segmented based on a some user choices. This segmentation is the basis for labeling. The user can specify any number of label categories, including labels that will be excluded from training (for instance, for segmented regions whose categories cannot be accurately determined). After labeling a few initial segments, a model is generated and used to both predict the category of all segments and the segments that would result in the best improvement in the model if they were also labeled. The user can retrain the model at any time and review the results of both the predictions and other users.
6262

63-
For development, the initial segmentation uses superpixels generated with the SLIC algorithm. These are computed on whole slide images in a tiled manner so that they can work on arbitrarily large images, and the tile boundaries are properly handled to avoid visible artifacts. Either of two basic models can be trained and used for predictions: small-scale CNN using image features implemented in tensorflow/keras or torch, or a huggingface foundation model that generates a one-dimensional feature vector. The certainty criteria for which segments should be labeled next can also be selected, and includes confidence, margin, negative entropy, and the BatchBALD [@batchbald2019] algorithm.
63+
For development, the initial segmentation uses superpixels generated with the SLIC [@SLIC2012] algorithm. These are computed on whole slide images in a tiled manner so that they can work on arbitrarily large images, and the tile boundaries are properly handled to avoid visible artifacts. Once generated, segments are represented in one of two ways, either as two-dimensional patches, each centered in a fixed-sized square of pixels with non-segment pixels set to black, or as one-dimensional vectors, such as those generated from the huggingface UNI [@huggingface2024uni] foundation model. One of two basic models is trained based upon the segment representation. For two-dimensional patches, the model to be trained is a small-scale CNN implemented in tensorflow/keras or torch. For one-dimensional vectors, the model to be trained is a single-layer linear classifier. The certainty criteria for which segments should be labeled next can also be selected, and includes confidence, margin, negative entropy, and the BatchBALD [@batchbald2019] algorithm.
6464

6565
We had a placental pathologist provide feedback to validate the efficiency of the user interface and utility of the process.
6666

6767
# Basic Workflow
6868

69-
When starting a new labeling project, the user selects how superpixels are generated, which certainty metric is used for determining the optimal labeling order, and what features are used for model training. The labeling mode allows defining project labels and performing initial labeling. This mode can also be used to add new label categories or combine two categories if they should not have been distinct. Label categories can additionally be marked as excluded, which removes them from training and ensures that superpixels with those labels are no longer suggested for labeling.
69+
When starting a new labeling project, the user selects how superpixels are generated, which certainty metric is used for determining the optimal labeling order, and what features are used for model training. The labeling mode allows defining project labels and performing initial labeling. This mode can also be used to add new label categories or combine two categories if they should not have been distinct. Label categories can additionally be marked as excluded, which removes them from training and ensures that superpixels with those labels are no longer suggested for labeling.
7070

7171
![The Bulk Labeling interface showing one of the project images divided into superpixels with some categories defined. A user can "paint" areas with known labels as an initial seed for the guided labeling process](../docs/screenshots/initial_labels.png)
7272

0 commit comments

Comments
 (0)