Skip to content

Commit bebc71b

Browse files
Merge pull request #49 from dccuchile/fix/benchmark_doc
Fix/benchmark doc
2 parents f2955c2 + 5c36d4a commit bebc71b

File tree

4 files changed

+52
-77
lines changed

4 files changed

+52
-77
lines changed

.readthedocs.yaml

+6-1
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,16 @@ formats:
44
- epub
55
- pdf
66

7+
build:
8+
os: ubuntu-22.04
9+
tools:
10+
python: "3.11"
11+
12+
713
sphinx:
814
configuration: docs/conf.py
915

1016
python:
11-
version: "3.7"
1217
install:
1318
- requirements: requirements.txt
1419
- requirements: requirements-dev.txt

docs/benchmark/benchmark.rst

+28-62
Original file line numberDiff line numberDiff line change
@@ -519,8 +519,8 @@ supporting the same number of number of word sets).
519519
520520
521521
522-
2. Fair Embedding Engine
523-
~~~~~~~~~~~~~~~~~~~~~~~~
522+
Fair Embedding Engine
523+
~~~~~~~~~~~~~~~~~~~~~
524524

525525
In the case of Fair Embedding Engine, the WE model is passed in the
526526
metric instantiation. Then, the output value of the metric is computed
@@ -833,8 +833,8 @@ family vs. career).
833833
"relatives",
834834
]
835835
836-
1. WEFE
837-
~~~~~~~
836+
WEFE
837+
~~~~
838838

839839
WEFE defines a standardized framework for executing bias mitigation
840840
algorithms based on the scikit-learn fit transform interface.
@@ -968,8 +968,8 @@ methods implemented in the library.
968968
Repulsion Attraction Neutralization debiased model WEAT evaluation: 0.26007230998948216
969969
970970
971-
1. Fair Embedding Engine
972-
~~~~~~~~~~~~~~~~~~~~~~~~
971+
Fair Embedding Engine
972+
~~~~~~~~~~~~~~~~~~~~~
973973

974974
The Fair Embedding Engine (FEE) requires the embedding model to be
975975
passed during instantiation of the algorithm. It currently does not
@@ -1042,8 +1042,8 @@ interface
10421042
10431043
10441044
1045-
1. Responsibly
1046-
~~~~~~~~~~~~~~
1045+
Responsibly
1046+
~~~~~~~~~~~
10471047

10481048
In Responsibly the embedding model is provided during the instantiation
10491049
of the ``GenderBiasWe`` class. Definitional pairs cannot be provided by
@@ -1063,8 +1063,8 @@ such as ``twitter-25``.
10631063
gender_bias_we = GenderBiasWE(word2vec) # instance the GenderBiasWE
10641064
gender_bias_we.debias(neutral_words=targets) # apply the debias
10651065
1066-
4. EmbeddingBiasScore
1067-
~~~~~~~~~~~~~~~~~~~~~
1066+
EmbeddingBiasScore
1067+
~~~~~~~~~~~~~~~~~~
10681068

10691069
The library does not implement mitigation methods, so it is not included
10701070
in this comparison.
@@ -1111,15 +1111,13 @@ SAME ✖ ✖ ✖ ✔
11111111
Generalized WEAT ✖ ✖ ✖ ✔
11121112
================ ==== === =========== ===================
11131113

1114-
The table exclusively focuses on metrics that directly compute from word
1115-
embeddings (WE) using predefined word sets. As a result, it omits
1116-
metrics that are not compatible with the wefe interface such as:
1114+
The table exclusively focuses on metrics that directly compute from word embeddings
1115+
(WE) using predefined word sets. As a result, it omits the following metrics:
11171116

1118-
- IndirectBias, a metric that accepts as input only two words and the
1119-
gender direction, previously calculated in a distinct operation.
1120-
- GIPE, PMN, and Proximity Bias, which evaluate WE models before and
1121-
after debiasing with auxiliary mitigation methods.
1122-
- SemBias, which is an analogy evaluation dataset.
1117+
- IndirectBias, a metric that accepts as input only two words and the gender
1118+
direction, previously calculated in a distinct operation.
1119+
- GIPE, PMN, and Proximity Bias, which evaluate WE models before and after debiasing
1120+
with auxiliary mitigation methods.
11231121

11241122
Mitigation algorithms
11251123
~~~~~~~~~~~~~~~~~~~~~
@@ -1140,47 +1138,15 @@ Conclusion
11401138
The following table summarizes the main differences between the
11411139
libraries analyzed in this benchmark study.
11421140

1143-
+-------------+-----------+--------------------+------------+---------+
1144-
| | WEFE | FEE | Responsibl | Embeddi |
1145-
| | | | y | ngBiasS |
1146-
| | | | | cores |
1147-
+=============+===========+====================+============+=========+
1148-
| Implemented | 7 | 7 | 3 | 6 |
1149-
| Metrics | | | | |
1150-
+-------------+-----------+--------------------+------------+---------+
1151-
| Implemented | 5 | 3 | 1 | 0 |
1152-
| Mitigation | | | | |
1153-
| Algorithms | | | | |
1154-
+-------------+-----------+--------------------+------------+---------+
1155-
| Extensible | Easy | Easy | Difficult, | Easy |
1156-
| | | | not very | |
1157-
| | | | modular. | |
1158-
+-------------+-----------+--------------------+------------+---------+
1159-
| Well-define |||||
1160-
| d | | | | |
1161-
| interface | | | | |
1162-
| for metrics | | | | |
1163-
+-------------+-----------+--------------------+------------+---------+
1164-
| Well-define |||||
1165-
| d | | | | |
1166-
| interface | | | | |
1167-
| for | | | | |
1168-
| mitigation | | | | |
1169-
| algorithms | | | | |
1170-
+-------------+-----------+--------------------+------------+---------+
1171-
| Lastest | January | October 2020 | April 2021 | April |
1172-
| update | 2023 | | | 2023 |
1173-
+-------------+-----------+--------------------+------------+---------+
1174-
| Installatio | Easy: pip | No instructions. | Only with | Only |
1175-
| n | or conda | It can be | pip. | from |
1176-
| | | installed from the | Presents | the |
1177-
| | | repository | problems | reposit |
1178-
| | | | | ory |
1179-
+-------------+-----------+--------------------+------------+---------+
1180-
| Documentati | Extensive | Almost no | Limited | No |
1181-
| on | documenta | documentation | documentat | documen |
1182-
| | tion | | ion | tation, |
1183-
| | with | | with some | only |
1184-
| | examples | | examples | example |
1185-
| | | | | s. |
1186-
+-------------+-----------+--------------------+------------+---------+
1141+
==================================================== ========================================= ========================================================== ========================================== ====================================
1142+
Item WEFE FEE Responsibly EmbeddingBiasScores
1143+
==================================================== ========================================= ========================================================== ========================================== ====================================
1144+
Implemented Metrics 7 7 3 6
1145+
Implemented Mitigation Algorithms 5 3 1 0
1146+
Extensible Easy Easy Difficult, not very modular. Easy
1147+
Well-defined interface for metrics ✔ ✖ ✖ ✔
1148+
Well-defined interface for mitigation algorithms ✔ ✖ ✖ ✖
1149+
Lastest update January 2023 October 2020 April 2021 April 2023
1150+
Installation Easy: pip or conda No instructions. It can be installed from the repository Only with pip. Presents problems Only from the repository
1151+
Documentation Extensive documentation with examples Almost no documentation Limited documentation with some examples No documentation, only examples.
1152+
==================================================== ========================================= ========================================================== ========================================== ====================================

examples/benchmark.ipynb

+7-4
Original file line numberDiff line numberDiff line change
@@ -1491,6 +1491,7 @@
14911491
]
14921492
},
14931493
{
1494+
"attachments": {},
14941495
"cell_type": "markdown",
14951496
"metadata": {},
14961497
"source": [
@@ -1513,11 +1514,13 @@
15131514
"| SAME | ✖ | ✖ | ✖ | ✔ |\n",
15141515
"| Generalized WEAT | ✖ | ✖ | ✖ | ✔ |\n",
15151516
"\n",
1516-
"The table exclusively focuses on metrics that directly compute from word embeddings (WE) using predefined word sets. As a result, it omits metrics that are not compatible with the wefe interface such as: \n",
1517+
"The table exclusively focuses on metrics that directly compute from word embeddings\n",
1518+
"(WE) using predefined word sets. As a result, it omits the following metrics:\n",
15171519
"\n",
1518-
"- IndirectBias, a metric that accepts as input only two words and the gender direction, previously calculated in a distinct operation.\n",
1519-
"- GIPE, PMN, and Proximity Bias, which evaluate WE models before and after debiasing with auxiliary mitigation methods.\n",
1520-
"- SemBias, which is an analogy evaluation dataset."
1520+
"- IndirectBias, a metric that accepts as input only two words and the gender\n",
1521+
" direction, previously calculated in a distinct operation.\n",
1522+
"- GIPE, PMN, and Proximity Bias, which evaluate WE models before and after debiasing\n",
1523+
" with auxiliary mitigation methods."
15211524
]
15221525
},
15231526
{

requirements-dev.txt

+11-10
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
pytest>=7.0.0
22
pytest-cov==3.0.0
3-
coverage==6.4.2
3+
coverage==7.2.5
44
# flake8==5.0.4
5-
black==22.6.0
6-
isort==5.10.1
7-
mypy==0.812
8-
Sphinx==5.0.2
9-
sphinx-gallery==0.11.1
10-
sphinx-rtd-theme==1.0.0
11-
sphinx-copybutton==0.5.0
5+
urllib3==1.26.15
6+
black==23.3.0
7+
isort==5.11.5
8+
mypy==1.2.0
9+
Sphinx==5.3.0
10+
sphinx-gallery==0.13.0
11+
sphinx-rtd-theme==1.2.0
12+
sphinx-copybutton==0.5.2
1213
numpydoc==1.5.0
13-
docutils==0.16
14+
docutils==0.18
1415
torch==1.13.1
1516
ipython==7.34.0
16-
ruff==0.0.194
17+
ruff==0.0.264

0 commit comments

Comments
 (0)