@@ -557,15 +557,139 @@ pip install https://huggingface.co/emiltj/da_multi_dupli_rater_1_onto/resolve/ma
557
557
python src/predict_single/predict_rater_2-9.py
558
558
```
559
559
560
- # GOTTEN TO HERE
561
-
562
560
- ** Assess agreement between rater and model**
563
561
- Make assessment fine-grained, and assess for each type of ent, in prodigy using the review recipe
562
+ For rater 3:
563
+ Cases where ents are same between predicted and model: 638
564
+ Cases where ents are NOT same between preds and model: 888
565
+ For rater 4:
566
+ Cases where ents are same between predicted and model: 1114
567
+ Cases where ents are NOT same between preds and model: 1363
568
+ For rater 5:
569
+ Cases where ents are same between predicted and model: 422
570
+ Cases where ents are NOT same between preds and model: 980
571
+ For rater 6:
572
+ Cases where ents are same between predicted and model: 1046
573
+ Cases where ents are NOT same between preds and model: 1213
574
+ For rater 7:
575
+ Cases where ents are same between predicted and model: 754
576
+ Cases where ents are NOT same between preds and model: 1148
577
+ For rater 8:
578
+ Cases where ents are same between predicted and model: 622
579
+ Cases where ents are NOT same between preds and model: 1076
580
+ For rater 9:
581
+ Cases where ents are same between predicted and model: 906
582
+ Cases where ents are NOT same between preds and model: 1203
583
+ Total cases where ents are same 5502
584
+ Total cases where ents are NOT same 7871
564
585
``` bash
565
586
# Go through script manually:
566
587
# src/data_assessment/model_and_raters_agreement.ipynb
567
588
```
568
589
590
+ - ** Add predictions to db**
591
+ - Creates in db:
592
+ - rater_ "$i"_ single_unprocessed_preds
593
+ - Creates in folders:
594
+ - ./data/single/unprocessed/rater_ $i/rater_ "$i"_ preds.jsonl
595
+ ``` bash
596
+ # tools/raters_preds_to_db.sh
597
+ prodigy drop rater_2_single_unprocessed_preds
598
+ prodigy drop rater_10_single_unprocessed_preds
599
+ ```
600
+
601
+ # GOTTEN TO HERE
602
+
603
+ - ** Review raters 3, 4, 5, 6, 7, 8, 9**
604
+ ``` bash
605
+ prodigy review rater_3_single_gold_all rater_3_single_unprocessed,rater_3_single_unprocessed_preds --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT -S -A
606
+
607
+ prodigy review rater_4_single_gold_all rater_4_single_unprocessed,rater_4_single_unprocessed_preds --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT -S -A
608
+
609
+ prodigy review rater_5_single_gold_all rater_5_single_unprocessed,rater_5_single_unprocessed_preds --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT -S -A
610
+
611
+ prodigy review rater_6_single_gold_all rater_6_single_unprocessed,rater_6_single_unprocessed_preds --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT -S -A
612
+
613
+ prodigy review rater_7_single_gold_all rater_7_single_unprocessed,rater_7_single_unprocessed_preds --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT -S -A
614
+
615
+ prodigy review rater_8_single_gold_all rater_8_single_unprocessed,rater_8_single_unprocessed_preds --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT -S -A
616
+
617
+ prodigy review rater_9_single_gold_all rater_9_single_unprocessed,rater_9_single_unprocessed_preds --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT -S -A
618
+ ```
619
+
620
+ - ** Split the rater_3_single_gold_all**
621
+ - Creates new files:
622
+ - ./data/single/gold/rater_ {r}/rater_ {r}_ single_gold_all.jsonl
623
+ - Creates new in db:
624
+ - rater_ {r}_ single_gold_accepted
625
+ - rater_ {r}_ single_gold_ignored
626
+ - rater_ {r}_ single_gold_rejected
627
+ ``` bash
628
+ python src/preprocessing/split_by_answer_rater_3_9_single_gold.py
629
+ ```
630
+
631
+ - ** Resolve ignored cases in rater_ {r}_ single_gold_ignored**
632
+ - Creates in db:
633
+ - rater_ {r}_ single_gold_ignored_resolved
634
+ ``` bash
635
+ prodigy mark rater_3_single_gold_ignored_resolved dataset:rater_3_single_gold_ignored --view-id review --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT
636
+
637
+ prodigy mark rater_4_single_gold_ignored_resolved dataset:rater_4_single_gold_ignored --view-id review --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT
638
+
639
+ prodigy mark rater_5_single_gold_ignored_resolved dataset:rater_5_single_gold_ignored --view-id review --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT
640
+
641
+ prodigy mark rater_6_single_gold_ignored_resolved dataset:rater_6_single_gold_ignored --view-id review --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT
642
+
643
+ prodigy mark rater_7_single_gold_ignored_resolved dataset:rater_7_single_gold_ignored --view-id review --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT
644
+
645
+ prodigy mark rater_8_single_gold_ignored_resolved dataset:rater_8_single_gold_ignored --view-id review --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT
646
+
647
+ prodigy mark rater_9_single_gold_ignored_resolved dataset:rater_9_single_gold_ignored --view-id review --label PERSON,NORP,FACILITY,ORGANIZATION,LOCATION,EVENT,LAW,DATE,TIME,PERCENT,MONEY,QUANTITY,ORDINAL,CARDINAL,GPE,WORK\ OF\ ART,LANGUAGE,PRODUCT
648
+ ```
649
+
650
+ - ** Dump the rater_ {r}_ single_gold_ignored**
651
+ ``` bash
652
+ prodigy db-out rater_{r}_single_gold_ignored data/single/gold
653
+ ```
654
+
655
+ - ** Merge the rater_ {r}_ single_gold_ignored and the rater_ {r}_ single_gold_accepted**
656
+ ``` bash
657
+ prodigy db-merge rater-{r}-single-gold-accepted,rater-{r}-single-gold-ignored rater_1_single_gold
658
+ ```
659
+
660
+
661
+ # Only written out steps to here(!)
662
+ # Below add:
663
+ - Merge all single gold for all raters
664
+ - Add language and product predictions to single gold combined
665
+ - Resolve them
666
+ - Add overwrite the resolved in single gold combined (see above way of doing it)
667
+ - Merge the single-gold-combined with extra lang+prod into the gold-multi-and-gold-rater-1-single
668
+ - Have it be NER manual instead (see above way of doing it)
669
+ ...???
670
+
671
+
672
+
673
+
674
+
675
+ - ** Add Language and Product predictions on the gold-multi dataset**
676
+ - Use tner/roberta-large-ontonotes5
677
+ - Only adds one, wrong label. So I'll skip it
678
+ - Perhaps to make sense to mention in methods, regardless
679
+ ``` bash
680
+ # gold-multi-training/datasets/lang_product_predict_gold_multi.py
681
+ ```
682
+
683
+
684
+ - ** Merge all gold datasets in db**
685
+
686
+
687
+
688
+
689
+
690
+
691
+
692
+
569
693
- ** Potentially. Make appropriate changes on gold-standard-multi data based on the assessment between rater and model**
570
694
571
695
- ** Potentially. Re-train model on new gold-standard-multi data**
0 commit comments