Loading
Warm-Up Round: Completed Round 1: Completed Round 2: Completed #nlp #knowledge_graph #language_model #dialogue #llm Weight: 1.0
12.8k
483
75
1527

Check this forum post for explanation of new metrics - Updates to Task 1 Metrics

kcy4 4.423
Loading...
nuxi 4.364
Loading...
Loading...
Clear Filter
Δ # Participants CPDScore USEScore BERTScore Word F1 BLEU GPU Track Entries Last Submission Submission Trend
01 kcy4 4.423 0.580 0.624 13.026 0.226 False 114
Loading...
View
02
  nuxi
4.364 0.605 0.629 15.174 0.234 False 150
Loading...
View
03
4.346 0.600 0.634 15.080 0.189 False 146
Loading...
View
04
4.328 0.610 0.636 16.444 0.378 False 133
Loading...
View
05
4.303 0.620 0.652 17.787 0.560 False 116
Loading...
View
06 nicholas_liu 4.254 0.609 0.647 17.455 0.502 False 10
Loading...
View
07
4.194 0.626 0.658 19.277 0.790 False 202
Loading...
View
08
4.191 0.570 0.620 10.666 0.342 False 58
Loading...
View
09 wangzhiyu918 4.112 0.624 0.653 18.938 0.781 False 100
Loading...
View
10
4.077 0.618 0.667 19.505 0.706 False 35
Loading...
View
11 ni_kai_hua 4.019 0.636 0.673 20.892 1.235 False 60
Loading...
View
12 takeru 3.927 0.629 0.669 20.785 1.197 False 19
Loading...
View
13
3.908 0.613 0.648 17.341 0.354 False 97
Loading...
View
14 jiwei_liu 3.846 0.637 0.679 21.431 1.323 False 3
Loading...
View
Baseline Prompt-Track
3.540 0.623 0.673 18.979 1.065 False View
Baseline Baseline LLM-Phi2-Prompt  Fork to make your submission
3.332 0.615 0.672 20.420 1.391 True View
15
3.047 0.604 0.671 19.214 0.888 True 32
Loading...
View
Baseline PersonaChat-BART-PeaCoK
3.042 0.604 0.671 19.214 0.888 True View
16
3.041 0.604 0.671 19.214 0.888 True 4
Loading...
View
17 dipam 2.710 0.509 0.617 2.051 0.000 False 7
Loading...
View
18
1.883 0.513 0.627 2.203 0.000 True 2
Loading...
View