Exparser models (both reference parsing and reference extraction) are trained on research papers in German as well as English language.

The following evaluation results are based on the corpus of 125 research papers from German language including the research papers containing references found in the footnotes and 100 research papers from English language. Separate evaluation was performed for both German and English language dataset.

Exparser Reference identification results: (10 fold cross validation)

Exparser German Dataset

Model Precision Recall F1
Exparser 0.69 0.89 0.78

1 line= First line of the reference.

I line= Intermediate line of the reference.

L line= Last line of the reference.

Precision Recall F1 Score
1 Line I Line L Line 1 Line I Line L Line 1 Line I Line L Line
0.73 0.51 0.78 0.84 0.84 0.86 0.78 0.64 0.79

Exparser Reference parsing results: (10 fold cross validation)

Exparser model evaluation on Exgoldstandard English Dataset

Tag Precision Recall F1
publisher 0.959086477 0.845181313 0.89728132
last page 0.993896063 0.984151736 0.9889821
surname 0.951610043 0.884536717 0.91505436
article-title 0.931791929 0.972769867 0.95151199
url 0.965273268 0.763906669 0.808867549
volume 0.95576658 0.937542063 0.945621057
source 0.942626564 0.83490314 0.883653533
given-names 0.941450873 0.911911429 0.925533171
editor 0.897915808 0.778200721 0.832358207
first page 0.997159664 0.980429599 0.988697629
year 0.944322999 0.933497319 0.938589647
identifier 0.960358176 0.70076969 0.733457182
issue 0.958618872 0.888918427 0.922161952
other 0.846494385 0.722210955 0.776731361

Exparser model evaluation on Exgoldstandard German Dataset

Tag Precision Recall F1
publisher 0.964192636 0.81064742 0.875111434
last page 0.991129788 0.962046788 0.976161105
surname 0.90994633 0.787478211 0.843367591
article-title 0.893704304 0.960605016 0.925422401
url 0.996033448 0.800261321 0.880681245
volume 0.932098238 0.78021711 0.847791857
source 0.890182663 0.748975755 0.810636528
given-names 0.8900896 0.823056487 0.854994087
editor 0.877913835 0.75151039 0.807626961
first page 0.979024701 0.938542128 0.958159757
year 0.904064843 0.90138347 0.902640056
identifier 0.902012681 0.705569486 0.753663066
issue 0.964353559 0.703203661 0.798748542
other 0.848551117 0.73500243 0.78503847