Created 2 months ago

Active 2 months ago

Viewed 54 times

2 replies

Hi, all

I need to customize my F1 measurement a lot when evaluating my fine-tuned NER model performance. But I don't see other people with similar issues. I wonder if I am doing the wrong thing, or if people have their customized F1 measures all the time but do not mention it.

I can easily get an F1 measure from the default hugging face evaluation library, and add some common lenient measures with minimum effort.

For example, I have labels <predator> <food> and my dataset is"

Mice eat cheese.
The Southeast Asian soldier fly eats nectar.
Owls eat mice.

In most off-the-shelf lenient measures, if we ignore the text span. The partially recognised fly would be counted twice as positives.

The(B-predator) Southeast(I-predator) Asian soldier fly(B-predator) eats nectar.

There are also types of lenient measures which ignore mismatched labels, as long as the text is highlighted.

I ended up considering all the options and calculated 4 types of lenient measures. And I haven't touched the measurements per label. Is that too much?

named-entity-recognition nlp

created May 1 at 18:42

FewKey

2 replies

Sorted by:

Hatter The Mad

Hey @FewKey!
There is not such thing as "overengineering" a metric. Only inappropriately defining it. So let's start there. What exactly do you wish to evaluate? What is the desired result? What sort of problems are a reason for you to "engineer" the metric at all?

May 6 at 6:59, edited May 8 at 14:29

FewKey

Author

I see. In my case, the desired result is "some entity is found" so that we can do downstream analysis. In this case, I guess which label or the span issues can be ignored. Thanks!

May 8 at 9:24

Share perspectives, advice, and insights

Use Discussions to engage in deeper dialogue, have opinion-based conversations, and exchange perspectives about a technical concept. See full Discussions guidelines.

Discussions is different than Q&A

Discussions exists separately from the traditional question-and-answer space. If you have a specific programming question, go to Stack Overflow Q&A to post your question.

Be welcoming and patient

All users are expected to treat one another with kindness and respect. Remember, everyone is here to learn, and sometimes while learning, people make mistakes. See code of conduct.

No resume or job listings

Discussions are not for sharing your resume or job listing.

Avoid self-promotion

If your post happens to be about your product or website, you must disclose your affiliation. See spam guidelines and best practices.

Collectives™ on Stack Overflow

Am I overengineering my lenient NER F1 measures

2 replies