Finding The Most Important Sentences Using TF-IDF by Python

Newt Tan
1 min readApr 29, 2020

--

Normally, the TF-IDF is used on words while not sentences. This realization is part of my college research project in fact. The dataset won’t be provided here for privacy reasons.

I read an article about the realization of this by javascript, which is quite good. But part of the code is possible to improve and rewrite by python. So I wrote this article and want to share it with your guys.

The performance is like:

Sentences Importances rank

Part of the code is:

InverseDocumentFreq Code

If you want to know more about the principles of the algorithm. You can read the reference article. Help yourself.

Here it is:

Source code: https://github.com/Wapiti08/Algorithms_on_Feature_Engineering/blob/master/TF-IDF-Sen.ipynb

Reference: https://hackernoon.com/finding-the-most-important-sentences-using-nlp-tf-idf-3065028897a3

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Newt Tan
Newt Tan

Written by Newt Tan

In the end, the inventor is still the hero and always will be. Don’t give up on your dreams. We started with DVDs.

No responses yet

What are your thoughts?