Unstructured data contains a plethora of information. It is like energy when harnessed, will create high value for its stakeholders.
can obtain important insights into the topic within a short span of time. It helps concise the text and obtain relevant keywords
One of the techniques used for Keyword Extraction is TF-IDF ( Term Frequency – Inverse Document Frequency )
Term Frequency – How frequently a term occurs in a text. It is measured as the number of times a term t appears in the text / Total number of words in the document Inverse Document Frequency – How important a word is in a document. It is measured as log(total number of sentences / Number of sentences with term t) TF-IDF – Words’ importance is measure by this score. It is measured as TF * IDF