The traditional naive bayes algorithm is a commonly used text classification algorithm, it’s attribute independence assumptions reduce the classification effect of text to some extent. In response to this problem, a weighted naive bayes text classification algorithm based on improved feature weight was proposed. First, when the TFIDF algorithm was improved from the inverse characteristic frequency, the category frequency, etc., the redundant attribute was removed, and the weight of the different feature items are used to measure the weight of different feature items, and then use the cross entropy. The feature item weight was substituted into a naive bayes formula, and the weighted naive bayes classification algorithm was constructed. Compared with several different algorithms, the experimental results show that this algorithm has significant increase in precision, recall and F1 score.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.