CERTIFICATE

Author Information

IMPACT FACTOR 2021

Subject Area

Life Sciences / Biology
Architecture / Building Management
Asian Studies
Business & Management
Chemistry
Computer Science
Economics & Finance
Engineering / Acoustics
Environmental Science
Agricultural Sciences
Pharmaceutical Sciences
General Sciences
Materials Science
Mathematics
Medicine
Nanotechnology & Nanoscience
Nonlinear Science
Chaos & Dynamical Systems
Physics
Social Sciences & Humanities

Why Us? >>

Open Access
Peer Reviewed
Rapid Publication
Life time hosting
Free promotion service
Free indexing service
More citations
Search engine friendly

Automated text analytics and classification of text documents with machine learning

Author:

Nihar Ranjan, Abhishek Gupta, Ishwari Dhumale, Payal Gogawale and Rugved Gramopadhye

Subject Area:

Physical Sciences and Engineering

Abstract:

With the dramatic increase of digital form of data, it is difficult to manage the huge amount of documents. Whenever any individual tries to find any information about particular topic, he may receive a large set of documents on the internet. Some of these documents may be in .pdf format some may be in .txt format or simply any word document. The title of these documents may seem relevant to what the individual is looking for but the content in those documents may differ. Thus there was a necessity to read, understand and analyze contents of all the documents at one glance. As a result, it has become necessary to categorize large texts (documents) into specific classes. In our propose system we are classifying the documents, both single and multiple documents into predefined classes. The documents can be of any form i.e .txt, .doc, .docx, .pdf. Then preprocessing techniques are used like tokenization, stop words removal, stemming on input files. The document is classified according to the given learning. Dynamic learning is used to update the learning datasets. This project covers how the classification of document is done and how exactly the desired output is determined (classified documents). We also aimed at generating a classification report of number of documents in a particular class with respect to total number of documents. The pie chart can also showcase why a particular document is inclined towards any particular category and what percentage of its content consists of related information towards that category.

PDF file:

15322.pdf

ONLINE PAYPAL PAYMENT

IJMCE RECOMMENDATION

Monthly archive

Advantages of IJCR

Rapid Publishing
Professional publishing practices
Indexing in leading database
High level of citation
High Qualitiy reader base
High level author suport

Plagiarism Detection

IJCR is following an instant policy on rejection those received papers with plagiarism rate of more than 20%. So, All of authors and contributors must check their papers before submission to making assurance of following our anti-plagiarism policies.