The project "Sherlock Checker" is a Plagiarism Checker system that is used to detect plagiarism in files using cosine similarity. To compute the similarity between the two files, the raw data is transformed into vectors, and then to arrays of numbers and then used vectors to compute the similarity between the files and prints the value in Decimals where 1.0
indicates 100%
.
Sherlock.Checker.Demo.Made.with.Clipchamp.mp4
- 1st scenario shows Maximum similarity between the 2 files.
- 2nd scenario shows few lines removed, and
~70%
similarity between the files. - 3rd scenario shows all the lines of the file removed and
0%
similarity between the files.
- Python.
- sklearn.
- TfidfVectorizer.
- cosine_similarity.
- Pycharm.
https://forms.gle/uYKJLqpQH4FryyG19
- Clone the Repository with:
git clone https://github.com/Akash-Ramjyothi/Sherlock-Checker
- Install required dependencies with:
pip install scikit-learn
- Run the Script, type:
python3 main.py`.
- Take a look at the Existing Issues or create your own Issues!
- Wait for the Issue to be assigned to you after which you can start working on it.
- Fork the Repo and create a Branch for any Issue that you are working upon.
- Create a Pull Request which will be promptly reviewed and suggestions would be added to improve it.
- Add Screenshots to help me know what this Code is all about.