🚀 Duplo Analyser - GitHub Action

⚡️ Lightning fast duplicate code detection! Supports all text formats with special handling of comments for common languages.

📝 Overview

Duplo Analyser is a GitHub Action for detecting duplicate code blocks in your repository. It scans source files and identifies similar code snippets based on configurable parameters. In case any duplicate blocks are found the action will fail the build.

🔋 This action is powered by Duplo - the fastest (?) duplicate detector on GitHub.

🚀 Usage

Add the following step to your GitHub Actions workflow:

- name: Run Duplo Analyser
  uses: dlidstrom/duplo-analyser@v2
  with:
    directory: '.'
    include-pattern: '.*'
    minimum-lines: "10"
    minimum-line-length: "3"
    max-files: "100"
    ignore-preprocessor-directives: "true"

🔧 Inputs

🔹 Input Name	📝 Description	🏷️ Default
`directory`	📂 Top directory from which to search for files. Only used with `include-pattern`.	`.`
`include-pattern`	🔍 Regular expression for including filenames (case-insensitive). Mutually exlusive with `file-list`.	`.*`
`exclude-pattern`	🚫 Regular expression for excluding filenames (case-insensitive). Only used with `include-pattern`.	`.^`
`file-list`	📝 File with filenames to analyse. Mutually exclusive with `include-pattern`.	`''`
`minimum-lines`	📏 Minimum number of lines required for duplicate detection	`10`
`minimum-line-length`	✂️ Minimum number of characters per line (shorter lines are ignored)	`3`
`max-files`	📊 Maximum number of files to report (useful for large duplicate sets)	`100`
`ignore-preprocessor-directives`	🛑 Removes preprocessor directives before duplicate detection	`true`
`version`	📌 Version of Duplo to use	`v2.0.1`

🔄 Example Workflow

🔍 Using regular expressions

name: Detect Duplicate Code
on: [push, pull_request]

jobs:
  duplication-check:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Run Duplo Analyser
        uses: dlidstrom/duplo-analyser@v2
        with: # optionally override the defaults
          directory: '.'
          include-pattern: '.*'
          minimum-lines: "10"
          minimum-line-length: "3"
          max-files: "100"
          ignore-preprocessor-directives: "true"

Sample include patterns (partial match is sufficient):

C/C++: '\.(h|cpp)$'
JavaScript: '\.js$' - or any other extension you need

The OR (|) operator only works inside groups (). Excluding files works in the same fashion.

The grep utility is used on all platforms, using posix-extended syntax.

📝 Using file list

name: Detect Duplicate Code
on: [push, pull_request]

jobs:
  duplication-check:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Run Duplo Analyser
        uses: dlidstrom/duplo-analyser@v2
        with:
          file-list: 'files.lst'

Using the file list option is useful when only analysing a specific subset of files, for example the files that are changed in a PR. This allows for having stricter rules for the files that are actively worked on.

📤 Output

The action prints duplicate code blocks to the workflow logs, allowing you to identify and refactor repeated code.

Sample:

Loading and hashing files ... 2 done.

tests/Quake2/g_chase.c(137)
tests/Quake2/g_chase.c(113)
	int i;
	edict_t *e;
	if (!ent->client->chase_target)
		return;
	i = ent->client->chase_target - g_edicts;
	do {

tests/Quake2/g_chase.c found: 1 block(s)
Configuration:
  Number of files: 1
  Minimal block size: 4
  Minimal characters in line: 3
  Ignore preprocessor directives: 0
  Ignore same filenames: 0

Results:
  Lines of code: 96
  Duplicate lines of code: 6
  Total 1 duplicate block(s) found.

🛠️ How It Works

🖥️ Platform-Specific Setup:
- Sets the executable path based on the operating system.
📦 Caching:
- Caches the Duplo binary to speed up future runs.
📥 Downloading Duplo (If Not Cached):
- Fetches the specified version of Duplo and unzips it.
📂 File Analysis:
- Uses find to locate files based on include/exclude patterns.
- Runs Duplo on the matching files.

📜 License

This action is open-source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
LICENSE		LICENSE
Readme.md		Readme.md
action.yml		action.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Duplo Analyser - GitHub Action

📝 Overview

🚀 Usage

🔧 Inputs

🔄 Example Workflow

🔍 Using regular expressions

📝 Using file list

📤 Output

🛠️ How It Works

📜 License

About

Uh oh!

Releases 13

Packages

Uh oh!

License

dlidstrom/duplo-analyser

Folders and files

Latest commit

History

Repository files navigation

🚀 Duplo Analyser - GitHub Action

📝 Overview

🚀 Usage

🔧 Inputs

🔄 Example Workflow

🔍 Using regular expressions

📝 Using file list

📤 Output

🛠️ How It Works

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Packages 0

Uh oh!

Packages