[Pre-commit] Fix Critical Issues and Improve Code Quality

## Summary

This issue tracks critical fixes and quality improvements for the Husky pre-commit script in the `develop` branch. The script orchestrates multiple validation tools (Node.js/pnpm, Python, git) to ensure code quality before commits.

## Related
- Part 2: #6318 (Cross-Platform, Maintainability & Performance improvements)

---

## 1. 🔴 CRITICAL ISSUES

### Issue 1.1: Dangerous `git add .` Stages All Files
**Severity**: CRITICAL  
**Lines**: 151

**Description**:
This stages **ALL files** in the repository (including untracked files, .env files, secrets, node_modules changes, etc.), not just the formatted files. This can lead to:
- Accidentally committing secrets or config files
- Staging unrelated changes the developer didn't intend to commit
- Breaking the principle of explicit staging

**Current Code**:
```bash
git add .
```

**Proposed Fix**:
```diff
-git add .
+# Only re-stage the files that were originally staged and potentially modified by formatters
+if [ -n "$STAGED_SRC" ]; then
+  echo "$STAGED_SRC" | xargs git add
+fi
+
+if [ -n "$CSS_STAGED" ]; then
+  echo "$CSS_STAGED" | xargs git add
+fi
+
+# Re-add any files that format:fix or other formatters might have modified
+git diff --cached --name-only --diff-filter=ACMRT | xargs -I {} git add {}
```

---

### Issue 1.2: Missing Shell Error Handling (`set -e`)
**Severity**: CRITICAL  
**Lines**: 1-2

**Description**:
Without `set -e` or `set -euo pipefail`, the script continues even when commands fail. Combined with inconsistent `|| exit 1` usage, some failures are silently ignored.

**Current Code**:
```bash
#!/usr/bin/env sh
pnpm run generate-docs
```

**Proposed Fix**:
```diff
 #!/usr/bin/env sh
+set -e  # Exit immediately if any command fails
+set -u  # Treat unset variables as errors
+set -o pipefail  # Catch failures in pipes
+
-pnpm run generate-docs
+# If you need to allow specific commands to fail, use explicit handling:
+pnpm run generate-docs || {
+  echo "❌ Documentation generation failed"
+  exit 1
+}
```

---

### Issue 1.3: Security Risk - Executing Downloaded External Script
**Severity**: HIGH  
**Lines**: 52-139

**Description**:
- No SHA verification of downloaded script
- If GitHub is compromised or MitM attack occurs, malicious code could execute
- Cache could be poisoned if downloaded during an attack

**Current Code**:
```bash
SCRIPT_URL="https://raw.githubusercontent.com/PalisadoesFoundation/.github/main/.github/workflows/scripts/disable_statements_check.py"
# ... downloads and executes without integrity verification
```

**Proposed Fix (Option 1 - Pin to commit SHA)**:
```diff
-SCRIPT_URL="https://raw.githubusercontent.com/PalisadoesFoundation/.github/main/.github/workflows/scripts/disable_statements_check.py"
+# Pin to specific commit SHA instead of 'main'
+SCRIPT_URL="https://raw.githubusercontent.com/PalisadoesFoundation/.github/abc123commitsha/.github/workflows/scripts/disable_statements_check.py"
```

**Proposed Fix (Option 2 - Add SHA256 verification)**:
```diff
+EXPECTED_SHA="abc123...your-sha-here"
 if [ "$NEEDS_DOWNLOAD" = true ]; then
+  # Download to temp file first
+  TEMP_FILE=$(mktemp)
-  curl -sSfL "$SCRIPT_URL" -o "$SCRIPT_PATH" || exit 1
+  curl -sSfL "$SCRIPT_URL" -o "$TEMP_FILE" || exit 1
+  
+  # Verify SHA256
+  ACTUAL_SHA=$(shasum -a 256 "$TEMP_FILE" | cut -d' ' -f1)
+  if [ "$ACTUAL_SHA" != "$EXPECTED_SHA" ]; then
+    echo "❌ Security error: Script checksum mismatch!"
+    echo "Expected: $EXPECTED_SHA"
+    echo "Got: $ACTUAL_SHA"
+    rm "$TEMP_FILE"
+    exit 1
+  fi
+  
+  mv "$TEMP_FILE" "$SCRIPT_PATH"
+  echo "✅ Script downloaded and verified"
-  echo "✅ Script downloaded successfully"
 fi
```

**Proposed Fix (Option 3 - Vendor the script - MOST SECURE)**:
- Copy the script to `.github/workflows/scripts/` and commit it to the repository

---

### Issue 1.4: Unquoted Variable Expansion - Command Injection Risk
**Severity**: MEDIUM-HIGH  
**Lines**: 140, 149

**Description**:
If filenames contain spaces or special characters (e.g., `file; rm -rf /`), xargs could misinterpret them.

**Current Code**:
```bash
echo "$STAGED_SRC" | xargs "$VENV_BIN" "$SCRIPT_PATH" --files
echo "$CSS_STAGED" | xargs "$VENV_BIN" .github/workflows/scripts/css_check.py --files
```

**Proposed Fix**:
```diff
-echo "$STAGED_SRC" | xargs "$VENV_BIN" "$SCRIPT_PATH" --files
+# Use null-byte separation for safety
+if [ -n "$STAGED_SRC" ]; then
+  printf '%s\n' $STAGED_SRC | xargs -I {} "$VENV_BIN" "$SCRIPT_PATH" --files {}
+fi
+
-echo "$CSS_STAGED" | xargs "$VENV_BIN" .github/workflows/scripts/css_check.py --files
+# Better: Pass files directly if script supports it
+if [ -n "$CSS_STAGED" ]; then
+  "$VENV_BIN" .github/workflows/scripts/css_check.py --files $CSS_STAGED
+fi
```

---

## 2. ⚠️ CODE QUALITY ISSUES

### Issue 2.1: Inconsistent Error Handling
**Severity**: MEDIUM  
**Lines**: 17, 44-46

**Description**:
These checks can fail silently, allowing problematic commits to pass.

**Current Code**:
```bash
# No exit on failure:
pnpm run check-i18n -- --staged $STAGED_SRC

# No exit on failure:
"$VENV_BIN" .github/workflows/scripts/check_docstrings.py --directories .github
"$VENV_BIN" .github/workflows/scripts/compare_translations.py --directory public/locales
"$VENV_BIN" .github/workflows/scripts/countline.py --lines 600 --files ./.github/workflows/config/countline_excluded_file_list.txt
```

**Proposed Fix**:
```diff
-pnpm run check-i18n -- --staged $STAGED_SRC
+# Add explicit error handling
+if [ -n "$STAGED_SRC" ]; then
+  pnpm run check-i18n -- --staged $STAGED_SRC || {
+    echo "❌ i18n check failed"
+    exit 1
+  }
+fi
 
-"$VENV_BIN" .github/workflows/scripts/check_docstrings.py --directories .github
-"$VENV_BIN" .github/workflows/scripts/compare_translations.py --directory public/locales
-"$VENV_BIN" .github/workflows/scripts/countline.py --lines 600 --files ./.github/workflows/config/countline_excluded_file_list.txt
+# Python checks
+"$VENV_BIN" .github/workflows/scripts/check_docstrings.py --directories .github || exit 1
+"$VENV_BIN" .github/workflows/scripts/compare_translations.py --directory public/locales || exit 1
+"$VENV_BIN" .github/workflows/scripts/countline.py --lines 600 --files ./.github/workflows/config/countline_excluded_file_list.txt || exit 1
```

---

### Issue 2.2: Missing Tool Availability Checks
**Severity**: MEDIUM  
**Lines**: Various

**Description**:
If tools aren't installed, errors are cryptic. Better to fail fast with clear messages.

**Current Code**:
```bash
pnpm run generate-docs  # Assumes pnpm exists
node .github/workflows/scripts/check-minio-compliance.cjs  # Assumes node exists
```

**Proposed Fix**:
```diff
 #!/usr/bin/env sh
 set -e
 
+# Validate required tools at the start
+check_tool() {
+  if ! command -v "$1" >/dev/null 2>&1; then
+    echo "❌ Error: Required tool '$1' is not installed."
+    echo "Please install it and try again."
+    exit 1
+  fi
+}
+
+check_tool pnpm
+check_tool node
+check_tool npx
+check_tool git
+
+# Now proceed with the rest of the script...
```

---

### Issue 2.3: Unquoted Variable in Command Arguments
**Severity**: LOW-MEDIUM  
**Line**: 17

**Description**:
`$STAGED_SRC` should be quoted to handle filenames with spaces.

**Current Code**:
```bash
pnpm run check-i18n -- --staged $STAGED_SRC
```

**Proposed Fix**:
```diff
-pnpm run check-i18n -- --staged $STAGED_SRC
+if [ -n "$STAGED_SRC" ]; then
+  pnpm run check-i18n -- --staged "$STAGED_SRC"
+fi
```

---

### Issue 2.4: Race Condition in Virtual Environment Setup
**Severity**: LOW-MEDIUM  
**Line**: 26

**Description**:
If multiple git hooks run simultaneously (unlikely but possible), they could conflict when setting up the venv.

**Current Code**:
```bash
VENV_BIN=$(./.github/workflows/scripts/venv.sh) || exit 1
```

**Proposed Fix**:
```diff
+# Use a lock file to prevent concurrent venv setup
+LOCK_FILE=".git/venv-setup.lock"
+exec 200>"$LOCK_FILE"
+flock -n 200 || {
+  echo "⏳ Waiting for virtual environment setup by another process..."
+  flock 200
+}
+
 VENV_BIN=$(./.github/workflows/scripts/venv.sh) || exit 1
+
+# Release lock (automatic on exit, but explicit is clearer)
+flock -u 200
```

---

## 3. 🔄 WORKFLOW & DEPENDENCIES

### Issue 3.1: Running Expensive Operations on Empty Staged Files
**Severity**: LOW  
**Lines**: 20-24

**Description**:
These commands run even when no files are staged, wasting time on no-op commits.

**Current Code**:
```bash
pnpm run update:toc
pnpm run check:pom
npx knip --include files,exports,nsExports,nsTypes
```

**Proposed Fix**:
```diff
+# Only run if there are staged files
+STAGED_FILES=$(git diff --cached --name-only)
+if [ -n "$STAGED_FILES" ]; then
   pnpm run update:toc
   pnpm run check:pom
   npx knip --include files,exports,nsExports,nsTypes
+  npx knip --config knip.deps.json --include dependencies
+  pnpm run check-mock-cleanup
+  pnpm run check-localstorage
+fi
```

---

### Issue 3.2: CSS Check Has Potential for False Empty
**Severity**: LOW  
**Lines**: 143-147

**Description**:
The `|| true` swallows the exit code, but if `git diff` itself fails (not just grep having no matches), you won't know.

**Current Code**:
```bash
CSS_STAGED=$(git diff --cached --name-only --diff-filter=ACMRT \
  | grep -v '^src/utils/' \
  | grep -v '^src/types/' || true)
```

**Proposed Fix**:
```diff
-CSS_STAGED=$(git diff --cached --name-only --diff-filter=ACMRT \
-  | grep -v '^src/utils/' \
-  | grep -v '^src/types/' || true)
+# Separate git diff error handling from grep matching
+STAGED_FILES=$(git diff --cached --name-only --diff-filter=ACMRT) || exit 1
+CSS_STAGED=$(echo "$STAGED_FILES" | grep -v '^src/utils/' | grep -v '^src/types/' || true)
 
 if [ -n "$CSS_STAGED" ]; then
-  echo "$CSS_STAGED" | xargs "$VENV_BIN" .github/workflows/scripts/css_check.py --files
+  echo "$CSS_STAGED" | xargs "$VENV_BIN" .github/workflows/scripts/css_check.py --files || exit 1
 fi
```

---

### Issue 3.3: Cache Age Calculation Fallback Issue
**Severity**: LOW  
**Lines**: 67-81

**Description**:
On Windows or other platforms, this will re-download every time, bypassing the cache entirely.

**Current Code**:
```bash
# Fallback if stat failed (unsupported platform)
if [ -z "$FILE_MOD_TIME" ]; then
  echo "⚠️  Could not determine cache age, forcing re-download"
  NEEDS_DOWNLOAD=true
```

**Proposed Fix**:
```diff
+# Add Windows support with PowerShell fallback
 if [ -z "$FILE_MOD_TIME" ]; then
+  # Try Windows (Git Bash / PowerShell)
+  if command -v powershell.exe >/dev/null 2>&1; then
+    FILE_MOD_TIME=$(powershell.exe -Command "(Get-Item '$SCRIPT_PATH').LastWriteTime.ToFileTime()" 2>/dev/null)
+  fi
+  
+  # If still failed, re-download
+  if [ -z "$FILE_MOD_TIME" ]; then
     echo "⚠️  Could not determine cache age (unsupported platform), forcing re-download"
     NEEDS_DOWNLOAD=true
+  fi
 fi
```

---

## 📋 Implementation Checklist

### Critical Issues
- [ ] Issue 1.1: Replace `git add .` with selective staging
- [ ] Issue 1.2: Add `set -e`, `set -u`, and `set -o pipefail`
- [ ] Issue 1.3: Implement script integrity verification (choose Option 1, 2, or 3)
- [ ] Issue 1.4: Fix unquoted variable expansion

### Code Quality Issues
- [ ] Issue 2.1: Add error handling to all Python checks
- [ ] Issue 2.2: Add tool availability checks
- [ ] Issue 2.3: Quote variables properly
- [ ] Issue 2.4: Add lock file for venv setup

### Workflow & Dependencies
- [ ] Issue 3.1: Skip expensive operations when no files staged
- [ ] Issue 3.2: Separate git diff error handling from grep
- [ ] Issue 3.3: Add Windows support for cache age calculation

---

## 🧪 Testing Checklist

### macOS Testing
- [ ] Pre-commit hook runs successfully on macOS
- [ ] All checks pass with valid code
- [ ] Hook fails appropriately with invalid code
- [ ] File staging works correctly (no unintended files)

### Linux Testing
- [ ] Pre-commit hook runs successfully on Linux
- [ ] All checks pass with valid code
- [ ] Hook fails appropriately with invalid code
- [ ] File staging works correctly (no unintended files)

### Windows Testing
- [ ] Pre-commit hook runs successfully on Windows (Git Bash)
- [ ] All checks pass with valid code
- [ ] Hook fails appropriately with invalid code
- [ ] File staging works correctly (no unintended files)
- [ ] Cache mechanism works correctly on Windows

### Edge Cases
- [ ] Test with filenames containing spaces
- [ ] Test with filenames containing special characters
- [ ] Test with no staged files
- [ ] Test with network disconnection (cached scripts)
- [ ] Test with missing tools (should fail gracefully)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Pre-commit] Fix Critical Issues and Improve Code Quality #6317

Summary

Related

1. 🔴 CRITICAL ISSUES

Issue 1.1: Dangerous `git add .` Stages All Files

Issue 1.2: Missing Shell Error Handling (`set -e`)

Issue 1.3: Security Risk - Executing Downloaded External Script

Issue 1.4: Unquoted Variable Expansion - Command Injection Risk

2. ⚠️ CODE QUALITY ISSUES

Issue 2.1: Inconsistent Error Handling

Issue 2.2: Missing Tool Availability Checks

Issue 2.3: Unquoted Variable in Command Arguments

Issue 2.4: Race Condition in Virtual Environment Setup

3. 🔄 WORKFLOW & DEPENDENCIES

Issue 3.1: Running Expensive Operations on Empty Staged Files

Issue 3.2: CSS Check Has Potential for False Empty

Issue 3.3: Cache Age Calculation Fallback Issue

📋 Implementation Checklist

Critical Issues

Code Quality Issues

Workflow & Dependencies

🧪 Testing Checklist

macOS Testing

Linux Testing

Windows Testing

Edge Cases

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Pre-commit] Fix Critical Issues and Improve Code Quality #6317

Description

Summary

Related

1. 🔴 CRITICAL ISSUES

Issue 1.1: Dangerous git add . Stages All Files

Issue 1.2: Missing Shell Error Handling (set -e)

Issue 1.3: Security Risk - Executing Downloaded External Script

Issue 1.4: Unquoted Variable Expansion - Command Injection Risk

2. ⚠️ CODE QUALITY ISSUES

Issue 2.1: Inconsistent Error Handling

Issue 2.2: Missing Tool Availability Checks

Issue 2.3: Unquoted Variable in Command Arguments

Issue 2.4: Race Condition in Virtual Environment Setup

3. 🔄 WORKFLOW & DEPENDENCIES

Issue 3.1: Running Expensive Operations on Empty Staged Files

Issue 3.2: CSS Check Has Potential for False Empty

Issue 3.3: Cache Age Calculation Fallback Issue

📋 Implementation Checklist

Critical Issues

Code Quality Issues

Workflow & Dependencies

🧪 Testing Checklist

macOS Testing

Linux Testing

Windows Testing

Edge Cases

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Issue 1.1: Dangerous `git add .` Stages All Files

Issue 1.2: Missing Shell Error Handling (`set -e`)