PyCmd: A Beginner’s Guide to Python Command-Line Tools


Why migrate from shell scripts to PyCmd?

  • Portability: Python runs consistently across platforms (Windows, macOS, Linux) without shell differences.
  • Error handling: Exceptions and structured logging replace fragile exit-code checks and brittle parsing.
  • Maintainability: Python modules, unit tests, and type hints scale better than long shell files.
  • Extensibility: Reuse libraries, parse complex data formats (JSON, XML), and integrate with web APIs easily.
  • User experience: Provide subcommands, richer help messages, and consistent flags with less boilerplate.

Plan the migration

  1. Inventory scripts: list all shell scripts, note inputs/outputs, environment variables, cron jobs, and how they’re invoked.
  2. Prioritize by complexity and frequency of change: start with a small, frequently used script to get confidence.
  3. Specify behavior: document intended behavior, side effects (files created/deleted), error conditions, and edge cases.
  4. Design CLI surface: determine subcommands, flags, defaults, and expected exit codes.
  5. Prepare tests: capture current behavior with integration tests (can run existing shell scripts to form baselines).

Setup: project layout and environment

Create a project structure optimized for a CLI library and tool:

mytool/ ├─ mytool/ │  ├─ __init__.py │  ├─ cli.py         # PyCmd entrypoints and command definitions │  ├─ core.py        # business logic, pure functions │  └─ utils.py       # small helpers ├─ tests/ │  ├─ test_core.py │  └─ test_cli.py ├─ pyproject.toml └─ README.md 

Use a modern Python toolchain:

  • Python 3.10+ recommended.
  • Use a virtual environment (venv, pipx for installing CLI locally).
  • Add tooling: pytest for tests, mypy for type checks, and black for formatting.

Install dependencies:

python -m venv .venv source .venv/bin/activate pip install pycmd pytest mypy black 

Understand common shell patterns and their Python equivalents

Below are typical shell idioms and how to implement them in Python.

  • File and path manipulation

    • Shell: mkdir -p, rm -rf, cp
    • Python: use pathlib.Path and shutil (Path.mkdir(parents=True, exist_ok=True); shutil.rmtree)
  • Command pipelines and text processing

    • Shell: grep, awk, sed
    • Python: iterate file lines, use regex (re), or libraries like csv, json
  • Environment variables and defaults

    • Shell: ${VAR:-default}
    • Python: os.environ.get("VAR", "default") or click/PyCmd option defaults
  • Exit codes and error messages

    • Shell: exit 1 or || exit 1
    • Python: raise SystemExit(1) or let exceptions bubble and map to exit codes in cli layer
  • Background jobs / scheduling

    • Shell: &, cron
    • Python: subprocess with Popen for background, or use scheduling libs (APScheduler) or rely on system cron for scheduling

Example: a simple script migration

Shell script (backup.sh):

#!/bin/sh set -e SRC="/etc/myapp" DEST="/backups/myapp-$(date +%F).tar.gz" tar -czf "$DEST" -C "$(dirname "$SRC")" "$(basename "$SRC")" find /backups -type f -mtime +30 -delete echo "Backup created: $DEST" 

Step-by-step migration to PyCmd:

  1. Extract business logic into functions (core.py)
  2. Create CLI layer handling flags and logging (cli.py)
  3. Replace shell utilities with Python equivalents

core.py:

from pathlib import Path import shutil import tarfile from datetime import datetime, timedelta def create_backup(src: Path, dest_dir: Path) -> Path:     dest_dir.mkdir(parents=True, exist_ok=True)     dest_name = f"{src.name}-{datetime.utcnow().date().isoformat()}.tar.gz"     dest_path = dest_dir / dest_name     with tarfile.open(dest_path, "w:gz") as tf:         tf.add(src, arcname=src.name)     return dest_path def prune_backups(dest_dir: Path, days: int = 30) -> int:     cutoff = datetime.utcnow() - timedelta(days=days)     removed = 0     for p in dest_dir.glob("*.tar.gz"):         if datetime.utcfromtimestamp(p.stat().st_mtime) < cutoff:             p.unlink()             removed += 1     return removed 

cli.py (PyCmd usage):

from pycmd import command, option from pathlib import Path from .core import create_backup, prune_backups @command() @option("--src", default="/etc/myapp", help="Source directory to back up") @option("--dest-dir", default="/backups", help="Destination directory") @option("--prune-days", type=int, default=30, help="Prune backups older than DAYS") def backup(src: str, dest_dir: str, prune_days: int):     src_path = Path(src)     dest_path = Path(dest_dir)     created = create_backup(src_path, dest_path)     removed = prune_backups(dest_path, prune_days)     print(f"Backup created: {created}")     if removed:         print(f"Removed {removed} old backups") 

Notes:

  • Validate inputs (existence, permissions) and raise informative errors.
  • Use UTC timestamps to avoid timezone surprises.
  • Consider atomic writes (create temp file then rename) if interruptions are a concern.

Handling subprocesses and external commands

If your shell script invokes external tools, prefer Python libraries when feasible. When you must call subprocesses, use the subprocess module safely:

  • Avoid shell=True unless necessary.
  • Capture output and check return codes explicitly.
  • Use timeout to avoid hanging processes.

Example:

import subprocess def run_tool(cmd: list[str], timeout: int = 60) -> str:     completed = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)     if completed.returncode != 0:         raise RuntimeError(f"Command {cmd} failed: {completed.stderr.strip()}")     return completed.stdout 

Logging and error handling

Replace ad-hoc echo/error messages with the logging module and structured exceptions:

  • Configure logging in cli entrypoint with appropriate levels.
  • Use exceptions in core logic; catch at top-level to set exit codes and present friendly messages.
  • Provide verbose/debug flag to increase logging detail.

Example:

import logging logger = logging.getLogger("mytool") def main():     logging.basicConfig(level=logging.INFO)     try:         # run CLI         ...     except Exception as e:         logger.error("Operation failed: %s", e)         raise SystemExit(1) 

Tests and verification

  • Unit tests: test pure functions in core.py with pytest.
  • Integration tests: run the CLI in a temporary filesystem using tmp_path and click.testing or PyCmd’s test helpers.
  • Regression tests: compare outputs of old shell scripts to new CLI for a set of representative inputs.

Example pytest for create_backup:

def test_create_backup(tmp_path):     src = tmp_path / "data"     src.mkdir()     (src / "file.txt").write_text("hello")     dest = tmp_path / "backups"     created = create_backup(src, dest)     assert created.exists()     # optionally verify tar contents 

Packaging and distribution

  • Provide an entry point in pyproject.toml so users can install with pip and run a single command.

pyproject.toml (snippet):

[project.scripts] mytool = "mytool.cli:main" 
  • For local usage, pip install -e . or use pipx for isolated installs.
  • Consider shipping as a single-file executable with tools like PEX or PyOxidizer if you need self-contained binaries.

Deployment and running as cron/systemd

  • For cron jobs, call the installed CLI with absolute paths and a dedicated virtual environment or an installed package.
  • For systemd timers, create a unit that runs the CLI with proper environment variables.
  • Keep secrets out of scripts; prefer config files with appropriate permissions or environment variables managed by the system.

Example cron line:

0 2 * * * /usr/bin/python3 -m mytool backup --src /etc/myapp --dest-dir /backups 

Performance and resource considerations

  • For large files, stream data rather than loading entirely into memory; use shutil.copyfileobj or tarfile.add with file-like objects.
  • For parallelizable tasks (processing many files), use concurrent.futures with process pools, mindful of I/O bottlenecks.
  • Monitor memory and handle OOM cases gracefully with clear error messages.

Common pitfalls and mitigation

  • Relying on locale-dependent behavior: specify encoding and use binary modes when needed.
  • Permissions differences: check and set file modes explicitly with Path.chmod.
  • Timezones and timestamps: prefer UTC for deterministic behavior.
  • Silent failures: ensure subprocesses and file ops raise exceptions or return non-zero exit codes.

Example migration checklist

  • [ ] Inventory scripts and usage
  • [ ] Capture current behavior with tests
  • [ ] Design CLI commands and options
  • [ ] Implement core logic with pure functions
  • [ ] Implement PyCmd CLI layer with helpful help texts
  • [ ] Add logging and error handling
  • [ ] Write unit and integration tests
  • [ ] Package and install locally
  • [ ] Replace cron/systemd entries to call new CLI
  • [ ] Monitor first runs and iterate on error handling

Conclusion

Migrating shell scripts to PyCmd modernizes tooling by leveraging Python’s ecosystem for robust error handling, cross-platform support, and easier testing. Start small, focus on preserving behavior, and refactor iteratively: separate core logic from CLI glue, test thoroughly, and use Python idioms for file handling and subprocess management. The result will be a more maintainable, extensible, and user-friendly command-line toolchain.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *