Troubleshooting Unison-SSH: Common Issues and Fixes

Optimizing Unison-SSH Performance for Large Repositories

1. Use a recent Unison and SSH

  • Upgrade: Install the latest stable Unison and OpenSSH builds to benefit from performance and bug fixes.
  • Matching versions: Ensure the same Unison version on both ends to avoid expensive protocol fallbacks.

2. Reduce filesystem scanning cost

  • Limit sync roots: Sync only necessary directories rather than entire mounts.
  • Prune ignored paths: Add common large or changing directories to your ignore list (e.g., build/, node_modules/, .git/).
    • Example ignore patterns:
      • ignore = Path node_modules
      • ignore = Path .git
  • Use preference for file groups: If many files don’t need syncing, move them out of the sync tree.

3. Tune Unison profile and options

  • Fast checks: Use prefer and times settings sensibly; times can prevent needless transfers when timestamps match.
  • Batch updates: Run Unison in batch mode for scripted runs: unison -batch profile.
  • Limit memory use: If memory is constrained, start with smaller archives or split syncs across subfolders.

4. Parallelize and split work

  • Split large repository: Break sync into multiple profiles focused on subtrees to allow parallel runs.
  • Run multiple Unison instances: On multicore servers, run separate Unison processes for different subtrees to utilize CPU and I/O concurrency.

5. Optimize SSH

  • Connection reuse: Use ControlMaster in SSH config to reuse TCP connections:

    Code

    Host example ControlMaster auto ControlPath ~/.ssh/cm-%r@%h:%p ControlPersist 10m
  • Compression: Enable -C for SSH compression when CPU is cheap and network is slow; disable if CPU is the bottleneck.
  • Cipher selection: Use faster ciphers (e.g., -c aes128-ctr or [email protected]) in SSH config for better throughput.
  • Keepalive: Add ServerAliveInterval to avoid reconnect overhead for long runs.

6. Reduce transfer volume

  • Avoid transferring unchanged files: Ensure Unison’s preference for file comparison (by default uses file signatures) is enabled.
  • Use rsync for initial bulk: Seed the remote with an rsync copy for the initial sync, then use Unison for incremental two-way updates.

7. Network and I/O tuning

  • Increase socket buffers: Tune TCP window sizes on both ends for high-latency links.
  • Filesystem performance: Use SSDs or tuned filesystems, and ensure background tasks (indexers, antivirus) are minimized during sync.
  • Monitor I/O: Identify hotspots with iostat, iotop, or similar tools and adjust concurrency accordingly.

8. Monitoring and diagnostics

  • Verbose logs: Run unison -debug to inspect costly operations and patterns.
  • Profile runs: Time separate phases (scan vs transfer) to know whether CPU, disk, or network is the bottleneck.
  • Iterate: Change one knob at a time and measure impact.

Quick checklist

  • Update Unison/SSH versions and match them.
  • Ignore large generated dirs (.git, node_modules, build).
  • Reuse SSH connections and pick appropriate ciphers/compression.
  • Split sync into subtrees and parallelize where safe.
  • Seed with rsync for initial bulk transfers.
  • Monitor scans vs transfers and optimize based on the bottleneck.

If you want, I can generate a ready-to-use Unison profile and SSH config tuned for your environment—tell me typical repo size, latency, and whether CPU or network is the limiting resource.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *