Portfolio: Server-to-server transfer manager

This page describes a proprietary system to transfer data between servers over encrypted channels, either on a schedule or triggered by file activity, which I built while employed as a Linux system administrator and Linux architect.

Key details
Brief description:A system for point-to-point encrypted data transfer channels, each of which automatically synchronises files from source to destination when its trigger conditions are met.
Consumer:Data integration team, ERP development and support team, Financial controls team, Linux system administration team
Impact to consumer:
  • Improved security posture
  • Reduced complexity of batch processes
  • More reliable automated data transfers, with a robust exception alerting mechanism
  • Short time to deployment due to ease of integration with existing systems and processes
Technical features:
  • Endpoints run an agent in C with modular components to segregate OpenSSL processing, rsync transfers, and central daemon logic
  • Transfers run from agent to agent, only using the central management server to receive new configuration and to store a copy of transfer logs
  • Minimal overhead with no complicated frameworks or dependencies
  • Once a channel is set up, management of it can be delegated to the appropriate team
Technologies used:C, OpenSSL, rsync, Apache HTTP Server, HTML::Mason (Perl), MariaDB

Many corporate processes involve data being processed on one server and then transferred to another. A wide range of transfer mechanisms are in use, such as FTP, SFTP, SCP, and rsync. All of these involve security tradeoffs and some, like FTP, can be quite complex and fragile.

To reduce the support overhead and the general attack surface, I extended the general idea of rsync-over-SSH transfers into the server-to-server transfer manager. This uses an agent at both ends of the transfer, with a central server to manage the overall configuration and receive a copy of transfer logs for operators to review under a unified interface. Dozens of critical internal data flows now rely on this system.

Transfers are triggered either on a schedule (like a crontab), or by an external process creating or removing marker files. The following marker file types are applicable to transfer channels:

The central server polls each agent continuously, transmitting any configuration changes, and retrieving any new transfer logs. Problems are presented in the web interface, and also exported as metrics files for the Zabbix monitoring system to detect and raise alerts from.

Each agent listens for connections from the central server and from other agents. The OpenSSL libraries are used to provide encryption and certificate validation. Agents perform the network I/O in a dedicated, unprivileged subprocess, for security isolation.

When an agent runs a transfer, it communicates its intent to the agent at the other end over the encrypted connection, and both sides then launch the rsync tool in subprocess with the UID, GID, and configuration file appropriate to the channel configuration. The progress is recorded in the transfer logs and the exit status captured to indicate whether the transfer was successful.

Screenshot of a server-to-server transfer channel's configuration
The configuration page of a server-to-server transfer channel

The figure above shows the configuration page of a server-to-server transfer channel in the management tool. The current status of the channel is shown at the top right; this is also reflected in the monitoring system via the exported metrics files.

The tool allows channel configuration to be delegated, by listing Active Directory users or groups in the Editors and Viewers boxes. The Viewers access is read-only, allowing those people to see the status and logs only.

Source and destination servers are not selected directly in the channel configuration, but instead refer to "server groups". In practice most server groups contain only one server; but in the case where services are provided by an active/passive server pair, the server group mechanism allows the transfer channel to always point to the currently active node.

Some of the rsync transfer options are surfaced in the Options section, as different teams use this transfer mechanism in different ways. Similarly, the glob patterns that rsync should match when transferring are also configurable here.