Fix rsync message authentication error
In my home I am running a raspberry-pi based digital video recorder. To copy the recorded files to my server, I use a cron-triggered rsync script. While that worked in other situations just fine, here I always get an "error in rsync protocol data stream (code 12) at io.c(226)". Finally looking deeper into it I think I have an explanation and a solution.
In my scripts to rsync data over from one system to another over SSH, I usually just use a simple command line that instructs rsync to create an SSH tunnel and run the communication over that line.
rsync -av -e "ssh -l ssh-user" rsync-user@host::module /dest
where I login as another user "ssh-user" with certificate-based authentication. The line here is actually right from the man page.
This usually worked very well. However, the problem I had was that when I did this with a specific combination of source and target hosts it always resulted in an error message like this
receiving incremental file list ./ Foo_bar_recording.ts ssh_dispatch_run_fatal: Connection to host.server.name port 22: message authentication code incorrect rsync: connection unexpectedly closed (60231093 bytes received so far) [receiver] rsync error: error in rsync protocol data stream (code 12) at io.c(226) [receiver=3.1.2] rsync: connection unexpectedly closed (2321 bytes received so far) [generator] rsync error: unexplained error (code 255) at io.c(226) [generator=3.1.2]
Looking at the error message I was thinking about networking problems, so I even tried changing hardware - from ethernet ports on both ends (using an USB-based Ethernet on the Raspberry), replacing cables, and even the hub between the two endpoints. Nothing worked, however.
I checked the software versions on both ends, rsync was the same, and SSH was 7.4p1 on one side and 7.5p1 on the other side. I could not find any information on the internets, most were problems at other line numbers in io.c, or where zero bytes were transferred.
Finally I was looking more into how rsync works and what it expects from the command given to the "-e" option. Rsync. The man pages states that this is used "to run an rsync daemon on the remote host and all data will be transmitted through that remote shell connection". So I was doing everything "by the book" and still receiving the error. I must be missing something.
rsync is expecting to use the remote shell connection to transmit all data. That data should of course not be corrupted. I assumed the problem - corruption - being on the hardware side, but what if it were on the software side? So how does SSH open the remote connection when called like this? Looking at the SSH man page I saw something about opening a pseudo-terminal for interactive sessions. Then I remembered that I had recently started to use "setterm" to switch off the monitor even when on console. For your reference:
/usr/bin/setterm --blank 1 --powerdown 1 --powersave powerdown
So what would happen if SSH opened a pseudo terminal and the terminal at some point would inject escape commands into the data stream to switch off the monitor? What helped was the comment on pseudo-terminals in the SSH shell: SSH by default enables pseudo terminals, and requests one for interactve sessions when the client has one. So when I run it on the command line to test, it would request a pseudo terminal. If running from a cron script it would probably not even do that as this is not interactive (not tested assumption).
So what is the solution? I tried to disable pseudo-terminal allocation with the "-T" option. The man pages states "If no pseudo-terminal has been allocated, the session is transparent and can be used to reliably transfer binary data." And in fact, the error did not appear again! So I would recommend that the rsync man page is changed to include the "-T" option:
rsync -av -e "ssh -T -l ssh-user" rsync-user@host::module /dest
I hope that helps you finding a solution to your rsync problems, it took me long enough to find to make it worth it to write up here.