Bundling a binary file into a shell script


When creating an auto-scaling group in EC2 I often try to package the deployment script into the user data. Installing some packaged software is easy to do but bundling configuration files that are needed is less straightforward. If the files are not confidential in any way, I either clone a Git repository or download a tarball from our static assets domain. But this leads to a dependency on external services and a slightly more complex deployment procedure. A few days ago I was faced with the same options again but it didn't sit right with me to do all this for a couple of files that are a few K's in size totally. I remembered that some software have installation scripts that bundle the binary blob inside the script.

First version

I searched and found an article in the Linux Journal that seemed to show what I wanted to (and seems to be copied everywhere). You could download a single file that was a shell script with the binary blob inside. Your usage will be close to this

wget http://hostname.tld/bundle
sh bundle

or this

wget http://hostname.tld/bundle
chmod +x bundle

Which is fine. However the code was a bit longer than it should have been and I felt it could be done better. A little more research and I found an answer in Stack Overflow that mentioned uuencode and uudecode. Reading the man page I saw it was closer to what I wanted. The code I wrote is available on my GitLab instance.

The implementation works as follows. The bundle has the script at the start of the file with the encoded binary at the end. The shell executes the script part (which ends with exit as to not continue any further, causing errors) and uudecode only starts processing after it sees the relevant header. The script feeds itself to uudecode (uudecode "$0") which decodes the binary and outputs it to disk which the script can then use. The code has both the build instruction in the Makefile and usage example in the bats tests.

Second version

However something kept nagging me. I wanted a simple invocation method like so:

curl http://hostname.tld/bundle | sh

And in the case of the user data in EC2, I could simply use the bundle. Otherwise I would need to host it somewhere and in the user data I would download and run the bundle. Which means that if the bundle was unavailable the instance would fail to provision.

Everything I found assumed that the file was present in the file system for uudecode to decode. If it was piped there was no file that uudecode could then decode. I kept mauling over it and a came up with a short, clean solution to this problem, which is available here, again with build instruction and test examples.

This time I used AWK to replace a single line in the script with the file, encoded using uuencode but this time in base64 (to keep the script valid without any characters with special meanings). That is piped to uudecode which decodes and saves it to disk. The script can then continue with the binary blob present.

This method is less space efficient and the build procedure is less obvious. But the ability to use resulting script as the user data (or piping the output from curl to sh) is worth it in my opinion.