Retrieving data from hijacked PNG images using HTML canvas and Javascript

Two weeks ago I wrote an article - Hijacking HTML canvas and PNG images to store arbitrary text data, this is the complimentary article that deals with extracting data back out of these images. As a summary, in the first article I described how arbitrary text data (or JSON in my specific case) could be stored as pixels in a PNG image. This required converting input data from JavaScript's Strings to a Uint8Array and then storing bytes using three channels (RGB) within an ImageData object before drawing this image to a Canvas and saving as a PNG file. So lets see what's involved in doing the reverse of this process and getting back the original data...

At the end of the last article we had an image that looked something like this...

This image stores a serialised JSON object that has 3700 indexed properties each holding a String value 'The quick brown fox jumps over the lazy dog'.

Loading the Image

The image is loaded by creating an Image object and setting its src attribute to the PNG image from the previous article. In a real use case the src attribute can be set or a URL or even a data URL. This example keeps it very simple and references a file.

JavaScript

var img = new Image();

img.onload = function() {

...

};

img.src = 'image.png';

The onload() callback will be where all of the decoding code goes, for now this is an empty function which will be filled out in the following sections.

Drawing Image to Canvas

Once the image is loaded we need to access its pixel data which means it has to be drawn on an off-screen canvas first. A canvas with the same dimensions as the image (assuming everything is square) is created and the image is then drawn to the 2D context.

JavaScript

var imgSize = img.width;

var canvas = document.createElement('canvas');

canvas.width = canvas.height = imgSize;

var ctx = canvas.getContext('2d');

ctx.drawImage(img, 0, 0);

Converting Pixels to a Byte Array

From the previous article, the maximum size of the source image was restricted to 256 pixels square and the first row of pixels was used to encode the size of the square that held actual data, leaving 255 rows to work with. This meant that the last column of pixels was also wasted which left us a 255 pixel square to store data into. The actual size of the data square could vary and the red component of the first pixel in the source image was used to store the size of this square.

So to we simply read the first pixel worth of data and grab the first byte to get the data size back.

JavaScript

var headerData = ctx.getImageData(0, 0, 1, 1);

var dataSize = headerData.data[0];

Once the data size is known the data square can be fetched. Remembering that this data is stored as RGBA pixels where the alpha value is always set to 255 (full opacity) we need to create a Uint8Array that is big enough to hold just the RGB data.

JavaScript

var imageData = ctx.getImageData(0, 1, dataSize, dataSize);

var paddedData = imageData.data;

var uint8array = new Uint8Array(paddedData.length / 4 * 3);

We have to skip every 4th byte and copy blocks of 3-byte values to the new array. This serves a double purpose too, since the image data is returned as a Uint8ClampedArray and we want a standard Uint8Array, this one loop will both skip alpha channel data and do the conversion to the correct data type!

JavaScript

var idx = 0;

for (var i = 0; i < paddedData.length - 1; i += 4) {

var subArray = paddedData.subarray(i, i + 3);

uint8array.set(subArray, idx);

idx += 3;

}

At this point we'll have an array that has all of our data plus a whole lot of zero-padded data at the end which also needs to be skipped. We need to find where in the array this zero padding ends so we just loop over the array from the end until we hit the first non-zero byte.

JavaScript

var includeBytes = uint8array.length;

for (var i = uint8array.length - 1; i > 0; i--) {

if (uint8array[i] == 0) {

includeBytes--;

}

else {

break;

}

Decoding the Byte Array

To get the original String value back the TextDecoder.decode() function is used on the subarray that excludes zero padded data.

JavaScript

var data = uint8array.subarray(0, includeBytes);

var strData = (new TextDecoder('utf-8')).decode(data);

That's it! Now the strData variable holds the original string data that was encoded into the PNG image. In my case it was JSON so I could easily convert it back to an object using JSON.parse().

As I stated in the previous article, this is not the most efficient or best code and not the best way of storing data, but it works and it met the needs of my project. I also went a little further in my implementation and stored a specific sequence of pixels in the first row that fingerprinted the images as decodable. If an image did not have this sequence of pixels the decoding code would reject it

It's always nice being able to take a technology that was created for one purpose and distort it in a useful way to meet another purpose. Hopefully this is useful to others too, if you end up using this in your works, do let me know!

-i