Why you should avoid Base64 Data URLs Within WordPress Posts
I was working on a project which allows users to submit posts for approval but using an external editor outside the wp-admin/edit.php
page. This created room for submitting images within posts as base64 URL, which at some point maxed out PHP shared memory because someone decided to submit a post embedded with a high-quality image of 5MB. Let us talk about a little bit of Base64 encoded URIs.
Base64 & Data URLs
Base64 encoding schemes are commonly used when there is a need to encode binary data, especially when that data needs to be stored and transferred over media that are designed to deal with text. This encoding helps to ensure that the data remains intact without modification during transport. Base64 is used commonly in a number of applications including email via MIME, as well as storing complex data in XML or JSON – www.base64decode.org
Data URLs was created to embed small media content on the webpage in-line. Which the browser has to decode the base64 data to get the binary data as content, and transcode it using the provided mime type. The format goes like this data:[<mediatype>][;base64],<data>
where <mediatype>
is the mime type, and <data>
is the base64 data
The Problem
Base64 doubles the size of the encoded data by approximately x2 when encoded. From the example below, the changes in character counts can be seen.
1 2 3 4 5 6 7 8 9 10 11 |
$ echo 'Hello World' | wc -c # Outputs: 12 $ echo 'Hello World' | base64 | wc -c # Outputs: 17 $ echo 'Foo' | wc -c # Outputs: 4 $ echo 'Foo' | base64 | wc -c # Outputs: 9 |
A real-life scenario is this red dot image () which is ~85bytes when decoded, and ~120bytes when encoded.
1 2 3 4 5 6 7 8 9 10 11 |
$ echo 'iVBORw0KGgoAAA\ ANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4\ //8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU\ 5ErkJggg==' | base64 --decode | wc -c # Outputs: 85 $ echo 'iVBORw0KGgoAAA\ ANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4\ //8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU\ 5ErkJggg==' | wc -c # Outputs: 120 |
The size doubled by 70%. My point is this; this is just a red dot that is in bytes, you could have a 200kb image that could turn to 1MB after encoded, and this data is stored into PHP’s shared memory while being read and written to the database. A lot of these encoded images in posts will grow your database size drastically and could max out PHP’s shared memory if care is not taken. Not only about your server, but it will also take a lot more bandwidth for users to load your website.
The Solution – ImageBase642File Plugin
I made this plugin to automatically catch encoded images in WordPress Posts when they are being created or edited. It will look out for:
- Base64 data URIs within post content
- Read and decode them chunk by chunk, no matter the size, avoiding maxing out PHP shared memory
- Write each chunk into a file till all data is decoded
- Replace base64 URI with a URL to image attachment, and also linking the image to the post in WordPress relationship
An example of how it works is converting a post content from this:
1 2 3 |
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot" /> Nec feugiat nisl pretium fusce. |
To this
1 2 3 |
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <img src="https://site.com/wp-content/uploads/2021/05/post_title_id_no.png" alt="Red dot" /> Nec feugiat nisl pretium fusce. |
Drastically reducing the post content size and also saving your database disk space. This plugin can be found on the WordPress plugin directory with its GitHub repo here, which I maintain. I look forward to your comments 🙂
In a scenario where the image is really of small size, base64 encoding is an advantage, because the browser decodes and load it faster. But for an image file greater than 50kb, it is not advisable in my opinion.