Migrating WordPress Gutenberg blocks to Statamic
Migrating data between CMSs is in most cases the reason why don't change your CMS. I wanted to discover how difficult it would be to migrate from WP to Statamic.
There are two amazing guides about migrating WordPress data to Statamic out there. Both of them are great and they handle this problem in a slightly different way:
The one by Lucky Media uses Corcel and a CLI command
The second, by Stopa Development, is a set of plugins for WP and Statamic that migrates the data
And while they are great, they both handle content like Gutenberg never existed.
The Block Editor aka Gutenberg
With the WordPress 5.0 release, Gutenberg became the new content editor and replaced TinyMCE. It brought a lot of changes in the way how content was created. With TinyMCE, the content was just a blob of HTML, that we could simply import into Statamic.
Gutenberg is more similar to Bard, where everything consists of blocks.
With both editors sharing a similar approach to handing content, it would be great to move some blocks directly to bard sets.
The problem
The only catch is the way how Gutenberg saves data:
<!-- wp:paragraph -->
<p>Gesha coffee, also known as Geisha coffee, is a remarkable and highly sought-after variety that has taken the coffee world by storm. Originally discovered in the remote Gesha village of Ethiopia, this coffee has gained worldwide recognition for its exceptional quality and unique flavor profile.</p>
<!-- /wp:paragraph -->
<!-- wp:acf/test {"name":"acf/test","data":{"name":"Maciek Palmowski","_name":"field_66c11a6ea3876","is_important":"1","_is_important":"field_66c11a87a3877","bigger_text":"This is a very important content\r\n\r\nThat's the message","_bigger_text":"field_66c11aa6a3878"},"mode":"preview"} /-->
This way it isn't easy to parse outside of WordPress. Luckily WordPress has a parse_blocks
function that allows you to convert this into a more readable format:
"block_data": [
{
"blockName": "core/paragraph",
"attrs": {
"align": "",
"content": "Gesha coffee, also known as Geisha coffee, is a remarkable and highly sought-after variety that has taken the coffee world by storm. Originally discovered in the remote Gesha village of Ethiopia, this coffee has gained worldwide recognition for its exceptional quality and unique flavor profile.",
"dropCap": false,
"placeholder": "",
"direction": "",
"lock": [],
"metadata": [],
"style": [],
"backgroundColor": "",
"textColor": "",
"gradient": "",
"className": "",
"fontSize": "",
"fontFamily": "",
"anchor": ""
},
"innerBlocks": [],
"innerHTML": "\n<p>Gesha coffee, also known as Geisha coffee, is a remarkable and highly sought-after variety that has taken the coffee world by storm. Originally discovered in the remote Gesha village of Ethiopia, this coffee has gained worldwide recognition for its exceptional quality and unique flavor profile.</p>\n",
"innerContent": [
"\n<p>Gesha coffee, also known as Geisha coffee, is a remarkable and highly sought-after variety that has taken the coffee world by storm. Originally discovered in the remote Gesha village of Ethiopia, this coffee has gained worldwide recognition for its exceptional quality and unique flavor profile.</p>\n"
],
"rendered": "\n<p>Gesha coffee, also known as Geisha coffee, is a remarkable and highly sought-after variety that has taken the coffee world by storm. Originally discovered in the remote Gesha village of Ethiopia, this coffee has gained worldwide recognition for its exceptional quality and unique flavor profile.</p>\n"
},
{
"blockName": "acf/test",
"attrs": {
"name": "acf/test",
"data": {
"name": "Maciek Palmowski",
"_name": "field_66c11a6ea3876",
"is_important": "1",
"_is_important": "field_66c11a87a3877",
"bigger_text": "This is a very important content\r\n\r\nThat's the message",
"_bigger_text": "field_66c11aa6a3878"
},
"mode": "preview",
"align": "",
"lock": [],
"metadata": [],
"className": "",
"anchor": ""
},
"innerBlocks": [],
"innerHTML": "",
"innerContent": [],
"rendered": "<h1>Render something</h1>"
},
}
Sadly, while TipTap is amazing, moving structured data to it isn't as straightforward as you could expect.
What we want to achieve
We'll create a CLI command that:
get the post data from the REST API
will convert some of the blocks into Bard sets
will convert the rest by guessing the rendered HTML
save as a post to a Statamic collection
WordPress preparations
Before we start, we'll need to install the wp-rest-blocks
plugin by Jonny Harris. You can find it on GitHub. There is also a similar plugin called VIP Block Data API by Automattic, but I prefer the one by Jonny.
Thanks to this, we'll get this nice structured data in the REST API. After the installation, you'll get access to the block_data
node inside your API.
Also, make sure that every post type you want to migrate is accessible via the REST API.
Creating the CLI command
I went with the LuckyMedia approach. Just instead of using Corcel, I decided to use Guzzle and grab the API.
Creating the command is as simple as running:
php artisan make:command ImportWordPress
Grabbing the data
This a very simple scenario, where I just want to grab content from posts
:
<?php
namespace App\Console\Commands;
use Illuminate\Console\Command;
use GuzzleHttp\Client;
use Statamic\Facades\Entry;
use Carbon\Carbon;
class ImportWordPress extends Command
{
/**
* The name and signature of the console command.
*
* @var string
*/
protected $signature = 'import:wp';
/**
* The console command description.
*
* @var string
*/
protected $description = 'Import posts from WordPress';
/**
* Execute the console command.
*/
public function handle()
{
foreach ($this->getPosts() as $post) {
$this->info("Importing post: {$post['title']['rendered']}");
$entry = Entry::make()
->collection('posts')
->slug($post['slug'])
->date(Carbon::createFromFormat('Y-m-d\TH:i:s', $post['date']))
->data([
'title' => $post['title']['rendered'],
'content' => 'our future content will be here'
]);
$entry->save();
}
}
public function getPosts()
{
$client = new Client();
$apiUrl = "http://wpapi.test/wp-json/wp/v2/posts";
try {
$response = $client->get($apiUrl);
$data = json_decode($response->getBody(), true);
return $data;
} catch (\Exception $e) {
$this->error("Error: " . $e->getMessage());
}
}
}
I have the getPosts
method that takes care of grabbing posts from the API and returning them.
Apart from grabbing the data in this step, we're saving some basic data into Statamic like slug and title. I won't focus more on those because this was already covered in the LuckyMedia tutorial.
Grabbing the default blocks
First, let's start with grabbing the blocks, that we want to copy as they are and let TipTap handle the rest.
<?php
namespace App\Console\Commands;
use Illuminate\Console\Command;
use GuzzleHttp\Client;
use Statamic\Facades\Entry;
use Carbon\Carbon;
use Tiptap\Editor;
class ImportWordPress extends Command
{
/**
* The name and signature of the console command.
*
* @var string
*/
protected $signature = 'import:wp';
/**
* The console command description.
*
* @var string
*/
protected $description = 'Import posts from WordPress';
/**
* Execute the console command.
*/
public function handle()
{
foreach ($this->getPosts() as $post) {
$value = [];
$this->info("Importing post: {$post['title']['rendered']}");
foreach ($post['block_data'] as $block) {
$value[] = match($block['blockName']) {
default => $this->parseDefaultBlock($block),
};
}
$entry = Entry::make()
->collection('posts')
->slug($post['slug'])
->date(Carbon::createFromFormat('Y-m-d\TH:i:s', $post['date']))
->data([
'title' => $post['title']['rendered'],
'content' => $value
]);
$entry->save();
}
}
public function getPosts()
{
$client = new Client();
$apiUrl = "http://wpapi.test/wp-json/wp/v2/posts";
try {
$response = $client->get($apiUrl);
$data = json_decode($response->getBody(), true);
return $data;
} catch (\Exception $e) {
$this->error("Error: " . $e->getMessage());
}
}
public function parseDefaultBlock($block)
{
$tmp_value = (new \Tiptap\Editor)
->setContent($block['rendered'])
->getJSON();
return json_decode($tmp_value, true)['content'][0];
}
}
With the parseDefaultBlock
method we're grabbing the rendered block HTML content, saving it to JSON, and converting it to an array. This way - we're recreating TipTap's data structure. This probably should work in most cases. There is a chance you'll need some additional extensions to handle some types of data, but all of this is covered in the tutorial I mentioned earlier.
Creating sets
TipTap has a way to create custom sets. Sadly, it focuses on parsing HTML to grab data. A step that we don't need because we already have structured data.
That is why, I had to reverse-engineer the whole process a bit. Let's go step by step.
In WordPress, we have a block called acf/test
and it consists of three data fields:
name - string
is_important - boolean
bigger_text - string
Before going forward, we have to create a similar set in Statamic.
Such set will be saved as:
-
type: set
attrs:
values:
type: test_data
name: 'Maciek Palmowski'
is_important: '1'
bigger_text: "This is a very important content\r\n\r\nThat's the message"
Knowing this and knowing that the Bard set is called test_data
I could create:
<?php
namespace App\Console\Commands;
use Illuminate\Console\Command;
use GuzzleHttp\Client;
use Statamic\Facades\Entry;
use Carbon\Carbon;
use Tiptap\Editor;
class ImportWordPress extends Command
{
/**
* The name and signature of the console command.
*
* @var string
*/
protected $signature = 'import:wp';
/**
* The console command description.
*
* @var string
*/
protected $description = 'Import posts from WordPress';
/**
* Execute the console command.
*/
public function handle()
{
foreach ($this->getPosts() as $post) {
$value = [];
$this->info("Importing post: {$post['title']['rendered']}");
foreach ($post['block_data'] as $block) {
$value[] = match($block['blockName']) {
'acf/test' => $this->parseTestBlock($block),
default => $this->parseDefaultBlock($block),
};
}
$entry = Entry::make()
->collection('posts')
->slug($post['slug'])
->date(Carbon::createFromFormat('Y-m-d\TH:i:s', $post['date']))
->data([
'title' => $post['title']['rendered'],
'content' => $value
]);
$entry->save();
}
}
public function getPosts()
{
$client = new Client();
$apiUrl = "http://wpapi.test/wp-json/wp/v2/posts";
try {
$response = $client->get($apiUrl);
$data = json_decode($response->getBody(), true);
return $data;
} catch (\Exception $e) {
$this->error("Error: " . $e->getMessage());
}
}
public function parseDefaultBlock($block)
{
$tmp_value = (new \Tiptap\Editor)
->setContent($block['rendered'])
->getJSON();
return json_decode($tmp_value, true)['content'][0];
}
public function parseTestBlock($block)
{
$tmp_value = (new \Tiptap\Editor)
->setContent([
'type' => 'set',
'content' => '',
'attrs' => [
'values' => [
'type' => 'test_data',
'name' => $block['attrs']['data']['name'],
'is_important' => $block['attrs']['data']['is_important'],
'bigger_text' => $block['attrs']['data']['bigger_text'],
]
],
])
->getJSON();
return json_decode($tmp_value, true);
}
}
As you can see, the parseTestBlock
method creates the array manually and fills the blanks with proper data from the API. If you want to migrate data similarly, you'll have to reverse-engineer each set's structure.
Also, by default sets don't have the content
key, but when you are creating it this way, this key has to be added, because without it will throw an error that content
is missing. Not sure why.
Closing thoughts
As I mentioned at some point - it wasn't as straightforward as it could. On the other hand, until we have a unified format for this type of content (and probably this will never happen), migrations between them will be tricky and will require some manual work.
On the other hand - when you understand how to import any data as bard sets, it becomes simpler. The most difficult part for me was reverse engineering it for the first time.
Get updated about new blog posts
No spam