Migrate Data to a Downstream TiDB Table with More Columns
This document provides the additional steps to be taken when you migrate data to a downstream TiDB table with more columns than the corresponding upstream table. For regular migration steps, see the following migration scenarios:
- Migrate MySQL of Small Datasets to TiDB
- Migrate MySQL of Large Datasets to TiDB
- Migrate and Merge MySQL Shards of Small Datasets to TiDB
- Migrate and Merge MySQL Shards of Large Datasets to TiDB
Use DM to migrate data to a downstream TiDB table with more columns
When replicating the upstream binlog, DM tries to use the current table schema of the downstream to parse the binlog and generate the corresponding DML statements. If the column number of the table in the upstream binlog does not match the column number in the downstream table schema, the following error occurs:
"errors": [
{
"ErrCode": 36027,
"ErrClass": "sync-unit",
"ErrScope": "internal",
"ErrLevel": "high",
"Message": "startLocation: [position: (mysql-bin.000001, 2022), gtid-set:09bec856-ba95-11ea-850a-58f2b4af5188:1-9 ], endLocation: [ position: (mysql-bin.000001, 2022), gtid-set: 09bec856-ba95-11ea-850a-58f2b4af5188:1-9]: gen insert sqls failed, schema: log, table: messages: Column count doesn't match value count: 3 (columns) vs 2 (values)",
"RawCause": "",
"Workaround": ""
}
]
The following is an example upstream table schema:
# Upstream table schema
CREATE TABLE `messages` (
`id` int(11) NOT NULL,
PRIMARY KEY (`id`)
)
The following is an example downstream table schema:
# Downstream table schema
CREATE TABLE `messages` (
`id` int(11) NOT NULL,
`message` varchar(255) DEFAULT NULL, # This is the additional column that only exists in the downstream table.
PRIMARY KEY (`id`)
)
When DM tries to use the downstream table schema to parse the binlog event generated by the upstream, DM reports the above Column count doesn't match
error.
In such cases, you can use the binlog-schema
command to set a table schema for the table to be migrated from the data source. The specified table schema needs to correspond to the binlog event data to be replicated by DM. If you are migrating sharded tables, for each sharded table, you need to set a table schema in DM to parse binlog event data. The steps are as follows:
Create a SQL file in DM and add the
CREATE TABLE
statement that corresponds to the upstream table schema to the file. For example, save the following table schema tolog.messages.sql
.# Upstream table schema CREATE TABLE `messages` ( `id` int(11) NOT NULL, PRIMARY KEY (`id`) )
Use the
binlog-schema
command to set the table schema for the table to be migrated from the data source. At this time, the data migration task should be in the Paused state due to the aboveColumn count doesn't match
error.tiup dmctl --master-addr ${advertise-addr} binlog-schema update -s ${source-id} ${task-name} ${database-name} ${table-name} ${schema-file}
The descriptions of parameters in this command are as follows:
Parameter Description -master-addr
Specifies ${advertise-addr}
of any DM-master node in the cluster where dmctl is to be connected.${advertise-addr}
indicates the address that DM-master advertises to the outside world.binlog-schema set
Manually set the schema information. -s
Specifies the source. ${source-id}
indicates the source ID of MySQL data.${task-name}
Specifies the name of the migration task defined in the task.yaml
configuration file of the data migration task.${database-name}
Specifies the database. ${database-name}
indicates the name of the upstream database.${table-name}
Specifies the name of the upstream table. ${schema-file}
Specifies the table schema file to be set. For example:
tiup dmctl --master-addr 172.16.10.71:8261 binlog-schema update -s mysql-01 task-test -d log -t message log.message.sql
Use the
resume-task
command to resume the migration task in the Paused state.tiup dmctl --master-addr ${advertise-addr} resume-task ${task-name}
Use the
query-status
command to confirm that the data migration task is running correctly.tiup dmctl --master-addr ${advertise-addr} query-status resume-task ${task-name}