TwinDB data recovery toolkit is a set of tools that work with InnoDB tablespaces at low level.

Incredible Performance of stream_parser

Stream_parser is a tool that finds InnoDB pages in stream of bytes. It can be either file such as ibdata1, *.ibd or raw partition.

Stream_parser runs as many parallel workers as number of CPUs in the system. The performance of stream_parser is amazing! Compare how stream_parser outperforms page_parser on a four-CPU virtual machine running on

My laptop

# ./page_parser -f /dev/mapper/vg_twindbdev-lv_root -t 18G

Opening file: /dev/mapper/vg_twindbdev-lv_root

...

Size to process: 19327352832 (18.000 GiB)

1.00% done. 2014-06-23 03:03:48 ETA(in 00:18 hours). Processing speed: 17570320 B/sec

2.00% done. 2014-06-23 03:05:27 ETA(in 00:19 hours). Processing speed: 16106127 B/sec

3.00% done. 2014-06-23 03:02:11 ETA(in 00:16 hours). Processing speed: 19327352 B/sec

4.00% done. 2014-06-23 03:03:48 ETA(in 00:17 hours). Processing speed: 17570320 B/sec

...

So, it takes almost 20 minutes to parse 18G partition.

Let’s check stream_parser

# ./stream_parser -f /dev/mapper/vg_twindbdev-lv_root -t 18G

...

Size to process: 19327352832 (18.000 GiB)

Worker(0): 1.91% done. 2014-06-23 02:51:41 ETA(in 00:00:56). Processing speed: 79.906 MiB/sec

Worker(2): 1.74% done. 2014-06-23 02:51:47 ETA(in 00:01:02). Processing speed: 72.000 MiB/sec

Worker(3): 3.30% done. 2014-06-23 02:51:15 ETA(in 00:00:30). Processing speed: 144.000 MiB/sec

Worker(1): 1.21% done. 2014-06-23 02:52:20 ETA(in 00:01:35). Processing speed: 47.906 MiB/sec

Worker(2): 5.38% done. 2014-06-23 02:51:11 ETA(in 00:00:25). Processing speed: 168.000 MiB/sec

Worker(3): 9.72% done. 2014-06-23 02:51:00 ETA(in 00:00:14). Processing speed: 296.000 MiB/sec

...

Worker(0): 88.91% done. 2014-06-23 02:52:06 ETA(in 00:00:02). Processing speed: 191.625 MiB/sec

Worker(0): 93.42% done. 2014-06-23 02:52:06 ETA(in 00:00:01). Processing speed: 207.644 MiB/sec

Worker(0): 97.40% done. 2014-06-23 02:52:06 ETA(in 00:00:00). Processing speed: 183.641 MiB/sec

All workers finished in 31 sec

So, 18 minutes versus 31 seconds. 34 times faster! Impressive, isn’t it?

C_parser Improvements

C_parser is a tool that reads InnoDB page or many pages, extracts records and stores them in tab-separated values dumps. InnoDB page with user data doesn’t store information about table structure. You should tell c_parser what fields you’re looking for. Command line option -t specifies a file with CREATE TABLE statement.

This is how it works. Here’s the CREATE statement (I took it from mysqldump)

# cat sakila/actor.sql

CREATE TABLE `actor` (

`actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,

`first_name` varchar(45) NOT NULL,

`last_name` varchar(45) NOT NULL,

`last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,

PRIMARY KEY (`actor_id`),

KEY `idx_actor_last_name` (`last_name`)

) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8;

And now let’s fetch records of table actor from InnoDB pages

# ./c_parser -6f pages-actor.ibd/FIL_PAGE_INDEX/0000000000001828.page -t sakila/actor.sql

-- Page id: 3, Format: COMPACT, Records list: Valid, Expected records: (200 200)

000000005313 970000013C0110 actor 1 "PENELOPE" "GUINESS" "2006-02-15 04:34:33"

000000005313 970000013C011B actor 2 "NICK" "WAHLBERG" "2006-02-15 04:34:33"

000000005313 970000013C0126 actor 3 "ED" "CHASE" "2006-02-15 04:34:33"

...

000000005313 970000013C09D8 actor 199 "JULIA" "FAWCETT" "2006-02-15 04:34:33"

000000005313 970000013C09E4 actor 200 "THORA" "TEMPLE" "2006-02-15 04:34:33"

-- Page id: 3, Found records: 200, Lost records: NO, Leaf page: YES

The version 5.6 of MySQL introduced few format changes. Most of them were already supported. The c_parser fixes on top of that some bugs in processing temporal fields.

The new UnDROP tool for InnoDB is still no reason not to take backups:-), but at least you can be armed better if the inevitable happens.

How to Recover Table Structure

MySQL stores table structure in a respective .frm file. When the table is dropped the .frm file is gone. Fortunately InnoDB stores copy of the structure in the dictionary. sys_parser is a tool that can read the dictionary and generate CREATE TABLE statement. Check how you can Recover Table Structure From InnoDB Dictionary.

How to Install TwinDB Data Recovery Toolkit

Check out the source code from LaunchPAD

# $ bzr branch lp:undrop-for-innodb

Branched 33 revisions.

Or you can download an archive with the latest revision from download page.

Compile the source code. But first install dependencies: make, gcc, flex, bison.

Root@twindb-dev undrop-for-innodb]# make

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include -c stream_parser.c

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include -pthread -lm stream_parser.o -o stream_parser

Flex sql_parser.l

Bison -o sql_parser.c sql_parser.y

Sql_parser.y: conflicts: 6 shift/reduce

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include -c sql_parser.c

Lex.yy.c:3078: warning: ‘yyunput’ defined but not used

Lex.yy.c:3119: warning: ‘input’ defined but not used

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include -c c_parser.c

./include/ctype-latin1.c:359: warning: ‘my_mb_wc_latin1’ defined but not used

./include/ctype-latin1.c:372: warning: ‘my_wc_mb_latin1’ defined but not used

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include -c tables_dict.c

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include -c print_data.c

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include -c check_data.c

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include sql_parser.o c_parser.o tables_dict.o print_data.o check_data.o -o c_parser -pthread -lm

Cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe -I./include -o innochecksum_changer innochecksum.c

[Root@twindb-dev undrop-for-innodb]#

UPDATE

The toolkit is tested on following systems

CentOS release 5.10 (Final) x86_64

CentOS release 6.5 (Final) x86_64

CentOS Linux release 7.0.1406 (Core) x86_64

Fedora release 20 (Heisenbug) x86_64

Ubuntu 10.04.4 LTS (lucid) x86_64

Ubuntu 12.04.4 LTS (precise) x86_64

Ubuntu 14.04 LTS (trusty) x86_64

Debian GNU/Linux 7.5 (wheezy) x86_64

32 bit operating systems are not supported

An InnoDB index doesn’t carry information about the table structure indeed. MySQL keeps the structure in .frm files and InnoDB stores the structure in the dictionary. When the table structure isn’t available from external source (old backup, installation script etc) then possible way to recover the structure are:

1) Recover from .frm files. There are some tools around available . I prefer to create a dummy table, replace the .frm file and run SHOW CREATE TABLE. This option however is useless when DROP TABLE happens, MySQL deletes the .frm file as well.

2) Recover the structure from the InnoDB dictionary. InnoDB stores almost all necessary information about the table structure in the dictionary. When a user runs DROP TABLE the respective records are deleted from the dictionary tables, so when recover the dictionary tables you need to specify -D option to c_parser (-D recovers records that are marked as deleted). The tables you need are SYS_TABLES, SYS_INDEXES, SYS_FIELDS and SYS_COLUMNS. Then load everything into a live instance of MySQL. A tool sys_parser from the toolkit reads SYS_* tables from MySQL and generates CREATE TABLE statement.

DBRECOVER Recovery Options

For Oracle incidents, start with the DBRECOVER for Oracle trial to verify table visibility, row previews, and export readiness on copied datafiles. For MySQL and InnoDB incidents, DBRECOVER for MySQL is free software and can inspect.ibd files, ibdata1, and database directories locally.

When the case is urgent, preserve the original files first, work from copies, and contact paid emergency support with the database version, platform, error messages, file list, and recovery objective.

Archive ParnassusData Blog Migration Archive