#!/usr/bin/perl -w
$table = "";
$aut_inc = 0;
@keys = ();
@extra = ();
$first = 0;
$ignore = 0;
while ($line = <>) {
chomp ($line);
if ($line =~ /^CREATE TABLE (\S+)/i) {
$first = 1;
$table = $1;
# if ($table eq "sessions") {
# $ignore = 1;
# } else {
# $ignore = 0;
# }
$inBlock = 1;
$aut_inc = 0;
print "$line\n";
$line = "";
} elsif ($line =~ /^\).*;/) {
$inBlock = 0;
print "\n);\n";
while (@keys > 0) {
$line = pop(@keys);
$line =~ s/,$//;
$line =~ s/([^_])active/$1"ACTIVE"/i;
$line =~ s/ KEY (\w+) (.+)/CREATE INDEX $1 on $table $2;\n/;
print $line;
}
while (@extra > 0) {
print "-- ".pop(@extra)."\n";
}
$line = "";
if ($aut_inc) {
print "-- Table $table requires an autoinc field\n";
print "-- $aut_inc_text\n";
$aut_inc = 0;
$aut_inc_text =~ s/\s*(\S+).*/$1/;
$gen_name = "gen_".$table."_".$aut_inc_text;
print "CREATE GENERATOR $gen_name;\n";
print "SET GENERATOR $gen_name TO 1;\n";
print "SET TERM ^^;\n";
print "CREATE TRIGGER ".$table."_autoinc FOR $table\n";
print "BEFORE INSERT AS\n";
print "BEGIN\n";
print "IF (NEW.$aut_inc_text IS NULL) THEN\n";
print "NEW.$aut_inc_text = GEN_ID($gen_name, 1);\n";
print "END^^\nSET TERM ;^^\n\n";
}
}
$line =~ s/0000-00-00 00:00:00/1900-01-01 00:00:00/i;
if ($inBlock) {
if ($line =~ / auto_increment/i) {
$aut_inc = 1;
$aut_inc_text = $line;
$line =~ s/ auto_increment//i;
}
if ($line =~ / KEY/) {
push(@keys,$line);
$line = "";
} else
{
# scoop specific hack to reduce key sizes
$line =~ s/(category varchar)\(128/$1(118/i;
$line =~ s/(prefname varchar)\(200/$1(180/i;
# differences between mysql & interbase
$line =~ s/ timestamp/ numeric/i;
$line =~ s/datetime /timestamp /i;
$line =~ s/ active / "ACTIVE" /i;
$line =~ s/ (int|smallint|tinyint)/ numeric/i;
$line =~ s/ (text|tinytext)/ BLOB SUB_TYPE text/;
$line =~ s/ bigint\([0-9]+\)/ numeric\(18\)/i;
$line =~ s/ FULLTEXT KEY (.*)//i;
$line =~ s/,$//i;
$line =~ s/(NULL|NOT NULL) +default (.+)/default $2 $1/;
# reserved words in interbase need quoting
$line =~ s/(^ *)length([ ,])/$1"LENGTH"$2/;
$line =~ s/(^ *)date([ ,])/$1"DATE"$2/;
$line =~ s/(^ *)time([ ,])/$1"TIME"$2/;
$line =~ s/(^ *)type([ ,])/$1"TYPE"$2/;
$line =~ s/(^ *)value([ ,])/$1"VALUE"$2/;
if ($line =~ / enum[^\)]*\)/) {
push(@extra,$line);
$line =~ s/ enum[^\)]*\)/ char(1)/;
}
}
if (length($line) > 1) {
if ($first) {
$first = 0;
} else {
$line =~ s/^ /,\n /;
}
print "$line";
}
} else {
$line =~ s/\\'/''/g;
if (!$ignore) {
print "$line\n";
}
}
}
There are quite a few differences between mysql and interbase.
- No autoinc fields, use generators instead
- Move KEY statements outside the create table statement to a create index statement.
- Quotes must be delimited with '' instead of \' (erm, thats two ', doesn't come across too well)
- Date format of 0000-00-00 is invalid, chose 1900-01-01 instead
- No fulltext search that I know of at the moment, those keys are just removed for now
- No bigint. I think there is an int64 field, but I've not checked it out just yet.
So. This is a first draft. It runs on my machine, and converts my mysql scoop database to an interbase database.
Why is it useless? Well, scoop dosn't use interbase. It uses mysql. I found a postgress conversion, but I haven't looked to closely at it yet.
So, whats next? Well I'm going to try and implement some portions of scoop outside of scoop. The part that reads all the comments for a story. I'll stick that in an external script, and run it against the two databases and see if there is a big difference in speed between the two.
After that? Well, if anybody else is interested I might try helping with any effort to abstract the data layer in scoop. Isn't somebody working on that?
Also, does anybody think that scoop would work better if the tables were normallized a bit better? Things like the Story IDs being such big identifiers? I think they would be better as Integers. 4 bytes rather than 20. That would shrink the size of some of the most used primary keys (on the story table & comment table) which would make them faster.
Is it worth starting on doing that, or is somebody already working on it?