Scoop -- the swiss army chainsaw of content management
Front Page · Everything · News · Code · Help! · Wishlist · Project · Scoop Sites · Dev Notes · Latest CVS changes · Development Activities
My useless perl script. Developer Diary
By hulver , Section Dev Notes []
Posted on Mon Aug 26, 2002 at 12:53:05 PM PST
Here is a useless perl script I've been working on. It's my first major perl code, so forgive any stupidities (but please point them out to me).

Modified from some sugesstions given to me in my kuro5hin diary.

This is a perl script that takes a text file as produced from mysqldump, and converts it so that it can be used on an interbase server.

#!/usr/bin/perl -w

$table = "";
$aut_inc = 0;
@keys = ();
@extra = ();
$first = 0;
$ignore = 0;
while ($line = <>) {
    chomp ($line);
    if ($line =~ /^CREATE TABLE (\S+)/i) {
	$first = 1;
	$table = $1;
#	if ($table eq "sessions") {
#		$ignore = 1;
#	} else {
#		$ignore = 0;
#	}
	$inBlock = 1;
	$aut_inc = 0;
	print "$line\n";
	$line = "";
    } elsif ($line =~ /^\).*;/) {
	$inBlock = 0;
	print "\n);\n";
	while (@keys > 0) {
		$line = pop(@keys);
		$line =~ s/,$//;
		$line =~ s/([^_])active/$1"ACTIVE"/i;
		$line =~ s/  KEY (\w+) (.+)/CREATE INDEX $1 on $table $2;\n/;
		print $line; 
	}
	while (@extra > 0) {
		print "-- ".pop(@extra)."\n";
	}
	$line = "";
	if ($aut_inc) {
		print "-- Table $table requires an autoinc field\n";
		print "-- $aut_inc_text\n";
		$aut_inc = 0;
		$aut_inc_text =~ s/\s*(\S+).*/$1/;
		$gen_name = "gen_".$table."_".$aut_inc_text;
		print "CREATE GENERATOR $gen_name;\n";
		print "SET GENERATOR $gen_name TO 1;\n";
		print "SET TERM ^^;\n";
		print "CREATE TRIGGER ".$table."_autoinc FOR $table\n";
		print "BEFORE INSERT AS\n";
		print "BEGIN\n";
		print "IF (NEW.$aut_inc_text IS NULL) THEN\n";
		print "NEW.$aut_inc_text = GEN_ID($gen_name, 1);\n";
		print "END^^\nSET TERM ;^^\n\n";
	}
    }
    $line =~ s/0000-00-00 00:00:00/1900-01-01 00:00:00/i;
    if ($inBlock) {
	if ($line =~ / auto_increment/i) {
		$aut_inc = 1;
		$aut_inc_text = $line;
		$line =~ s/ auto_increment//i;
	}
	if ($line =~ /	KEY/) {
		push(@keys,$line);
		$line = "";
	} else
	{
#		scoop specific hack to reduce key sizes
		$line =~ s/(category varchar)\(128/$1(118/i;
		$line =~ s/(prefname varchar)\(200/$1(180/i;
#		differences between mysql & interbase
		$line =~ s/ timestamp/ numeric/i;
		$line =~ s/datetime /timestamp /i;
		$line =~ s/ active / "ACTIVE" /i;
		$line =~ s/ (int|smallint|tinyint)/ numeric/i;
		$line =~ s/ (text|tinytext)/ BLOB SUB_TYPE text/;
		$line =~ s/ bigint\([0-9]+\)/ numeric\(18\)/i;
		$line =~ s/ FULLTEXT KEY (.*)//i;
		$line =~ s/,$//i;
		$line =~ s/(NULL|NOT NULL) +default (.+)/default $2 $1/;
#		reserved words in interbase need quoting
		$line =~ s/(^ *)length([ ,])/$1"LENGTH"$2/;
		$line =~ s/(^ *)date([ ,])/$1"DATE"$2/;
		$line =~ s/(^ *)time([ ,])/$1"TIME"$2/;
		$line =~ s/(^ *)type([ ,])/$1"TYPE"$2/;
		$line =~ s/(^ *)value([ ,])/$1"VALUE"$2/;
		if ($line =~ / enum[^\)]*\)/) {
			push(@extra,$line);
			$line =~ s/ enum[^\)]*\)/ char(1)/;
		}
	}
	if (length($line)  > 1) {
		if ($first) {
			$first = 0;
		} else {
			$line =~ s/^ /,\n /;
		}
		print "$line";
	}
   } else {
	$line =~ s/\\'/''/g;
	if (!$ignore) {
		print "$line\n";
	}
  }
}

There are quite a few differences between mysql and interbase.

  • No autoinc fields, use generators instead
  • Move KEY statements outside the create table statement to a create index statement.
  • Quotes must be delimited with '' instead of \' (erm, thats two ', doesn't come across too well)
  • Date format of 0000-00-00 is invalid, chose 1900-01-01 instead
  • No fulltext search that I know of at the moment, those keys are just removed for now
  • No bigint. I think there is an int64 field, but I've not checked it out just yet.

So. This is a first draft. It runs on my machine, and converts my mysql scoop database to an interbase database.

Why is it useless? Well, scoop dosn't use interbase. It uses mysql. I found a postgress conversion, but I haven't looked to closely at it yet.

So, whats next? Well I'm going to try and implement some portions of scoop outside of scoop. The part that reads all the comments for a story. I'll stick that in an external script, and run it against the two databases and see if there is a big difference in speed between the two.

After that? Well, if anybody else is interested I might try helping with any effort to abstract the data layer in scoop. Isn't somebody working on that?

Also, does anybody think that scoop would work better if the tables were normallized a bit better? Things like the Story IDs being such big identifiers? I think they would be better as Integers. 4 bytes rather than 20. That would shrink the size of some of the most used primary keys (on the story table & comment table) which would make them faster.
Is it worth starting on doing that, or is somebody already working on it?

< Scoop Rewrite In near Future | themes... still going (slowly) forward >

Menu
· create account
· faq
· search
· report bugs
· Scoop Administrators Guide
· Scoop Box Exchange

Login
Make a new account
Username:
Password:

Related Links
· Scoop
· Kuro5hin
· hulver's Diary

Story Views
  9 Scoop users have viewed this story.

Display: Sort:
My useless perl script. | 3 comments (3 topical, 0 hidden)
A quick note (none / 0) (#1)
by panner on Thu Sep 05, 2002 at 06:13:46 PM PST

I just want you to know that I have read this, and have read your messages to scoop-dev. But I've not replied to either.

I fully intend to, but I'm lazy. You bring up a lot of good points, and good points mean thinking. And thinking means getting the monkey to start cranking gears. Once that monkey gets to work, I'll look more closely.

So, in conclusion, it's all the monkey's fault.



--
Keith Smiley



My useless perl script. | 3 comments (3 topical, 0 hidden)
Display: Sort:

Hosted by ScoopHost.com Powered by Scoop
All trademarks and copyrights on this page are owned by their respective companies. Comments are owned by the Poster. The Rest © 1999 The Management

create account | faq | search