Skip to content

Commit

Permalink
Item14435: ensure DB is properly rolled back if an insert fails for a…
Browse files Browse the repository at this point in the history
…ny reason
  • Loading branch information
cdot committed Jul 14, 2017
1 parent a5e51a6 commit a00d9a4
Show file tree
Hide file tree
Showing 3 changed files with 158 additions and 69 deletions.
89 changes: 41 additions & 48 deletions data/System/DBIStoreContrib.txt
@@ -1,47 +1,13 @@
%META:TOPICINFO{author="ProjectContributor" date="1499336104" format="1.1" version="1"}%
%META:TOPICINFO{author="ProjectContributor" date="1500033372" format="1.1" version="1"}%
---+!! !DBIStoreContrib
<!--
* Set SHORTDESCRIPTION = %$SHORTDESCRIPTION%
-->
%SHORTDESCRIPTION%

%TOC%

This extension supports fast queries and searches over the
content of topics and attachments by cacheing wiki topics in
an SQL database.

Works as a mirror of an existing store (e.g. !PlainFile, RCS)
1. With all searching operations performed using SQL _or_
2. With just structured queries performed using SQL _or_
3. Simply to support other clients querying data using SQL.

Searches in Foswiki fall into two types - _text searches_ and _query
searches_. Text searches are what you use to find text in a topic,
while query searches use a structured query language to specify
logical combinations of field values. Foswiki has two internal
configuration options, ={Store}{SearchAlgorithm}= and
={Store}{QueryAlgorithm}=, that select the algorithms that it will use
for these different types of search.

The default query algorithm works by cherry-picking parts of queries
to create regular expression searches that can then be passed though
the search algorithm. These searches are then used as a filter on the
set of topics, to narrow down to a subset that is then "brute force"
matched against the original query to give a final set of matching
topics. This is fine when you have relatively few queries, but is
not terribly efficient if you make a lot of use of query searches.

The !DBIStoreContrib does things differently. When you select
!DBIStoreContrib for the query algorithm, query searches are
automatically transformed into SQL queries, which are very
fast. Further, you can optionally choose to use the database for
standard text searches (including regular expression searches, if your
DB supports them).

A final benefit is that your Foswiki data is "cached" in an external
SQL database that can be queried by other tools (though changes made
in the database will *not* be reflected in the wiki).
* Automatically translates =%<nop>SEARCH{type="query"= into SQL queries
* Supports plain-text searches over attachment content
* Foswiki data is "cached" in an external SQL database that can be queried by other tools (though changes made in the database by other tools will *not* be reflected in the wiki)

The !DBIStoreContrib has been designed to be *as compatible as possible* with
existing =%SEARCH= expressions, and is usable with all well-written
Expand All @@ -63,6 +29,8 @@ Foswiki save operations to incrementally update the database.
Alternatively, you can use the included =dbistore_manage.pl= program
to create cron (or iwatch) jobs to update the database offline.

%TOC%

---++ Database Schema
The database schema must reflect 1:1 the structure of data in Foswiki
topics. The default schema will work for most databases, though you
Expand Down Expand Up @@ -125,15 +93,15 @@ The schema should normally specify a table for every =%<nop>META:=
that may occur in topics. This works so long as you know what
meta-data will be added by the plugins you use. If you don't know, you
can optionally specify that new tables are automatically added by
setting ={AutoloadUnknownMETA}= in =configure.
setting ={AutoloadUnknownMETA}= in =configure=.

Normally only columns defined in the schema are loaded. Columns can be
automatically added for attributes missing from the schema by setting
={AutoAddUnknownFields}= in configure. Automatically added columns
={AutoAddUnknownFields}= in =configure=. Automatically added columns
will be given the type specified by the =_DEFAULT= column type
declaration.

Note that it is always best to make sure all meta-data has a schema
%I% it is always best to make sure all meta-data has a schema
entry. Automatic extension of the schema is relatively expensive, and
should be avoided if possible.

Expand All @@ -160,7 +128,7 @@ Columns are defined in a Perl hash containing attributes:
column.
* =index= if true, then an index will be created for the column

For example, the WORKFLOW table might
For example, the WORKFLOW table for the WorkflowPlugin might be:
<verbatim>
tid => { type => 'INT', primary => 1 }
name => { type => 'TEXT', truncate_to => 30 }
Expand Down Expand Up @@ -413,12 +381,37 @@ Note that there has been *no* testing with ODBC drivers that are
compiled with unicode support enabled.

---++ How it works
!DBIStoreContrib works by using an (included) Foswiki plugin to
"eavesdrop" on save operations in Foswiki. When a topic is saved, it
is automatically added to the database in the background. The DB only
caches the latest version of a topic; version histories are not stored
(that would be the job of a full store implementation, which is
a different project).

Searches in Foswiki fall into two types - _text searches_ and
_query searches_. Text searches are what you use to find text in a topic,
while query searches use a structured query language to specify
logical combinations of field values. Foswiki has two internal
configuration options, ={Store}{SearchAlgorithm}= and
={Store}{QueryAlgorithm}=, that select the algorithms that it will use
for these different types of search.

The default Foswiki query algorithm works by cherry-picking parts of
queries to create regular expression searches that can then be passed
though the search algorithm. These searches are then used as a filter
on the set of topics, to narrow down to a subset that is then "brute
force" matched against the original query to give a final set of
matching topics. This is fine when you have relatively few queries,
but is not terribly efficient if you make a lot of use of query
searches.

The !DBIStoreContrib does things differently. When you select
!DBIStoreContrib for the query algorithm, query searches are
automatically transformed into SQL queries, which are very
fast. Further, you can optionally choose to use the database for
standard text searches (including regular expression searches, if your
DB supports them).

!DBIStoreContrib maintains the SQL database using an (included)
Foswiki plugin to "eavesdrop" on save operations in Foswiki. When a
topic is saved, it is automatically added to the database in the
background. The DB only caches the latest version of a topic; version
histories are not stored (that would be the job of a full store
implementation, which is a different project).

The database is interfaced to via the standard Perl DBI interface, so
any RDBMS that has an adapter can be used for the cache. A number of
Expand Down
95 changes: 83 additions & 12 deletions lib/Foswiki/Contrib/DBIStoreContrib.pm
Expand Up @@ -23,8 +23,8 @@ use DBI ();
use Encode ();
use File::Temp;

our $VERSION = '2.2'; # plugin version is also locked to this
our $RELEASE = '6 Jul 2017';
our $VERSION = '2.3'; # plugin version is also locked to this
our $RELEASE = '14 Jul 2017';

# Global options, used to control tracing etc throughout the module
our %TRACE = (
Expand All @@ -39,11 +39,14 @@ our %TRACE = (
require Exporter;
our @ISA = qw(Exporter);
our @EXPORT_OK = qw(
trace personality insert remove rename expandIDs
trace personality insert remove rename expandIDs checkDBIntegrity
$TABLE_PREFIX %TRACE traceSQL
NAME NUMBER STRING UNKNOWN BOOLEAN SELECTOR VALUE TABLE PSEUDO_BOOL);

our $SHORTDESCRIPTION = 'Use DBI to implement a store using an SQL database.';
our $SHORTDESCRIPTION =
'Supports fast queries and searches over the '
. 'content of topics and attachments by caching wiki topics in '
. 'an SQL database (e.g. MS SQL Server, !MySQL, !PostgreSQL)';

# Type identifiers.
# FIRST 3 MUST BE KEPT IN LOCKSTEP WITH Foswiki::Infix::Node
Expand Down Expand Up @@ -192,7 +195,7 @@ sub error {
trace(@_);
unless ( $TRACE{cli} ) {

# Repreint to STDERR
# Reprint to STDERR
$TRACE{cli} = 1;
trace(@_);
$TRACE{cli} = 0;
Expand Down Expand Up @@ -565,15 +568,15 @@ sub load {
while ( $wit->hasNext() ) {

my $w = ( $wo->web ? $wo->web . '/' : '' ) . $wit->next();
next unless !defined($wre) || $w =~ /^$wre$/;

# Load subweb
my $swo = Foswiki::Meta->load( $wo->session, $w );
load( $swo, $wre, $tre, $reload );
}

# No topics at root level
return unless ( $wo->web() && ( !defined($wre) || $w =~ /^$wre$/ ) );
return
unless ( $wo->web() && ( !defined($wre) || $wo->web() =~ /^$wre$/ ) );

my $tit = $wo->eachTopic();
while ( $tit->hasNext() ) {
Expand Down Expand Up @@ -857,6 +860,8 @@ sub _insertTopic {
# Insert this row
my $data = $mo->{$type};

unshift( @$undos, [ "DELETE FROM #T<$type> WHERE tid=?", $tid ] );

foreach my $item (@$data) {

# All tables have 'tid'
Expand Down Expand Up @@ -970,7 +975,7 @@ sub _remove {
if $TRACE{action};

# Iterate over all our tables removing the tid
foreach my $tr ( 'topic', @tables ) {
foreach my $tr ( @tables, 'topic' ) {
eval {
personality->sql( "DELETE FROM #T<$tr> WHERE #<tid>=?",
$tid );
Expand All @@ -983,6 +988,71 @@ sub _remove {
}
}

sub checkDBIntegrity {
my $repair = shift;

getDBH();

my @problems;

# Get list of our tables from metatypes
my $sth = personality->sql('SELECT #<name> FROM #T<metatypes>');
my @tables;

foreach my $row ( @{ $sth->fetchall_arrayref() } ) {
my $t = $row->[0];
if ( personality->table_exists($t) ) {
push( @tables, $t );
}
else {
error("Table $t in metatypes does not exist in DB");
if ($repair) {
eval {
personality->sql(
"DELETE FROM #T<metatypes> WHERE #<name>=?", $t );
error("...removed");
};
if ($@) {
error( "repair failed: ", $@ );
}
}
}
}

# Check tids in all tables
$sth = personality->sql('SELECT #<tid>,#<web>,#<name> FROM #T<topic>');
my %tids =
map { $_->[0] => "$_->[1].$_->[2]" } @{ $sth->fetchall_arrayref() };

foreach my $tr (@tables) {
$sth = personality->sql("SELECT #<tid> FROM #T<$tr>");
my $tiddles = $sth->fetchall_arrayref();
my %missing_tids;
foreach my $row (@$tiddles) {
my $tid = $row->[0];
unless ( exists $tids{$tid} ) {
$missing_tids{$tid} = 1;
}
}
if ( scalar keys %missing_tids ) {
error( "Table $tr has tids missing from topic table: ",
join( ',', sort keys %missing_tids ) );
if ($repair) {
eval {
personality->sql(
"DELETE FROM #T<$tr> WHERE #<tid> IN ("
. join( ',', keys %missing_tids )
. ")" );
error("...removed");
};
if ($@) {
error( "repair failed: ", $@ );
}
}
}
}
}

=begin TML
---++ StaticMethod remove($meta [, $attachment])
Expand Down Expand Up @@ -1113,15 +1183,16 @@ sub query {
getDBH();

my $sth = personality->sql($sql);
my $rv = $sth->fetchall_arrayref();

$rv = _applyToStrings(
my $rv = [];
if ( $sql =~ /^\s*select\W/i ) {
$rv = $sth->fetchall_arrayref();
}
return _applyToStrings(
$rv,
sub {
return uc2site( personality->from_db( $_[0] ) );
}
);
return $rv;
}

=begin TML
Expand Down
43 changes: 34 additions & 9 deletions tools/dbistore_manage.pl
Expand Up @@ -30,8 +30,11 @@ BEGIN
my $web = '*';
my $topic = '*';
my $clean;
my $check;
my $repair;

my $result = Getopt::Long::GetOptions(
'check' => \$check,
'clean' => \$clean,
'dbitrace=s' => \$dbitrace,
'help' => sub {
Expand All @@ -49,6 +52,7 @@ BEGIN
'load=s' => \@loads,
'query=s' => \$query,
'reload' => \$reload,
'repair' => \$repair,
'reset' => \$reset,
'sql:s' => \$sql,
'topic=s' => \$topic,
Expand Down Expand Up @@ -87,6 +91,11 @@ BEGIN
$opsDone++;
}

if ($check) {
Foswiki::Contrib::DBIStoreContrib::checkDBIntegrity($repair);
$opsDone++;
}

# Topic(s) to be loaded
if ( scalar @loads ) {
foreach my $wt (@loads) {
Expand Down Expand Up @@ -133,7 +142,7 @@ BEGIN
$Foswiki::Plugins::SESSION->search->parseSearch( $query,
{ type => 'query' } );
$sql = Foswiki::Contrib::DBIStoreContrib::HoistSQL::hoist($query);
$sql = "SELECT #<web>,#<name> FROM #T<topic> WHERE $sql";
$sql = "SELECT #<web>,#<name>,#<tid> FROM #T<topic> WHERE $sql";
$sql .= " AND #<name> LIKE '$topic'" if $topic && $topic ne '%';
$sql .= " AND #<web> LIKE '$web'" if $web && $web ne '%';

Expand All @@ -149,8 +158,8 @@ BEGIN
}
my $rv = Foswiki::Contrib::DBIStoreContrib::query($sql);
if ( $sql =~ /^\s*select\W/i ) {
foreach my $topic (@$rv) {
trace( $topic->[0], '.', $topic->[1] );
foreach my $result (@$rv) {
trace( join( ', ', @$result ) );
}
}
else {
Expand Down Expand Up @@ -222,12 +231,6 @@ =head1 OPTIONS
Only one of C<--query> or C<--sql> may be given. Use B<--sql -> to
read a query from standard input.
=item B<--trace>
Enable a trace option. C<--help trace> can be used to list trace keys.
C<--trace all> will switch on all trace options. Trace output is printed
to STDERR.
=item B<--clean>
In theory, nothing ever gets deleted from a Foswiki database. In
Expand All @@ -236,6 +239,28 @@ =head1 OPTIONS
entries that no longer correspond to live topics. Cleaning is done
before any --query or --sql is executed.
=item B<--check> check database integrity
It's possible that a database operation might be incomplete, for
example if the database server crashes halfway through a deletion. In
this case it is possible for the database to become internally
inconsistent. If you see SQL error messages in the logs, you should
run with --check to determine if the database is internally
consistent. If it isn't, your can run --check again with the --repair
option, to repair any inconsistencies. Checking (and repair) are done
before and --query or --sql is executed.
=item B<--repair> repair database integrity
Does nothing on it's own, but when used with --check will repair any
inconsistencies found by --check.
=item B<--trace>
Enable a trace option. C<--help trace> can be used to list trace keys.
C<--trace all> will switch on all trace options. Trace output is printed
to STDERR.
=item B<--dbitrace> trace
Set DBI trace options, which can be used to enable tracing of the DBI
Expand Down

0 comments on commit a00d9a4

Please sign in to comment.