[#1011286] Hash collisions cause more errors

View Trackers | Bugs | Download .csv | Monitor

Date:
2012-12-03 05:30
Priority:
3
State:
Closed
Submitted by:
Robert Coup (rcoup)
Assigned to:
Nobody (None)
Category:
None
Group:
None
Resolution:
Accepted
 
Summary:
Hash collisions cause more errors

Detailed description
v2.1.1

Hash collisions are causing errors, much as in http://pgfoundry.org/tracker/index.php?func=detail&aid=1011179&group_id=1000037&atid=230


CREATE TABLE test1 (id integer PRIMARY KEY, content text);
CREATE TABLE test2 (id integer PRIMARY KEY, content text);

INSERT INTO test1 (id, content) VALUES
(536636, '74e09d3896d47b6dcc83e2d27484d50e'),
(838939, '9f279730ed8658f642597f922064402c'),
(4672156, 'e37dc45df4d341c15c4f709666107466');

INSERT INTO test2 (id, content) VALUES
(536636, '74e09d3896d47b6dcc83e2d27484d50e'),
(838939, '9f279730ed8658f642597f922064402c'),
(4700943, 'acc5bfbbf8e12b402e0672391d76e1ef');

Then

./pg_comparator --max-ratio=99999999.0 --checksum-function=md5 --stats txt 'pgsql:///mydb/test2?"id":"content";' 'pgsql:///mydb/test1'
UPDATE 838939
DELETE 838939
UPDATE 4700943
INSERT 536636
revision: 1375
testing: pgsql/pgsql
table size: 3
folding factor: 7
levels: 2 (cut-off from 2)
query number: 10
query size: 1058
fetched sums: 2
fetched chks: 6
fetched data: 0
query metadata: 4
key size: 4
col size: 33
diffs found: 4
expecting: undef
options: 248
total time: 0.102206
checksum: 0.090238
summary: 0.005300
merge: 0.003988
bulks: 0.000014
synchro: 0.000002
clear: 0.000002
end: 0.002662


UPDATE 838939 <- Should be no-change
DELETE 838939 <- Should be no-change. Also, key not repeated.
UPDATE 4700943 <- Should be INSERT
INSERT 536636 <- Should be no-change
And missing a 'DELETE 4672156'

Problem #1:
- key hash collisions are pretty easy to achieve when it's only using 32/128 bits.

Problem #2:
- when hash collisions do occur, lines 2227/2252/2272 all try to do comparisons (eq, lt, gt) on arrays, which perl doesn't support. It compares the lengths, which are always true.

(also, is it possible to put the source code on github/bitbucket/somewhere? Makes it vastly easier to analyse changes, contribute improvements, and improve the test coverage. And the bug tracker is 100x nicer to use!)

Followup

Message
Date: 2013-03-07 12:09
Sender: Fabien Coelho

Indeed. Thanks for the report and the fix.

This will be fixed in the next release, based on your version.!

"pg_comparator" is under svn, but not public because it is mixed with other things that I would have to separate.

The validation is based on random tables, it is not simple to add a particular case by hand to the process...
Date: 2012-12-03 21:30
Sender: Robert Coup

I think this solves it, but perl is not my strong point. Can a key hash collision check be added to the test suite?

https://gist.github.com/4198249

Attached Files:

Changes:

Field Old Value Date By
status_idOpen2013-03-07 12:09fabien
close_dateNone2013-03-07 12:09fabien
ResolutionNone2013-03-07 12:09fabien
Powered By FusionForge