mirror of
https://github.com/DBDiff/DBDiff.git
synced 2026-04-25 05:15:49 +03:00
[GH-ISSUE #28] DBDiff says falsely tables are identical #155
Labels
No labels
bug
enhancement
good first issue
help wanted
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/DBDiff#155
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Pachat on GitHub (Oct 1, 2016).
Original GitHub issue: https://github.com/DBDiff/DBDiff/issues/28
Depending on the presence of a third field DBDiff detects or not the difference in the second field.
Step to reproduce
1°) Create the a11_elements and a12_elements tables in two databases
2°) In database2, change a value
3°) Run DBDiff
1°)
CREATE TABLE
a11_elements(idint(6) NOT NULL,show_in_list_summaryint(1) DEFAULT NULL,filter_exact_matchint(1) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO
a11_elements(id,show_in_list_summary,filter_exact_match) VALUES(646, 1, NULL);ALTER TABLE
a11_elementsADD PRIMARY KEY (id);CREATE TABLE
a12_elements(idint(6) NOT NULL,show_in_list_summaryint(1) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO
a12_elements(id,show_in_list_summary) VALUES (646, 1);ALTER TABLE
a12_elementsADD PRIMARY KEY (id);2°)
database2.a11_elements.
show_in_list_summary=0database2.a12_elements.
show_in_list_summary=03°)
php dbdiff --server1=root:root@localhost:3306 --type=data --include=all server1.database1.a11_elements:server1.database2.a11_elements --output=dbdiff_database1to2.sql
Results :
php dbdiff --server1=root:root@localhost:3306 --type=data --include=all server1.database1.a12_elements:server1.database2.a12_elements --output=dbdiff_database1to2.sql
Results :
---------- UP ----------
UPDATE
a12_elementsSETshow_in_list_summary= '0' WHEREid= '646';---------- DOWN ----------
UPDATE
a12_elementsSETshow_in_list_summary= '1' WHEREid= '646';Conclusion: With the same context for the first two fields (646 on both databases and respectively 1 / 0 on each database), only when deleting the third field DBDiff detects the difference in the second field.
@djshen-ponddy commented on GitHub (Oct 5, 2016):
What if
filter_exact_matchis not NULL ?@Pachat commented on GitHub (Oct 5, 2016):
When filter_exact_match is not NULL, the result is correct.
Step to reproduce
Mydatabase1
CREATE TABLE a13_elements (
id int(6) NOT NULL,
show_in_list_summary int(1) DEFAULT NULL,
filter_exact_match int(1) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO a13_elements (id, show_in_list_summary, filter_exact_match) VALUES(646, 1, 0);
ALTER TABLE a13_elements ADD PRIMARY KEY (id);
Mydatabase2
CREATE TABLE a13_elements (
id int(6) NOT NULL,
show_in_list_summary int(1) DEFAULT NULL,
filter_exact_match int(1) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO a13_elements (id, show_in_list_summary, filter_exact_match) VALUES(646, 0, 0);
ALTER TABLE a13_elements ADD PRIMARY KEY (id);
php dbdiff --server1=root:root@localhost:3306 --type=data --include=all server1.mydatabase1.a13_elements:server1.mydatabase2.a13_elements --output=dbdiff_13.sql
So, the result that is still incorrect is for a11_elements (see first post) when
filter_exact_match is NULL
(which may happen)
Does this gives a direction for the correction of the bug?
@djshen-ponddy commented on GitHub (Oct 6, 2016):
I think this bug is the same as #9 .
In getChangeDiff` of src/DB/Data/LocalTableData.php, I replace NULL value by ''.
@Pachat commented on GitHub (Oct 11, 2016):
@djshen-ponddy
That's an interesting proposal. However, it makes NULL and '' equivalent. Which might not be the case.
A NULL value by default, may become '' when the user decides to leave it blank.
NULL shows that the user did not update the value, '' shows that the user updated to blank.
We may have to use explicitly IS NULL functions to compare?
@djshen-ponddy commented on GitHub (Oct 12, 2016):
So we have to replace NULL value and use ISNULL as a flag for each field, and then concatenate all fields and flags and calculate md5 hash?
@Pachat commented on GitHub (Oct 12, 2016):
It looks like the culprit here is MD5 or any other PHP hash function. For example:
I think that, indeed, we cannot avoid the use of ISNULL to get the difference
@jasdeepkhalsa commented on GitHub (Oct 12, 2016):
Oh dear! We do need to use hashes in some way to make it easier to compare the data...
If hashes are the problem I wonder how GIT manages to do it! (Maybe we'll have to ask them for their secret)
@djshen-ponddy commented on GitHub (Oct 12, 2016):
SELECT CONCAT("a", NULL)is NULL andSELECT MD5(NULL)is NULL too.I think we can still use hashes to compare the data, but we need to handle NULL value to avoid the problem above.
MySQL CONCAT and MD5
@rekitae commented on GitHub (Oct 13, 2016):
I had the same problem.
There is also a way to solve CONCAT_WS and COALESCE functions.
How COALESCE
How CONCAT_WS
@djshen-ponddy commented on GitHub (Oct 13, 2016):
I found another problem about CONCAT.
The two records, (1, 'abc', 'de') and (1, 'abcde', ''), are both concatenated into '1abcde', so their MD5 hashes are identical.
@rekitae commented on GitHub (Oct 13, 2016):
concat_ws seems to be a good alternative.
The example below
@djshen-ponddy commented on GitHub (Oct 13, 2016):
I think we need a special separator (magic number) to reduce the probability of occurrence of this problem.
@isgroup-srl commented on GitHub (Jul 30, 2017):
@djshen-ponddy you can add data (a space) to the string before calling md5() to avoid the issue md5() returning null if the data is null. This way you will not have any collision (eg: "NULL" and " " will not have the same MD5 as the string will become " " and " "). With concat_ws() you cannot reliably add a separator that is not in the data itself, btw using a random token of enough entropy will make collisions unlikely.
@SergioMBS commented on GitHub (Oct 15, 2017):
concat_ws seems to work, why hasn't this been changed in the code?