Click to See Complete Forum and Search --> : Find differences in two strings
julifos
02-06-2004, 02:37 PM
Hi all!
I'm looking for a tricky and speedy way to find the differences (or only the first difference) comparing two strings. Eg:
x = 'This is a sample';
y = 'This is a sXmple';
findDifferences(x,y);
//-> 11
I can do it looping through the characters (or through the words), but it is a bad speed-bussiness when comparing long strings, so I'm looking for a dirty trick to make it much quicker...
Tx!
olerag
02-06-2004, 02:43 PM
Maybe someone else knows if there's a simpler way but I
believe you'll need to loop thru the strings testing char by
char. Also, I would suggest to begin the first "for" loop with
the smaller string, but you probably already know that.
not sure how much of a difference this would make, but you could split up strings longer than X characters into substrings, and narrow it down. But again, prolly wouldn't save more than a few ms
julifos
02-07-2004, 07:43 AM
Well... Then, finally, here is the code I'll use (quick code, not extensivelly tested):
function findDifferences(x,y){
// check first char, just in case
if (y[0] != x[0]) return 0;
// check if x and y are equal, just in case
if (x.indexOf(y) != -1) return -1;
var chunks = 5;
xl = x.length;
xldiv = (xl - (xl % chunks)) / chunks;
lastCheck = '';
// loop through "big" chunks
for (i=1;i<xldiv;i++){
lastCheck += y.substring(lastCheck.length,(i*xldiv) - 1);
if (x.indexOf(lastCheck) == -1) {
lastCheck = lastCheck.substring(0,((i-1)*xldiv))
break;
}
}
// loop against conflictive chunk
for (i=lastCheck.length; i < y.length; i++){
lastCheck += y.charAt(i);
if (x.indexOf(lastCheck) == -1) return i
}
}
x = "La vida es una mierda, siempre y cuando el mundo sea algo menos absurdo";
y = "La vida es una mierda, siempre y cuando el mundo sea algo menos absurda";
firstDifference = findDifferences(x,y) //--> 70
y[firstDifference] //--> "a"
I define the number of chunks to split the text (5), but I could add some routine to check for the length of the text to compare, so I can adjust the number of chunks and optimize routine's performance.
Cheers ;)
AdamGundry
02-07-2004, 11:54 AM
I think technically the fastest method for finding the first difference would be a kind of "binary chop", whereby you begin by splitting the strings into two, taking the first with a difference and splitting it, and so on until you find the individual character that varies. Bear in mind that JS is slow anyway, and so this is unlikely to give a massive performance boost, but if you have absurdely long strings the time taken will be less.
Adam
julifos
02-07-2004, 05:50 PM
Yep! If I understood ok, this routine is much faster for long-text processing, according to my tests:
textOK = 0;
lastIndex = 0;
findDifferences2(x,y);
//—> the offending character's index is at textOK
function findDifferences2(x,y){ //
if (x.length < 2) return; // finish!
midx = Math.round(x.length/2);
x1 = x.substring(0,midx);
y1 = y.substring(0,midx);
x2 = x.substring(midx,x.length);
y2 = y.substring(midx,y.length);
if (x1.indexOf(y1) != -1){ // first chunk is equal
textOK += x1.length;
findDifferences2(x2,y2);
} else { // first chunk is not equal
findDifferences2(x1,y1);
}
}