Click to See Complete Forum and Search --> : Comparing a single character in a string?


Sanjan
05-23-2009, 08:19 PM
I have an array that is setup and I want to compare a single site... so if this was the array

@sequence contains these values
[0]JHHDHS
[1]JHHUEN
[2]JHHDHS

I want to compare the 4th letter in all 3 here but i want to find out the percent similarity. By that i mean how conserved is each letter at a site. So for site 4 i should be able to find out that D is 2 of 3 positions, and that U is 1 of 3.

How can i do that in perl?

Sanjan

Sixtease
05-24-2009, 02:05 AM
This would be a function which you pass the number of the character you want to extract (starting from zero) and the list of the strings to search. The return value is a hash of the structure { 'character': ratio }
# parameters: 1) position starting from 0, 2) list of strings
sub get_char_ratio {
my $pos = shift; # get and remove the position from the parameters
my $total = @_; # count the strings
my %count;
for my $str (@_) { # count how many times each character appeared
$count{ substr $str, $pos, 1 }++;
}
for (values %count) { $_ /= $total } # divide counts by total to get ratio
return \%count;
}
Here is a script for testing it out:
#!/usr/bin/perl

use strict;
use warnings;

my @data = qw(
JHHDHS
JHHUEN
JHHDHS
);

# parameters: 1) position starting from 0, 2) list of strings
sub get_char_ratio {
my $pos = shift;
my $total = @_;
my %count;
for my $str (@_) {
$count{ substr $str, $pos, 1 }++;
}
for (values %count) { $_ /= $total }
return \%count;
}

my $count = get_char_ratio( 3, @data );
while ( my ($char, $ratio) = each %$count ) {
printf "%s: %.1f%%\n", $char, 100*$ratio;
}