Click to See Complete Forum and Search --> : Parsing this piece of string


fambi
10-14-2004, 04:28 PM
If anybody is wandering, i decided not to kill myself.

Anyhow, i have this piece of string which i need to parse.

"$string={variables_xyz=[{variable_x=123, variable_y=456, variable_z=789}]}"

At any moment, i could be receiving about 50,000 of these in an array and therefore need the most efficient method of parsing them to extract variable_x, variable_y & variable_z.

As you well may know, i am a complete amateur/newbie/etc. and will probably spend hourse on this only to write an inefficient script.

For this reason, i am making this post and hope that one of you wonderful php pro's can help me with an efficient script.

Lots and lots and lots of thanks in advance.

Paul Jr
10-14-2004, 04:47 PM
Wouldn’t it be more efficient to use an array, instead? That’s that they’re there for…

HaganeNoKokoro
10-14-2004, 04:56 PM
But if these values are coming, in that format (for whatever reason), from a file or somthing, then he has to parse them into somehting useful.

Paul Jr
10-14-2004, 05:34 PM
That is true, but it would be much more efficient to use an array if possible.

ShrineDesigns
10-14-2004, 06:51 PM
try this:function parse_varstr($string)
{
$retval = array();
$string = preg_replace(array("/\\{/", "/\\}/"), '', $string);
preg_match_all("/variable_[a-z0-9]*/i", $string, $vars, PREG_SET_ORDER);
preg_match_all("/(=)([a-z0-9]*)(,?)/i", $string, $vals, PREG_SET_ORDER);

if(count($vals) > count($vars))
{
array_shift($vals);
}
for($i = 0; $i < count($vars); $i++)
{
$retval[$vars[$i][0]] = $vals[$i][2];
}
return $retval;
}parse_varstr() returns:Array
(
[variable_x] => 123
[variable_y] => 456
[variable_z] => 789
)

fambi
10-15-2004, 06:51 AM
My apologies, and this just goes to prove what a newbie i am, but when i said 'array', i meant 'loop'.

The above string is extracted from an http header so it looks like this

$result[9]="{variables_xyz=[{variable_x=123, variable_y=456, variable_z=789}]}"

I tried ShrineDesigns idea but i think there must be something a bit simpler.

Looking forward to more grea help.

yuna
10-15-2004, 08:02 AM
If you know for sure that the line will always begin with "{variable_xyz=", then one method might be:


$result="{variables_xyz=[{variable_x=123, variable_y=456, variable_z=789}]}";

# lop off the first fourteen characters ("{variables_xyz=")
# then remove the brackets and braces
$vars=ereg_replace("[\[\{\]\}]","",substr($result,14));
# this should leave $vars="variable_x..._z=789"

$values=explode(",",$vars);
# use trim to get rid of any leading spaces
$xstring=explode("=",trim($values[0]));
$x=$xstring[1];
$ystring=explode("=",trim($values[1]));
$y=$ystring[1];
$zstring=explode("=",trim($values[2]));
$z=$zstring[1];

fambi
10-15-2004, 09:12 AM
Made a few modifications, but that is exactly what i am looking for.

Thank you very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very much .

fambi
10-15-2004, 10:30 AM
Ok. Things just got much, much, much worse!

Having dealt with the first string, i now have to deal with something that is a lot more ugly!

$string={variableA=abc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi},{variableA=abc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi},{variableA=abc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi},{variableA=abc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi}"

I now have to parse $string, but unlike the my previous request, i have to parse an unpredictable number of responses, each enclsoed within curly brackets and seperated by a comma as shown above and then insert the results into a database.

In the above example, there are 4 responses, but when i get these responses, i won't know how many to expect!

One response is something that (with Yuna's help) i have dealth with, but an unlimited and unpredicatable number of responses is about 70,000 miles above my head!

Heeeeeeeeeeeeeeeeeeeeeeeeeeeeeelllllllllpppp

HaganeNoKokoro
10-15-2004, 12:18 PM
This code first splits the string by curly braces, making an array of everything that is surrounded by curly braces. Then it splits out the variables using a regexp and places them in an array.

EDIT: Found a dumb mistake in the code, fixed


function parseString($string) {
preg_match_all("|\\{[^\\}\\{]*\\}|", $string, $arr); //split by braces
$answer=array();
for($i=0; $i<count($arr[0]); $i++) { //for each brace set
preg_match_all("|variable[\\w]+=[\\w]*|", $arr[0][$i], $match); //split out the variables
$answer[$i]=array();
for($j=0; $j<count($match[0]); $j++) { //for each variable
$spl=split("=", $match[0][$j]); //split about = sign
$answer[$i][$spl[0]]=$spl[1];
}
}
return $answer;
}

So

$str="{variableA=abc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi}, {variableA=abc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi}, {variableA=abc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi}";
$vars=parseString($str);
print_r($vars);

gives

Array (
[0] => Array ( [variableA] => abc [variableB] => bcd [variableC] => cde [variableE] => efg [variableF] => fgh [variableG] => ghi )
[1] => Array ( [variableA] => abc [variableB] => bcd [variableC] => cde [variableE] => efg [variableF] => fgh [variableG] => ghi )
[2] => Array ( [variableA] => abc [variableB] => bcd [variableC] => cde [variableE] => efg [variableF] => fgh [variableG] => ghi )
)

fambi
10-15-2004, 04:20 PM
Thanks Hagane, now can anyone tell me how do i stick all of that into a table (assuming that the table has an auto-increment.

HaganeNoKokoro
10-15-2004, 04:44 PM
That will depend on what database management system you are using, and on your table structure.

fambi
10-16-2004, 01:27 AM
Well, i am using mysql and the table structure will be exactly the same as the variables.

HaganeNoKokoro
10-16-2004, 02:26 AM
I found a silly error in my code, fixed above.

Anyway, to form queries I would do something like this: $str="{variableA=abc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi},
{variableA=cbc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi},
{variableA=acc, variableB=bcd, variableC=cde, variableE=efg, variableF=fgh, variableG=ghi}";
$vars=parseString($str);
foreach($vars as $cvar) {
$query="INSERT INTO variables (";
$i=0;
foreach($cvar as $k=>$v) {
$query.=$k;
if($i<count($cvar)-1) $query.=",";
else $query .=") VALUES (";
$i++;
}
$query.=implode($cvar, ",").")";
}This results in the following queries INSERT INTO variables (variableA,variableB,variableC,variableE,variableF,variableG) VALUES (abc,bcd,cde,efg,fgh,ghi)
INSERT INTO variables (variableA,variableB,variableC,variableE,variableF,variableG) VALUES (cbc,bcd,cde,efg,fgh,ghi)
INSERT INTO variables (variableA,variableB,variableC,variableE,variableF,variableG) VALUES (acc,bcd,cde,efg,fgh,ghi)Hopefully this fits what you were thinking in terms of table structure.

fambi
10-16-2004, 10:41 AM
Ok. It works fine with your script, but when i use the actual values, it doesn' work!

Try this:

$str="{Time=2004-10-13 12:40:46, Status=4, Id=0410131237513251X, List=123456789, cost=0.0, Body=ok}, {Time=2004-10-13 16:33:34, Status=4, Id=0410131633024521X, List=987654321, cost=0.0, Body=unregistered}";

You seem to know about 5,000,000 kilos of php more than me and i would appreciate it if you could also explain how you come to the solution.

HaganeNoKokoro
10-16-2004, 11:58 AM
Aha, you had us all thinking your variables were actually called "variableA", not things like "time" Here's a fixed version of the parseString() code that should work betterfunction parseString($string) {
preg_match_all("|\\{[^\\}]*\\}|", $string, $arr); //split by braces
$answer=array();
for($i=0; $i<count($arr[0]); $i++) { //for each brace set
preg_match_all("|[^\\{\\}\\=\\,]+=[^\\,\\{\\}]*|", $arr[0][$i], $match); //split out the variables
$answer[$i]=array();
for($j=0; $j<count($match[0]); $j++) { //for each variable
$spl=split("=", $match[0][$j]); //split about = sign
$answer[$i][$spl[0]]=$spl[1];
}
}
return $answer;
}

fambi
10-17-2004, 01:17 AM
EXCELLENT! Thank you Hagane, that is absolutely, tremendously perfect!

Now, hopefully to bring the "parsing this piece of string" saga to an end...

In my original codes, we used to recieve an http header and we had to extract two pieces of data that were kept in line 0 & line 8 of the returned header (i.e. $result). So the original code looked like this:


for($mm=0; $mm<$count; $mm++)
{
$status[$mm] = $result[0];
$id[$mm] = $result[8];
if(trim($status[$mm])=='11')
{
bla bla bla...


But the thing is, now we get all the data fed to us on line 9 of the header in one piece of string.

I thought it would be sufficient to parse it, but i think that there is more to it, because, although it is parsed correctly, it doesn't work when placed within an array.

Although my PHP knowledge is increasing day by day, (with your kind help i may add) i still can't get a grip around 'arrays'.

The code shown below is, with the help of Yuna, the code i am using to parse line 9 of the http header, but what do i have to do for it it work within the for($mm=0; $mm<$count; $mm++) statement.


//$line_9=$result[9];
$line_9="{Report=[{ID=0410141738518081X, Status=11}]}";
$result=str_replace (array ('Report=','{[{', '}]}'), array ('','', ''), $line_9);
$values=explode(",",$result);
$xstring=explode("=",trim($values[0]));
$id=$xstring[1];
$ystring=explode("=",trim($values[1]));
$status=$ystring[1];


I hope that i am making sense.

I look forward to your help and would also appreciate it if you could give me some links to helpful tutorials on arrays.

drythirst
10-17-2004, 05:32 PM
Please Clarify
....▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓

fambi
10-17-2004, 05:47 PM
I wish i could clarify, but i myself don't understand.

1. We used to recieve an http header as a response.
2. Within that header, there were 2 pieces of data that we would extract and insert into a table (they were kept in line 0 & line 8 of the header).
3. As we were recieveing many responses at the same time, my programmer used to use this code


for($mm=0; $mm<$count; $mm++)
{
$status[$mm] = $result[0];
$id[$mm] = $result[8];
if(trim($status[$mm])=='11')
{


4. Now, however, this set up has changed and we recieve both pieces of information within line 9 of the header.
5. With the help of 2 very kind PHP programmers (Yuna & HaganeNoKokoro) i have been able to parse line 9 and extract the two pieces of information.
6. HOWEVER, in my dialogue with them, i didn't realise that it might be slightly more complicated than simply parsing the line as i have to also account for.


for($mm=0; $mm<$count; $mm++)


I have tried to mimick the original code by adding the [$mm] as shown below


for($mm=0; $mm<$count; $mm++)
{
$line_9="{Report=[{ID=0410141738518081X, Status=11}]}";
$result=str_replace (array ('Report=','{[{', '}]}'), array('','', ''), $line_9);
$values=explode(",",$result);
$xstring=explode("=",trim($values[0]));
$id[$mm]=$xstring[1]; //in this line
$ystring=explode("=",trim($values[1]));
$status[$mm]=$ystring[1];//and in this line

if(trim($status[$mm])=='11')
{


but it didn't work, leaving me to assume that it is not as simple as that.


I would appreciate any assistance that i could get.

Thanks.

fambi
10-18-2004, 08:04 AM
Heeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeelp.

HaganeNoKokoro
10-18-2004, 12:14 PM
What exactly is $count the count of?

fambi
10-18-2004, 01:19 PM
I am about to try and find out...

fambi
10-18-2004, 01:25 PM
OK, this is the what i found... I hope it makes sense:


if($send == "YES")
{
$split = str_split($data,160);
$count = sizeof($split);
$index = $index+$m;
$formatted = str_replace('\\', '',$data);
$split = str_split($formatted,160);
for($mm=0; $mm<$count; $mm++)
{

HaganeNoKokoro
10-18-2004, 02:38 PM
So it takes the $data string and splits it into 160-character long chunks for processing, and puts the chunks in the $split array. But then I don't see the $split being used anywhere in the loop. Very mysterious. I would have thought that, each time through the loop, we would be doing something with $split[$mm].

What is in this $data? I/m assuming it's the header data you're trying to work with. Do you have an example?

fambi
10-18-2004, 05:08 PM
Thanks for the help.

Well basically, $data is a message that is sent out via http GET.

In response, an http header is recieved (i.e. $result), which has the status of the message (e.g. OK) as well as a unique ID for the message.

Here is the script (without all the unecessary stuff). Make sure to read the comments.


<?include("/post_function.php");?>
<?
$text_data = "This is an example of a message.";
$message_text_save = $text_data;
$recipients_count =1;
$total_message_count =0;
$successful_message =0;
$failed_message =0;
$message_index =0;

function str_split($str,$num)
{
if($num < 1) return FALSE;
$arr = array();
for ($j = 0; $j < strlen($str); $j= $j+$num)
{
$arr[] = substr($str,$j,$num);
}
return $arr;
}

if(isset($SendMessage))
{
$split_message = str_split($text_data,160);
$message_count = sizeof($split_message);
$message_index = $message_index+$m;
$formatted_message = str_replace('\\', '',$text_data);
$split_message = str_split($formatted_message,160);

for($mm=0; $mm<$message_count; $mm++)
{
$data["1"] = "A";
$data["2"] = "B";
$data["3"] = "C";
$data["4"] = "D";
$data["5"] = $split_message[$mm];

$result = post_it($data, "http://url.com/send");//this is the feedback we get from the post function
//The following 2 lines were originally used to extract the required info from the returned header, i.e. $result.
$message_status[$mm] =substr($result[0],13);//this is the first piece of info i.e. Status
$message_id[$mm] =$result[8];//this is the second piece of info i.e. ID

/*However, now both pieces of information are kept within $result[9].
With the help of you guys, i was able to use the following code to parse a hard coded simulation of $result[9]
$result[9]="{AcceptReport=[{ID=0410141738518081X, Status=11}]}";
$result1="$result[9];
$result2=str_replace (array ('AcceptReport=','{[{', '}]}'), array ('','', ''), $result1);
$values=explode(",",$result2);
$xstring=explode("=",trim($values[0]));
$message_id=$xstring[1];
$ystring=explode("=",trim($values[1]));
$message_status=$ystring[1];

However, this doesn't work when placed within this script, leaving me to believe it has something to do with the [$mm] issue.
*/

if(trim($message_status[$mm] )=='OK')
{
$successful_message = $successful_message+1;
}
else
{
$failed_message = $failed_message+1;
}
}
$total_message_count =$message_count;
}
}
?>


This script was written by a friend as a great favour to me and, although it may not be as professional as wanted, it worked.

But, due to unforseen circumstances, things have changed and now the results are recieved in $result[9].

Thanks again for your help.

HaganeNoKokoro
10-19-2004, 01:44 AM
Will the first part always be the same (like will it ALWAYS be AcceptReport, or can it be any of several things)? Becuase if it will not be the same in every result, then we'll have to rethink the method a little. for($mm=0; $mm<$message_count; $mm++) {
$data["1"] = "A";
$data["2"] = "B";
$data["3"] = "C";
$data["4"] = "D";
$data["5"] = $split_message[$mm];

$result = post_it($data, "http://url.com/send");

$result1=preg_replace("|[\\w]+=|", "", $result[9], 1);
$result2=str_replace(array("{[{", "}]}"), array("", ""), $result1);

$values=explode(",", $result2);
$message_id[$mm]=trim($values[0]);
$message_status[$mm]=trim($values[1]);

// ... whatever you're gonna do down below
}

fambi
10-19-2004, 05:54 AM
Well, for the forseeable future, it is always going to be the same.

So if you ever leave the webdeveloper forum, make sure to leave me some contact details (just joking!).

I haven't tried implementing your feedback, but as soon as i do, i will let you know how it has gone.

THANX!

fambi
10-23-2004, 09:02 AM
OK. Sorry for the delay. But i have implemented the developments into the scripts and tested them, and they all work.

THANK YOU HAGANE and ofcourse yuna as well.

I now officialy declare the "Parsing this piece of string" saga CLOSED.

Thanks again.