Experts Round Table Network

Serverside Technology => PHP => Topic started by: sajuks on February 24, 2007, 07:07:41 PM



Title: Regular expression Help
Post by: sajuks on February 24, 2007, 07:07:41 PM
Hi folks,

  Need your help in building a proper regular expression.
I have a file which contains information in the following format:

Information: The Batch Tw-09-hg was started at 22-09-06 at 11:30PM by Turin
Information: Document upload starts
Information: Document closed at 11:40PM
Information: The Batch Tw-09-hg was closed at 22-09-06 at 11:30PM by Turin

I need a regular expression whcih will search for multiple values in each single line and if it satisifes
my valeus it will display it on the page.
So for example if i am searching for the key words for 22-09-06 and Turin, then when i scan through the file
it should find which all lines contains the words 22-09-06 and Turin and then display that whole line on a page.
Basically iam looking at displaying only those lines from my file which contains those specific words. The file size is more than 9MB so some file reading tips is also appreciated.

Thanks
Saju




Title: Re: Regular expression Help
Post by: CrYpTiC_MauleR on February 24, 2007, 09:37:16 PM
You can try this method which doenst use a regualr expression since thats would be slow method to do basic string matching and would be pretty slow when parsing a 9MB file.

Code
Language: php (GeSHi-highlighted)
<?php
 
$keywords = 'Turin
22-09-06
Tw-09-hg'
;
 
$keywords = explode("\n", $keywords);
 
//$data = file('myfile.txt');
$data = explode("\n", 'Information: The Batch Tw-09-hg was started at 22-09-06 at 11:30PM by Turin
Information: Document upload starts
Information: Document closed at 11:40PM
Information: Hello World!
Information: The Batch Tw-09-hg was closed at 22-09-06 at 11:30PM by Turin'
);
 
$matches = array();
foreach ($data as $line)
{
   $match = true;
   foreach ($keywords as $keyword)
   {
       if (false === strpos($line, $keyword))
       {
           $match = false;
       }
   }
   if ($match)
   {
       $matches[] = $line;
   }
}
 
echo '<pre>';
print_r($matches);
 
?>


Title: Re: Regular expression Help
Post by: VGR on February 26, 2007, 10:33:35 AM
100% agree.
Never use regexps except on moderately complex expressions, and only on very small files/contents
why ? because it's either "unreadabe 2 weeks after you wrote them" and thus unmaintainable, or awfully slow.
Even the Perl version, (and the XML parser BTW) is not suited for real work.
I used to have to parse 300 MB of XML, and believe me, it's a lot when "expanded" in nodes in memory.
Especially on a 256 MB RAM machine ;-)

I solved the problem using PHP, as usual.

don't use regexps. Use strpos()!==FALSE