Saturday, June 4, 2011

How to read part of file which is not determined by NEWLINE (\n)

One of the special variable in perl make this possible.

Special variable used :- $\ .

$\ is record separator, newline (\n) by default. $/ may be set to a value longer than one character in order to match a multi-character delimiter. If $/ is undefined (i.e. $\ = undef), no record separator is matched, and <FILEHANDLE> will read everything to the end of the current file.

Hence this can be used for following file which has many xmls:-

For example
________________________________________________________________________________
<DOC>
  <NAME>ABC</NAME>
  <CONTENT>This is first sample doc </CONTENT>
</DOC>

<DOC>
  <NAME>PQR</NAME>
  <CONTENT>This is second sample doc </CONTENT>
</DOC>

_________________________________________________________________________________

Now XML parser will fail for parsing above text:-

Hence we can seperate the xmls while reading from a file by using $/ special variable as follows:-
______________________________________________________________________________
local $/="</DOC>";
open FILE, "ap/ap890101";
while (<FILE>) {
     $value = "$_";
     print $value;
}
close FILE;
______________________________________________________________________________

Courtesy of link which provides various information about perl special variables :-
http://www.kichwa.com/quik_ref/spec_variables.html

Can't locate XML/DOM.pm in @INC:: How to install DOM parser in windows

I was frusterated for an hour to find how to install DOM parser in windows as it is not preinstalled in Strawberry Perl compiler.

Just type in command prompt :-

 c:\> ppm install XML-DOM

This will work perfert if all the "path" environment variables are initialized properly.