Saturday, June 4, 2011

How to read part of file which is not determined by NEWLINE (\n)

One of the special variable in perl make this possible.

Special variable used :- $\ .

$\ is record separator, newline (\n) by default. $/ may be set to a value longer than one character in order to match a multi-character delimiter. If $/ is undefined (i.e. $\ = undef), no record separator is matched, and <FILEHANDLE> will read everything to the end of the current file.

Hence this can be used for following file which has many xmls:-

For example
________________________________________________________________________________
<DOC>
  <NAME>ABC</NAME>
  <CONTENT>This is first sample doc </CONTENT>
</DOC>

<DOC>
  <NAME>PQR</NAME>
  <CONTENT>This is second sample doc </CONTENT>
</DOC>

_________________________________________________________________________________

Now XML parser will fail for parsing above text:-

Hence we can seperate the xmls while reading from a file by using $/ special variable as follows:-
______________________________________________________________________________
local $/="</DOC>";
open FILE, "ap/ap890101";
while (<FILE>) {
     $value = "$_";
     print $value;
}
close FILE;
______________________________________________________________________________

Courtesy of link which provides various information about perl special variables :-
http://www.kichwa.com/quik_ref/spec_variables.html

No comments:

Post a Comment