[Linux-disciples] Perl: displaying the part of a match that failed

Chung-chieh Shan ccshan at post.harvard.edu
Fri Jun 24 20:56:22 EDT 2005


On 2005-06-24T16:19:46-0400, Stephen R Laniel wrote:
> On Fri, Jun 24, 2005 at 04:14:39PM -0400, Chung-chieh Shan wrote:
> > I don't understand what "the part of the file that failed to match"
> > means.  What part of "abcde" failed to match /^.*f.*$/ ?
> Good point; it was unclear on my part. What I mean is "on
> what part of the pattern did the regex parser fail and
> decide that the string didn't match the pattern?" In the
> above, the string matches up to the end of the first '.*'
> and fails at 'f'.

But where in the string "abcde" did the match fail?

> So in my dreamworld, Perl would say, "CSS
> parsing failed at
> 
> elemName {
> 	font-fcae: Times New Roman;
> }
> 
> because there is no property 'font-fcae'."
> 
> Is that any clearer?

I think if you try to specify (e.g., to code up) how to match a given
regular expression against a given string and return an error location
on failure, you would see where you are and aren't being clear.

To be more specific: I assume that this dreamworld is where you told
Perl to match against the incorrect CSS string

    elemName {
        font-face: Times New Roman;
    }
    elemName {
        font-fcae: Times New Roman;
    }
    elemName {
        font-face: Times New Roman;
    }

where the first and third declarations are correct, and the second one
is not.

But note that you haven't told Perl what constitutes an incorrect
declaration.  It has no way of distinguishing between your dream and the
nightmare where it says "CSS parsing failed at

elemName {
        font-fcae: Times New Roman;
}

because there is no property 'font-fcae: Tim'."

Moreover, you haven't told Perl that you want the first incorrect
-declaration- it encounters, not the first incorrect stylesheet it
encounters (or the first incorrect block or statement it encounters,
whatever that would mean).  Why shouldn't Perl say "CSS parsing failed
at

    font-fcae

because there is no property 'font-fcae'.", or "CSS parsing failed at

    elemName {
        font-face: Times New Roman;
    }
    elemName {
        font-fcae: Times New Roman;
    }
    elemName {
        font-face: Times New Roman;
    }

because there is no property 'font-fcae'."?

For your purpose, I suspect that it would help to put "|(.*)" at the end
of your $stylesheet pattern.  But in general, making a parser generate
good error messages is tricky.  If you're in the mood for danger,
you could access "pos $input_string" from within a well-positioned
"(?{code})" block.  The following program prints 13, then 8:

    $a = 'abcabcabcdeabcdef';
    $a =~ m!.*(?{ print pos $a, "\n" })cd.*c!;

-- 
Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig
OSCE report on 2004 USA elections: http://www.osce.org/item/13658.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.bostoncoop.net/pipermail/linux-disciples/attachments/20050624/5efa23cd/attachment-0001.pgp


More information about the Linux-disciples mailing list