[Linux-disciples] Backslashes within quotes

Stephen R Laniel linux-disciples@bostoncoop.net
Mon, 24 Nov 2003 16:47:58 -0500


I'm looking to write a Perl regex that would convert, e.g.,
book="(.*)" into <span class="booktitle">$1</span> (with more
generality, but that's the idea).

The trouble is, what if $1 contains quotation marks? Let's imagine I had
a book called "Screwed: Life Aboard "The Titanic"". How could I write a
regex that would properly turn this into
<span class="booktitle">Screwed: Life Aboard "The Titanic"</span>?

I could insist that my users escape each nested quote with a backslash, a
la
book="My Life As a \"Monkee\""

which seems like the best solution for now; embedded quotes are quite
rare. (Of the 359 books that I own, I find only two that have quotes
within the title. They're both by Feynman: _"Surely You're Joking, Mr.
Feynman"_ and _"What Do You Care What Other People Think?"_)

But I don't know how to write the regex to properly handle escaped
quotation marks. Any ideas?

Also, what if I have a movie title within a book title, so that I might
want to write

book="movie="Adaptation": The Screenplay"

Users shouldn't be expected to escape the quotes around the movie title.
A first pass over this string would turn movie="Adaptation" into
<span class="movietitle">Adaptation</span>, which would then be a pair
of unescaped quotation marks within the outer set.

So. Hm. Any ideas how to handle edge cases like this?

-- 
``What must it be like to see the world from inside David Bernstein's
  head? ... It must be like living in a Mondrian painting.''
 -Kieran Healey, http://www.crookedtimber.org/archives/000869.html