[Linux-disciples] Regex for an English word?

Stephen R Laniel steve at laniels.org
Wed Nov 30 12:00:42 EST 2005


Perl's \w "word" regex is designed for programming-language
"words" -- letters plus underscores [1]. Is there a standard
regex that people use for English words? You want to allow
apostrophes within them, but only after the first character.
Something like

[:alpha:]+[']?[:alpha:]+

would do it, but not quite: the word 'i' wouldn't match. To
get this exactly right might be a little tricky, but I
assume someone's come up with a fairly standard and robust
regex for English words. Does anyone know of a good one?

In general, I assume there's some database somewhere of
standard regexes. E.g., I normally use

\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

to match IP addresses, but that doesn't catch a lot of the
nuances in IPs. Someone must use a better regex here, right?

[1] - http://www.perl.com/doc/manual/html/pod/perlre.html

-- 
Stephen R. Laniel
steve at laniels.org
+(617) 308-5571
http://laniels.org/
PGP key: http://laniels.org/slaniel.key
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.bostoncoop.net/pipermail/linux-disciples/attachments/20051130/bbf5c3ca/attachment.pgp


More information about the Linux-disciples mailing list