[Linux-disciples] AND with grep

Adam Kessel adam at rosi-kessel.org
Thu May 20 18:45:06 EDT 2004


On Thu, May 20, 2004 at 04:38:21PM -0400, Stephen R Laniel wrote:
> If you want to find all documents that contain both [term1]
> AND [term2], not necessarily on the same line, here's my
> thought on how to do it:
>  
> 1) grep for [term1], assume that the output is of the form
> '[filename]:.*$', and grab all the [filename]s that match
> 
> 2) repeat 1) for [term2].
> 
> 3) compare the lists obtained from 1) and 2). Return any
> filenames matching both.
> 
> Is there a better way?
> 
Maybe not a "better" way, but there are other ways.

For example:

cat file | tr "\n" " " | grep "\(term1.*term2\)\|\(term2.*term1\)"

Or

for x in *; do if [ -f $x ]; then cat $x | tr "\n" " " | perl -n -e "/(term1.*term2)|(term2.*term1)/is and print \"$x\n\""; fi; done;

Or the above of as a bash script, i.e.

grepand term1 term2 scope

#!/bin/bash
term1=$1
term2=$2
shift 2
for x in $@; do if [ -f $x ]; then cat $x | tr "\n" " " | perl -n -e "/($term1.*$term2)|($term2.*$term1)/is and print \"$x\n\""; fi; done;

Etc..

I don't know if there's a really graceful way to do it, though.
-- 
Adam Kessel
http://adam.rosi-kessel.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.bostoncoop.net/pipermail/linux-disciples/attachments/20040520/a4b7e3ab/attachment-0001.pgp


More information about the Linux-disciples mailing list