PCREGREP(1)                                                        PCREGREP(1)



NAME
       pcregrep  -  a grep with Perl-compatible regular expres-
       sions.

SYNOPSIS
       pcregrep [options] [long options] [pattern] [path1 path2
       ...]

DESCRIPTION

       pcregrep  searches  files for character patterns, in the
       same way as other grep commands do, but it uses the PCRE
       regular  expression library to support patterns that are
       compatible with the regular expressions of Perl  5.  See
       pcrepattern(3)  for  a  full  description  of syntax and
       semantics of the regular expressions that PCRE supports.

       Patterns,  whether  supplied on the command line or in a
       separate file, are given without delimiters.  For  exam-
       ple:

         pcregrep Thursday /etc/motd

       If  you  attempt to use delimiters (for example, by sur-
       rounding a pattern with slashes, as is  common  in  Perl
       scripts),  they  are interpreted as part of the pattern.
       Quotes can of course be used on the command line because
       they  are  interpreted by the shell, and indeed they are
       required if a pattern  contains  white  space  or  shell
       metacharacters.

       The  first  argument that follows any option settings is
       treated as the single pattern to be matched when neither
       -e  nor  -f is present.  Conversely, when one or both of
       these options are used to specify  patterns,  all  argu-
       ments are treated as path names. At least one of -e, -f,
       or an argument pattern must be provided.

       If no files are specified, pcregrep reads  the  standard
       input.  The  standard  input can also be referenced by a
       name consisting of a single hyphen.  For example:

         pcregrep some-pattern /file1 - /file3

       By default, each line that matches the pattern is copied
       to  the  standard  output, and if there is more than one
       file, the file name is output at the start of each line.
       However,  there are options that can change how pcregrep
       behaves. In particular, the -M option makes it  possible
       to  search  for patterns that span line boundaries. What
       defines a line boundary is controlled by the -N  (--new-
       line) option.

       Patterns   are  limited  to  8K  or  BUFSIZ  characters,
       whichever  is  the  greater.   BUFSIZ  is   defined   in
       <stdio.h>.

       If  the  LC_ALL or LC_CTYPE environment variable is set,
       pcregrep uses the value to set a locale when calling the
       PCRE  library.  The --locale option can be used to over-
       ride this.

OPTIONS

       --        This terminate the list of options. It is use-
                 ful  if  the  next  item  on  the command line
                 starts with a hyphen but  is  not  an  option.
                 This allows for the processing of patterns and
                 filenames that start with hyphens.

       -A number, --after-context=number
                 Output number  lines  of  context  after  each
                 matching  line.  If filenames and/or line num-
                 bers are being output, a hyphen  separator  is
                 used instead of a colon for the context lines.
                 A line containing "--" is output between  each
                 group  of  lines, unless they are in fact con-
                 tiguous in the input file. The value of number
                 is  expected  to be relatively small. However,
                 pcregrep guarantees to have up to 8K  of  fol-
                 lowing text available for context output.

       -B number, --before-context=number
                 Output  number  lines  of  context before each
                 matching line. If filenames and/or  line  num-
                 bers  are  being output, a hyphen separator is
                 used instead of a colon for the context lines.
                 A  line containing "--" is output between each
                 group of lines, unless they are in  fact  con-
                 tiguous in the input file. The value of number
                 is expected to be relatively  small.  However,
                 pcregrep  guarantees  to have up to 8K of pre-
                 ceding text available for context output.

       -C number, --context=number
                 Output number lines of context both before and
                 after  each matching line.  This is equivalent
                 to setting both -A and -B to the same value.

       -c, --count
                 Do not output individual lines;  instead  just
                 output  a  count  of  the number of lines that
                 would otherwise have been output.  If  several
                 files are given, a count is output for each of
                 them. In this mode, the -A, -B, and -C options
                 are ignored.

       --colour, --color
                 If  this  option is given without any data, it
                 is equivalent to "--colour=auto".  If data  is
                 required,  it  must be given in the same shell
                 item, separated by an equals sign.

       --colour=value, --color=value
                 This option specifies under what circumstances
                 the  part  of  a  line  that matched a pattern
                 should be coloured in the  output.  The  value
                 may  be  "never"  (the  default), "always", or
                 "auto". In the latter case, colouring  happens
                 only  if the standard output is connected to a
                 terminal. The colour can be specified by  set-
                 ting  the environment variable PCREGREP_COLOUR
                 or PCREGREP_COLOR. The value of this  variable
                 should  be  a string of two numbers, separated
                 by a semicolon.  They are copied directly into
                 the  control  string  for  setting colour on a
                 terminal, so  it  is  your  responsibility  to
                 ensure that they make sense. If neither of the
                 environment variables is set, the  default  is
                 "1;31", which gives red.

       -D action, --devices=action
                 If  an  input  path is not a regular file or a
                 directory, "action" specifies how it is to  be
                 processed.   Valid   values  are  "read"  (the
                 default) or "skip" (silently skip the path).

       -d action, --directories=action
                 If an input  path  is  a  directory,  "action"
                 specifies  how  it  is to be processed.  Valid
                 values are  "read"  (the  default),  "recurse"
                 (equivalent  to  the  -r  option),  or  "skip"
                 (silently skip the path). In the default case,
                 directories  are read as if they were ordinary
                 files. In some operating systems the effect of
                 reading  a directory like this is an immediate
                 end-of-file.

       -e pattern, --regex=pattern,
                 --regexp=pattern  Specify  a  pattern  to   be
                 matched.  This  option  can  be  used multiple
                 times in order to specify several patterns. It
                 can also be used as a way of specifying a sin-
                 gle pattern that starts with a hyphen. When -e
                 is used, no argument pattern is taken from the
                 command line; all  arguments  are  treated  as
                 file names. There is an overall maximum of 100
                 patterns. They are applied to each line in the
                 order  in  which  they  are  defined until one
                 matches (or fails to match if -v is used).  If
                 -f  is used with -e, the command line patterns
                 are matched first, followed  by  the  patterns
                 from  the  file,  independent  of the order in
                 which these options are specified.  Note  that
                 multiple use of -e is not the same as a single
                 pattern with alternatives.  For  example,  X|Y
                 finds  the first character in a line that is X
                 or Y, whereas if the two  patterns  are  given
                 separately, pcregrep finds X if it is present,
                 even if it follows Y in the line. It  finds  Y
                 only if there is no X in the line. This really
                 matters only if you are using -o to  show  the
                 portion of the line that matched.

       --exclude=pattern
                 When  pcregrep  is  searching  the  files in a
                 directory as a consequence of the  -r  (recur-
                 sive  search)  option,  any  files whose names
                 match the pattern are excluded. The pattern is
                 a  PCRE  regular  expression.  If  a file name
                 matches both --include and  --exclude,  it  is
                 excluded.  There  is  no  short  form for this
                 option.

       -F, --fixed-strings
                 Interpret each pattern  as  a  list  of  fixed
                 strings,  separated by newlines, instead of as
                 a regular expression. The -w (match as a word)
                 and  -x (match whole line) options can be used
                 with -F. They  apply  to  each  of  the  fixed
                 strings.  A  line  is  selected  if any of the
                 fixed strings are found in it (subject  to  -w
                 or -x, if present).

       -f filename, --file=filename
                 Read  a  number of patterns from the file, one
                 per line, and match them against each line  of
                 input.  A  data  line  is output if any of the
                 patterns match it. The filename can  be  given
                 as "-" to refer to the standard input. When -f
                 is used, patterns  specified  on  the  command
                 line  using  -e  may also be present; they are
                 tested before the file's patterns. However, no
                 other  pattern is taken from the command line;
                 all arguments are treated as file names. There
                 is   an   overall  maximum  of  100  patterns.
                 Trailing white  space  is  removed  from  each
                 line,  and  blank  lines are ignored. An empty
                 file  contains  no  patterns   and   therefore
                 matches nothing.

       -H, --with-filename
                 Force  the  inclusion  of  the filename at the
                 start of output lines when searching a  single
                 file. By default, the filename is not shown in
                 this case. For matching lines, the filename is
                 followed  by  a colon and a space; for context
                 lines, a hyphen separator is used. If  a  line
                 number  is  also  being output, it follows the
                 file name without a space.

       -h, --no-filename
                 Suppress the output filenames  when  searching
                 multiple  files.  By  default,  filenames  are
                 shown when multiple files  are  searched.  For
                 matching  lines, the filename is followed by a
                 colon and a space; for context lines, a hyphen
                 separator  is  used.  If a line number is also
                 being output, it follows the file name without
                 a space.

       --help    Output a brief help message and exit.

       -i, --ignore-case
                 Ignore  upper/lower  case  distinctions during
                 comparisons.

       --include=pattern
                 When pcregrep is  searching  the  files  in  a
                 directory  as  a consequence of the -r (recur-
                 sive search) option, only  those  files  whose
                 names match the pattern are included. The pat-
                 tern is a PCRE regular expression. If  a  file
                 name  matches both --include and --exclude, it
                 is excluded. There is no short form  for  this
                 option.

       -L, --files-without-match
                 Instead  of  outputting  lines from the files,
                 just output the names of the files that do not
                 contain any lines that would have been output.
                 Each file name is output once, on  a  separate
                 line.

       -l, --files-with-matches
                 Instead  of  outputting  lines from the files,
                 just output the names of the files  containing
                 lines  that  would have been output. Each file
                 name is  output  once,  on  a  separate  line.
                 Searching  stops as soon as a matching line is
                 found in a file.

       --label=name
                 This option supplies a name to be used for the
                 standard  input when file names are being out-
                 put. If not supplied,  "(standard  input)"  is
                 used.  There is no short form for this option.

       --locale=locale-name
                 This option specifies a locale to be used  for
                 pattern  matching.  It  overrides the value in
                 the LC_ALL or LC_CTYPE environment  variables.
                 If  no locale is specified, the PCRE library's
                 default (usually  the  "C"  locale)  is  used.
                 There is no short form for this option.

       -M, --multiline
                 Allow  patterns  to  match more than one line.
                 When this option is given, patterns  may  use-
                 fully  contain  literal newline characters and
                 internal occurrences of ^  and  $  characters.
                 The  output  for  any one match may consist of
                 more than one line. When this option  is  set,
                 the  PCRE  library  is  called  in "multiline"
                 mode.  There is a limit to the number of lines
                 that  can  be matched, imposed by the way that
                 pcregrep buffers the input file  as  it  scans
                 it. However, pcregrep ensures that at least 8K
                 characters  or  the  rest  of   the   document
                 (whichever  is  the shorter) are available for
                 forward matching, and similarly  the  previous
                 8K characters (or all the previous characters,
                 if fewer than 8K) are guaranteed to be  avail-
                 able for lookbehind assertions.

       -N newline-type, --newline=newline-type
                 The  PCRE library supports four different con-
                 ventions for indicating  the  ends  of  lines.
                 They  are  the  single-character  sequences CR
                 (carriage return) and LF (linefeed), the  two-
                 character  sequence CRLF, and an "any" conven-
                 tion,  in  which  any  Unicode   line   ending
                 sequence is assumed to end a line. The Unicode
                 sequences are the three just  mentioned,  plus
                 VT   (vertical  tab,  U+000B),  FF  (formfeed,
                 U+000C), NEL (next  line,  U+0085),  LS  (line
                 separator,  U+2028), and PS (paragraph separa-
                 tor, U+0029).

                 When the PCRE  library  is  built,  a  default
                 line-ending  sequence  is  specified.  This is
                 normally the standard sequence for the operat-
                 ing system. Unless otherwise specified by this
                 option, pcregrep uses the  library's  default.
                 The  possible  values  for this option are CR,
                 LF, CRLF, or ANY. This makes  it  possible  to
                 use  pcregrep  on  files  that  have come from
                 other environments without  having  to  modify
                 their  line endings. If the data that is being
                 scanned does not agree with the convention set
                 by this option, pcregrep may behave in strange
                 ways.

       -n, --line-number
                 Precede each output line by its line number in
                 the  file, followed by a colon and a space for
                 matching lines or a hyphen  and  a  space  for
                 context  lines.  If the filename is also being
                 output, it precedes the line number.

       -o, --only-matching
                 Show only the part of the line that matched  a
                 pattern.  In  this  mode, no context is shown.
                 That is,  the  -A,  -B,  and  -C  options  are
                 ignored.

       -q, --quiet
                 Work  quietly, that is, display nothing except
                 error  messages.  The  exit  status  indicates
                 whether or not any matches were found.

       -r, --recursive
                 If  any given path is a directory, recursively
                 scan the files it contains, taking note of any
                 --include  and --exclude settings. By default,
                 a directory is read as a normal file; in  some
                 operating systems this gives an immediate end-
                 of-file. This option is a shorthand  for  set-
                 ting the -d option to "recurse".

       -s, --no-messages
                 Suppress  error messages about non-existent or
                 unreadable  files.  Such  files  are   quietly
                 skipped.  However, the return code is still 2,
                 even if matches were found in other files.

       -u, --utf-8
                 Operate in UTF-8 mode. This option  is  avail-
                 able only if PCRE has been compiled with UTF-8
                 support. Both patterns and subject lines  must
                 be valid strings of UTF-8 characters.

       -V, --version
                 Write  the version numbers of pcregrep and the
                 PCRE library that is being used to  the  stan-
                 dard error stream.

       -v, --invert-match
                 Invert  the  sense of the match, so that lines
                 which do not match any of the patterns are the
                 ones that are found.

       -w, --word-regex, --word-regexp
                 Force  the patterns to match only whole words.
                 This is equivalent to having \b at  the  start
                 and end of the pattern.

       -x, --line-regex, --line-regexp
                 Force  the  patterns to be anchored (each must
                 start matching at the beginning of a line) and
                 in  addition,  require  them  to  match entire
                 lines. This is equivalent to having  ^  and  $
                 characters at the start and end of each alter-
                 native branch in every pattern.

ENVIRONMENT VARIABLES

       The environment variables LC_ALL and LC_CTYPE are  exam-
       ined, in that order, for a locale. The first one that is
       set is used. This can  be  overridden  by  the  --locale
       option.  If no locale is set, the PCRE library's default
       (usually the "C" locale) is used.

NEWLINES

       The -N (--newline) option allows pcregrep to scan  files
       with  different  newline  conventions  from the default.
       However, the setting of this option does not affect  the
       way in which pcregrep writes information to the standard
       error and output streams. It uses the string "\n"  in  C
       printf()  calls  to  indicate newlines, relying on the C
       I/O library to convert this to an  appropriate  sequence
       if the output is sent to a file.

OPTIONS COMPATIBILITY

       The  majority  of  short  and  long  forms of pcregrep's
       options are the same as in the  GNU  grep  program.  Any
       long  option  of the form --xxx-regexp (GNU terminology)
       is also available  as  --xxx-regex  (PCRE  terminology).
       However,  the --locale, -M, --multiline, -u, and --utf-8
       options are specific to pcregrep.

OPTIONS WITH DATA

       There are four different ways in which  an  option  with
       data  can be specified.  If a short form option is used,
       the data may follow immediately, or in the next  command
       line item. For example:

         -f/some/file
         -f /some/file

       If  a  long  form option is used, the data may appear in
       the same command line item, separated by an equals char-
       acter, or (with one exception) it may appear in the next
       command line item. For example:

         --file=/some/file
         --file /some/file

       Note, however, that if you want to supply  a  file  name
       beginning  with  ~  as data in a shell command, and have
       the shell expand ~ to a home directory, you  must  sepa-
       rate  the  file  name from the option, because the shell
       does not treat ~ specially unless it is at the start  of
       an item.

       The  exception to the above is the --colour (or --color)
       option, for which the data is optional. If  this  option
       does  have  data,  it  must  be given in the first form,
       using an equals character. Otherwise it will be  assumed
       that it has no data.

MATCHING ERRORS

       It is possible to supply a regular expression that takes
       a very long time to fail to match  certain  lines.  Such
       patterns normally involve nested indefinite repeats, for
       example: (a+)*\d when matched against a line of a's with
       no  final  digit.  The  PCRE  matching  function  has  a
       resource limit that causes it to abort in these  circum-
       stances. If this happens, pcregrep outputs an error mes-
       sage and the line that caused the problem to  the  stan-
       dard  error  stream.  If  there  are  more  than 20 such
       errors, pcregrep gives up.

DIAGNOSTICS

       Exit status is 0 if any matches  were  found,  1  if  no
       matches  were  found,  and  2 for syntax errors and non-
       existent or inacessible  files  (even  if  matches  were
       found in other files) or too many matching errors. Using
       the -s option to suppress error messages about inaccess-
       ble files does not affect the return code.

SEE ALSO

       pcrepattern(3), pcretest(1).

AUTHOR

       Philip Hazel
       University Computing Service
       Cambridge CB2 3QH, England.

Last updated: 29 November 2006
Copyright (c) 1997-2006 University of Cambridge.



                                                                   PCREGREP(1)
