App-Greple
view release on metacpan or search on metacpan
--all print entire data
-F, --filter use as a filter (implies --all --need=0 --exit=0)
-m, --max=n[,m] max count of blocks to be shown
-A,-B,-C [n] after/before/both match context
--join remove newline in the matched part
--joinby=string replace newline in the matched text with a string
--nonewline do not add newline character at the end of block
--filestyle=style how filenames are printed (once, separate, line)
--linestyle=style how line numbers are printed (separate, line)
--blockstyle=style how block numbers are printed (separate, line)
--separate set filestyle, linestyle, blockstyle "separate"
--format LABEL=... define the format for line number and file name
--frame-top top frame line
--frame-middle middle frame line
--frame-bottom bottom frame line
FILE
--glob=glob glob target files
--chdir=dir change directory before search
--readlist get filenames from stdin
COLOR
--color=when use terminal colors (auto, always, never)
--nocolor same as --color=never
--colormap=color R, G, B, C, M, Y, etc.
--colorsub=... shortcut for --colormap="sub{...}"
--colorful use default multiple colors
--colorindex=flags color index method: Ascend/Descend/Block/Random/Unique/Group/GP
--random use a random color each time (--colorindex=R)
--uniqcolor use a different color for each unique string (--colorindex=U)
--uniqsub=func preprocess function to check uniqueness
--ansicolor=s ANSI color 16, 256 or 24bit
--[no]256 same as --ansicolor 256 or 16
--regioncolor use different color for inside and outside regions
--face enable or disable visual effects
BLOCK
-p, --paragraph enable paragraph mode
--border=pattern specify a border pattern
--block=pattern specify a block of records
--blockend=s block-end mark (Default: "--")
--join-blocks join consecutive blocks that are back-to-back
REGION
--inside=pattern select matches inside of pattern
--outside=pattern select matches outside of pattern
--include=pattern limit matches to the area
--exclude=pattern limit matches to outside of the area
--strict enable strict mode for --inside/outside --block
CHARACTER CODE
--icode=name input file encoding
--ocode=name output file encoding
FILTER
--if,--of=filter input/output filter command
--pf=filter post-process filter command
--noif disable the default input filter
RUNTIME FUNCTION
--begin=func call a function before starting the search
--end=func call a function after completing the search
--prologue=func call a function before executing the command
--epilogue=func call a function after executing the command
--postgrep=func call a function after each grep operation
--callback=func callback function for each matched string
OTHER
--usage[=expand] show this help message
--version show version
--exit=n set the command exit status
--norc skip reading startup file
--man display the manual page for the command or module
--show display the module file contents
--path display the path to the module file
--error=action action to take after a read error occurs
--warn=type runtime error handling type
--alert [name=#] set alert parameters (size/time)
-d flags display info (f:file d:dir c:color m:misc s:stat)
# INSTALL
## CPANMINUS
$ cpanm App::Greple
# SUMMARY
**greple** is a grep-like tool designed for searching structured text
such as source code and documents. Key features include:
- **Flexible pattern matching**: Multiple keyword search with AND/OR/NOT
logic, including multi-line matching and lexical expressions.
- **Region control**: Target specific sections with `--inside`,
`--outside`, `--include`, and `--exclude` options. Useful for
searching only within code blocks, comments, or other delimited
regions.
- **Block-oriented processing**: Define and search custom text blocks
such as paragraphs or function definitions.
- **Multi-byte support**: Native handling of Japanese and other Asian
languages with proper character encoding.
- **Extensibility**: Module system allows custom search patterns and
filters for specific document types or use cases.
While it can be used for general text search, greple excels at
searching source code, structured documents, and multi-byte text where
context and precision matter.
# DESCRIPTION
## MULTIPLE KEYWORDS
### AND
**greple** can take multiple search patterns with the `-e` option, but
unlike the [egrep(1)](http://man.he.net/man1/egrep) command, it will search them in AND context.
For example, the next command prints lines that contain all of
`foo` and `bar` and `baz`.
greple -e foo -e bar -e baz ...
Each word can appear in any order and any place in the string. So
this command finds all of the following lines.
foo bar baz
baz bar foo
the foo, bar and baz
### OR
`(?-m)` at the beginning of regex if you want to explicitly disable
it.
Order of capture group in the pattern is not guaranteed. Please avoid
to use direct index, and use relative or named capture group instead.
For example, if you want to search repeated characters, use
`(\w)\g{-1}` or `(?<c>\w)\g{c}` rather than
`(\w)\1`.
Extended Bracketed Character Classes (`(?[...])`) and Variable Length
Lookbehind can be used without warnings. See
["Extended Bracketed Character Classes" in perlrecharclass](https://metacpan.org/pod/perlrecharclass#Extended-Bracketed-Character-Classes) and
["(?<=pattern)" in perlre](https://metacpan.org/pod/perlre#pattern).
- **-e** _pattern_, **--and**=_pattern_
Specify the positive match pattern. Next command prints lines containing
all of `foo`, `bar` and `baz`.
greple -e foo -e bar -e baz
- **-t** _pattern_, **--may**=_pattern_
Specify the optional (tentative) match pattern. Next command prints
lines containing `foo` and `bar`, and highlights `baz` if it exists.
greple -e foo -e bar -t baz
Since it does not affect the bare pattern argument, you can add the
highlighting word to the end of the command argument as follows.
greple foo file
greple foo file -t bar
greple foo file -t bar -t baz
- **-r** _pattern_, **--must**=_pattern_
Specify the required match pattern. If one or more required pattern
exist, other positive match pattern becomes optional.
greple -r foo -r bar -e baz
Because `-t` promotes all other `-e` patterns to required, the next command
does the same thing. Mixing `-r`, `-e` and `-t` is not recommended,
though.
greple -r foo -e bar -t baz
- **-v** _pattern_, **--not**=_pattern_
Specify the negative match pattern. Because it does not affect the
bare pattern argument, you can narrow down the search result like
this.
greple foo file
greple foo file -v bar
greple foo file -v bar -v baz
In the above pattern options, space characters are treated specially.
They are replaced by the pattern which matches any number of white
spaces including newline. So the pattern can expand to multiple
lines. Next commands search the series of word `foo` `bar` `baz`
even if they are separated by newlines.
greple -e 'foo bar baz'
This is done by converting pattern `foo bar baz` to
`foo\s+bar\s+baz`, so that word separator can match one or more white
spaces.
As for Asian wide characters, pattern is cooked as zero or more white
spaces can be allowed between any characters. So Japanese string
pattern `æ¥æ¬èª` will be converted to `æ¥\s*æ¬\s*èª`.
If you don't want these conversion, use `-E` (or `--re`) option.
- **-x** _pattern_, **--le**=_pattern_
Treat the pattern string as a collection of tokens separated by
spaces. Each token is interpreted by the first character. Token
start with `-` means **negative** pattern, `?` means **optional**, and
`+` does **required**.
The next example prints lines containing `foo` and `yabba`,
and none of `bar` and `dabba`, with highlighting `baz` and `doo`
if they exist.
greple --le='foo -bar ?baz yabba -dabba ?doo'
This is the summary of start character for `--le` option:
+ Required pattern
- Negative match pattern
? Optional pattern
& Function call (see next section)
- **-x** \[**+?-**\]**&**_function_, **--le**=\[**+?-**\]**&**_function_
If the pattern starts with ampersand (`&`), it is treated as a
function, and the function is called instead of searching pattern.
Function call interface is the same as the one for block/region options.
If you have a definition of _odd\_line_ function in your `.greplerc`,
which is described in this manual later, you can print odd number
lines like this:
greple -n '&odd_line' file
Required (`+`), optional (`?`) and negative (`-`) mark can be used
for function pattern.
**CALLBACK FUNCTION**: Region list returned by function can have two
extra elements besides start/end position. Third element is index.
Fourth element is a callback function pointer which will be called to
produce string to be shown in command output. Callback function is
called with four arguments (start position, end position, index,
matched string) and expected to return replacement string. If the
function returns `undef`, the result is not changed.
- **-E** _pattern_, **--re**=_pattern_
block treatment. Just print all contents. Can be negated by the
**--no-all** option.
- **-F**, **--filter**
Use **greple** as a filter. This option implicitly sets **--all**,
**--need**=`0` and **--exit**=`0`, so the entire input is printed
regardless of whether or not any pattern is matched.
With this option, a search pattern is not required. The first
argument is treated as a filename, not a pattern. To specify a
pattern, use an explicit option such as **-E**. When a
pattern is given, matched parts are highlighted but no lines are
excluded from the output.
Can be negated by the **--no-filter** option.
- **-m** _n_\[,_m_\], **--max-count**=_n_\[,_m_\]
Set the maximum count of blocks to be shown to _n_.
Actually _n_ and _m_ are simply passed to perl [splice](https://metacpan.org/pod/splice) function as
_offset_ and _length_. Works like this:
greple -m 10 # get first 10 blocks
greple -m 0,-10 # get last 10 blocks
greple -m 0,10 # remove first 10 blocks
greple -m -10 # remove last 10 blocks
greple -m 10,10 # remove 10 blocks from 10th (10-19)
This option does not affect search performance or command exit
status.
Note that **grep** command also has the same option, but its behavior is
different when invoked with multiple files. **greple** produces given
number of output for each file, while **grep** takes it as a total
number of output.
- **-m** _\*_, **--max-count**=_\*_
In fact, _n_ and _m_ can repeat as many as possible. Next example
removes first 10 blocks (by `0,10`), then get first 10 blocks from
the result (by `10`). Consequently, get 10 blocks from 10th (10-19).
greple -m 0,10,10
Next command gets first 20 (by `20,`) and gets last 10 (by `,-10`),
producing same result. Empty string behaves like absence for
_length_ and zero for _offset_.
greple -m 20,,,-10
- **-A**\[_n_\], **--after-context**\[=_n_\]
- **-B**\[_n_\], **--before-context**\[=_n_\]
- **-C**\[_n_\], **--context**\[=_n_\]
Print _n_-blocks before/after matched string. The value _n_ can be
omitted and the default is 2. When used with `--paragraph` or
`--block` option, _n_ means number of paragraph or block.
Actually, these options expand the area of logical operation. It
means
greple -C1 'foo bar baz'
matches following text.
foo
bar
baz
Moreover
greple -C1 'foo baz'
also matches this text, because matching blocks around `foo` and
`bar` overlaps each other and makes single block.
- **--join**
- **--joinby**=_string_
Convert newline character found in matched string to empty or specified
_string_. Using `--join` with `-o` (only-matching) option, you can
collect searching sentence list in one per line form. This is
sometimes useful for Japanese text processing. For example, next
command prints the list of KATAKANA words, including those spread
across multiple lines.
greple -ho --join '\p{InKatakana}+(\n\p{InKatakana}+)*'
Space separated word sequence can be processed with `--joinby`
option. Next example prints all `for *something*` pattern in pod
documents within Perl script.
greple -Mperl --pod -ioe '\bfor \w+' --joinby ' '
- **--\[no\]newline**
Since **greple** can handle arbitrary blocks other than normal text
lines, they sometimes do not end with newline character. Option `-o`
makes similar situation. In that case, extra newline is appended at
the end of block to be shown. Option `--no-newline` disables this
behavior.
- **--filestyle**=\[`line`,`once`,`separate`\], **--fs**
Default style is _line_, and **greple** prints filename at the
beginning of each line. Style _once_ prints the filename only once
at the first time. Style _separate_ prints filename in the separate
line before each line or block.
- **--linestyle**=\[`line`,`separate`\], **--ls**
Default style is _line_, and **greple** prints line numbers at the
beginning of each line. Style _separate_ prints line number in the
separate line before each line or block.
- **--blockstyle**=\[`line`,`separate`\], **--bs**
Default style is _line_, and **greple** prints block numbers at the
beginning of each line. Style _separate_ prints block number in the
newlines at the beginning of text or following another newline (`\R`
means more general linebreaks including `\r\n`; consult
[perlrebackslash](https://metacpan.org/pod/perlrebackslash) for detail).
Next command treat the data as a series of 10-line unit.
greple -n --border='(.*\n){1,10}'
Contrary to the next `--block` option, `--border` never produce
disjoint records.
If you want to treat entire file as a single block, setting border to
start or end of whole data is efficient way. Next commands works
same.
greple --border '\A' # beginning of file
greple --border '\z' # end of file
- **--block**=_pattern_
- **--block**=_&sub_
Specify the record block to display. Default block is a single line.
Empty blocks are ignored. When blocks are not continuous, the match
occurred outside blocks are ignored.
If multiple block options are given, overlapping blocks are merged
into a single block.
Please be aware that this option is sometimes quite time consuming,
because it finds all blocks before processing.
- **--blockend**=_string_
Change the end mark displayed after `-pABC` or `--block` options.
Default value is "--". Escape sequences `\t`, `\n`, `\r`, and
`\f` are recognized.
- **--join-blocks**
Join consecutive blocks together. Logical operation is done for each
individual blocks, but if the results are back-to-back connected, make
them single block for final output.
**Related options:**
**-b**/**--block-number** (["STYLES"](#styles)),
**-A**/**-B**/**-C** (["STYLES"](#styles)),
**--inside**/**--outside**/**--include**/**--exclude** (["REGIONS"](#regions))
## REGIONS
- **--inside**=_pattern_
- **--outside**=_pattern_
Option `--inside` and `--outside` limit the text area to be matched.
For simple example, if you want to find string `and` not in the word
`command`, it can be done like this.
greple --outside=command and
The block can be larger and expand to multiple lines. Next command
searches from C source, excluding comment part.
greple --outside '(?s)/\*.*?\*/'
Next command searches only from POD part of the perl script.
greple --inside='(?s)^=.*?(^=cut|\Z)'
When multiple **inside** and **outside** regions are specified, those
regions are mixed up in union way.
In multiple color environment, and if single keyword is specified,
matches in each `--inside`/`--outside` region is printed in different
color. Forcing this operation with multiple keywords, use
`--regioncolor` option.
- **--inside**=_&function_
- **--outside**=_&function_
If the pattern name begins by ampersand (&) character, it is treated
as a name of subroutine which returns a list of blocks. Using this
option, user can use arbitrary function to determine from what part of
the text they want to search. User defined function can be defined in
`.greplerc` file or by module option.
- **--include**=_pattern_
- **--exclude**=_pattern_
- **--include**=_&function_
- **--exclude**=_&function_
`--include`/`--exclude` option behave exactly same as
`--inside`/`--outside` when used alone.
When used in combination, `--include`/`--exclude` are mixed in AND
manner, while `--inside`/`--outside` are in OR.
Thus, in the next example, first line prints all matches, and second
does none.
greple --inside PATTERN --outside PATTERN
greple --include PATTERN --exclude PATTERN
You can make up desired matches using `--inside`/`--outside` option,
then remove unnecessary part by `--include`/`--exclude`
- **--strict**
Limit the match area strictly.
By default, `--block`, `--inside`/`outside`,
`--include`/`--exclude` option allows partial match within the
specified area. For instance,
greple --inside and command
matches pattern `command` because the part of matched string is
included in specified inside-area. Partial match fails when option
`--strict` provided, and longer string never matches within shorter
area.
printed is replaced by the result of the function. Arbitrary function
can be defined in `.greplerc` file or module. Matched data is placed
in variable `$_`. Filename is passed by `&FILELABEL` key, as
described later.
It is possible to use multiple `--print` options. In that case,
second function will get the result of the first function. The
command will print the final result of the last function.
This option and next **--continue** are no more recommended to use
because **--colormap** and **--callback** functions are more simple and
powerful.
- **--continue**
When `--print` option is given, **greple** will immediately print the
result returned from print function and finish the cycle. Option
`--continue` forces to continue normal printing process after print
function called. So please be sure that all data being consistent.
For these run-time functions, optional argument list can be set in the
form of `key` or `key=value`, connected by comma. These arguments
will be passed to the function in key => value list. Sole key will
have the value one. Also processing file name is passed with the key
of `FILELABEL` constant. As a result, the option in the next form:
--begin function(key1,key2=val2)
--begin function=key1,key2=val2
will be transformed into following function call:
function(&FILELABEL => "filename", key1 => 1, key2 => "val2")
As described earlier, `FILELABEL` parameter is not given to the
function specified with module option. So
-Mmodule::function(key1,key2=val2)
-Mmodule::function=key1,key2=val2
simply becomes:
function(key1 => 1, key2 => "val2")
The function can be defined in `.greplerc` or modules. Assign the
arguments into hash, then you can access argument list as member of
the hash. It's safe to delete FILELABEL key if you expect random
parameter is given. Content of the target file can be accessed by
`$_`. Ampersand (`&`) is required to avoid the hash key is
interpreted as a bare word.
sub function {
my %arg = @_;
my $filename = delete $arg{&FILELABEL};
$arg{key1}; # 1
$arg{key2}; # "val2"
$_; # contents
}
## OTHERS
- **--usage**\[=_expand_\]
**Greple** print usage and exit with option `--usage`, or no valid
parameter is not specified. In this case, module option is displayed
with help information if available. If you want to see how they are
expanded, supply something not empty to `--usage` option, like:
greple -Mmodule --usage=expand
- **--version**
Show version.
- **--exit**=_number_
When **greple** executed normally, it exit with status 0 or 1 depending
on something matched or not. Sometimes we want to get status 0 even
if nothing matched. This option set the status code for normal
execution. It still exits with non-zero status when error occurred.
- **--man**, **--doc**
Show manual page.
Display module's manual page when used with `-M` option.
- **--show**, **--less**
Show module file contents. Use with `-M` option.
- **--path**
Show module file path. Use with `-M` option.
- **--norc**
Do not read startup file: `~/.greplerc`. This option has to be
placed before any other options including `-M` module options.
Setting `GREPLE_NORC` environment has the same effect.
- **--error**=_action_
As **greple** tries to read data as a character string, sometimes fails
to convert them into internal representation, and the file is skipped
without processing by default. This works fine to skip binary
data. (**skip**)
Also sometimes encounters code mapping error due to character
encoding. In this case, reading the file as a binary data helps to
produce meaningful output. (**retry**)
This option specifies the action when data read error occurred.
- **skip**
Skip the file. Default.
- **retry**
Retry reading the file as a binary data.
- **fatal**
Abort the operation.
- **ignore**
Ignore error and continue to read anyway.
- **GREPLEOPTS**
Environment variable GREPLEOPTS is used as a default options. They
are inserted before command line options.
- **GREPLE\_NORC**
If set non-empty string, startup file `~/.greplerc` is not processed.
- **DEBUG\_GETOPT**
Enable [Getopt::Long](https://metacpan.org/pod/Getopt%3A%3ALong) debug option.
- **DEBUG\_GETOPTEX**
Enable [Getopt::EX](https://metacpan.org/pod/Getopt%3A%3AEX) debug option.
- **NO\_COLOR**
If true, all coloring capability with ANSI terminal sequence is
disabled. See [https://no-color.org/](https://no-color.org/).
Before starting execution, **greple** reads the file named `.greplerc`
on user's home directory. Following directives can be used.
- **option** _name_ string
Argument _name_ of **option** directive is user defined option name.
The rest are processed by `shellwords` routine defined in
Text::ParseWords module. Be sure that this module sometimes requires
escape backslashes.
Any kind of string can be used for option name but it is not combined
with other options.
option --fromcode --outside='(?s)\/\*.*?\*\/'
option --fromcomment --inside='(?s)\/\*.*?\*\/'
If the option named **default** is defined, it will be used as a
default option.
For the purpose to include following arguments within replaced
strings, two special notations can be used in option definition.
String `$<n>` is replaced by the _n_th argument after the
substituted option, where _n_ is number start from one. String
`$<shift>` is replaced by following command line argument and
the argument is removed from option list.
For example, when
option --line --le &line=$<shift>
is defined, command
greple --line 10,20-30,40
will be evaluated as this:
greple --le &line=10,20-30,40
- **expand** _name_ _string_
Define local option _name_. Command **expand** is almost same as
command **option** in terms of its function. However, option defined
by this command is expanded in, and only in, the process of
definition, while option definition is expanded when command arguments
are processed.
This is similar to string macro defined by following **define**
command. But macro expansion is done by simple string replacement, so
you have to use **expand** to define option composed by multiple
arguments.
- **define** _name_ string
Define macro. This is similar to **option**, but argument is not
processed by _shellwords_ and treated just a simple text, so
meta-characters can be included without escape. Macro expansion is
done for option definition and other macro definition. Macro is not
evaluated in command line option. Use option directive if you want to
use in command line,
define (#kana) \p{InKatakana}
option --kanalist --nocolor -o --join --re '(#kana)+(\n(#kana)+)*'
help --kanalist List up Katakana string
- **help** _name_
If **help** directive is used for same option name, it will be printed
in usage message. If the help message is `ignore`, corresponding
line won't show up in the usage.
- **builtin** _spec_ _variable_
Define built-in option which should be processed by option parser.
Arguments are assumed to be [Getopt::Long](https://metacpan.org/pod/Getopt%3A%3ALong) style spec, and
_variable_ is string start with `$`, `@` or `%`. They will be
replaced by a reference to the object which the string represent.
See **pgp** module for example.
- **autoload** _module_ _options_ ...
Define module which should be loaded automatically when specified
option is found in the command arguments.
For example,
autoload -Mdig --dig --git
replaces option "`--dig`" to "`-Mdig --dig`", so that **dig** module
is loaded before processing `--dig` option.
Environment variable substitution is done for string specified by
`option` and `define` directives. Use Perl syntax **$ENV{NAME}** for
this purpose. You can use this to make a portable module.
When **greple** found `__PERL__` line in `.greplerc` file, the rest
of the file is evaluated as a Perl program. You can define your own
subroutines which can be used by `--inside`/`--outside`,
`--include`/`--exclude`, `--block` options.
For those subroutines, file content will be provided by global
variable `$_`. Expected response from the subroutine is the list of
array references, which is made up by start and end offset pairs.
For example, suppose that the following function is defined in your
`.greplerc` file. Start and end offset for each pattern match can be
taken as array element `$-[0]` and `$+[0]`.
__PERL__
sub odd_line {
my @list;
my $i;
while (/.*\n/g) {
push(@list, [ $-[0], $+[0] ]) if ++$i % 2;
}
@list;
}
You can use next command to search pattern included in odd number
lines.
% greple --inside '&odd_line' pattern files...
# MODULE
You can expand the **greple** command using module. Module files are
placed at `App/Greple/` directory in Perl library, and therefor has
**App::Greple::module** package name.
In the command line, module have to be specified preceding any other
options in the form of **-M**_module_. However, it also can be
specified at the beginning of option expansion.
If the package name is declared properly, `__DATA__` section in the
module file will be interpreted same as `.greplerc` file content. So
you can declare the module specific options there. Functions declared
in the module can be used from those options, it makes highly
expandable option/programming interaction possible.
Using `-M` without module argument will print available module list.
Option `--man` will display module document when used with `-M`
option. Use `--show` option to see the module itself. Option
`--path` will print the path of module file.
See this sample module code. This sample defines options to search
from pod, comment and other segment in Perl script. Those capability
can be implemented both in function and macro.
package App::Greple::perl;
use Exporter 'import';
our @EXPORT = qw(pod comment podcomment);
our %EXPORT_TAGS = ( );
our @EXPORT_OK = qw();
use App::Greple::Common;
use App::Greple::Regions;
my $pod_re = qr{^=\w+(?s:.*?)(?:\Z|^=cut\s*\n)}m;
my $comment_re = qr{^(?:\h*#.*\n)+}m;
sub pod {
match_regions(pattern => $pod_re);
}
sub comment {
match_regions(pattern => $comment_re);
}
sub podcomment {
match_regions(pattern => qr/$pod_re|$comment_re/);
}
1;
__DATA__
define :comment: ^(\s*#.*\n)+
define :pod: ^=(?s:.*?)(?:\Z|^=cut\s*\n)
#option --pod --inside :pod:
#option --comment --inside :comment:
#option --code --outside :pod:|:comment:
option --pod --inside '&pod'
option --comment --inside '&comment'
option --code --outside '&podcomment'
You can use the module like this:
greple -Mperl --pod default greple
greple -Mperl --colorful --code --comment --pod default greple
If special subroutine `initialize()` and `finalize()` are defined in
the module, they are called at the beginning with
[Getopt::EX::Module](https://metacpan.org/pod/Getopt%3A%3AEX%3A%3AModule) object as a first argument. Second argument is
the reference to `@ARGV`, and you can modify actual `@ARGV` using
it. See [App::Greple::find](https://metacpan.org/pod/App%3A%3AGreple%3A%3Afind) module as an example.
( run in 0.313 second using v1.01-cache-2.11-cpan-5623c5533a1 )