No Head Line in grep search
when I tried to find a process, I normally use ps
with grep
command.
sh> ps aux | <span>grep </span>fishmyoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish <span>-c</span> /usr/bin/gnome-session <span>-l</span>myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 <span>-fish</span>myoungj+ 2665 0.0 0.1 172848 10076 pts/2 Ss+ 09:24 0:00 <span>-fish</span>myoungj+ 2781 0.0 0.1 172724 9712 pts/0 Ss+ 09:27 0:00 <span>-fish</span>myoungj+ 3024 0.0 0.1 164528 9552 pts/3 Ss+ 09:32 0:00 <span>-fish</span>myoungj+ 4709 0.0 0.0 9136 2692 pts/5 S+ 10:00 0:00 <span>grep</span> <span>--color</span><span>=</span>auto fishsh> ps aux | <span>grep </span>fish myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish <span>-c</span> /usr/bin/gnome-session <span>-l</span> myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 <span>-fish</span> myoungj+ 2665 0.0 0.1 172848 10076 pts/2 Ss+ 09:24 0:00 <span>-fish</span> myoungj+ 2781 0.0 0.1 172724 9712 pts/0 Ss+ 09:27 0:00 <span>-fish</span> myoungj+ 3024 0.0 0.1 164528 9552 pts/3 Ss+ 09:32 0:00 <span>-fish</span> myoungj+ 4709 0.0 0.0 9136 2692 pts/5 S+ 10:00 0:00 <span>grep</span> <span>--color</span><span>=</span>auto fishsh> ps aux | grep fish myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish -c /usr/bin/gnome-session -l myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 -fish myoungj+ 2665 0.0 0.1 172848 10076 pts/2 Ss+ 09:24 0:00 -fish myoungj+ 2781 0.0 0.1 172724 9712 pts/0 Ss+ 09:27 0:00 -fish myoungj+ 3024 0.0 0.1 164528 9552 pts/3 Ss+ 09:32 0:00 -fish myoungj+ 4709 0.0 0.0 9136 2692 pts/5 S+ 10:00 0:00 grep --color=auto fish
Enter fullscreen mode Exit fullscreen mode
Headline is helpful
However, I found that no head line sometimes makes me wondering what those information actually means. i.e: I’d though it would be nicer if I could see the head line of ps
command along with search results.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMANDmyoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish <span>-c</span> /usr/bin/gnome-session <span>-l</span>myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 <span>-fish</span>.. snip ..USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish <span>-c</span> /usr/bin/gnome-session <span>-l</span> myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 <span>-fish</span> .. snip ..USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish -c /usr/bin/gnome-session -l myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 -fish .. snip ..
Enter fullscreen mode Exit fullscreen mode
Quick Solution for single use
awk
or sed
could be useful in this category if you don’t need any other feature from grep
.
sh> ps aux | <span>awk</span> <span>'NR == 1 || /fish/ { print; }'</span>sh> ps aux | <span>awk</span> <span>'NR == 1 || /fish/ { print; }'</span>sh> ps aux | awk 'NR == 1 || /fish/ { print; }'
Enter fullscreen mode Exit fullscreen mode
But I think grep
is more powerful tool.
In bash, it looks straight forward for me.
bash> ps aux | <span>{</span> <span>read </span>line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> <span>}</span>bash> ps aux | <span>{</span> <span>read </span>line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> <span>}</span>bash> ps aux | { read line; echo "$line"; grep 'fish'; }
Enter fullscreen mode Exit fullscreen mode
or using sub-shell.
bash> ps aux | <span>(</span> <span>read </span>line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> <span>)</span>bash> ps aux | <span>(</span> <span>read </span>line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> <span>)</span>bash> ps aux | ( read line; echo "$line"; grep 'fish'; )
Enter fullscreen mode Exit fullscreen mode
or in fish shell (little longer)
fish> ps aux | begin <span>read</span> <span>-l</span> line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> endfish> ps aux | begin <span>read</span> <span>-l</span> line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> endfish> ps aux | begin read -l line; echo "$line"; grep 'fish'; end
Enter fullscreen mode Exit fullscreen mode
Still, I think bash is better than fish in one-liner command.
More Serious Approach
But those one-liners are not very friendly. IMHO, all the programmes, at least, provide us of simple usage. So I decided to go little deeper.
Fish shell solution
The recent file is on my github: hgrep.fish
The basic options are below:
- -h|help : help message and exit
- -C|context : which is passed to as a ‘grep option’, Which is sometimes useful when we need the context literally.
<span>#!/usr/bin/env fish</span><span>set</span> <span>-l</span> PROG <span>=</span> <span>'hgrep.fish'</span><span># ref: https://fishshell.com/docs/current/cmds/argparse.html#cmd-argparse</span><span>set</span> <span>-l</span> options <span>'C/context='</span> <span>'h/help'</span><span>function </span>usage <span>-S</span> <span>-d</span> <span>"basic usage for </span><span>$PROG</span><span>"</span><span>echo</span> <span>\</span><span>"Usage: </span><span>$PROG</span><span> [-C|--context context] <SEARCH> [<INPUT PATH>]"</span>end<span># parse args here</span>argparse <span>$options</span> <span>--</span> <span>$argv</span><span>set</span> <span>-l</span> argc <span>(</span>count <span>$argv</span><span>)</span><span># note: processed arguments are removed from $argv</span><span>if </span><span>test</span> <span>$argc</span> <span>-ne</span> 1 <span>-a</span> <span>$argc</span> <span>-ne</span> 2usage<span>exit </span>0end<span>set</span> <span>-l</span> search_string <span>$argv</span><span>[</span>1] <span># first argument</span><span>set</span> <span>-l</span> input_path /dev/stdin<span>if </span><span>test</span> <span>$argc</span> <span>-gt</span> 1<span># <INPUT PATH> is specified</span><span>set </span>input_path <span>$argv</span><span>[</span><span>-1</span><span>]</span>end<span>echo</span> <span>$input_path</span><span>set</span> <span>-l</span> grep_options <span>-i</span><span>if </span><span>set</span> <span>-q</span> _flag_context<span>set</span> <span>--append</span> grep_options <span>'-C'</span> <span>$_flag_context</span>end<span>set</span> <span>--append</span> grep_options <span>$search_string</span>begin<span># print head first</span><span>read</span> <span>-l</span> line<span>echo</span> <span>"</span><span>$line</span><span>"</span><span># let 'grep' do the rest</span><span>exec grep</span> <span>$grep_options</span>end < <span>$input_path</span><span>#!/usr/bin/env fish</span> <span>set</span> <span>-l</span> PROG <span>=</span> <span>'hgrep.fish'</span> <span># ref: https://fishshell.com/docs/current/cmds/argparse.html#cmd-argparse</span> <span>set</span> <span>-l</span> options <span>'C/context='</span> <span>'h/help'</span> <span>function </span>usage <span>-S</span> <span>-d</span> <span>"basic usage for </span><span>$PROG</span><span>"</span> <span>echo</span> <span>\</span> <span>"Usage: </span><span>$PROG</span><span> [-C|--context context] <SEARCH> [<INPUT PATH>]"</span> end <span># parse args here</span> argparse <span>$options</span> <span>--</span> <span>$argv</span> <span>set</span> <span>-l</span> argc <span>(</span>count <span>$argv</span><span>)</span> <span># note: processed arguments are removed from $argv</span> <span>if </span><span>test</span> <span>$argc</span> <span>-ne</span> 1 <span>-a</span> <span>$argc</span> <span>-ne</span> 2 usage <span>exit </span>0 end <span>set</span> <span>-l</span> search_string <span>$argv</span><span>[</span>1] <span># first argument</span> <span>set</span> <span>-l</span> input_path /dev/stdin <span>if </span><span>test</span> <span>$argc</span> <span>-gt</span> 1 <span># <INPUT PATH> is specified</span> <span>set </span>input_path <span>$argv</span><span>[</span><span>-1</span><span>]</span> end <span>echo</span> <span>$input_path</span> <span>set</span> <span>-l</span> grep_options <span>-i</span> <span>if </span><span>set</span> <span>-q</span> _flag_context <span>set</span> <span>--append</span> grep_options <span>'-C'</span> <span>$_flag_context</span> end <span>set</span> <span>--append</span> grep_options <span>$search_string</span> begin <span># print head first</span> <span>read</span> <span>-l</span> line <span>echo</span> <span>"</span><span>$line</span><span>"</span> <span># let 'grep' do the rest</span> <span>exec grep</span> <span>$grep_options</span> end < <span>$input_path</span>#!/usr/bin/env fish set -l PROG = 'hgrep.fish' # ref: https://fishshell.com/docs/current/cmds/argparse.html#cmd-argparse set -l options 'C/context=' 'h/help' function usage -S -d "basic usage for $PROG" echo \ "Usage: $PROG [-C|--context context] <SEARCH> [<INPUT PATH>]" end # parse args here argparse $options -- $argv set -l argc (count $argv) # note: processed arguments are removed from $argv if test $argc -ne 1 -a $argc -ne 2 usage exit 0 end set -l search_string $argv[1] # first argument set -l input_path /dev/stdin if test $argc -gt 1 # <INPUT PATH> is specified set input_path $argv[-1] end echo $input_path set -l grep_options -i if set -q _flag_context set --append grep_options '-C' $_flag_context end set --append grep_options $search_string begin # print head first read -l line echo "$line" # let 'grep' do the rest exec grep $grep_options end < $input_path
Enter fullscreen mode Exit fullscreen mode
begin .. end < $input_path
pattern is used before when I made fish-pandoc-any-to-markdown. So I found this version a bit easier than others.
Perl Solution
My perl solution was made very long time ago. I’m happy to see that it is still working. Basic routine is the same, except it has one more options. –nohead which is not neccessary. I think I just wanted to chceck the how the OptArgs is working at that time.
I realized today that the routine in fish shell is also applicable.
- read one line from input and print to stdout
- exec to grep with option
Nevertheless, I believed that it is worth to learn!
parsing options in perl
And thanks to OptArgs module, I could handle option handy and in a more structural approach. (However, I think this is little heavier than python’s argparse
.)
The recent file is on my github:
hgrep.pl
<span>#!/usr/bin/env perl</span><span># -*- Mode: cperl; cperl-indent-level:4; tab-width: 8; indent-tabs-mode: nil -*-</span><span># -*- coding: utf-8 -*-</span><span># vim: set tabstop=8 expandtab:</span><span>use</span> <span>strict</span><span>;</span> <span>use</span> <span>warnings</span><span>;</span><span>use</span> <span>feature</span> <span>qw(switch)</span><span>;</span><span>use</span> <span>OptArgs</span><span>;</span> <span># https://metacpan.org/dist/OptArgs/view/bin/optargs</span><span>my</span> <span>@grep_options</span> <span>=</span> <span>qw(-i)</span><span>;</span><span>for</span> <span>(</span> <span>$ENV</span><span>{'</span><span>TERM</span><span>'}</span> <span>)</span> <span>{</span><span>if</span> <span>(</span> <span>$_</span> <span>=~</span> <span>/dumb/</span> <span>)</span> <span>{</span> <span>}</span><span>default</span> <span>{</span> <span>push</span> <span>@grep_options</span><span>,</span> <span>"</span><span>--color=auto</span><span>"</span> <span>}</span><span>}</span><span># ref: https://metacpan.org/pod/OptArgs</span><span>## option parts ...</span><span>opt</span> <span>context</span> <span>=></span><span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Num</span><span>',</span><span>alias</span> <span>=></span> <span>'</span><span>C</span><span>',</span><span>default</span> <span>=></span> <span>3</span><span>,</span><span>comment</span> <span>=></span> <span>'</span><span>print NUM lines of output context</span><span>'</span> <span>);</span><span>opt</span> <span>help</span> <span>=></span><span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Bool</span><span>',</span><span>alias</span> <span>=></span> <span>'</span><span>h</span><span>',</span><span>comment</span> <span>=></span> <span>'</span><span>print a help message and exit</span><span>',</span><span>ishelp</span> <span>=></span> <span>1</span> <span>);</span><span># argument parts ...</span><span>arg</span> <span>search</span> <span>=></span><span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Str</span><span>',</span><span>required</span> <span>=></span> <span>1</span><span>,</span><span>comment</span> <span>=></span> <span>'</span><span>string to search from file</span><span>'</span> <span>);</span><span>arg</span> <span>file_name</span> <span>=></span><span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Str</span><span>',</span><span>default</span> <span>=></span> <span>'</span><span>-</span><span>',</span> <span># default input from stdin</span><span>comment</span> <span>=></span> <span>'</span><span>the file which we search from</span><span>'</span> <span>);</span><span># parsing options via optargs function!</span><span>my</span> <span>$opts</span> <span>=</span> <span>optargs</span><span>;</span><span>#!/usr/bin/env perl</span> <span># -*- Mode: cperl; cperl-indent-level:4; tab-width: 8; indent-tabs-mode: nil -*-</span> <span># -*- coding: utf-8 -*-</span> <span># vim: set tabstop=8 expandtab:</span> <span>use</span> <span>strict</span><span>;</span> <span>use</span> <span>warnings</span><span>;</span> <span>use</span> <span>feature</span> <span>qw(switch)</span><span>;</span> <span>use</span> <span>OptArgs</span><span>;</span> <span># https://metacpan.org/dist/OptArgs/view/bin/optargs</span> <span>my</span> <span>@grep_options</span> <span>=</span> <span>qw(-i)</span><span>;</span> <span>for</span> <span>(</span> <span>$ENV</span><span>{'</span><span>TERM</span><span>'}</span> <span>)</span> <span>{</span> <span>if</span> <span>(</span> <span>$_</span> <span>=~</span> <span>/dumb/</span> <span>)</span> <span>{</span> <span>}</span> <span>default</span> <span>{</span> <span>push</span> <span>@grep_options</span><span>,</span> <span>"</span><span>--color=auto</span><span>"</span> <span>}</span> <span>}</span> <span># ref: https://metacpan.org/pod/OptArgs</span> <span>## option parts ...</span> <span>opt</span> <span>context</span> <span>=></span> <span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Num</span><span>',</span> <span>alias</span> <span>=></span> <span>'</span><span>C</span><span>',</span> <span>default</span> <span>=></span> <span>3</span><span>,</span> <span>comment</span> <span>=></span> <span>'</span><span>print NUM lines of output context</span><span>'</span> <span>);</span> <span>opt</span> <span>help</span> <span>=></span> <span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Bool</span><span>',</span> <span>alias</span> <span>=></span> <span>'</span><span>h</span><span>',</span> <span>comment</span> <span>=></span> <span>'</span><span>print a help message and exit</span><span>',</span> <span>ishelp</span> <span>=></span> <span>1</span> <span>);</span> <span># argument parts ...</span> <span>arg</span> <span>search</span> <span>=></span> <span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Str</span><span>',</span> <span>required</span> <span>=></span> <span>1</span><span>,</span> <span>comment</span> <span>=></span> <span>'</span><span>string to search from file</span><span>'</span> <span>);</span> <span>arg</span> <span>file_name</span> <span>=></span> <span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Str</span><span>',</span> <span>default</span> <span>=></span> <span>'</span><span>-</span><span>',</span> <span># default input from stdin</span> <span>comment</span> <span>=></span> <span>'</span><span>the file which we search from</span><span>'</span> <span>);</span> <span># parsing options via optargs function!</span> <span>my</span> <span>$opts</span> <span>=</span> <span>optargs</span><span>;</span>#!/usr/bin/env perl # -*- Mode: cperl; cperl-indent-level:4; tab-width: 8; indent-tabs-mode: nil -*- # -*- coding: utf-8 -*- # vim: set tabstop=8 expandtab: use strict; use warnings; use feature qw(switch); use OptArgs; # https://metacpan.org/dist/OptArgs/view/bin/optargs my @grep_options = qw(-i); for ( $ENV{'TERM'} ) { if ( $_ =~ /dumb/ ) { } default { push @grep_options, "--color=auto" } } # ref: https://metacpan.org/pod/OptArgs ## option parts ... opt context => ( isa => 'Num', alias => 'C', default => 3, comment => 'print NUM lines of output context' ); opt help => ( isa => 'Bool', alias => 'h', comment => 'print a help message and exit', ishelp => 1 ); # argument parts ... arg search => ( isa => 'Str', required => 1, comment => 'string to search from file' ); arg file_name => ( isa => 'Str', default => '-', # default input from stdin comment => 'the file which we search from' ); # parsing options via optargs function! my $opts = optargs;
Enter fullscreen mode Exit fullscreen mode
And now processing the parsed arguments and open a file (or stdin)
<span>if</span> <span>(</span> <span>defined</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'}</span> <span>and</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'}</span> <span>></span> <span>0</span> <span>)</span> <span>{</span><span>push</span> <span>@grep_options</span><span>,</span> <span>'</span><span>-C</span><span>',</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'};</span><span>}</span><span>my</span> <span>$fh</span><span>;</span><span>if</span> <span>(</span> <span>$opts</span><span>-></span><span>{'</span><span>file_name</span><span>'}</span> <span>ne</span> <span>'</span><span>-</span><span>'</span> <span>)</span> <span>{</span><span>open</span> <span>my</span> <span>$fh</span><span>,</span> <span>"</span><span><</span><span>$opts</span><span>->{file_name}</span><span>",</span><span>or</span> <span>die</span> <span>"</span><span>Can't open `</span><span>$opts</span><span>->{file_name}'</span><span>";</span><span>}</span><span>else</span> <span>{</span><span># http://perldoc.perl.org/functions/open.html</span><span>open</span><span>(</span> <span>$fh</span><span>,</span> <span>"</span><span><&=</span><span>",</span><span>*STDIN</span> <span>);</span><span>}</span><span>if</span> <span>(</span> <span>not</span> <span>$opts</span><span>-></span><span>{</span><span>nohead</span><span>}</span> <span>)</span> <span>{</span><span>my</span> <span>$head</span> <span>=</span> <span><</span><span>$fh</span><span>></span><span>;</span><span># FIXME: colourising ....</span><span>print</span> <span>"</span><span>$head</span><span>";</span><span>}</span><span>my</span> <span>$to_gh</span><span>;</span><span>if</span> <span>(</span> <span>defined</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'}</span> <span>and</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'}</span> <span>></span> <span>0</span> <span>)</span> <span>{</span> <span>push</span> <span>@grep_options</span><span>,</span> <span>'</span><span>-C</span><span>',</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'};</span> <span>}</span> <span>my</span> <span>$fh</span><span>;</span> <span>if</span> <span>(</span> <span>$opts</span><span>-></span><span>{'</span><span>file_name</span><span>'}</span> <span>ne</span> <span>'</span><span>-</span><span>'</span> <span>)</span> <span>{</span> <span>open</span> <span>my</span> <span>$fh</span><span>,</span> <span>"</span><span><</span><span>$opts</span><span>->{file_name}</span><span>",</span> <span>or</span> <span>die</span> <span>"</span><span>Can't open `</span><span>$opts</span><span>->{file_name}'</span><span>";</span> <span>}</span> <span>else</span> <span>{</span> <span># http://perldoc.perl.org/functions/open.html</span> <span>open</span><span>(</span> <span>$fh</span><span>,</span> <span>"</span><span><&=</span><span>",</span><span>*STDIN</span> <span>);</span> <span>}</span> <span>if</span> <span>(</span> <span>not</span> <span>$opts</span><span>-></span><span>{</span><span>nohead</span><span>}</span> <span>)</span> <span>{</span> <span>my</span> <span>$head</span> <span>=</span> <span><</span><span>$fh</span><span>></span><span>;</span> <span># FIXME: colourising ....</span> <span>print</span> <span>"</span><span>$head</span><span>";</span> <span>}</span> <span>my</span> <span>$to_gh</span><span>;</span>if ( defined $opts->{'context'} and $opts->{'context'} > 0 ) { push @grep_options, '-C', $opts->{'context'}; } my $fh; if ( $opts->{'file_name'} ne '-' ) { open my $fh, "<$opts->{file_name}", or die "Can't open `$opts->{file_name}'"; } else { # http://perldoc.perl.org/functions/open.html open( $fh, "<&=",*STDIN ); } if ( not $opts->{nohead} ) { my $head = <$fh>; # FIXME: colourising .... print "$head"; } my $to_gh;
Enter fullscreen mode Exit fullscreen mode
requirement for system programming
And when I try to go further, I found that I need little more system programming underneath, which shell
normally does for me.
To communicate with grep
function, we need to open a pipe via open
function.
<span>my</span> <span>$grep_pid</span> <span>=</span> <span>open</span><span>(</span> <span>$to_gh</span><span>,</span> <span>'</span><span>|-</span><span>'</span> <span>);</span><span>if</span> <span>(</span> <span>not</span> <span>defined</span> <span>$grep_pid</span> <span>)</span> <span>{</span><span>die</span> <span>"</span><span>Can't fork: $!</span><span>";</span><span>}</span><span>my</span> <span>$grep_pid</span> <span>=</span> <span>open</span><span>(</span> <span>$to_gh</span><span>,</span> <span>'</span><span>|-</span><span>'</span> <span>);</span> <span>if</span> <span>(</span> <span>not</span> <span>defined</span> <span>$grep_pid</span> <span>)</span> <span>{</span> <span>die</span> <span>"</span><span>Can't fork: $!</span><span>";</span> <span>}</span>my $grep_pid = open( $to_gh, '|-' ); if ( not defined $grep_pid ) { die "Can't fork: $!"; }
Enter fullscreen mode Exit fullscreen mode
|-
means creating a pipe, and fork implicitly at the same time and now we have two processes, when the parent writing into new handle \$to_gh, the child will read from the stdin.
In terms of shell script, it looks like below at the moment.
sh> parent_perl <some options ...> | child_perlsh> parent_perl <some options ...> | child_perlsh> parent_perl <some options ...> | child_perl
Enter fullscreen mode Exit fullscreen mode
i.e. parent_perl and child_perl now communicate with piple(|
) and the child_perl
process will be replaced with grep
process via exec
.
There is a simple way to we are in the parent_perl
process or child_perl
process, which is checking the $grep_pid
value.
<span>if</span> <span>(</span> <span>$grep_pid</span> <span>)</span> <span>{</span><span># if grep_pid is not zero, this is parent_perl (parent process)</span><span># which handle both file handles.</span><span>while</span> <span>(</span> <span><</span><span>$fh</span><span>></span> <span>)</span> <span>{</span> <span>print</span> <span>$to_gh</span> <span>$_</span><span>;</span> <span>}</span><span>close</span> <span>$_</span> <span>for</span> <span>$to_gh</span><span>,</span> <span>$fh</span><span>;</span><span># parent process have to wait any children processes finsished.</span><span>waitpid</span> <span>$grep_pid</span><span>,</span> <span>0</span><span>;</span><span>}</span><span>else</span> <span>{</span><span># otherwise, this is child_perl (child process)</span><span>close</span> <span>$fh</span><span>;</span> <span># not used in child process</span><span>exec</span> <span>'</span><span>grep</span><span>',</span> <span>@grep_options</span><span>,</span> <span>$opts</span><span>-></span><span>{'</span><span>search</span><span>'};</span><span>}</span><span>exit</span> <span>0</span><span>;</span><span>if</span> <span>(</span> <span>$grep_pid</span> <span>)</span> <span>{</span> <span># if grep_pid is not zero, this is parent_perl (parent process)</span> <span># which handle both file handles.</span> <span>while</span> <span>(</span> <span><</span><span>$fh</span><span>></span> <span>)</span> <span>{</span> <span>print</span> <span>$to_gh</span> <span>$_</span><span>;</span> <span>}</span> <span>close</span> <span>$_</span> <span>for</span> <span>$to_gh</span><span>,</span> <span>$fh</span><span>;</span> <span># parent process have to wait any children processes finsished.</span> <span>waitpid</span> <span>$grep_pid</span><span>,</span> <span>0</span><span>;</span> <span>}</span> <span>else</span> <span>{</span> <span># otherwise, this is child_perl (child process)</span> <span>close</span> <span>$fh</span><span>;</span> <span># not used in child process</span> <span>exec</span> <span>'</span><span>grep</span><span>',</span> <span>@grep_options</span><span>,</span> <span>$opts</span><span>-></span><span>{'</span><span>search</span><span>'};</span> <span>}</span> <span>exit</span> <span>0</span><span>;</span>if ( $grep_pid ) { # if grep_pid is not zero, this is parent_perl (parent process) # which handle both file handles. while ( <$fh> ) { print $to_gh $_; } close $_ for $to_gh, $fh; # parent process have to wait any children processes finsished. waitpid $grep_pid, 0; } else { # otherwise, this is child_perl (child process) close $fh; # not used in child process exec 'grep', @grep_options, $opts->{'search'}; } exit 0;
Enter fullscreen mode Exit fullscreen mode
and last exec 'grep' ...
will replace its own perl process to grep
process. no process could not be created without a
parent.
I found that it is worth trying to understand basic system programming in perl, However shell script will be much easier to handle it.
Python Solution (as a newbie)
How about python? I think the same logic could be applied in python as well. However, I didn’t get chance to write down a python script yet. So, I didn’t make any function and write it as simple as possible.
credit:
- os pipe: https://www.tutorialspoint.com/python/os_pipe.htm
- for loop: https://realpython.com/python-for-loop/
- file i/o: https://www.w3schools.com/python/python_file_open.asp
- optparse: https://docs.python.org/3/library/optparse.html
- execvp: https://docs.python.org/3/library/os.html?highlight=popen#os.execvp
- waitpid: https://docs.python.org/3/library/os.html#os.waitpid
I go through similar pattern as I did in perl you can find the recent file on my github: hgrep.py
<span>#!/usr/bin/env python3 </span><span>import</span> <span>os</span><span>,</span> <span>sys</span><span>import</span> <span>argparse</span><span># handle options first </span><span>parser</span> <span>=</span> <span>argparse</span><span>.</span><span>ArgumentParser</span><span>()</span><span>#prog="hgrep.py") </span><span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"-C"</span><span>,</span> <span>"--context"</span><span>,</span><span>nargs</span> <span>=</span> <span>1</span><span>,</span><span>type</span> <span>=</span> <span>int</span><span>,</span><span>dest</span> <span>=</span> <span>"context"</span><span>,</span><span>required</span> <span>=</span> <span>False</span><span>,</span><span>help</span><span>=</span><span>"print NUM lines of output context"</span> <span>)</span><span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"search"</span><span>,</span><span># upper case in the help message </span> <span>metavar</span> <span>=</span> <span>"<SEARCH>"</span><span>,</span><span>help</span> <span>=</span> <span>"string to search from <file_path>"</span> <span>)</span><span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"file_path"</span><span>,</span><span># upper case in the help message </span> <span>metavar</span> <span>=</span> <span>"[<FILE PATH>]"</span><span>,</span><span>default</span> <span>=</span> <span>'-'</span><span>,</span><span>help</span> <span>=</span> <span>"<file_path> to search"</span> <span>)</span><span># case insenstive search </span><span>grep_options</span> <span>=</span> <span>[</span> <span>'-i'</span> <span>]</span><span># highligting </span><span>if</span> <span>os</span><span>.</span><span>environ</span><span>[</span><span>'TERM'</span><span>].</span><span>lower</span> <span>!=</span> <span>'dumb'</span><span>:</span><span>grep_options</span><span>.</span><span>append</span><span>(</span> <span>"--color=auto"</span> <span>)</span><span>#!/usr/bin/env python3 </span> <span>import</span> <span>os</span><span>,</span> <span>sys</span> <span>import</span> <span>argparse</span> <span># handle options first </span><span>parser</span> <span>=</span> <span>argparse</span><span>.</span><span>ArgumentParser</span><span>()</span><span>#prog="hgrep.py") </span><span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"-C"</span><span>,</span> <span>"--context"</span><span>,</span> <span>nargs</span> <span>=</span> <span>1</span><span>,</span> <span>type</span> <span>=</span> <span>int</span><span>,</span> <span>dest</span> <span>=</span> <span>"context"</span><span>,</span> <span>required</span> <span>=</span> <span>False</span><span>,</span> <span>help</span><span>=</span><span>"print NUM lines of output context"</span> <span>)</span> <span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"search"</span><span>,</span> <span># upper case in the help message </span> <span>metavar</span> <span>=</span> <span>"<SEARCH>"</span><span>,</span> <span>help</span> <span>=</span> <span>"string to search from <file_path>"</span> <span>)</span> <span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"file_path"</span><span>,</span> <span># upper case in the help message </span> <span>metavar</span> <span>=</span> <span>"[<FILE PATH>]"</span><span>,</span> <span>default</span> <span>=</span> <span>'-'</span><span>,</span> <span>help</span> <span>=</span> <span>"<file_path> to search"</span> <span>)</span> <span># case insenstive search </span><span>grep_options</span> <span>=</span> <span>[</span> <span>'-i'</span> <span>]</span> <span># highligting </span><span>if</span> <span>os</span><span>.</span><span>environ</span><span>[</span><span>'TERM'</span><span>].</span><span>lower</span> <span>!=</span> <span>'dumb'</span><span>:</span> <span>grep_options</span><span>.</span><span>append</span><span>(</span> <span>"--color=auto"</span> <span>)</span>#!/usr/bin/env python3 import os, sys import argparse # handle options first parser = argparse.ArgumentParser()#prog="hgrep.py") parser.add_argument( "-C", "--context", nargs = 1, type = int, dest = "context", required = False, help="print NUM lines of output context" ) parser.add_argument( "search", # upper case in the help message metavar = "<SEARCH>", help = "string to search from <file_path>" ) parser.add_argument( "file_path", # upper case in the help message metavar = "[<FILE PATH>]", default = '-', help = "<file_path> to search" ) # case insenstive search grep_options = [ '-i' ] # highligting if os.environ['TERM'].lower != 'dumb': grep_options.append( "--color=auto" )
Enter fullscreen mode Exit fullscreen mode
I found argparse module cannot handle optional positional argument. optional positional argument is natural in grep
. So I’d like to keep that behaviour.
<span># argparse cannot handle optional argument # WORKAROUND: </span><span>argv</span> <span>=</span> <span>sys</span><span>.</span><span>argv</span><span>[</span><span>1</span><span>::]</span><span>if</span> <span>len</span><span>(</span><span>argv</span><span>)</span> <span>==</span> <span>0</span><span>:</span><span>print</span><span>(</span> <span>"{prog}: No argument given"</span><span>.</span><span>format</span><span>(</span><span>prog</span><span>=</span> <span>sys</span><span>.</span><span>argv</span><span>[</span><span>0</span><span>]</span> <span>),</span><span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stderr</span> <span>)</span><span>parser</span><span>.</span><span>print_help</span><span>()</span><span>exit</span><span>(</span> <span>1</span> <span>)</span><span>if</span> <span>len</span><span>(</span><span>argv</span><span>)</span> <span>==</span> <span>1</span><span>:</span><span># user ommit input file path </span> <span># default : - (stdin) </span> <span>argv</span><span>.</span><span>append</span><span>(</span> <span>'-'</span> <span>)</span><span>args</span> <span>=</span> <span>parser</span><span>.</span><span>parse_args</span><span>(</span> <span>argv</span> <span>)</span><span># check more grep options </span><span>if</span> <span>args</span><span>.</span><span>context</span> <span>is</span> <span>not</span> <span>None</span> <span>and</span> <span>args</span><span>.</span><span>context</span> <span>></span> <span>0</span><span>:</span><span>grep_options</span><span>.</span><span>extend</span><span>(</span> <span>[</span> <span>'-C'</span><span>,</span> <span>args</span><span>.</span><span>context</span> <span>]</span> <span>)</span><span>grep_options</span><span>.</span><span>append</span><span>(</span> <span>args</span><span>.</span><span>search</span> <span>)</span><span># argparse cannot handle optional argument # WORKAROUND: </span><span>argv</span> <span>=</span> <span>sys</span><span>.</span><span>argv</span><span>[</span><span>1</span><span>::]</span> <span>if</span> <span>len</span><span>(</span><span>argv</span><span>)</span> <span>==</span> <span>0</span><span>:</span> <span>print</span><span>(</span> <span>"{prog}: No argument given"</span><span>.</span><span>format</span><span>(</span><span>prog</span><span>=</span> <span>sys</span><span>.</span><span>argv</span><span>[</span><span>0</span><span>]</span> <span>),</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stderr</span> <span>)</span> <span>parser</span><span>.</span><span>print_help</span><span>()</span> <span>exit</span><span>(</span> <span>1</span> <span>)</span> <span>if</span> <span>len</span><span>(</span><span>argv</span><span>)</span> <span>==</span> <span>1</span><span>:</span> <span># user ommit input file path </span> <span># default : - (stdin) </span> <span>argv</span><span>.</span><span>append</span><span>(</span> <span>'-'</span> <span>)</span> <span>args</span> <span>=</span> <span>parser</span><span>.</span><span>parse_args</span><span>(</span> <span>argv</span> <span>)</span> <span># check more grep options </span><span>if</span> <span>args</span><span>.</span><span>context</span> <span>is</span> <span>not</span> <span>None</span> <span>and</span> <span>args</span><span>.</span><span>context</span> <span>></span> <span>0</span><span>:</span> <span>grep_options</span><span>.</span><span>extend</span><span>(</span> <span>[</span> <span>'-C'</span><span>,</span> <span>args</span><span>.</span><span>context</span> <span>]</span> <span>)</span> <span>grep_options</span><span>.</span><span>append</span><span>(</span> <span>args</span><span>.</span><span>search</span> <span>)</span># argparse cannot handle optional argument # WORKAROUND: argv = sys.argv[1::] if len(argv) == 0: print( "{prog}: No argument given".format(prog= sys.argv[0] ), file = sys.stderr ) parser.print_help() exit( 1 ) if len(argv) == 1: # user ommit input file path # default : - (stdin) argv.append( '-' ) args = parser.parse_args( argv ) # check more grep options if args.context is not None and args.context > 0: grep_options.extend( [ '-C', args.context ] ) grep_options.append( args.search )
Enter fullscreen mode Exit fullscreen mode
I don’t really know about python, but I guess I took the very low-level pipe()
function in python.
<span># and let's go for plumbing # file descriptors r,w for reading and writing </span><span>r</span><span>,</span> <span>w</span> <span>=</span> <span>os</span><span>.</span><span>pipe</span><span>()</span><span>if</span> <span>args</span><span>.</span><span>file_path</span> <span>==</span> <span>"-"</span><span>:</span><span># or from stdin </span> <span>file_to_read</span> <span>=</span> <span>sys</span><span>.</span><span>stdin</span><span>else</span><span>:</span><span># or open file path to read </span> <span>if</span> <span>os</span><span>.</span><span>path</span><span>.</span><span>isfile</span><span>(</span> <span>args</span><span>.</span><span>file_path</span> <span>):</span><span>file_to_read</span> <span>=</span> <span>open</span><span>(</span> <span>args</span><span>.</span><span>file_path</span><span>,</span> <span>"r"</span> <span>)</span><span>else</span><span>:</span><span>print</span><span>(</span> <span>"A file path:({fp}) is not readable"</span><span>.</span><span>format</span><span>(</span> <span>fp</span><span>=</span><span>args</span><span>.</span><span>file_path</span> <span>)</span><span>,</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stderr</span> <span>)</span><span>exit</span><span>(</span> <span>2</span> <span>)</span><span># read head first and print into stdout directly </span><span>print</span><span>(</span> <span>file_to_read</span><span>.</span><span>readline</span><span>()</span> <span>,</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stdout</span><span>,</span> <span>flush</span> <span>=</span> <span>True</span> <span>)</span><span># fork() will create a child process # and we can distinguish which one is parent process by checking # return value </span><span>grep_pid</span> <span>=</span> <span>os</span><span>.</span><span>fork</span><span>()</span><span>if</span> <span>grep_pid</span><span>:</span><span># parent process </span><span># to communicate with to a child process </span> <span># writing file descriptor will be used </span> <span>os</span><span>.</span><span>close</span><span>(</span><span>r</span><span>)</span><span>os</span><span>.</span><span>dup2</span><span>(</span> <span>w</span><span>,</span> <span>sys</span><span>.</span><span>stdout</span><span>.</span><span>fileno</span><span>()</span> <span>)</span><span>for</span> <span>line</span> <span>in</span> <span>file_to_read</span><span>:</span><span>print</span><span>(</span> <span>line</span> <span>)</span><span># It is good practice to close all the file open </span> <span>os</span><span>.</span><span>close</span><span>(</span> <span>w</span> <span>)</span><span># safely waiting for children processes </span> <span>os</span><span>.</span><span>waitpid</span><span>(</span> <span>grep_pid</span><span>,</span><span>os</span><span>.</span><span>WNOHANG</span> <span># if child process status not available: no wait </span> <span>)</span><span>else</span><span>:</span><span># child process </span> <span>os</span><span>.</span><span>dup2</span><span>(</span> <span>r</span><span>,</span> <span>sys</span><span>.</span><span>stdin</span><span>.</span><span>fileno</span><span>()</span> <span>)</span><span># child process only requires 'r' as stdin </span> <span># and stdout so it is better to close r,w here. </span> <span>os</span><span>.</span><span>closerange</span><span>(</span> <span>r</span><span>,</span> <span>w</span> <span>)</span><span>os</span><span>.</span><span>execvp</span><span>(</span> <span>'grep'</span><span>,</span> <span>grep_options</span> <span>)</span><span>exit</span><span>(</span><span>0</span><span>)</span><span># and let's go for plumbing # file descriptors r,w for reading and writing </span><span>r</span><span>,</span> <span>w</span> <span>=</span> <span>os</span><span>.</span><span>pipe</span><span>()</span> <span>if</span> <span>args</span><span>.</span><span>file_path</span> <span>==</span> <span>"-"</span><span>:</span> <span># or from stdin </span> <span>file_to_read</span> <span>=</span> <span>sys</span><span>.</span><span>stdin</span> <span>else</span><span>:</span> <span># or open file path to read </span> <span>if</span> <span>os</span><span>.</span><span>path</span><span>.</span><span>isfile</span><span>(</span> <span>args</span><span>.</span><span>file_path</span> <span>):</span> <span>file_to_read</span> <span>=</span> <span>open</span><span>(</span> <span>args</span><span>.</span><span>file_path</span><span>,</span> <span>"r"</span> <span>)</span> <span>else</span><span>:</span> <span>print</span><span>(</span> <span>"A file path:({fp}) is not readable"</span> <span>.</span><span>format</span><span>(</span> <span>fp</span><span>=</span><span>args</span><span>.</span><span>file_path</span> <span>)</span> <span>,</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stderr</span> <span>)</span> <span>exit</span><span>(</span> <span>2</span> <span>)</span> <span># read head first and print into stdout directly </span><span>print</span><span>(</span> <span>file_to_read</span><span>.</span><span>readline</span><span>()</span> <span>,</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stdout</span><span>,</span> <span>flush</span> <span>=</span> <span>True</span> <span>)</span> <span># fork() will create a child process # and we can distinguish which one is parent process by checking # return value </span><span>grep_pid</span> <span>=</span> <span>os</span><span>.</span><span>fork</span><span>()</span> <span>if</span> <span>grep_pid</span><span>:</span> <span># parent process </span> <span># to communicate with to a child process </span> <span># writing file descriptor will be used </span> <span>os</span><span>.</span><span>close</span><span>(</span><span>r</span><span>)</span> <span>os</span><span>.</span><span>dup2</span><span>(</span> <span>w</span><span>,</span> <span>sys</span><span>.</span><span>stdout</span><span>.</span><span>fileno</span><span>()</span> <span>)</span> <span>for</span> <span>line</span> <span>in</span> <span>file_to_read</span><span>:</span> <span>print</span><span>(</span> <span>line</span> <span>)</span> <span># It is good practice to close all the file open </span> <span>os</span><span>.</span><span>close</span><span>(</span> <span>w</span> <span>)</span> <span># safely waiting for children processes </span> <span>os</span><span>.</span><span>waitpid</span><span>(</span> <span>grep_pid</span><span>,</span> <span>os</span><span>.</span><span>WNOHANG</span> <span># if child process status not available: no wait </span> <span>)</span> <span>else</span><span>:</span> <span># child process </span> <span>os</span><span>.</span><span>dup2</span><span>(</span> <span>r</span><span>,</span> <span>sys</span><span>.</span><span>stdin</span><span>.</span><span>fileno</span><span>()</span> <span>)</span> <span># child process only requires 'r' as stdin </span> <span># and stdout so it is better to close r,w here. </span> <span>os</span><span>.</span><span>closerange</span><span>(</span> <span>r</span><span>,</span> <span>w</span> <span>)</span> <span>os</span><span>.</span><span>execvp</span><span>(</span> <span>'grep'</span><span>,</span> <span>grep_options</span> <span>)</span> <span>exit</span><span>(</span><span>0</span><span>)</span># and let's go for plumbing # file descriptors r,w for reading and writing r, w = os.pipe() if args.file_path == "-": # or from stdin file_to_read = sys.stdin else: # or open file path to read if os.path.isfile( args.file_path ): file_to_read = open( args.file_path, "r" ) else: print( "A file path:({fp}) is not readable" .format( fp=args.file_path ) , file = sys.stderr ) exit( 2 ) # read head first and print into stdout directly print( file_to_read.readline() , file = sys.stdout, flush = True ) # fork() will create a child process # and we can distinguish which one is parent process by checking # return value grep_pid = os.fork() if grep_pid: # parent process # to communicate with to a child process # writing file descriptor will be used os.close(r) os.dup2( w, sys.stdout.fileno() ) for line in file_to_read: print( line ) # It is good practice to close all the file open os.close( w ) # safely waiting for children processes os.waitpid( grep_pid, os.WNOHANG # if child process status not available: no wait ) else: # child process os.dup2( r, sys.stdin.fileno() ) # child process only requires 'r' as stdin # and stdout so it is better to close r,w here. os.closerange( r, w ) os.execvp( 'grep', grep_options ) exit(0)
Enter fullscreen mode Exit fullscreen mode
Where I found difficulty
os.dup2
is essential to communicate with the grep
in child process as grep
only care about stdin
here, but there is no way to inform the child that parent is going to newly open file descriptors (r,w). So we should kindly re-bind the new file descriptor to stdin. TBH, I spent too much time on this because lacks of my knowledge about system programming.
and os.waitpid
requires os.WNOHANG
option value, I thought it will be 0
, which is actually not. so my programme was on hang after grep
had finished its job.
Wrapping Up
pipe and shell’s power
- Even though it was good chance to learn about basic pipe usage, Shell script is very powerful for basic process communication between two processes.
- perl’s old open function’s arguments are a little bit hacky.
parsing option is easier with modules
And also I tried to add option and test them.
- fish‘s argparse is relatively new, but useful for my cases.
- Perl’s OptArgs has more features and handle optional argument as well
- python’s argparse has good type system for checking data type and is performant, however it doesn’t support optional (positional) arguments. so I applied some workaround.
Suggestion after post
-
It would be nicer, if we have option for case sensitive option because I put case-insensitive for default.
-
After making fish-pandoc-any-to-markdown and hgrep, only I need a programme to pre-process and let the other application could handle rest of it. So it becomes more general programme like below:
sh> ps aux | head-with get-one-line <span>--tail-with</span> <span>grep</span> <span>-i</span> /fish/<span># or in fish-pandoc-any-to-markdown</span>sh> <span>cat </span>some.org | head-with retrieve-metadata <span>--tail-with</span> pandoc <span>-t</span> markdown <span>></span> some.mdsh> ps aux | head-with get-one-line <span>--tail-with</span> <span>grep</span> <span>-i</span> /fish/ <span># or in fish-pandoc-any-to-markdown</span> sh> <span>cat </span>some.org | head-with retrieve-metadata <span>--tail-with</span> pandoc <span>-t</span> markdown <span>></span> some.mdsh> ps aux | head-with get-one-line --tail-with grep -i /fish/ # or in fish-pandoc-any-to-markdown sh> cat some.org | head-with retrieve-metadata --tail-with pandoc -t markdown > some.md
Enter fullscreen mode Exit fullscreen mode
Well… but not for today. maybe after I get more chance to use the similar patterns!
Thank you for reading! and Happy coding!
原文链接:Grep with Head Line
暂无评论内容