Grep with Head Line

No Head Line in grep search

when I tried to find a process, I normally use ps with grep command.

sh> ps aux | <span>grep </span>fish
myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish <span>-c</span> /usr/bin/gnome-session <span>-l</span>
myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 <span>-fish</span>
myoungj+ 2665 0.0 0.1 172848 10076 pts/2 Ss+ 09:24 0:00 <span>-fish</span>
myoungj+ 2781 0.0 0.1 172724 9712 pts/0 Ss+ 09:27 0:00 <span>-fish</span>
myoungj+ 3024 0.0 0.1 164528 9552 pts/3 Ss+ 09:32 0:00 <span>-fish</span>
myoungj+ 4709 0.0 0.0 9136 2692 pts/5 S+ 10:00 0:00 <span>grep</span> <span>--color</span><span>=</span>auto fish
sh> ps aux | <span>grep </span>fish
myoungj+     695  0.0  0.0  88596  7000 tty2     S+   09:17   0:00 -/usr/bin/fish <span>-c</span> /usr/bin/gnome-session <span>-l</span> 
myoungj+    2490  0.0  0.1 164660 10140 pts/1    Ss   09:21   0:00 <span>-fish</span>
myoungj+    2665  0.0  0.1 172848 10076 pts/2    Ss+  09:24   0:00 <span>-fish</span>
myoungj+    2781  0.0  0.1 172724  9712 pts/0    Ss+  09:27   0:00 <span>-fish</span>
myoungj+    3024  0.0  0.1 164528  9552 pts/3    Ss+  09:32   0:00 <span>-fish</span>
myoungj+    4709  0.0  0.0   9136  2692 pts/5    S+   10:00   0:00 <span>grep</span> <span>--color</span><span>=</span>auto fish
sh> ps aux | grep fish myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish -c /usr/bin/gnome-session -l myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 -fish myoungj+ 2665 0.0 0.1 172848 10076 pts/2 Ss+ 09:24 0:00 -fish myoungj+ 2781 0.0 0.1 172724 9712 pts/0 Ss+ 09:27 0:00 -fish myoungj+ 3024 0.0 0.1 164528 9552 pts/3 Ss+ 09:32 0:00 -fish myoungj+ 4709 0.0 0.0 9136 2692 pts/5 S+ 10:00 0:00 grep --color=auto fish

Enter fullscreen mode Exit fullscreen mode

Headline is helpful

However, I found that no head line sometimes makes me wondering what those information actually means. i.e: I’d though it would be nicer if I could see the head line of ps command along with search results.

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish <span>-c</span> /usr/bin/gnome-session <span>-l</span>
myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 <span>-fish</span>
.. snip ..
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
myoungj+     695  0.0  0.0  88596  7000 tty2     S+   09:17   0:00 -/usr/bin/fish <span>-c</span> /usr/bin/gnome-session <span>-l</span> 
myoungj+    2490  0.0  0.1 164660 10140 pts/1    Ss   09:21   0:00 <span>-fish</span>
.. snip ..
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish -c /usr/bin/gnome-session -l myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 -fish .. snip ..

Enter fullscreen mode Exit fullscreen mode

Quick Solution for single use

awk or sed could be useful in this category if you don’t need any other feature from grep.

sh> ps aux | <span>awk</span> <span>'NR == 1 || /fish/ { print; }'</span>
sh> ps aux | <span>awk</span> <span>'NR == 1 || /fish/ { print; }'</span>
sh> ps aux | awk 'NR == 1 || /fish/ { print; }'

Enter fullscreen mode Exit fullscreen mode

But I think grep is more powerful tool.

In bash, it looks straight forward for me.

bash> ps aux | <span>{</span> <span>read </span>line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> <span>}</span>
bash> ps aux | <span>{</span> <span>read </span>line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> <span>}</span>
bash> ps aux | { read line; echo "$line"; grep 'fish'; }

Enter fullscreen mode Exit fullscreen mode

or using sub-shell.

bash> ps aux | <span>(</span> <span>read </span>line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> <span>)</span>
bash> ps aux | <span>(</span> <span>read </span>line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> <span>)</span>
bash> ps aux | ( read line; echo "$line"; grep 'fish'; )

Enter fullscreen mode Exit fullscreen mode

or in fish shell (little longer)

fish> ps aux | begin <span>read</span> <span>-l</span> line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> end
fish> ps aux | begin <span>read</span> <span>-l</span> line<span>;</span> <span>echo</span> <span>"</span><span>$line</span><span>"</span><span>;</span> <span>grep</span> <span>'fish'</span><span>;</span> end
fish> ps aux | begin read -l line; echo "$line"; grep 'fish'; end

Enter fullscreen mode Exit fullscreen mode

Still, I think bash is better than fish in one-liner command.

More Serious Approach

But those one-liners are not very friendly. IMHO, all the programmes, at least, provide us of simple usage. So I decided to go little deeper.

Fish shell solution

The recent file is on my github: hgrep.fish

The basic options are below:

  • -h|help : help message and exit
  • -C|context : which is passed to as a ‘grep option’, Which is sometimes useful when we need the context literally.
<span>#!/usr/bin/env fish</span>
<span>set</span> <span>-l</span> PROG <span>=</span> <span>'hgrep.fish'</span>
<span># ref: https://fishshell.com/docs/current/cmds/argparse.html#cmd-argparse</span>
<span>set</span> <span>-l</span> options <span>'C/context='</span> <span>'h/help'</span>
<span>function </span>usage <span>-S</span> <span>-d</span> <span>"basic usage for </span><span>$PROG</span><span>"</span>
<span>echo</span> <span>\</span>
<span>"Usage: </span><span>$PROG</span><span> [-C|--context context] <SEARCH> [<INPUT PATH>]"</span>
end
<span># parse args here</span>
argparse <span>$options</span> <span>--</span> <span>$argv</span>
<span>set</span> <span>-l</span> argc <span>(</span>count <span>$argv</span><span>)</span>
<span># note: processed arguments are removed from $argv</span>
<span>if </span><span>test</span> <span>$argc</span> <span>-ne</span> 1 <span>-a</span> <span>$argc</span> <span>-ne</span> 2
usage
<span>exit </span>0
end
<span>set</span> <span>-l</span> search_string <span>$argv</span><span>[</span>1] <span># first argument</span>
<span>set</span> <span>-l</span> input_path /dev/stdin
<span>if </span><span>test</span> <span>$argc</span> <span>-gt</span> 1
<span># <INPUT PATH> is specified</span>
<span>set </span>input_path <span>$argv</span><span>[</span><span>-1</span><span>]</span>
end
<span>echo</span> <span>$input_path</span>
<span>set</span> <span>-l</span> grep_options <span>-i</span>
<span>if </span><span>set</span> <span>-q</span> _flag_context
<span>set</span> <span>--append</span> grep_options <span>'-C'</span> <span>$_flag_context</span>
end
<span>set</span> <span>--append</span> grep_options <span>$search_string</span>
begin
<span># print head first</span>
<span>read</span> <span>-l</span> line
<span>echo</span> <span>"</span><span>$line</span><span>"</span>
<span># let 'grep' do the rest</span>
<span>exec grep</span> <span>$grep_options</span>
end < <span>$input_path</span>
<span>#!/usr/bin/env fish</span>

<span>set</span> <span>-l</span> PROG <span>=</span> <span>'hgrep.fish'</span>
<span># ref: https://fishshell.com/docs/current/cmds/argparse.html#cmd-argparse</span>
<span>set</span> <span>-l</span> options <span>'C/context='</span> <span>'h/help'</span>

<span>function </span>usage <span>-S</span> <span>-d</span> <span>"basic usage for </span><span>$PROG</span><span>"</span>
    <span>echo</span> <span>\</span>
<span>"Usage: </span><span>$PROG</span><span> [-C|--context context] <SEARCH> [<INPUT PATH>]"</span>
end

<span># parse args here</span>
argparse <span>$options</span> <span>--</span> <span>$argv</span>

<span>set</span> <span>-l</span> argc <span>(</span>count <span>$argv</span><span>)</span>
<span># note: processed arguments are removed from $argv</span>
<span>if </span><span>test</span> <span>$argc</span> <span>-ne</span> 1 <span>-a</span> <span>$argc</span> <span>-ne</span> 2
    usage
    <span>exit </span>0
end

<span>set</span> <span>-l</span> search_string <span>$argv</span><span>[</span>1] <span># first argument</span>
<span>set</span> <span>-l</span> input_path /dev/stdin

<span>if </span><span>test</span> <span>$argc</span> <span>-gt</span> 1
    <span># <INPUT PATH> is specified</span>
    <span>set </span>input_path <span>$argv</span><span>[</span><span>-1</span><span>]</span>
end

<span>echo</span> <span>$input_path</span>

<span>set</span> <span>-l</span> grep_options <span>-i</span>

<span>if </span><span>set</span> <span>-q</span> _flag_context
    <span>set</span> <span>--append</span> grep_options <span>'-C'</span> <span>$_flag_context</span>
end

<span>set</span> <span>--append</span> grep_options <span>$search_string</span>

begin
    <span># print head first</span>
    <span>read</span> <span>-l</span> line
    <span>echo</span> <span>"</span><span>$line</span><span>"</span>

    <span># let 'grep' do the rest</span>
    <span>exec grep</span> <span>$grep_options</span>

end < <span>$input_path</span>
#!/usr/bin/env fish set -l PROG = 'hgrep.fish' # ref: https://fishshell.com/docs/current/cmds/argparse.html#cmd-argparse set -l options 'C/context=' 'h/help' function usage -S -d "basic usage for $PROG" echo \ "Usage: $PROG [-C|--context context] <SEARCH> [<INPUT PATH>]" end # parse args here argparse $options -- $argv set -l argc (count $argv) # note: processed arguments are removed from $argv if test $argc -ne 1 -a $argc -ne 2 usage exit 0 end set -l search_string $argv[1] # first argument set -l input_path /dev/stdin if test $argc -gt 1 # <INPUT PATH> is specified set input_path $argv[-1] end echo $input_path set -l grep_options -i if set -q _flag_context set --append grep_options '-C' $_flag_context end set --append grep_options $search_string begin # print head first read -l line echo "$line" # let 'grep' do the rest exec grep $grep_options end < $input_path

Enter fullscreen mode Exit fullscreen mode

begin .. end < $input_path pattern is used before when I made fish-pandoc-any-to-markdown. So I found this version a bit easier than others.

Perl Solution

My perl solution was made very long time ago. I’m happy to see that it is still working. Basic routine is the same, except it has one more options. –nohead which is not neccessary. I think I just wanted to chceck the how the OptArgs is working at that time.

I realized today that the routine in fish shell is also applicable.

  1. read one line from input and print to stdout
  2. exec to grep with option

Nevertheless, I believed that it is worth to learn!

parsing options in perl

And thanks to OptArgs module, I could handle option handy and in a more structural approach. (However, I think this is little heavier than python’s argparse.)

The recent file is on my github:
hgrep.pl

<span>#!/usr/bin/env perl</span>
<span># -*- Mode: cperl; cperl-indent-level:4; tab-width: 8; indent-tabs-mode: nil -*-</span>
<span># -*- coding: utf-8 -*-</span>
<span># vim: set tabstop=8 expandtab:</span>
<span>use</span> <span>strict</span><span>;</span> <span>use</span> <span>warnings</span><span>;</span>
<span>use</span> <span>feature</span> <span>qw(switch)</span><span>;</span>
<span>use</span> <span>OptArgs</span><span>;</span> <span># https://metacpan.org/dist/OptArgs/view/bin/optargs</span>
<span>my</span> <span>@grep_options</span> <span>=</span> <span>qw(-i)</span><span>;</span>
<span>for</span> <span>(</span> <span>$ENV</span><span>{'</span><span>TERM</span><span>'}</span> <span>)</span> <span>{</span>
<span>if</span> <span>(</span> <span>$_</span> <span>=~</span> <span>/dumb/</span> <span>)</span> <span>{</span> <span>}</span>
<span>default</span> <span>{</span> <span>push</span> <span>@grep_options</span><span>,</span> <span>"</span><span>--color=auto</span><span>"</span> <span>}</span>
<span>}</span>
<span># ref: https://metacpan.org/pod/OptArgs</span>
<span>## option parts ...</span>
<span>opt</span> <span>context</span> <span>=></span>
<span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Num</span><span>',</span>
<span>alias</span> <span>=></span> <span>'</span><span>C</span><span>',</span>
<span>default</span> <span>=></span> <span>3</span><span>,</span>
<span>comment</span> <span>=></span> <span>'</span><span>print NUM lines of output context</span><span>'</span> <span>);</span>
<span>opt</span> <span>help</span> <span>=></span>
<span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Bool</span><span>',</span>
<span>alias</span> <span>=></span> <span>'</span><span>h</span><span>',</span>
<span>comment</span> <span>=></span> <span>'</span><span>print a help message and exit</span><span>',</span>
<span>ishelp</span> <span>=></span> <span>1</span> <span>);</span>
<span># argument parts ...</span>
<span>arg</span> <span>search</span> <span>=></span>
<span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Str</span><span>',</span>
<span>required</span> <span>=></span> <span>1</span><span>,</span>
<span>comment</span> <span>=></span> <span>'</span><span>string to search from file</span><span>'</span> <span>);</span>
<span>arg</span> <span>file_name</span> <span>=></span>
<span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Str</span><span>',</span>
<span>default</span> <span>=></span> <span>'</span><span>-</span><span>',</span> <span># default input from stdin</span>
<span>comment</span> <span>=></span> <span>'</span><span>the file which we search from</span><span>'</span> <span>);</span>
<span># parsing options via optargs function!</span>
<span>my</span> <span>$opts</span> <span>=</span> <span>optargs</span><span>;</span>
<span>#!/usr/bin/env perl</span>
<span># -*- Mode: cperl; cperl-indent-level:4; tab-width: 8; indent-tabs-mode: nil -*-</span>
<span># -*- coding: utf-8 -*-</span>
<span># vim: set tabstop=8 expandtab:</span>

<span>use</span> <span>strict</span><span>;</span> <span>use</span> <span>warnings</span><span>;</span>
<span>use</span> <span>feature</span> <span>qw(switch)</span><span>;</span>
<span>use</span> <span>OptArgs</span><span>;</span> <span># https://metacpan.org/dist/OptArgs/view/bin/optargs</span>

<span>my</span> <span>@grep_options</span> <span>=</span> <span>qw(-i)</span><span>;</span>

<span>for</span> <span>(</span> <span>$ENV</span><span>{'</span><span>TERM</span><span>'}</span> <span>)</span> <span>{</span>
    <span>if</span> <span>(</span> <span>$_</span> <span>=~</span> <span>/dumb/</span> <span>)</span> <span>{</span> <span>}</span>
    <span>default</span> <span>{</span> <span>push</span> <span>@grep_options</span><span>,</span> <span>"</span><span>--color=auto</span><span>"</span> <span>}</span>
<span>}</span>

<span># ref: https://metacpan.org/pod/OptArgs</span>
<span>## option parts ...</span>
<span>opt</span> <span>context</span> <span>=></span>
  <span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Num</span><span>',</span>
    <span>alias</span> <span>=></span> <span>'</span><span>C</span><span>',</span>
    <span>default</span> <span>=></span> <span>3</span><span>,</span>
    <span>comment</span> <span>=></span> <span>'</span><span>print NUM lines of output context</span><span>'</span> <span>);</span>

<span>opt</span> <span>help</span> <span>=></span>
  <span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Bool</span><span>',</span>
    <span>alias</span> <span>=></span> <span>'</span><span>h</span><span>',</span>
    <span>comment</span> <span>=></span> <span>'</span><span>print a help message and exit</span><span>',</span>
    <span>ishelp</span> <span>=></span> <span>1</span> <span>);</span>

<span># argument parts ...</span>
<span>arg</span> <span>search</span> <span>=></span>
  <span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Str</span><span>',</span>
    <span>required</span> <span>=></span> <span>1</span><span>,</span>
    <span>comment</span> <span>=></span> <span>'</span><span>string to search from file</span><span>'</span> <span>);</span>

<span>arg</span> <span>file_name</span> <span>=></span>
  <span>(</span> <span>isa</span> <span>=></span> <span>'</span><span>Str</span><span>',</span>
    <span>default</span> <span>=></span> <span>'</span><span>-</span><span>',</span> <span># default input from stdin</span>
    <span>comment</span> <span>=></span> <span>'</span><span>the file which we search from</span><span>'</span> <span>);</span>

<span># parsing options via optargs function!</span>
<span>my</span> <span>$opts</span> <span>=</span> <span>optargs</span><span>;</span>
#!/usr/bin/env perl # -*- Mode: cperl; cperl-indent-level:4; tab-width: 8; indent-tabs-mode: nil -*- # -*- coding: utf-8 -*- # vim: set tabstop=8 expandtab: use strict; use warnings; use feature qw(switch); use OptArgs; # https://metacpan.org/dist/OptArgs/view/bin/optargs my @grep_options = qw(-i); for ( $ENV{'TERM'} ) { if ( $_ =~ /dumb/ ) { } default { push @grep_options, "--color=auto" } } # ref: https://metacpan.org/pod/OptArgs ## option parts ... opt context => ( isa => 'Num', alias => 'C', default => 3, comment => 'print NUM lines of output context' ); opt help => ( isa => 'Bool', alias => 'h', comment => 'print a help message and exit', ishelp => 1 ); # argument parts ... arg search => ( isa => 'Str', required => 1, comment => 'string to search from file' ); arg file_name => ( isa => 'Str', default => '-', # default input from stdin comment => 'the file which we search from' ); # parsing options via optargs function! my $opts = optargs;

Enter fullscreen mode Exit fullscreen mode

And now processing the parsed arguments and open a file (or stdin)

<span>if</span> <span>(</span> <span>defined</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'}</span> <span>and</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'}</span> <span>></span> <span>0</span> <span>)</span> <span>{</span>
<span>push</span> <span>@grep_options</span><span>,</span> <span>'</span><span>-C</span><span>',</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'};</span>
<span>}</span>
<span>my</span> <span>$fh</span><span>;</span>
<span>if</span> <span>(</span> <span>$opts</span><span>-></span><span>{'</span><span>file_name</span><span>'}</span> <span>ne</span> <span>'</span><span>-</span><span>'</span> <span>)</span> <span>{</span>
<span>open</span> <span>my</span> <span>$fh</span><span>,</span> <span>"</span><span><</span><span>$opts</span><span>->{file_name}</span><span>",</span>
<span>or</span> <span>die</span> <span>"</span><span>Can't open `</span><span>$opts</span><span>->{file_name}'</span><span>";</span>
<span>}</span>
<span>else</span> <span>{</span>
<span># http://perldoc.perl.org/functions/open.html</span>
<span>open</span><span>(</span> <span>$fh</span><span>,</span> <span>"</span><span><&=</span><span>",</span><span>*STDIN</span> <span>);</span>
<span>}</span>
<span>if</span> <span>(</span> <span>not</span> <span>$opts</span><span>-></span><span>{</span><span>nohead</span><span>}</span> <span>)</span> <span>{</span>
<span>my</span> <span>$head</span> <span>=</span> <span><</span><span>$fh</span><span>></span><span>;</span>
<span># FIXME: colourising ....</span>
<span>print</span> <span>"</span><span>$head</span><span>";</span>
<span>}</span>
<span>my</span> <span>$to_gh</span><span>;</span>
<span>if</span> <span>(</span> <span>defined</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'}</span> <span>and</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'}</span> <span>></span> <span>0</span> <span>)</span> <span>{</span>
    <span>push</span> <span>@grep_options</span><span>,</span> <span>'</span><span>-C</span><span>',</span> <span>$opts</span><span>-></span><span>{'</span><span>context</span><span>'};</span>
<span>}</span>
<span>my</span> <span>$fh</span><span>;</span>

<span>if</span> <span>(</span> <span>$opts</span><span>-></span><span>{'</span><span>file_name</span><span>'}</span> <span>ne</span> <span>'</span><span>-</span><span>'</span> <span>)</span> <span>{</span>
    <span>open</span> <span>my</span> <span>$fh</span><span>,</span> <span>"</span><span><</span><span>$opts</span><span>->{file_name}</span><span>",</span>
        <span>or</span> <span>die</span> <span>"</span><span>Can't open `</span><span>$opts</span><span>->{file_name}'</span><span>";</span>
<span>}</span>
<span>else</span> <span>{</span>
    <span># http://perldoc.perl.org/functions/open.html</span>
    <span>open</span><span>(</span> <span>$fh</span><span>,</span> <span>"</span><span><&=</span><span>",</span><span>*STDIN</span> <span>);</span>
<span>}</span>

<span>if</span> <span>(</span> <span>not</span> <span>$opts</span><span>-></span><span>{</span><span>nohead</span><span>}</span> <span>)</span> <span>{</span>
    <span>my</span> <span>$head</span> <span>=</span> <span><</span><span>$fh</span><span>></span><span>;</span>
    <span># FIXME: colourising ....</span>
    <span>print</span> <span>"</span><span>$head</span><span>";</span>
<span>}</span>

<span>my</span> <span>$to_gh</span><span>;</span>
if ( defined $opts->{'context'} and $opts->{'context'} > 0 ) { push @grep_options, '-C', $opts->{'context'}; } my $fh; if ( $opts->{'file_name'} ne '-' ) { open my $fh, "<$opts->{file_name}", or die "Can't open `$opts->{file_name}'"; } else { # http://perldoc.perl.org/functions/open.html open( $fh, "<&=",*STDIN ); } if ( not $opts->{nohead} ) { my $head = <$fh>; # FIXME: colourising .... print "$head"; } my $to_gh;

Enter fullscreen mode Exit fullscreen mode

requirement for system programming

And when I try to go further, I found that I need little more system programming underneath, which shell normally does for me.

To communicate with grep function, we need to open a pipe via open function.

<span>my</span> <span>$grep_pid</span> <span>=</span> <span>open</span><span>(</span> <span>$to_gh</span><span>,</span> <span>'</span><span>|-</span><span>'</span> <span>);</span>
<span>if</span> <span>(</span> <span>not</span> <span>defined</span> <span>$grep_pid</span> <span>)</span> <span>{</span>
<span>die</span> <span>"</span><span>Can't fork: $!</span><span>";</span>
<span>}</span>
<span>my</span> <span>$grep_pid</span> <span>=</span> <span>open</span><span>(</span> <span>$to_gh</span><span>,</span> <span>'</span><span>|-</span><span>'</span> <span>);</span>
<span>if</span> <span>(</span> <span>not</span> <span>defined</span> <span>$grep_pid</span> <span>)</span> <span>{</span>
    <span>die</span> <span>"</span><span>Can't fork: $!</span><span>";</span>
<span>}</span>
my $grep_pid = open( $to_gh, '|-' ); if ( not defined $grep_pid ) { die "Can't fork: $!"; }

Enter fullscreen mode Exit fullscreen mode

|- means creating a pipe, and fork implicitly at the same time and now we have two processes, when the parent writing into new handle \$to_gh, the child will read from the stdin.

In terms of shell script, it looks like below at the moment.

sh> parent_perl <some options ...> | child_perl
sh>  parent_perl <some options ...> | child_perl
sh> parent_perl <some options ...> | child_perl

Enter fullscreen mode Exit fullscreen mode

i.e. parent_perl and child_perl now communicate with piple(|) and the child_perl process will be replaced with grep process via exec.

There is a simple way to we are in the parent_perl process or child_perl process, which is checking the $grep_pid value.

<span>if</span> <span>(</span> <span>$grep_pid</span> <span>)</span> <span>{</span>
<span># if grep_pid is not zero, this is parent_perl (parent process)</span>
<span># which handle both file handles.</span>
<span>while</span> <span>(</span> <span><</span><span>$fh</span><span>></span> <span>)</span> <span>{</span> <span>print</span> <span>$to_gh</span> <span>$_</span><span>;</span> <span>}</span>
<span>close</span> <span>$_</span> <span>for</span> <span>$to_gh</span><span>,</span> <span>$fh</span><span>;</span>
<span># parent process have to wait any children processes finsished.</span>
<span>waitpid</span> <span>$grep_pid</span><span>,</span> <span>0</span><span>;</span>
<span>}</span>
<span>else</span> <span>{</span>
<span># otherwise, this is child_perl (child process)</span>
<span>close</span> <span>$fh</span><span>;</span> <span># not used in child process</span>
<span>exec</span> <span>'</span><span>grep</span><span>',</span> <span>@grep_options</span><span>,</span> <span>$opts</span><span>-></span><span>{'</span><span>search</span><span>'};</span>
<span>}</span>
<span>exit</span> <span>0</span><span>;</span>
<span>if</span> <span>(</span> <span>$grep_pid</span> <span>)</span> <span>{</span>
    <span># if grep_pid is not zero, this is parent_perl (parent process)</span>
    <span># which handle both file handles.</span>
    <span>while</span> <span>(</span> <span><</span><span>$fh</span><span>></span> <span>)</span> <span>{</span> <span>print</span> <span>$to_gh</span> <span>$_</span><span>;</span> <span>}</span>

    <span>close</span> <span>$_</span> <span>for</span> <span>$to_gh</span><span>,</span> <span>$fh</span><span>;</span>

    <span># parent process have to wait any children processes finsished.</span>
    <span>waitpid</span> <span>$grep_pid</span><span>,</span> <span>0</span><span>;</span>
<span>}</span>
<span>else</span> <span>{</span>
    <span># otherwise, this is child_perl (child process)</span>
    <span>close</span> <span>$fh</span><span>;</span> <span># not used in child process</span>
    <span>exec</span> <span>'</span><span>grep</span><span>',</span> <span>@grep_options</span><span>,</span> <span>$opts</span><span>-></span><span>{'</span><span>search</span><span>'};</span>
<span>}</span>

<span>exit</span> <span>0</span><span>;</span>
if ( $grep_pid ) { # if grep_pid is not zero, this is parent_perl (parent process) # which handle both file handles. while ( <$fh> ) { print $to_gh $_; } close $_ for $to_gh, $fh; # parent process have to wait any children processes finsished. waitpid $grep_pid, 0; } else { # otherwise, this is child_perl (child process) close $fh; # not used in child process exec 'grep', @grep_options, $opts->{'search'}; } exit 0;

Enter fullscreen mode Exit fullscreen mode

and last exec 'grep' ... will replace its own perl process to grep process. no process could not be created without a
parent.

I found that it is worth trying to understand basic system programming in perl, However shell script will be much easier to handle it.

Python Solution (as a newbie)

How about python? I think the same logic could be applied in python as well. However, I didn’t get chance to write down a python script yet. So, I didn’t make any function and write it as simple as possible.

credit:

I go through similar pattern as I did in perl you can find the recent file on my github: hgrep.py

<span>#!/usr/bin/env python3 </span>
<span>import</span> <span>os</span><span>,</span> <span>sys</span>
<span>import</span> <span>argparse</span>
<span># handle options first </span><span>parser</span> <span>=</span> <span>argparse</span><span>.</span><span>ArgumentParser</span><span>()</span><span>#prog="hgrep.py") </span><span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"-C"</span><span>,</span> <span>"--context"</span><span>,</span>
<span>nargs</span> <span>=</span> <span>1</span><span>,</span>
<span>type</span> <span>=</span> <span>int</span><span>,</span>
<span>dest</span> <span>=</span> <span>"context"</span><span>,</span>
<span>required</span> <span>=</span> <span>False</span><span>,</span>
<span>help</span><span>=</span><span>"print NUM lines of output context"</span> <span>)</span>
<span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"search"</span><span>,</span>
<span># upper case in the help message </span> <span>metavar</span> <span>=</span> <span>"<SEARCH>"</span><span>,</span>
<span>help</span> <span>=</span> <span>"string to search from <file_path>"</span> <span>)</span>
<span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"file_path"</span><span>,</span>
<span># upper case in the help message </span> <span>metavar</span> <span>=</span> <span>"[<FILE PATH>]"</span><span>,</span>
<span>default</span> <span>=</span> <span>'-'</span><span>,</span>
<span>help</span> <span>=</span> <span>"<file_path> to search"</span> <span>)</span>
<span># case insenstive search </span><span>grep_options</span> <span>=</span> <span>[</span> <span>'-i'</span> <span>]</span>
<span># highligting </span><span>if</span> <span>os</span><span>.</span><span>environ</span><span>[</span><span>'TERM'</span><span>].</span><span>lower</span> <span>!=</span> <span>'dumb'</span><span>:</span>
<span>grep_options</span><span>.</span><span>append</span><span>(</span> <span>"--color=auto"</span> <span>)</span>
<span>#!/usr/bin/env python3 </span>
<span>import</span> <span>os</span><span>,</span> <span>sys</span>
<span>import</span> <span>argparse</span>

<span># handle options first </span><span>parser</span> <span>=</span> <span>argparse</span><span>.</span><span>ArgumentParser</span><span>()</span><span>#prog="hgrep.py") </span><span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"-C"</span><span>,</span> <span>"--context"</span><span>,</span>
                     <span>nargs</span> <span>=</span> <span>1</span><span>,</span>
                     <span>type</span> <span>=</span> <span>int</span><span>,</span>
                     <span>dest</span> <span>=</span> <span>"context"</span><span>,</span>
                     <span>required</span> <span>=</span> <span>False</span><span>,</span>
                     <span>help</span><span>=</span><span>"print NUM lines of output context"</span> <span>)</span>

<span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"search"</span><span>,</span>
                     <span># upper case in the help message </span>                     <span>metavar</span> <span>=</span> <span>"<SEARCH>"</span><span>,</span>
                     <span>help</span> <span>=</span> <span>"string to search from <file_path>"</span> <span>)</span>

<span>parser</span><span>.</span><span>add_argument</span><span>(</span> <span>"file_path"</span><span>,</span>
                     <span># upper case in the help message </span>                     <span>metavar</span> <span>=</span> <span>"[<FILE PATH>]"</span><span>,</span>
                     <span>default</span> <span>=</span> <span>'-'</span><span>,</span>
                     <span>help</span> <span>=</span> <span>"<file_path> to search"</span> <span>)</span>

<span># case insenstive search </span><span>grep_options</span> <span>=</span> <span>[</span> <span>'-i'</span> <span>]</span>

<span># highligting </span><span>if</span> <span>os</span><span>.</span><span>environ</span><span>[</span><span>'TERM'</span><span>].</span><span>lower</span> <span>!=</span> <span>'dumb'</span><span>:</span>
    <span>grep_options</span><span>.</span><span>append</span><span>(</span> <span>"--color=auto"</span> <span>)</span>
#!/usr/bin/env python3 import os, sys import argparse # handle options first parser = argparse.ArgumentParser()#prog="hgrep.py") parser.add_argument( "-C", "--context", nargs = 1, type = int, dest = "context", required = False, help="print NUM lines of output context" ) parser.add_argument( "search", # upper case in the help message metavar = "<SEARCH>", help = "string to search from <file_path>" ) parser.add_argument( "file_path", # upper case in the help message metavar = "[<FILE PATH>]", default = '-', help = "<file_path> to search" ) # case insenstive search grep_options = [ '-i' ] # highligting if os.environ['TERM'].lower != 'dumb': grep_options.append( "--color=auto" )

Enter fullscreen mode Exit fullscreen mode

I found argparse module cannot handle optional positional argument. optional positional argument is natural in grep. So I’d like to keep that behaviour.

<span># argparse cannot handle optional argument # WORKAROUND: </span><span>argv</span> <span>=</span> <span>sys</span><span>.</span><span>argv</span><span>[</span><span>1</span><span>::]</span>
<span>if</span> <span>len</span><span>(</span><span>argv</span><span>)</span> <span>==</span> <span>0</span><span>:</span>
<span>print</span><span>(</span> <span>"{prog}: No argument given"</span><span>.</span><span>format</span><span>(</span><span>prog</span><span>=</span> <span>sys</span><span>.</span><span>argv</span><span>[</span><span>0</span><span>]</span> <span>),</span>
<span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stderr</span> <span>)</span>
<span>parser</span><span>.</span><span>print_help</span><span>()</span>
<span>exit</span><span>(</span> <span>1</span> <span>)</span>
<span>if</span> <span>len</span><span>(</span><span>argv</span><span>)</span> <span>==</span> <span>1</span><span>:</span>
<span># user ommit input file path </span> <span># default : - (stdin) </span> <span>argv</span><span>.</span><span>append</span><span>(</span> <span>'-'</span> <span>)</span>
<span>args</span> <span>=</span> <span>parser</span><span>.</span><span>parse_args</span><span>(</span> <span>argv</span> <span>)</span>
<span># check more grep options </span><span>if</span> <span>args</span><span>.</span><span>context</span> <span>is</span> <span>not</span> <span>None</span> <span>and</span> <span>args</span><span>.</span><span>context</span> <span>></span> <span>0</span><span>:</span>
<span>grep_options</span><span>.</span><span>extend</span><span>(</span> <span>[</span> <span>'-C'</span><span>,</span> <span>args</span><span>.</span><span>context</span> <span>]</span> <span>)</span>
<span>grep_options</span><span>.</span><span>append</span><span>(</span> <span>args</span><span>.</span><span>search</span> <span>)</span>
<span># argparse cannot handle optional argument # WORKAROUND: </span><span>argv</span> <span>=</span> <span>sys</span><span>.</span><span>argv</span><span>[</span><span>1</span><span>::]</span>
<span>if</span> <span>len</span><span>(</span><span>argv</span><span>)</span> <span>==</span> <span>0</span><span>:</span>
    <span>print</span><span>(</span> <span>"{prog}: No argument given"</span><span>.</span><span>format</span><span>(</span><span>prog</span><span>=</span> <span>sys</span><span>.</span><span>argv</span><span>[</span><span>0</span><span>]</span> <span>),</span>
           <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stderr</span> <span>)</span>
    <span>parser</span><span>.</span><span>print_help</span><span>()</span>
    <span>exit</span><span>(</span> <span>1</span> <span>)</span>

<span>if</span> <span>len</span><span>(</span><span>argv</span><span>)</span> <span>==</span> <span>1</span><span>:</span>
    <span># user ommit input file path </span>    <span># default : - (stdin) </span>    <span>argv</span><span>.</span><span>append</span><span>(</span> <span>'-'</span> <span>)</span>

<span>args</span> <span>=</span> <span>parser</span><span>.</span><span>parse_args</span><span>(</span> <span>argv</span> <span>)</span>

<span># check more grep options </span><span>if</span> <span>args</span><span>.</span><span>context</span> <span>is</span> <span>not</span> <span>None</span> <span>and</span> <span>args</span><span>.</span><span>context</span> <span>></span> <span>0</span><span>:</span>
    <span>grep_options</span><span>.</span><span>extend</span><span>(</span> <span>[</span> <span>'-C'</span><span>,</span> <span>args</span><span>.</span><span>context</span> <span>]</span> <span>)</span>

<span>grep_options</span><span>.</span><span>append</span><span>(</span> <span>args</span><span>.</span><span>search</span> <span>)</span>
# argparse cannot handle optional argument # WORKAROUND: argv = sys.argv[1::] if len(argv) == 0: print( "{prog}: No argument given".format(prog= sys.argv[0] ), file = sys.stderr ) parser.print_help() exit( 1 ) if len(argv) == 1: # user ommit input file path # default : - (stdin) argv.append( '-' ) args = parser.parse_args( argv ) # check more grep options if args.context is not None and args.context > 0: grep_options.extend( [ '-C', args.context ] ) grep_options.append( args.search )

Enter fullscreen mode Exit fullscreen mode

I don’t really know about python, but I guess I took the very low-level pipe() function in python.

<span># and let's go for plumbing # file descriptors r,w for reading and writing </span><span>r</span><span>,</span> <span>w</span> <span>=</span> <span>os</span><span>.</span><span>pipe</span><span>()</span>
<span>if</span> <span>args</span><span>.</span><span>file_path</span> <span>==</span> <span>"-"</span><span>:</span>
<span># or from stdin </span> <span>file_to_read</span> <span>=</span> <span>sys</span><span>.</span><span>stdin</span>
<span>else</span><span>:</span>
<span># or open file path to read </span> <span>if</span> <span>os</span><span>.</span><span>path</span><span>.</span><span>isfile</span><span>(</span> <span>args</span><span>.</span><span>file_path</span> <span>):</span>
<span>file_to_read</span> <span>=</span> <span>open</span><span>(</span> <span>args</span><span>.</span><span>file_path</span><span>,</span> <span>"r"</span> <span>)</span>
<span>else</span><span>:</span>
<span>print</span><span>(</span> <span>"A file path:({fp}) is not readable"</span>
<span>.</span><span>format</span><span>(</span> <span>fp</span><span>=</span><span>args</span><span>.</span><span>file_path</span> <span>)</span>
<span>,</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stderr</span> <span>)</span>
<span>exit</span><span>(</span> <span>2</span> <span>)</span>
<span># read head first and print into stdout directly </span><span>print</span><span>(</span> <span>file_to_read</span><span>.</span><span>readline</span><span>()</span> <span>,</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stdout</span><span>,</span> <span>flush</span> <span>=</span> <span>True</span> <span>)</span>
<span># fork() will create a child process # and we can distinguish which one is parent process by checking # return value </span><span>grep_pid</span> <span>=</span> <span>os</span><span>.</span><span>fork</span><span>()</span>
<span>if</span> <span>grep_pid</span><span>:</span>
<span># parent process </span>
<span># to communicate with to a child process </span> <span># writing file descriptor will be used </span> <span>os</span><span>.</span><span>close</span><span>(</span><span>r</span><span>)</span>
<span>os</span><span>.</span><span>dup2</span><span>(</span> <span>w</span><span>,</span> <span>sys</span><span>.</span><span>stdout</span><span>.</span><span>fileno</span><span>()</span> <span>)</span>
<span>for</span> <span>line</span> <span>in</span> <span>file_to_read</span><span>:</span>
<span>print</span><span>(</span> <span>line</span> <span>)</span>
<span># It is good practice to close all the file open </span> <span>os</span><span>.</span><span>close</span><span>(</span> <span>w</span> <span>)</span>
<span># safely waiting for children processes </span> <span>os</span><span>.</span><span>waitpid</span><span>(</span> <span>grep_pid</span><span>,</span>
<span>os</span><span>.</span><span>WNOHANG</span> <span># if child process status not available: no wait </span> <span>)</span>
<span>else</span><span>:</span>
<span># child process </span> <span>os</span><span>.</span><span>dup2</span><span>(</span> <span>r</span><span>,</span> <span>sys</span><span>.</span><span>stdin</span><span>.</span><span>fileno</span><span>()</span> <span>)</span>
<span># child process only requires 'r' as stdin </span> <span># and stdout so it is better to close r,w here. </span> <span>os</span><span>.</span><span>closerange</span><span>(</span> <span>r</span><span>,</span> <span>w</span> <span>)</span>
<span>os</span><span>.</span><span>execvp</span><span>(</span> <span>'grep'</span><span>,</span> <span>grep_options</span> <span>)</span>
<span>exit</span><span>(</span><span>0</span><span>)</span>
<span># and let's go for plumbing # file descriptors r,w for reading and writing </span><span>r</span><span>,</span> <span>w</span> <span>=</span> <span>os</span><span>.</span><span>pipe</span><span>()</span>

<span>if</span> <span>args</span><span>.</span><span>file_path</span> <span>==</span> <span>"-"</span><span>:</span>
    <span># or from stdin </span>    <span>file_to_read</span> <span>=</span> <span>sys</span><span>.</span><span>stdin</span>
<span>else</span><span>:</span>
    <span># or open file path to read </span>    <span>if</span> <span>os</span><span>.</span><span>path</span><span>.</span><span>isfile</span><span>(</span> <span>args</span><span>.</span><span>file_path</span> <span>):</span>
        <span>file_to_read</span> <span>=</span> <span>open</span><span>(</span> <span>args</span><span>.</span><span>file_path</span><span>,</span> <span>"r"</span> <span>)</span>
    <span>else</span><span>:</span>


        <span>print</span><span>(</span> <span>"A file path:({fp}) is not readable"</span>
               <span>.</span><span>format</span><span>(</span> <span>fp</span><span>=</span><span>args</span><span>.</span><span>file_path</span> <span>)</span>
               <span>,</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stderr</span> <span>)</span>
        <span>exit</span><span>(</span> <span>2</span> <span>)</span>

<span># read head first and print into stdout directly </span><span>print</span><span>(</span> <span>file_to_read</span><span>.</span><span>readline</span><span>()</span> <span>,</span> <span>file</span> <span>=</span> <span>sys</span><span>.</span><span>stdout</span><span>,</span> <span>flush</span> <span>=</span> <span>True</span> <span>)</span>

<span># fork() will create a child process # and we can distinguish which one is parent process by checking # return value </span><span>grep_pid</span> <span>=</span> <span>os</span><span>.</span><span>fork</span><span>()</span>

<span>if</span> <span>grep_pid</span><span>:</span>
    <span># parent process </span>
    <span># to communicate with to a child process </span>    <span># writing file descriptor will be used </span>    <span>os</span><span>.</span><span>close</span><span>(</span><span>r</span><span>)</span>
    <span>os</span><span>.</span><span>dup2</span><span>(</span> <span>w</span><span>,</span> <span>sys</span><span>.</span><span>stdout</span><span>.</span><span>fileno</span><span>()</span> <span>)</span>

    <span>for</span> <span>line</span> <span>in</span> <span>file_to_read</span><span>:</span>
        <span>print</span><span>(</span> <span>line</span> <span>)</span>

    <span># It is good practice to close all the file open </span>    <span>os</span><span>.</span><span>close</span><span>(</span> <span>w</span> <span>)</span>

    <span># safely waiting for children processes </span>    <span>os</span><span>.</span><span>waitpid</span><span>(</span> <span>grep_pid</span><span>,</span>
                <span>os</span><span>.</span><span>WNOHANG</span> <span># if child process status not available: no wait </span>               <span>)</span>

<span>else</span><span>:</span>
    <span># child process </span>    <span>os</span><span>.</span><span>dup2</span><span>(</span> <span>r</span><span>,</span> <span>sys</span><span>.</span><span>stdin</span><span>.</span><span>fileno</span><span>()</span> <span>)</span>

    <span># child process only requires 'r' as stdin </span>    <span># and stdout so it is better to close r,w here. </span>    <span>os</span><span>.</span><span>closerange</span><span>(</span> <span>r</span><span>,</span> <span>w</span> <span>)</span>
    <span>os</span><span>.</span><span>execvp</span><span>(</span> <span>'grep'</span><span>,</span> <span>grep_options</span> <span>)</span>

<span>exit</span><span>(</span><span>0</span><span>)</span>
# and let's go for plumbing # file descriptors r,w for reading and writing r, w = os.pipe() if args.file_path == "-": # or from stdin file_to_read = sys.stdin else: # or open file path to read if os.path.isfile( args.file_path ): file_to_read = open( args.file_path, "r" ) else: print( "A file path:({fp}) is not readable" .format( fp=args.file_path ) , file = sys.stderr ) exit( 2 ) # read head first and print into stdout directly print( file_to_read.readline() , file = sys.stdout, flush = True ) # fork() will create a child process # and we can distinguish which one is parent process by checking # return value grep_pid = os.fork() if grep_pid: # parent process # to communicate with to a child process # writing file descriptor will be used os.close(r) os.dup2( w, sys.stdout.fileno() ) for line in file_to_read: print( line ) # It is good practice to close all the file open os.close( w ) # safely waiting for children processes os.waitpid( grep_pid, os.WNOHANG # if child process status not available: no wait ) else: # child process os.dup2( r, sys.stdin.fileno() ) # child process only requires 'r' as stdin # and stdout so it is better to close r,w here. os.closerange( r, w ) os.execvp( 'grep', grep_options ) exit(0)

Enter fullscreen mode Exit fullscreen mode

Where I found difficulty

os.dup2 is essential to communicate with the grep in child process as grep only care about stdin here, but there is no way to inform the child that parent is going to newly open file descriptors (r,w). So we should kindly re-bind the new file descriptor to stdin. TBH, I spent too much time on this because lacks of my knowledge about system programming.

and os.waitpid requires os.WNOHANG option value, I thought it will be 0, which is actually not. so my programme was on hang after grep had finished its job.

Wrapping Up

pipe and shell’s power

  • Even though it was good chance to learn about basic pipe usage, Shell script is very powerful for basic process communication between two processes.
  • perl’s old open function’s arguments are a little bit hacky.

parsing option is easier with modules

And also I tried to add option and test them.

  • fish‘s argparse is relatively new, but useful for my cases.
  • Perl’s OptArgs has more features and handle optional argument as well
  • python’s argparse has good type system for checking data type and is performant, however it doesn’t support optional (positional) arguments. so I applied some workaround.

Suggestion after post

  • It would be nicer, if we have option for case sensitive option because I put case-insensitive for default.

  • After making fish-pandoc-any-to-markdown and hgrep, only I need a programme to pre-process and let the other application could handle rest of it. So it becomes more general programme like below:

sh> ps aux | head-with get-one-line <span>--tail-with</span> <span>grep</span> <span>-i</span> /fish/
<span># or in fish-pandoc-any-to-markdown</span>
sh> <span>cat </span>some.org | head-with retrieve-metadata <span>--tail-with</span> pandoc <span>-t</span> markdown <span>></span> some.md
sh> ps aux | head-with get-one-line <span>--tail-with</span> <span>grep</span> <span>-i</span> /fish/
    <span># or in fish-pandoc-any-to-markdown</span>
sh> <span>cat </span>some.org | head-with retrieve-metadata <span>--tail-with</span> pandoc <span>-t</span> markdown <span>></span> some.md
sh> ps aux | head-with get-one-line --tail-with grep -i /fish/ # or in fish-pandoc-any-to-markdown sh> cat some.org | head-with retrieve-metadata --tail-with pandoc -t markdown > some.md

Enter fullscreen mode Exit fullscreen mode

Well… but not for today. maybe after I get more chance to use the similar patterns!

Thank you for reading! and Happy coding!

原文链接:Grep with Head Line

© 版权声明
THE END
喜欢就支持一下吧
点赞11 分享
No matter what label is thrown your way, only you can define your self.
不管你被贴上什么标签,只有你才能定义你自己
评论 抢沙发

请登录后发表评论

    暂无评论内容