Grep for a term and return all lines from within a multiline HTML tag

I'm building a command line tool that curl's into my companies directory and searches for entries.

For example ./tel bnye

returns

Bill Nye
Scientist
Bulding A
555-555-5555
[email protected]

Current script:

#!/bin/bash

person=$1
curl -u myUsername https://www.company.com/staffDir |
sed 's/<[^>]\+>/ /g' | sed 's/&nbsp;//g' |  grep -B5 $person

Each entry in the directory is surrounded by tags. So an entry would be

<tr>
Bill Nye
Scientist
Building A
555-555-5555
[email protected]
</tr>

So that works ok - BUT what would be the all-time -greatest would to be able to grep for a name and then just return all the info from within the tag it's within - but I don't know how to do that.

Any advice?

I am not sure if I get what it is that you want as the output, "grep for a name and then just return all the info from within the tag it's within", isn't that what it already does or do you want the contents of the parent tag of that tag?

Yeah this one is a tough one to type out: I'll try again.

So - forget my script for now - it works, but I think there's a better way to implement it.

Lets say I have a site with the following:

<tr>
    <td nowrap>Nye, Bill</td>
    <td nowrap>555-555-5555</td>
    <td a href="mailto:[email protected]">[email protected]</td>
<tr>


<tr>
    <td nowrap>Hopper, Grace</td>
    <td nowrap>555-555-5565</td>
    <td a href="mailto:[email protected]">[email protected]</td>
<tr>


<tr>
    <td nowrap>Ritchie, Dennis</td>
    <td nowrap>555-555-5525</td>
    <td a href="mailto:[email protected]">[email protected]</td>
<tr>

Is there a way if I grep for Ritchie it would return everything between his <tr> ... </tr> entries. Or if I grep for ghopper it would return everything between her <tr> ... </tr> entires.

Hopefully that helps, but if not please let me know I'll elaborate more - thank you!

in bash its possible to grep the name and return line number with grep -i Ritchie If the format is the same easiest way would be to add 2 line number in a variable that are under it and make a script that returns back the line numbers. With sed -n '$page p' .

But that would be a solution in Bash.

1 Like

Python solution:

3 Likes

@The_Cable Thanks for the suggestion!
@chmod000 My py-foo is weak - seeing this makes me realize I need to step it up in that area.

Here's what I finally did. I made a utility that works like this tel [arg] [searchTerm]. You can use -n for "Name" search and -e for "email" search. I'll post the code once I'm back on my netbook where it's located at.

np tho i do encourage people to use python too. Bash is a mix of scripts and command line tools. If you really think of it it's one confusing mesh, but its a low bar to fix things really quick.

+10000