[SOLVED]BASH NOOB - email addresses from text

Hi,

I have an assignment where I have to extract email addresses from text and put them in file that already has email addresses in them and this without duplicates. The problem is is that I have to use a loop and need to have at least 20 lines for my assignment.

$emailList contains already given email addresses
$email is an article that contains email addresses that have to be extracted

The code I currently have:

1 #!/bin/sh
  2 #bash project
  3 emailList=$1
  4 email=$2
  5 buffer='cat bufferBestand.txt'
  6 grep -o -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b"  $email | sort -u >> "$buffer"
  7 cat $emailLijst >> "$buffer"
  8 cat "$buffer" | sort -u > "$emailList"

I extract all the email addresses from the $email and use sort -u to get rid of the duplicates. This is then written to a buffer file. Then I put the email addresses that were already in the $emailList in $buffer and again use sort -u to get rid of the duplicates and put it all back in $emailList

It works but I can't turn it in because it only has 6 lines of code.

I've been trying for a couple of days now to get it to work with loops but I'm getting nowhere and can't make sense of my code anymore as I'v rewritten it countless of times now.

The code with loops I currently have:
bash
1 #!/bin/sh
2 #Script voor het tweede linux project
3 emailLijst=$1
4 email=$2
5 teller=0
6 buffer=/home/student/Documents/Project2/bufferBestand.txt
7 grep -o -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" $email > bufferBestand.txt
8 cat "$emailLijst" >> bufferBestand.txt
9 teller=0
10 while IFS= read -r line
11 do
12 while IFS= read -r line2
13 do
14 if [ "$line" = "$line2" ]
15 then
16 ((teller+=1))
17 fi
18 if [ "$teller" -eq 0 ]
19 then
20 echo "$line" > bufferBestand2.txt
21 fi
22 done<bufferBestand2.txt
23 done<bufferBestand.txt

Apologies for the formatting. Not really used to typing code in a forum post.

No problem, You can use three Backtick's (grave accent) at the beginning and end of the code block to do this:

#!/bin/sh
#bash project
emailList=$1
email=$2
buffer='cat bufferBestand.txt'
grep -o -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,6}\b"  $email | sort -u &gt;&gt; "$buffer"
cat $emailLijst &gt;&gt; "$buffer"
cat "$buffer" | sort -u &gt; "$emailList"

and this:

#!/bin/sh
#Script voor het tweede linux project
emailLijst=$1
email=$2
teller=0
buffer=/home/student/Documents/Project2/bufferBestand.txt
grep -o -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,6}\b" $email > bufferBestand.txt
cat "$emailLijst" >> bufferBestand.txt
teller=0
while IFS= read -r line
do
while IFS= read -r line2
do
if [ "$line" = "$line2" ]
then
((teller+=1))
fi
if [ "$teller" -eq 0 ]
then
echo "$line" > bufferBestand2.txt
fi
done<bufferBestand2.txt
done<bufferBestand.txt

Also forum uses markdown so this might be helpful: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet

Not everything on here works on the forums but most do.

1 Like

Thanks. I didn't know this.
I've also been able to solve my problem.
This is how my script looks like now.

#!/bin/sh
#Script voor het tweede linux project
"bufferSchrijfBestand.txt"
emailLijst=$1
email=$2
buffer='bufferBestand.txt'
bufferSchrijf='bufferSchrijfBestand.txt'
grep -o -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b"  $email > "$buffer"
cat "$emailLijst" >> "$buffer"
while IFS= read -r email1
do
	COUNTER=0
	cat "$bufferSchrijf" | { while IFS= read -r emailUitBufferSchrijfBestand
	do
		
		if [ "$email1" = "$emailUitBufferSchrijfBestand" ]
		then
			((COUNTER+=1))		
		fi			
	done;
	if [ "$COUNTER" -eq 0  ]
	then
		echo "$email1" >> "$bufferSchrijf"
	else
		break
	fi
	}
done < "$buffer"
cat "$bufferSchrijf" > "$emailLijst"
1 Like