Perl / sed / grep regex trouble (didn't use Python)

cotton · October 26, 2016, 2:37am

Hi,

I'm trying to make a regex to return only a certain part of an entry. Please consider the following two entries

BIO-2140-01 (54321) Foundations of Biology
IDS-3150-F2-02 (12345) History of Everything

I'd like to return the numbers in parenthesis only - so 54321 and 12345

So far my approach as been search for everything that isn't 5 consecutive digits and replace that are with ''.

I've tried this regex s/[^\d\d\d\d\d]//g and s/[^\s(\d{5})]//g
Add this one to this mix that doesn't work s/[^\b(\d{5})\b]//g;

First one will return the all numbers squished ie 21400154321, the second on returns 214001 (54321).

Any thoughts on just grabbing the digits in between the parens?

nakamura · October 26, 2016, 8:47am

Haven't used perl in while but the regex for capturing the content of parentheses is

$([^$]+)\)

to fine tune the regex I recommend that you use regex101.com or regexr.com

Ion · October 26, 2016, 9:41am

Also you might find it handy to include the capture group (e.g. $1 and so on). Then if you want you could include all three parts as it becomes alot easier as only one regex is needed to capture the three parts that you probably will need.

But as nakamura pointed out always use a regex tester before trying anything as it will save you so much hassle

cotton · October 26, 2016, 3:21pm

Thanks @nakamura - that matches it.

@Ion I'm trying to figure out how to use the capture group but no luck. I've tried

$var1 = "BIO-1200-01 (12345) Some Course Name =~

s/(\(([^\)]+)\))/$1/;

No luck... any advice - I'd like to set a variable = to that matched result.

eidolonFIRE · October 26, 2016, 5:22pm

step 1) Abandon Perl.. run for your life to Python.

cotton · October 26, 2016, 5:24pm

I'd be willing to consider it, but how would you solve this using python - I'm used to sed (which is very close to perl). Python's regex requires me to know both regex and python - perl pretty much you just need to know regex.

There I changed the title to willing to consider Python - I'm language agnostic - as long as it can get the job done.

eidolonFIRE · October 26, 2016, 5:35pm

Python is incredibly readable and quick to pick up compared to perl.

I would use the regular expressions library. It's how you parse input using re.

https://docs.python.org/2/library/re.html

cotton · October 26, 2016, 6:56pm

Thanks I'll check that out. Unfortunately, I still don't conceptually understand how to assign a variable to only the matching part of a regex from a string... nor can seem to come up with a solid google search which helps me understand.

cotton · October 27, 2016, 2:12am

Stuck with perl - here's the nasty regex

($courseID) = ($course =~ /[^(]+\(([^)]+)\)/)

omg my eyes!!!! IT HURTS!

reikoshea · October 27, 2016, 7:23am

me@my-PC ~
$ echo "BIO-2140-01 (54321)" | perl -ne 'm/\(([^)]+)\)/; print $1 . "\n";'
54321
me@my-PC ~
$ echo "IDS-3150-F2-02 (12345)" | perl -ne 'm/\(([^)]+)\)/; print $1 . "\n";'
12345

me@my-PC ~
$ cat test.txt
BIO-2140-01 (54321) Foundations of Biology
IDS-3150-F2-02 (12345) History of Everything
me@my-PC ~
$ cat test.pl
#!/usr/bin/perl
my $fn = 'test.txt';
open(my $fh, '<', $fn);
while(my $line = <$fh>){
        chomp($line);
        if ($line =~ m/\(([^)]+)\)/) {
                print $1 . "\n";
        }
}
me@my-PC ~
$ ./test.pl
54321
12345

Ion · October 27, 2016, 9:37am

@cotton @reikoshea example is what you want, but just to help you understand capture groups see code below.

#!/usr/bin/perl
# n2 - extract forename and surname

print "please enter your name ";
chop ($name = <STDIN>);

if ($name =~ /^\s*(\S+)\s+(\S+)\s*$/) {
 print "Hi $1. Your Surname is $2.";
} else {
 print "no match";
}
print "\n";

Hammerhead_Corvette · November 6, 2016, 10:13pm

break it up into groups... 's/(\w{3})-(\d{4})-(\d{2}) $\d{5}$/$1-$2-$3-/g' each group is described by a $ and number so you basically remove the last 5 numbers and parenthesis. Hope this helps. Will work with Perl rename @cotton

rename -n 's/(\w{3})-(\d{4})-(\d{2}) $\d{5}$/$4/g'
this will give you the numbers in parenthesis only.

cotton · November 11, 2016, 2:50am

Ok honestly - what the heck, are you guys seeing this???? If I show that to any "normal" person and told them it actually means something they'd tell me I'm crazy. I mean for gosh sakes - there's not even a number or letter in that thing.

Hammerhead_Corvette · November 11, 2016, 3:05am

It's speaks to us !

reikoshea · November 11, 2016, 8:21pm

That's why we get the big bucks.