top of page
Writer's pictureBrian

Grep!

Probably my most used Linux tool is grep. Grep stands for Global regular expression print. Grep allows the user to search for almost anything from the command line. The biggest strength of this searching tool is its implementation of regular expressions (Shotts, 2019).


 

Regular Expressions


“Simply put, regular expressions are symbolic notations used to identify patterns in text” (Shotts, 2019).

Regular expressions let users define common patterns. Some of the more common ones that come to mind include phone numbers, email addresses, and IPv4 addresses. Other uses for regular expression include searching for data that you know is in a specific format and input validation.

Regular expressions can be implemented in numerous methods or programming languages. Still, I believe the quickest and easiest way to locate anything is almost always to use grep from the Linux CLI. The other regular expression method I have utilized is the Python RE module (https://docs.python.org/3/library/re.html).


You can see GREP implemented in my Thumper script:

You can see Python’s RE module implemented in suri-rule-gen.py:


 

Regular expressions are written with a special syntax that defines which characters should be accepted in which place in the string. Check out how to begin building grep regular expressions with this post here: https://linuxize.com/post/regular-expressions-in-grep/


Additionally, here are some examples of greps I frequently use.


Grep for IPv4:

grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3}" file.txt

Grep for Email Addresses:

grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" file.txt

Whenever checking out a new tool, I highly recommend checking out the man page.

man grep

or check it out here: https://linux.die.net/man/1/grep


 

Example


To show the value grep’s searching capabilities, I decided to analyze the Emerging Threats Open ruleset, which comes configured by default in Suricata. After a fresh install of Suricata with :

sudo apt-get install suricata

I then updated the Suricata ruleset with:

sudo suricata-update

Once I updated the ruleset with suricata-update, I had a copy of the latest version of ET Open located at /var/lib/suricata/rules/suricata.rules.


My first step was to determine how many Suricata rules specifically contained the word ransomware. I used grep to search for the string “ransomware” with the -i and -o flags to make this determination. The -o flag specified that I only wanted to print matching content. The -i flag indicated that I was conducting a case insensitive search. The other tools I used to determine this number were sort and uniq. Sort sorted the results, and uniq -c provided counts for how many times each string occurred.



Grep used to search ET Open ruleset for the word ransomware.
Grep used to search ET Open ruleset for the word ransomware.

In my next search, I used grep to search for all IPv4 addresses located in suricata.rules. I did this by using grep with the -E flag to specify that I was using a regular expression, and the regular expression I used was:

"([0-9]{1,3}[\.]){3}[0-9]{1,3}"

I piped the output to more for the sake of the screenshot. Otherwise, I lost the command due to the number of lines returned.


Grep used to search ET Open ruleset for IPv4 addresses.
Grep used to search ET Open ruleset for IPv4 addresses.

Finally, I decided to use grep to determine how many Suricata rules in ET Open had both the word ransomware and an IP address. This was done by using the first grep from the first example and then piping the output to the regular expression used in the second example. As it turns out, there are three Suricata rules with both an IPv4 address and the word ransomware.


Grep used to search ET Open ruleset for rules with IPv4 addresses and the word ransomware.
Grep used to search ET Open ruleset for rules with IPv4 addresses and the word ransomware.

To summarize, grep is my favorite tool. Grep is mainly used to implement regular expressions from the Linux CLI. Regular expressions define patterns such as a phone number or an IP address. Regular expressions can be helpful when searching for data or when validating input. If you have not already, I encourage you to try out grep for yourself!


 


References

Shotts, W. (2019). The Linux Command Line: A Complete Introduction (2nd ed).

No Starch Press.


ShellHacks. (2016a). RegEx: Find Email Addresses in a File using Grep.

ShellHacks. (2016b). RegEx: Find IP Addresses in a File Using Grep.




109 views0 comments

Recent Posts

See All

Comments


bottom of page