Linux: Grep: Searching in Files

GREP is one of the most efficient functionality provided by UNIX for search inside files. We can also use GREP to search in zipped files without extracting or output of the earlier command. Various key functionalities of GREP command is explained further.

Basic usage of grep is to provide your search string inside single () or double quote () and then the path of file/files inside which that string needs to be searched. Following are the examples of GREP command and various input flags. The usage of GREP command is almost same across UNIX/LINUX/AIX or any other environment.

  • grep error *.log

Above command will search for word “error” inside all files ending with extension log in the current directory and display respective rows containing this word. If we want to search inside all directories inside current location then we can simply give “*” instead “*.log”. This command will not consider subdirectories or zipped files.

  • grep error message *.log
  • grep error message *.log

Above command will search for string “error message” inside all files with extension *.log. If we executed this command without double quotes then OS will consider messages as file and will to search inside that file if exists. It will also search inside *.log files.

  • grep –i error *.log

This is the case-insensitive option. Every search inside grep is case sensitive means OS will assume error, ERROR, and Error as a separate string. With “i” flag we can tell OS to ignore case and run the basic search.

  • grep  ‘^error’ *.log

This option will return rows only if they are beginning with error and not inside or terminating with an error. Caret “^” used to when we need to tell OS to find string begins with specific text/pattern.

  • grep ‘error$‘ *.log

Similarly, the dollar “$” option finds strings inside the files ending with given text and returns matching output.

  • grep –v error *.log

Above flag will return rows which are not matching with the given input. Therefore in case of large file sets, we should carefully use this flag, otherwise, it will print everything on the screen or write to the file in the case output diverted e.g. “grep –v error *.log > /tmp/output.log ”

  • grep –f input.txt *.log

We can also give multiline input or multiple parameters inside the file as a search string. Every row from the file OS will treat as the separate search string. Sometimes such input files are used to filter known error messages from the error logs. e.g. “grep error *.log | grep -f input.txt” or “grep –vf input.txt *.log | grep error”

We can also give additional “x” flag (grep –xf input.txt *.log) which will match every line with each line in the file. E.g. if one of the record from input file contains “error” then it will check for rows having only string “error”. It will not return records where rows contain string “error” with other texts such as “error message” or “file exception error”.

  • grep –n error *.log

With “n” option we get row numbers of every matched record from the given files.

As explained earlier we can combine any of these flags to meet search criteria.

  • grep –e “err|excepion|warn” *.log
  •  egrep “err|excepion|warn” *.log

With “e” options we can give multiple search strings in the same command with “|” (pipe) as the delimiter for separating input. Generally, with normal GREP these type of search is the bit expensive from the performance perspective.

  • grep –F error *.log
  • fgrep error *.log

Whenever we want to search for the fixed strings inside one or multiple files we should be using “F” (upper case) option as it returns output faster compare to standard grep. This also referred as fgrep.

  • grep –v^$‘ output.log

Above command will help to remove empty rows from the given input file.

  • grep –r error /usr/input

Above flag is the recursive flag which means it will search for the given string inside given the directory and all sub-directories if exists. Sometimes it can impact performance if we have too many subdirectories with big files.

Normal grep command does not search inside zipped files. As a workaround, some people unzip such files and then try to search inside unzipped files. We can use zgrep instead to search inside zipped files as below. This command cannot search in normal files.

  • zgrep  error *.log.gz

With all above command, we can always use “more” command with a pipe to view pagewise data of searched output in default editor.

To summarise GREP is one of the most used commands by UNIX users and sometimes we use longer commands or big scripts due to non-awareness complete functionalities. There are more flags available with GREP but I have tried to list down some of the key flags which are mostly required.

Thanks.

Advertisements

Linux : Debugging Networking/Connectivity Issues ?

When it comes to debugging connectivity issues with between to devices “traceroute” is one of the most widely used commands. But this command sometimes stuck in between as most of the firewalls filter traceroute and ping commands considering the risk of opening network routes. But then how can we investigate such issues?

Any network connection used TCP/IP three-way handshake before starting communication between to devices. This follows SYN_SENT -> SYN_ACK -> ESTABLISHED package routes. So In order to confirm connectivity between 2 devices, we must ensure that three-way handshake is complete. This can be identified using “netstat” commands

netstat -an :- this command provides the list of active network connections on the machine. So if we have check three-way handshake then we must grep destination IP address from the output of this command.

E.g. (assuming your destination ip is 10.120.20.40 )

netstat -an | grep 10.120.20.40

If the output is showing ESTABLISHED state then your server successfully established network device and then you can concentrate on application code to investigate further.

If the output is showing SYN_SENT state then your server has sent first connection request but waiting for acknowledgment from the server.

In such case, you must run the same command on the destination server side by providing source machine ip in grep command.

If you are not getting any output on the server side then it proves that request not received at the server side. Then you can concentrate on taking “tcpdump” with the help of network/firewall team to investigate outgoing route to the destination server. Most of the times it happens to be some firewall policies filtering traffic to the destination.

If server output showing “SYN_ACK” then we must check return route. This mean that connection request received at destination side and server sent the acknowledgment for connection establishment. In such case, we check return route from destination to the source machine.

netstat is very effective command and available to the most of the user, which means doesn’t need super access to run this command.

Linux : Finding faster in Files

GREP command is the most consumed command in Linux environments when it comes to searching and finding files. But when we need to search for large or multiple files on regular intervals then we need to look for efficient options.

In such cases, fgrep command provides faster results when it comes to searching for fixed strings. fgrep gives more than 10 times faster than normal grep command. It needs the same format as standard grep command.

e.g.

fgrep test files*log

Also on the other side if we need to search in zip files without opening them then we can use zgrep command which again works on the same pattern like normal grep command

zgrep search_string zipfile*gz

Linux : Why Can’t I access crontab ?

Crontab allows users to schedule jobs under in controlled manner. In Linux environments every user can have their crontab. In order to create crontab user needs to have their username in cron.allow file which is resides in the cron directory. Most of the operating systems does not create this file by default, but system admin creates this file to restrict users from creating cron entry. Apart from this there is another file called cron.deny which restricts users from creating cron entires. Both of these files are independent from each other.

There is no format to add usernames in this files. You just need to add name of the user in the system.

cron directory locations

Linux environments

/etc/cron

AIX

/var/adm/cron

MAC

/usr/lib/cron

Linux : Finding who logged on the Server ?

Unix/Linux environments are very efficient operating systems when it comes to tracking, auditing. System logs generally logs most of the user activities, but that was accessible mostly to the Admin Users. There is another command called “last” which also provides login time, name and ip details for all users. This is very good command provides critical info helps in investigating security issues. We can also search for specific user logs “last sushant” or can check last n number( e.g. 200) rows with command “last 200”

some Examples on Mac OS


wtmp begins Wed Nov  4 11:50

Sushs-MBP:log sushantmhatre$ last -10

sushantmhatre  ttys000                   Sat Oct 29 11:44   still logged in

_mbsetupuser  console                   Tue Aug  9 10:18   still logged in

sushantmhatre  console                   Tue Aug  9 10:18   still logged in

reboot    ~                         Tue Aug  9 10:17

shutdown  ~                         Tue Aug  9 10:17

root      console                   Tue Aug  9 10:14 – shutdown  (00:02)

sushantmhatre  ttys000                   Mon Jul 11 19:42 – 21:47 (1+02:04)

_mbsetupuser  console                   Thu May 26 13:34 – shutdown (74+20:43)

sushantmhatre  console                   Thu May 26 13:34 – 10:14 (74+20:40)

reboot    ~                         Thu May 26 13:33


Sushs-MBP:log sushantmhatre$ last sushantmhatre

sushantmhatre  ttys000                   Sat Oct 29 11:44   still logged in

sushantmhatre  console                   Tue Aug  9 10:18   still logged in

sushantmhatre  ttys000                   Mon Jul 11 19:42 – 21:47 (1+02:04)

sushantmhatre  console                   Thu May 26 13:34 – 10:14 (74+20:40)

sushantmhatre  ttys000                   Wed Apr 13 19:26 – 21:56 (16+02:30)

sushantmhatre  ttys000                   Tue Apr 12 23:28 – 19:26  (19:57)

sushantmhatre  console                   Tue Mar 29 12:25 – 13:30 (58+01:05)

sushantmhatre  console                   Thu Mar 24 17:07 – crash (4+18:17)

sushantmhatre  ttys000                   Fri Mar 18 13:14 – 16:57 (6+03:43)

sushantmhatre  console                   Fri Mar 18 13:13 – 16:58 (6+03:45)

sushantmhatre  ttys000                   Thu Feb 25 16:31 – crash (21+20:41)

sushantmhatre  console                   Mon Feb  8 20:16 – crash (38+16:56)

sushantmhatre  console                   Sun Jan 31 22:12 – crash (7+22:03)

sushantmhatre  console                   Sun Jan 31 21:39 – 22:09  (00:29)

sushantmhatre  console                   Mon Jan 18 12:17 – crash (13+09:22)

sushantmhatre  console                   Sat Jan 16 11:52 – 03:19 (1+15:26)

sushantmhatre  ttys000                   Thu Jan 14 20:20 – 21:47  (01:26)

sushantmhatre  console                   Wed Jan 13 16:15 – 00:53 (2+08:37)

sushantmhatre  console                   Sun Jan  3 22:03 – crash (9+18:12)

sushantmhatre  console                   Mon Dec 21 18:48 – crash (13+03:14)

sushantmhatre  console                   Sun Dec 20 21:08 – 14:40  (17:32)

sushantmhatre  console                   Mon Nov 30 15:02 – crash (20+06:05)

sushantmhatre  console                   Wed Nov 18 00:29 – crash (12+14:33)

sushantmhatre  console                   Sun Nov  8 22:18 – crash (9+02:11)

sushantmhatre  console                   Wed Nov  4 11:51 – 22:17 (4+10:26)


Linux : Copying commands from Windows

While copying commands from non-text editors to Linux environments we need to be very careful as depending upon to environment parameter settings values can be changed after the paste.

Most common example is “-” changing to dot”.” when copied from MS Word to Linux environments. In common Linux environment “-” used to pass flags to current command and “.” used as wild characters. Therefore we need to be 100% sure that same command is copied to the destination

Cron – Unix Job Scheduling

Crontab allows you to schedule scripts on the server in a controlled manner. The format is given below.

Below script is scheduled every day at the 20th hour.

0 20 * * * /home/oracle/scripts/export_dump.sh  > /dev/null 2>&1
 # ┌───────────── min (0 - 59)
 # │ ┌────────────── hour (0 - 23)
 # │ │ ┌─────────────── day of month (1 - 31)
 # │ │ │ ┌──────────────── month (1 - 12)
 # │ │ │ │ ┌───────────────── day of week (0 - 6) (0 to 6 are Sunday to
 # │ │ │ │ │                  Saturday, or use names; 7 is also Sunday)
 # │ │ │ │ │
 # │ │ │ │ │
 # * * * * *  command to execute

In above example, we are discarding both script output and error logs.

Sometimes scripts expect environmental variables during run time. If they are not configured in script then we must invoke .profile file before execution of the script (considering .profile holding all environmental variables)

0 20 * * * . /home/.profile; /home/oracle/scripts/export_dump.sh  > /dev/null 2>&1

On the other side we can also divert output to separate log files as displayed below.

0 20 * * * . /home/.profile; /home/oracle/scripts/export_dump.sh >> /tmp/script_log.txt > /dev/null 2>&1

For every crontab system create separate file at following directory “/var/adm/cron”

crontab -e :-

This command is used to edit the crontab file.

crontab -l :-

This command is used to list crontab entries for the current user.

crontab -r :-

Be careful with this command as it will remove all cron entries for the given user.

Please be advised that cron have no mechanism to check duplicate jobs. Which means that if we have scheduled any job to run every minute then it will automatically trigger that job every minute and will not check if earlier run completed or still running. Therefore whenever we are scheduling any scripts on cron with high frequency then must include duplicate run check in the script.