A filters is a program that takes its input from the standard input file, processes (or filters) it, and sends its output to the standard output file. Linux has a rich set of filters that can be use to work on data in an effective way. Some examples of filters are cat, grep, wc, tr, and cut.
The grep Filter
The grep filter searches a file for a particular pattern of characters and displays all the lines that contain that pattern. The pattern that is searched for is referred to as a regular expression. The grep filter cannot be use without specifying a regular expression.
grep regular_ expression [filename]
The file name is optional in the grep command. Without a file name, grep expects standard input (i.e. inputs from the keyboard). As a line is entered, grep searches the line for the regular expression and displays the line if it contains the expression. Execution stop when a user indicates the end of input by pressing Ctrl + d.
In this example, grep searches for the word the user types in the work, job, and task. When it finds the word job it displays the word job again on the standard output. That is why you can see job written twice in the output.
grep “root” /etc/passwd
In the above example, the lines containing the expression “root” are displayed.
When specifying the pattern “root” in the above example, the double quotes (“) are optional. The above command can also be given as:
grep root /etc/passwd
Using the characters summarized in the following table, you can specify complex regular expressions.
|[ ]||Matches any one of a set of characters||Grep “New [abc] “ filename||It specifies the search pattern as ‘Newa; ‘Newb; or ‘Newc’.|
|[ ] with hyphen||Matches any one of a range of characters||Grep “New[a-c] “ filename||It specifies the search pattern as ‘Newa; ‘Newb; or ‘Newc’.|
|^||Matches the pattern following it if the pattern occurs at the beginning of a line||Grep “New [abc]”filename||It specifies the search pattern as ‘Newa; ‘Newb; or ‘Newc; but this must occur at the beginning of the line.|
|^ within [ ]||Matches the pattern that does not contain any character in the specified set||Grep “New [^a-c] “ filename||It specifies a pattern containing the word ‘New’ followed by any character other than ‘a’, ‘b’, or ‘c’.|
|$||Matches the pattern preceding it if the pattern occurs at the end of each line||Grep “New [abc] $ “ filename||It specifies the search pattern as ‘Newa’, ‘Newb’, or ‘Newc’, but this must occur at the end of the line.|
|.(dot)||Matches any one character||Grep “New [abc] “ filename||It specifies a pattern containing the word ‘New’ following by any character, followed by ‘a’, ‘b’, or ‘c’.|
|/ (backslash)||Ignores the special meaning of the character following it||Grep ” New\.\ [abc] “ filename||It specifies a search pattern, New.[abc] , in which the dot signifies the dot character itself and [abc] signifies of the characters dot, [and] is ignored.|
Options of the grep Filter
The grep filter also has options that alter the output of the command. These are:
-n :This prints each line matching the pattern, along with its line number. The number is printed at the beginning of the line.
-c :This prints only a count of the lines that match a pattern.
-v :This prints all the lines that do not match the pattern specified by the regular expression.
Options must be specified before the regular expression. Option can also be combined. (For example, -n and –v can be used together as –nv .)
The wc Filter
The wc filter is used to count the number of line, words, and characters in a disk file or in the standard input.
wc [options] [filename/s]
2 7 29
The file sample has two lines, seven words, and 29 characters.
The following table summarizes the options of the wc filter.
-l : displays the number of lines
-w : diplays the number of words.
-c : displays the number of characters
Since wc is a filter, it use the standard input if no file name is provided, as depicted below:
wc is an example of a filter
<Press Ctrl and d>
1 7 29
The cut Filter
The cut filter is useful when specific columns from the output of certain commands (such as Is -1, who) need to be extracted. It can also be use to extract specific columns from files.
cut [options] [filename/s]
Refer to the following for the options of the cut filter:
-f : displays the specified columns
-c : displays the specified characters.
-d : specifies the columns delimiter (or you could say seperator).
cut -d ‘:’ -f1 /etc/passwd
Here, the cut command has been used to extract only the names of the users in the /etc/passwd file. The field separator is “:”.
cut -c1-5 /etc/passwd
This command will display the first five characters from the file /etc/passwd.
cut -d ‘:’ -f1,6 /etc/passwd
The above command will display the first and the sixth column from the file /etc/passwd.
cut -d ‘:’ -f1-4 /etc/passwd
The above command will display the first, second, third, and fourth column from the file /etc/passwd.
The tr Filter
The tr filter can be used to translate one set of characters to another. The tr filter translates the characters in the output only and does not affect the input file.
tr ‘:’ ”< /etc/passwd
This command replaces all the occurrences of the ‘:’ character with a space for the file /etc/passwd, and displays the output on the standard output.
This filer can also be used to squeeze (reduce) repeated occurrences of a character into one. Several commands have a multicolumn output and the gap between columns is more than one space. In such cases, users cannot use cut if they want to extract a column, since two columns are separated by more than one space. Therefore, the solution is to squeeze the multiple space between columns into a single space and then use the cut command to extract the desired columns.
The –s option is used to squeeze several occurrences of a character into one character, as illustration below.
alexander :0 2016-06-12 21:40 (:0)
alexander pts/0 2016-06-14 12:07 (:0)
To make the column separator a single space, tr – s has to be use, as shown below:
who > temp_file
tr -s “ ” < temp_file
alexander :0 2016-06-12 21:40 (:0)
alexander pts/0 2016-06-14 12:07 (:0)
Here, the output of the who command is first redirected to a file name ‘temp_file’. And then the –s option of tr works on every record, squeeze (reducing) the repeated spaces into a single space.
Another common use of the tr filter is case-conversion.
tr “[a-z]” “[A-Z]”
The cat jumped over the dog quickly
THE CAT JUMPED OVER THE DOG QUICKLY
The above command converts all lowercase letters to uppercase.
The content of a file could also be translated on the command line. Let us translate the temp_file we created earlier to all uppercase;
tr “[a-z]” “[A-Z]” < temp_file
ALEXANDER :0 2016-06-12 21:40 (:0)
ALEXANDER PTS/0 2016-06-14 12:07 (:0)
In Linux, filers and other commands can be combined such that the standard output of one filter or command can be sent as standard input to another filter or command.
For example, to display all the contents of the current directory at a time, you can type the following commands:
ls > temp_file
Here, a listing (using the ‘ls’ command) of the directory is stored in the file temp_file by the first command. This file is then used as input by the more command.
Through the Linux pipe feature, these two steps can be combined and executed as a single command without creating a temporary file, as shown below:
ls | more
The vertical bar (|) is the pipe character, which indicates to the shell that the output of the command before ‘I’ is sent as input to the command after ‘I’.
Another advantage of the pipe feature is that utilities do not have to be rewritten to perform complex tasks. Instead, Linux tools (commands) can be combined. There is no limit to the number of filters or commands in a pipe. Consider the example in the figure below to understand how a pipe works.
The following is a sample output of the ls -1 command:
drwxrwxr-x 2 alexander alexander 4096 May 24 08:46 deja-dup
drwxr-xr-x 9 alexander alexander 4096 Jul 30 12:08 Desktop
drwxr-xr-x 5 alexander alexander 4096 Jul 29 16:10 Documents
drwxr-xr-x 2 alexander alexander 4096 Jul 30 11:31 Downloads
-rw-r–r– 1 alexander alexander 8980 Apr 28 07:46 examples.desktop
drwxr-xr-x 2 alexander alexander 4096 Apr 28 07:54 Music
drwxr-xr-x 3 alexander alexander 4096 Jul 30 11:31 Pictures
drwxr-xr-x 2 alexander alexander 4096 Apr 28 07:54 Public
-rw-rw-r– 1 alexander alexander 109 Jul 30 13:34 tempfile
drwxr-xr-x 2 alexander alexander 4096 Apr 28 07:54 Templates
drwxrwxr-x 2 alexander alexander 4096 Jun 6 11:02 ubuntu
drwxr-xr-x 2 alexander alexander 4096 Jul 30 11:30 Videos
Some example of the pipe feature, based on the above output, are given below.
1. To display the names of all the files in the current directory, type the following command:
Here, the output of the ls -1 command is given as input to the grep command. The Is -1 command gives a detailed list of the files in your current directory. The grep command extracts all the ordinary files. This output is given to the tr command, which replaces multiple spaces with a single. The cut command takes this as input and extracts the ninth field, which is the file name, and displays it.
2. To display the file names in the current directory along with the file size, a screen-full at a time, you need to give the command:
ls -l | tr -s “ ” | cut -d” ” -f5,9 | more
3. Let sale there is a staff data file (lets say ‘staffContact‘) that contains information like this;
mikel,60 male 08066786568
sarah 23 female 09086676768
collins 34 male 08054768769
brymo 44 male 07054646271
gerad 23 Male 08165242516
To scan the file, staffContact, and search for the string, “male” (assuming we want to extract all the staff who are males), we can use this command;
cat staffContact| grep “male” | more
grep “error” < staffContact | more
The tee command.
the intermediate output in a pipe is discarded y linux, which means it is not saved on the disk. Sometimes, you may want to redirect the output of a command to another command and also in the process save it on the disk for later use. The tee command takes standard input and writes to standard output and to file(s). if the file to be written to does not exist, the file is created. If the file already exists, its contents are overwritten. The -a (append) option can be used to append contents to an already existing file.
ls -al | tee allFiles
The above command displays all the files in the current directory on the screen and also writes to the file “allFiles”.
The above command extracts the first, second, third, an fourth columns from the passwd file, and then passed to the tee command which creates the file users-passwords to store the data (or output) before displaying it on the screen.