Sunday, September 6, 2009

LINUX DATA MANIPULATION cut

SkyHi @ Sunday, September 06, 2009
Consider a slight variation on the company.data file we've been playing with in this section:

406378:Sales:Itorre:Jan
031762:Marketing:Nasium:Jim
636496:Research:Ancholie:Mel
396082:Sales:Jucacion:Ed

If you want to print just columns 1 to 6 of each line (the employee serial numbers), use the -c1-6 flag, as in this command:

cut -c1-6 company.data
406378
031762
636496
396082

If you want to print just columns 4 and 8 of each line (the first letter of the department and the fourth digit of the serial number), use the -c4,8 flag, as in this command:

cut -c4,8 company.data
3S
7M
4R
0S

And since this file obviously has fields delimited by colons, we can pick out just the last names by specifying the -d: and -f3 flags, like this:

cut -d: -f3 company.data
Itorre
Nasium
Ancholie
Jucacion

Here is a summary of the most common flags for the cut command:

-c [n | n,m | n-m] Specify a single column, multiple columns (separated by a comma), or range of columns (separated by a dash).
-f [n | n,m | n-m] Specify a single field, multiple fields (separated by a comma), or range of fields (separated by a dash).
-dc Specify the field delimiter.
-s Suppress (don't print) lines not containing the delimiter.

Reference: http://lowfatlinux.com/linux-columns-cut.html