awk内置函数sub gensub gsub match等介绍

asort和asorti

格式：

asort(s [, d])
asorti(s [, d]) 

功能及返回值： asort：对数组进行排序，如果省略参数d，则修改数组s，如果提供参数d，则将数组s拷贝到d中然后进行排序，数组s不会被修改，排序后数组的下标从1开始；最终返回数组中元素个数

[root@MySQL ~]# awk 'BEGIN{a[0]="a";a[1]="c";a[2]="b";print "before sorting:";for(i in a){print i,a[i]};asort(a);print "after sorting:";for(i in a){print i,a[i]}}'
before sorting:
a
c
b
after sorting:
a
b
c

[root@mysql ~]# awk 'BEGIN{a[0]="a";a[1]="c";a[2]="b";print "before sorting:";for(i in a){print i,a[i]};asort(a,d);print "after sorting:";for(i in a){print i,a[i]};print ;for(i in d){print i,d[i]}}'
before sorting:
a
c
b
after sorting:
a
c
b

a
b
c

sub、gensub和gsub函数

格式：

sub(r, s [, t])
gsub(r, s [, t])
gensub(r, s, h [, t])

功能及返回值： sub:对于t中匹配r的字串，将第一个匹配的子串替换为s，如果t省略，则t为$0；返回值为替换的字符串个数 gsub:对于t中匹配r的字串，将匹配的所有子串替换为s，如果t省略，则t为$0；返回值为替换的字符串个数 gensub:对于t中匹配r的字串，如果h是以”g”或”G”开头的字符串，则将匹配的所有子串替换为s，如果h是数字n，则将第n处匹配进行替换；如果参数t省略，则t为$0

sub及gsub案例

[root@MySQL ~]# awk 'BEGIN{r="or|ll";s="wj";t="hello,world!hello,awk";print sub(r,s,t),t}'
1 hewjo,world!hello,awk

[root@MySQL ~]# awk 'BEGIN{r="or|ll";s="wj";t="hello,world!hello,awk";print gsub(r,s,t),t}'
3 hewjo,wwjld!hewjo,awk

[root@MySQL ~]# echo "hello,world;hello,awk"|awk '{r="or|ll";s="wj";print sub(r,s),$0}'
1 hewjo,world;hello,awk

[root@MySQL ~]# echo "hello,world;hello,awk"|awk '{r="or|ll";s="wj";print gsub(r,s),$0}'
3 hewjo,wwjld;hewjo,awk

注：正则表达式另外一种写法

[root@MySQL ~]# echo "hello,world;hello,awk"|awk \'{s="wj";print gsub(/or|ll/,s),$0}\'
3 hewjo,wwjld;hewjo,awk

注：sub和gsub函数功能相同，前者指替换匹配的第一个字符串，而后者进行全局替换

gensub案例

省略参数t

[root@MySQL ~]# echo "hello,world\!hello,awk\!hello Linux\!"|awk 'BEGIN{s="ww";r="ll"}{print gensub(r,s,"g")}'
hewwo,world!hewwo,awk!hewwo linux!

参数h以g或G开头

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,"g",t)}'
hewwo,world!hewwo,awk!hewwo linux!

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,"g1",t)}'
hewwo,world!hewwo,awk!hewwo linux!

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,"G1",t)}'
hewwo,world!hewwo,awk!hewwo linux!

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,"g1",t)}'
hewwo,world!hewwo,awk!hewwo linux!

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,"G1",t)}'
hewwo,world!hewwo,awk!hewwo linux!

如果参数h不是数字也不以g或G开头，则替换第一处

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,"a",t)}' 
hewwo,world!hello,awk!hello linux!

参数h是数字

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,"1",t)}'
hewwo,world!hello,awk!hello linux!

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,"3",t)}'
hello,world!hello,awk!hewwo linux!

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,3,t)}'
hello,world!hello,awk!hewwo linux!

[root@MySQL ~]# awk 'BEGIN{s="ww";t="hello,world!hello,awk!hello linux!";r="ll";print gensub(r,s,0,t)}'
awk: warning: gensub: third argument of 0 treated as 1
hewwo,world!hello,awk!hello linux!

index函数

格式：

index(s, t)

功能及返回值： 返回字符串t在字符串s中的索引，如果字符串t在字符串s中不存在，则返回0(这表明字符串的索引是从1开始的)

[root@MySQL ~]# awk 'BEGIN{s="hello,world";t="llo";print index(s,t)}'
3

[root@MySQL ~]# awk 'BEGIN{s="hello,world";t="lloo";print index(s,t)}'
0

[root@MySQL ~]# awk 'BEGIN{s="hello,world";t="hello,world!";print index(s,t)}'
0

[root@MySQL ~]# awk 'BEGIN{s="hello,world";t="";print index(s,t)}'
1

注意：当字符串t为空时，返回的索引为1

length函数

格式：

length([s])

功能及返回值： 返回字符串s的长度，如果参数s省略，则返回$0的长度；从3.1.5版本开始，作为非标准扩展，如果参数为数组，则返回数组元素个数。

[root@MySQL ~]# awk 'BEGIN{s="hello,world";t="";print index(s,t)}'
1

[root@MySQL ~]# awk 'BEGIN{s="";print length(s)}'
0

[root@MySQL ~]# awk 'BEGIN{s=123;print length(s)}'
3

[root@MySQL ~]# awk 'BEGIN{s="hello world";print length(s)}'
11

[root@MySQL ~]# awk 'BEGIN{print length()}'
0

[root@MySQL ~]# echo "123 345" | awk '{print length()}'
7

match函数

格式：

match(s, r [, a])

功能及返回值： 当正则表达式r匹配字符串s中的某一部分时，返回匹配部分的索引，如果匹配不上，返回0，同时设置内置变量RSTART和RLENGTH；如果提供没有省略参数数组a，数组中的第1-n个元素为字符串s匹配正则表达式r中的带括号的子表达式的部分，数组a的第0个元素为字符串s匹配正则表达式r的完整匹配，数组的下标a[n, “start”]和a[n, “length”]分别表示匹配字符串的第一个字符的索引及匹配的字符串的长度。可能描述地不是很清楚，下面通过例子来讲解

案例一：省略参数数组a

[root@MySQL ~]# awk 'BEGIN{s="hello,world!";r="ll";print match(s,r)}'
3

匹配ll，索引为3，返回值为3

[root@MySQL ~]# awk 'BEGIN{s="hello,world!";r="wj";print match(s,r)}'
0

匹配wj，没有匹配上，返回值为0

案例二：提供参数数组a(正则表达式中没有带括号的子表达式)

[root@MySQL ~]# awk 'BEGIN{s="hello,world!";r="ll";print match(s,r,a);print ;for(i in a){print "subscript:"i"\t""valus:"a[i]}}'
3

返回s中匹配ll的索引，为3

subscript:0 start valus:3 #数组a[0,”start”]，值为s中匹配ll的索引，即3 subscript:0 length valus:2 #数组a[0,”length”]，值为匹配的字符串的长度，即ll的长度，为2 subscript:0 valus:ll #数组a[0]，值为匹配的字符串，即ll

另外一种写法，将正则表达式放在//中，和上面是同样的效果

[root@MySQL ~]# awk 'BEGIN{s="hello,world!";print match(s,/ll/,a);print ;for(i in a){print "subscript:"i"\t""valus:"a[i]}}'
3

subscript:0start valus:3 subscript:0length valus:2 subscript:0 valus:ll

案例三：提供参数数组a(正则表达式中有带括号的子表达式)

[root@MySQL ~]# awk 'BEGIN{s="hello,world!";r="(ll).*(or.*d)";print match(s,r,a);print length(a);print ;for(i in a){print "subscript:"i"\t""valus:"a[i]}}'
3 #匹配的字符串索引位置
9 #数组a中的元素个数

subscript:0start valus:3 subscript:0length valus:9 subscript:1start valus:3 subscript:2start valus:8 subscript:0 valus:llo,world subscript:1 valus:ll subscript:2 valus:orld subscript:2length valus:4 subscript:1length valus:2

#上面输出数组a的元素顺序有点乱，整理下，如下：

subscript:0 valus:llo,world subscript:0 start valus:3 subscript:0 length valus:9 subscript:1 valus:ll subscript:1 start valus:3 subscript:1 length valus:2 subscript:2 valus:orld subscript:2 start valus:8 subscript:2 length valus:4

当正则表达式中有带括号的子表达式时，数组a中的第0个元素为正则表达式的完整表达式，数组第1-n个元素为正则表达式中子表达式的内容

(ll).(or.d)

对于字符串“hello,world!”来说，

正则表达式(ll).(or.d)的的完整匹配为“llo,world”，所以a[0]的值为“llo,world”，a[0,”start”]为“llo,world”中的起始字符“l”在“hello,world!”中的索引，即3；a[0,”length”]为“llo,world”的长度，即9。

正则表达式(ll).(or.d)中的子表达式分别为(ll)和(or.*d)，匹配“hello,world!”时，分别匹配”ll”和“orld”，所以a[1]和a[2]的值分别为”ll”和“orld”，a[n, “start”]和a[n, “length”]（n=2,3）分别存储对应的索引和长度

注：

RSTART:match()函数匹配的第一个字符的索引；如果没有匹配，则为0 RLENGTH:match()函数匹配的字符串的长度；如果没有匹配，则为-1 RSTART The index of the first character matched by match(); 0 if no match. (Thisimplies that character indices start at one.) RLENGTH The length of the string matched by match(); -1 if no match.

split函数

格式：

split(s, a [, r])

功能及返回值：将字符串s用正则表达式r作为分隔符进行分割，将分割的多个字段(域)存储到数组a中；如果r省略，用awk内置的FS变量对字符串s进行分割，将将分割的多个字段(域)存储到数组a中。返回分割的字段数也即数组中元素个数。

[root@MySQL ~]# awk 'BEGIN{s="hello,world;hello,awk";r=",";print split(s,a,r);for(i in a){print i,a[i]}}'
3
1 hello
2 world;hello
3 awk

[root@MySQL ~]# awk 'BEGIN{s="hello,world;hello,awk";r="hello";print split(s,a,r);for(i in a){print i,a[i]}}'
3
1
2 ,world;
3 ,awk

[root@MySQL ~]# head -n 1 /etc/passwd|awk 'BEGIN{s="hello,world;hello"}{print split(s,a);for(i in a){print i,a[i]}}'
1
1 hello,world;hello

[root@MySQL ~]# head -n 1 /etc/passwd|awk 'BEGIN{FS=";";s="hello,world;hello"}{print split(s,a);for(i in a){print i,a[i]}}'
2
1 hello,world
2 hello

注：FS变量的赋值也可以放在pattern+action外面

[root@MySQL ~]# head -n 1 /etc/passwd|awk -v FS=";" 'BEGIN{s="hello,world;hello"}{print split(s,a);for(i in a){print i,a[i]}}'
2
1 hello,world
2 hello

sprintf函数

对于该函数，后续会单独写一篇文章介绍

strtonum函数

格式：

strtonum(str)

功能及返回值：将字符串类型转化为数字类型，如果str以0开头则被转化为8进制，如果str以0x或0X开头则被转换为16进制

[root@MySQL ~]# awk 'BEGIN{s="123";print strtonum(s)}'
123
[root@MySQL ~]# awk 'BEGIN{s="0123";print strtonum(s)}'
83
[root@MySQL ~]# awk 'BEGIN{s="0x123";print strtonum(s)}'
291
[root@MySQL ~]# awk 'BEGIN{s="0X123";print strtonum(s)}'
291
[root@MySQL ~]# awk 'BEGIN{s="a123";print strtonum(s)}'
0
[root@MySQL ~]# awk 'BEGIN{s="12a3";print strtonum(s)}'
12
[root@MySQL ~]# awk 'BEGIN{s="123a";print strtonum(s)}'
123
[root@MySQL ~]# awk 'BEGIN{s="123.456";print strtonum(s)}'
123.456
[root@MySQL ~]# awk 'BEGIN{s="";print strtonum(s)}'
0

substr函数

格式：

substr(s, i [, n])

功能及返回值：

substr:返回字符串s中从索引i开始的最大长度为n字符串，如果n省略，则返回从索引i到字符串s末尾的字符串

[root@MySQL ~]# awk 'BEGIN{s="hello,world";print substr(s,2)}'
ello,world
[root@MySQL ~]# awk 'BEGIN{s="hello,world";print substr(s,2,5)}'
ello,
[root@MySQL ~]# awk 'BEGIN{s="hello,world";print substr(s,0)}'
hello,world
[root@MySQL ~]# awk 'BEGIN{s="hello,world";print substr(s,-2)}'
hello,world
[root@MySQL ~]# awk 'BEGIN{s="hello,world";print substr(s,3,-2)}'

[root@MySQL ~]# 

tolower和toupper函数

格式：

tolower(str)
toupper(str)

功能及返回值： tolower:将字符转化为小写字母，非字母则不变 toupper:将字符转换为大写字母，非字母则不变

[root@MySQL ~]# awk 'BEGIN{s="HellO";print tolower(s)}'
hello
[root@MySQL ~]# awk 'BEGIN{s="^He;llO$";print tolower(s)}'
^he;llo$
[root@MySQL ~]# awk 'BEGIN{s="HellO";print toupper(s)}'
HELLO
[root@MySQL ~]# awk 'BEGIN{s="^He;llO$";print toupper(s)}'
^HE;LLO$

文档信息
--------------
* 作者:
* 版权声明：自由转载-非商用
* 转载: []

awk内置函数sub gensub gsub match等介绍