计算每列中的唯一出现次数

html5 • 2022年12月15日 pm9:26 • 问答

我有一个包含多列的文件，如$2$3(until $32) 中的

A refdevhet devdevhomo
B refdevhet refdevhet
C refrefhomo refdevhet
D devrefhet  refdevhet

A refdevhet devdevhomo
B refdevhet refdevhet
C refrefhomo refdevhet
D devrefhet  refdevhet

我需要分别计算每列中每个唯一元素的出现次数

所以我有

refdevhet  2 3
refrefhomo 1 0
devrefhet  1 0
devdevhomo 0 1

我尝试了几种变体

而是打印所选字段中唯一元素出现的累积总和。

回答

这是一个解决方案：

BEGIN {
    FS=OFS="t"
}
{
    if (NF>mxf) mxf = NF;
    for(i=1; i<=NF; i++) {ws[$i]=1; c[$i,i]++}
} 
END {
    for (w in ws) {
        printf "%s", w
        for (i=1;i<=mxf;i++) printf "%s%d", OFS, c[w,i];
        print ""
    }
}

请注意，解决方案是通用的。它也将包括第一列。要省略第一列，i=1请i=2在两个地方都更改为。

When you use the comma, the actual index is `$i SUBSEP i`, where the default value for SUBSEP is `34` (the ASCII "FS" character)
Also `print ""` is the same as `printf "n"`
`ws[$i]=1` is not needed, `ws[$i]` is enough.

回答

除了@Andriy 的好答案之外，使用 GNU awk，您还可以使用二维数组

gawk '
  {for (i=2; i<=NF; i++) count[$i][i]++}
  END {
    for (word in count) {
      printf "%s", word
      for (i=2; i<=NF; i++) printf "%s%d", OFS, count[word][i]
      print ""
    }
  }
' file | column -t

我在这里假设每一行的字段数与最后一行的字段数相同。

以上是计算每列中的唯一出现次数的全部内容。

THE END

二维码

转发引用不是被推导为r值引用吗？

< <上一篇

房间唯一性约束失败

下一篇>>

搜索内容

计算每列中的唯一出现次数

回答

回答

目录

目录

推荐文章

最新文章