根据列名打印列

假设我有一个test.txt包含

a,b,c,d,e
1,2,3,4,5
6,7,8,9,10

我想根据匹配的列名打印列,无论是从另一个文本文件还是从数组。例如,如果我被给予

arr=(a b c)

我希望我的输出是

a,b,c
1,2,3
6,7,8
a,b,c
1,2,3
6,7,8

如何使用 bash 实用程序/awk/sed 执行此操作?我的实际文本文件是 3GB(我想要匹配列值的行实际上是第 3 行),因此非常感谢有效的解决方案。这是我到目前为止:

for j in "${arr[@]}"; do awk -F ',' -v a=$j '{ for(i=1;i<=NF;i++) {if($i==a) {print $i}}}' test.txt; done

但我得到的输出是

a
b
c

这不仅缺少其他行,而且每个列名都打印在一行上。

回答

使用您显示的样本,请尝试以下操作。代码正在读取 2 个文件 another_file.txt(其中包含 a b c每个示例)和名为 test.txt 的实际 Input_file(其中包含所有值)。

awk '
FNR==NR{
  for(i=1;i<=NF;i++){
    arr[$i]
  }
  next
}
FNR==1{
  for(i=1;i<=NF;i++){
    if($i in arr){
      valArr[i]
      header=(header?header OFS:"")$i
    }
  }
  print header
  next
}
{
  val=""
  for(i=1;i<=NF;i++){
    if(i in valArr){
       val=(val?val OFS:"")$i
    }
  }
  print val
}
' another_file.txt FS="," OFS="," test.txt

输出如下:

说明:为上述解决方案添加详细说明。

awk '                                         ##Starting awk program from here.
FNR==NR{                                      ##Checking condition which will be TRUE while reading another_text file here.
  for(i=1;i<=NF;i++){                         ##Traversing through all fields of current line.
    arr[$i]                                   ##Creating arr with index of current field value.
  }
  next                                        ##next will skip all statements from here.
}
FNR==1{                                       ##Checking if this is 1st line for test.txt file.
  for(i=1;i<=NF;i++){                         ##Traversing through all fields of current line.
    if($i in arr){                            ##If current field values comes in arr then do following.
      valArr[i]                               ##Creating valArr which has index of current field number.
      header=(header?header OFS:"")$i         ##Creating header which has each field value in it.
    }
  }
  print header                                ##Printing header here.
  next                                        ##next will skip all statements from here.
}
{
  val=""                                      ##Nullifying val here.
  for(i=1;i<=NF;i++){                         ##Traversing through all fields of current line.
    if(i in valArr){                          ##Checking if i is present in valArr then do following.
       val=(val?val OFS:"")$i                 ##Creating val which has current field value.
    }
  }
  print val                                   ##printing val here.
}
' another_file.txt FS="," OFS="," test.txt    ##Mentioning Input_file names here.


以上是根据列名打印列的全部内容。
THE END
分享
二维码
< <上一篇
下一篇>>