根据列名打印列
假设我有一个test.txt包含
a,b,c,d,e
1,2,3,4,5
6,7,8,9,10
我想根据匹配的列名打印列,无论是从另一个文本文件还是从数组。例如,如果我被给予
arr=(a b c)
我希望我的输出是
a,b,c
1,2,3
6,7,8
a,b,c
1,2,3
6,7,8
如何使用 bash 实用程序/awk/sed 执行此操作?我的实际文本文件是 3GB(我想要匹配列值的行实际上是第 3 行),因此非常感谢有效的解决方案。这是我到目前为止:
for j in "${arr[@]}"; do awk -F ',' -v a=$j '{ for(i=1;i<=NF;i++) {if($i==a) {print $i}}}' test.txt; done
但我得到的输出是
a
b
c
这不仅缺少其他行,而且每个列名都打印在一行上。
回答
使用您显示的样本,请尝试以下操作。代码正在读取 2 个文件 another_file.txt(其中包含 a b c每个示例)和名为 test.txt 的实际 Input_file(其中包含所有值)。
awk '
FNR==NR{
for(i=1;i<=NF;i++){
arr[$i]
}
next
}
FNR==1{
for(i=1;i<=NF;i++){
if($i in arr){
valArr[i]
header=(header?header OFS:"")$i
}
}
print header
next
}
{
val=""
for(i=1;i<=NF;i++){
if(i in valArr){
val=(val?val OFS:"")$i
}
}
print val
}
' another_file.txt FS="," OFS="," test.txt
输出如下:
说明:为上述解决方案添加详细说明。
awk ' ##Starting awk program from here.
FNR==NR{ ##Checking condition which will be TRUE while reading another_text file here.
for(i=1;i<=NF;i++){ ##Traversing through all fields of current line.
arr[$i] ##Creating arr with index of current field value.
}
next ##next will skip all statements from here.
}
FNR==1{ ##Checking if this is 1st line for test.txt file.
for(i=1;i<=NF;i++){ ##Traversing through all fields of current line.
if($i in arr){ ##If current field values comes in arr then do following.
valArr[i] ##Creating valArr which has index of current field number.
header=(header?header OFS:"")$i ##Creating header which has each field value in it.
}
}
print header ##Printing header here.
next ##next will skip all statements from here.
}
{
val="" ##Nullifying val here.
for(i=1;i<=NF;i++){ ##Traversing through all fields of current line.
if(i in valArr){ ##Checking if i is present in valArr then do following.
val=(val?val OFS:"")$i ##Creating val which has current field value.
}
}
print val ##printing val here.
}
' another_file.txt FS="," OFS="," test.txt ##Mentioning Input_file names here.