对矩阵的每个元素测试条件
我需要评估矩阵中每个元素的条件,其中测试涉及dimnames每个索引的 。具体来说,如果元素的rowname和colname共享特定功能,我想将该元素设置为零。
m <- matrix(rnorm(81),nrow=9)
colnames(m) <- paste(c(rep("A",3),rep("B",3),rep("C",3)),1:ncol(m),sep = "_")
rownames(m) <- paste(c(rep("A",3),rep("B",3),rep("C",3)),1:nrow(m),sep = "_")
m
A_1 A_2 A_3 B_4 B_5 B_6 C_7 C_8 C_9
A_1 -0.03201198 -2.2241923 -0.65584334 -0.346745371 -0.9263060 -1.99181830 0.9138187 -2.4959751 -0.96723090
A_2 -1.44319826 -0.2225057 -1.35327091 -0.009194619 0.5798469 2.42753826 -1.4574564 -0.8858597 -1.41595891
A_3 -0.05863965 -0.2177708 -0.39131739 0.729532751 1.4106448 0.15899085 -1.7521345 0.5398222 -0.05073061
B_4 1.11006840 1.0315201 0.10434758 -0.508430234 -1.7095192 0.90913528 1.7367210 -0.9006098 -1.41698688
B_5 0.21405173 0.4735690 0.42655214 -0.367748304 0.9820261 -0.77933908 1.1326391 0.5316226 2.24951820
B_6 0.27153476 -0.3506076 0.16943749 -0.666135969 0.2962018 1.12236640 -1.3103133 -1.9494454 -0.57526358
C_7 1.69732641 -1.1439368 -0.02734925 -0.814635435 0.6658583 0.68069434 0.3330596 -1.2564933 0.15807742
C_8 0.35194835 -0.7075880 -0.45814046 0.773997223 -0.6530986 0.01295098 0.2557955 1.4658751 -3.33651509
C_9 0.58610083 0.7908394 -1.38909037 -0.742739398 -0.3745243 2.80990368 0.2172529 -0.3672324 0.56309688
我想要的结果可以通过 for 循环轻松实现,但是在处理大型矩阵时,这当然非常慢。
for (i in 1:nrow(m)) {
for (j in 1:ncol(m)) {
m[i,j] <- ifelse(sub("_.*", "1", rownames(m)[i])==sub("_.*", "1", colnames(m)[j]),0,m[i,j])
}
}
m
A_1 A_2 A_3 B_4 B_5 B_6 C_7 C_8 C_9
A_1 0.0000000 0.0000000 0.00000000 -0.346745371 -0.9263060 -1.99181830 0.9138187 -2.4959751 -0.96723090
A_2 0.0000000 0.0000000 0.00000000 -0.009194619 0.5798469 2.42753826 -1.4574564 -0.8858597 -1.41595891
A_3 0.0000000 0.0000000 0.00000000 0.729532751 1.4106448 0.15899085 -1.7521345 0.5398222 -0.05073061
B_4 1.1100684 1.0315201 0.10434758 0.000000000 0.0000000 0.00000000 1.7367210 -0.9006098 -1.41698688
B_5 0.2140517 0.4735690 0.42655214 0.000000000 0.0000000 0.00000000 1.1326391 0.5316226 2.24951820
B_6 0.2715348 -0.3506076 0.16943749 0.000000000 0.0000000 0.00000000 -1.3103133 -1.9494454 -0.57526358
C_7 1.6973264 -1.1439368 -0.02734925 -0.814635435 0.6658583 0.68069434 0.0000000 0.0000000 0.00000000
C_8 0.3519484 -0.7075880 -0.45814046 0.773997223 -0.6530986 0.01295098 0.0000000 0.0000000 0.00000000
C_9 0.5861008 0.7908394 -1.38909037 -0.742739398 -0.3745243 2.80990368 0.0000000 0.0000000 0.00000000
概念上类似于这个问题,但我不知道如何评估家庭dimnames内的条件apply。有什么建议?谢谢!
回答
将sub被量化。唯一需要做的改变是连接rep名称属性,使其与 的元素数量相同matrix,根据逻辑向量提取这些元素并将其赋值为 0
rnm <- sub("_.*", "1", rownames(m))
cnm <- sub("_.*", "1", colnames(m))
m[rnm[row(m)] == cnm[col(m)]] <- 0
-输出
m
A_1 A_2 A_3 B_4 B_5 B_6 C_7 C_8 C_9
A_1 0.0000000 0.00000000 0.00000000 -2.84612856 0.1611173 0.29004403 -0.8844186 0.8363131 -0.57395543
A_2 0.0000000 0.00000000 0.00000000 1.92166045 0.2671856 -0.50424582 0.8672366 -2.2496354 0.04046654
A_3 0.0000000 0.00000000 0.00000000 -0.01515657 0.6775447 -0.03862803 1.7764642 0.7146040 0.33652933
B_4 1.0842025 0.32981334 -0.61961179 0.00000000 0.0000000 0.00000000 0.3366096 0.5196087 1.67367867
B_5 -1.2897726 0.04082185 1.62661008 0.00000000 0.0000000 0.00000000 -1.6750488 0.5464289 0.98246881
B_6 -0.4213025 -0.70439232 0.09091241 0.00000000 0.0000000 0.00000000 0.7050487 -0.4151445 -0.04604658
C_7 1.0594241 -1.43071282 -0.75394573 1.08360040 1.3646551 -0.88687658 0.0000000 0.0000000 0.00000000
C_8 1.3908082 1.12093371 1.73690687 1.05202987 0.6152715 0.45601621 0.0000000 0.0000000 0.00000000
C_9 -0.3026523 1.14507291 -1.04714611 -2.38087279 -0.6976168 -0.96394113 0.0000000 0.0000000 0.00000000
或者另一种选择是重新整形为“长”格式,根据substr条件将“频率”列指定为 0,然后使用xtabs
xtabs(Freq ~ Var1 + Var2, transform(as.data.frame.table(m),
Freq = Freq * (substr(Var1, 1, 1) != substr(Var2, 1, 1))))