我如何旋转更宽并按两列计算一对的出现?

在此处查看数据框

dt <- structure(list(ID = c(1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 4, 5, 5, 
5, 6, 6, 6, 7, 7, 7), V1 = c("ABC", "ABC", "DEF", "GHI", "GHI", 
"GHI", "JKL", "JKL", "DEF", "ABC", "MNO", "GHI", "GHI", "ABC", 
"DEF", "DEF", "GHI", "MNO", "MNO", "ABC"), V2 = c("DEF", "MNO", 
"MNO", "JKL", "DEF", "ABC", "DEF", "ABC", "ABC", "JKL", "JKL",                                                                                               
"ABC", "DEF", "DEF", "GHI", "MNO", "MNO", "ABC", "JKL", "JKL"
)), row.names = c(NA, -20L), class = c("data.table", "data.frame"))

例如,在 V1 列中,ABC 出现了 5 次,在 V2 中,DEF 也出现了 5 次。然而,他们配对了三倍。我想创建一个计数列,无论它们属于哪个列(V1 或 V2),它都会计算它们的对。

回答

更新

dt[, c(2, 3, 1)] %>%
    graph_from_data_frame(directed = FALSE) %>%
    get.adjacency(type = "upper") %>%
    graph_from_adjacency_matrix(weighted = TRUE) %>%
    get.data.frame()

   from  to weight
1   ABC DEF      3
2   ABC GHI      2
3   DEF GHI      3
4   ABC JKL      3
5   DEF JKL      1
6   GHI JKL      1
7   ABC MNO      2
8   DEF MNO      2
9   GHI MNO      1
10  JKL MNO      2

我想你可以试试igraph下面的选项

library(igraph)
get.adjacency(
    graph_from_data_frame(dt[, -"ID"],
        directed = FALSE
    ),
    sparse = FALSE
)

这使

    ABC DEF GHI JKL MNO
ABC   0   3   2   3   2
DEF   3   0   3   1   2
GHI   2   3   0   1   1
JKL   3   1   1   0   2
MNO   2   2   1   2   0

如果要添加指示计数的列,可以尝试

transform(
    dt,
    cnts = ave(ID, pmin(V1, V2), pmax(V1, V2), FUN = length)
)

这使

   ID  V1  V2 cnts
 1:  1 ABC DEF    3
 2:  1 ABC MNO    2
 3:  1 DEF MNO    2
 4:  2 GHI JKL    1
 5:  2 GHI DEF    3
 6:  2 GHI ABC    2
 7:  2 JKL DEF    1
 8:  2 JKL ABC    3
 9:  2 DEF ABC    3
10:  3 ABC JKL    3
11:  4 MNO JKL    2
12:  5 GHI ABC    2
13:  5 GHI DEF    3
14:  5 ABC DEF    3
15:  6 DEF GHI    3
16:  6 DEF MNO    2
17:  6 GHI MNO    1
18:  7 MNO ABC    2
19:  7 MNO JKL    2
20:  7 ABC JKL    3


以上是我如何旋转更宽并按两列计算一对的出现?的全部内容。
THE END
分享
二维码
< <上一篇
下一篇>>