在R中将四列特殊连接成新的两列

我在R工作,我有一个有趣的问题。我想转换下一个数据帧:

DF = data.frame(ID = c(1, 2, 3),
              Person1 = c("Devin Davey", "Rui Butt", "Keon Dotson"),
              Sign = "artist",
              Person2 = c("Eli Greer", "Alvin Simons", "Leona Ford"),
              Sex = c("female", "male", "female"),
              Score = c(10, 20, 30)) 



  ID     Person1   Sign      Person2    Sex Score
1  1 Devin Davey artist    Eli Greer female    10
2  2    Rui Butt artist Alvin Simons   male    20
3  3 Keon Dotson artist   Leona Ford female    30

像这样格式化:

  ID         Name   Sign Score
1  1  Devin Davey artist    10
2  1    Eli Greer female    10
3  2     Rui Butt artist    20
4  2 Alvin Simons   male    20
5  3  Keon Dotson artist    30
6  3   Leona Ford female    30

也就是说,将四列特殊连接成两个新列。

我有一个想法如下:

PART1 <- DF %>% 
            select(ID, Person1, Person2, Score) %>%
            gather(key, Name, -c(ID, Score), na.rm = TRUE) %>%
            select(-key) %>%
            arrange(ID) %>%
            mutate(temp_id = 1:n())

PART2 <- DF %>% 
            select(ID, Sign, Sex) %>%
            gather(key, Sign, -ID, na.rm = TRUE) %>%
            select(-key) %>%
            arrange(ID) %>%
            mutate(temp_id = 1:n())

PART1 %>%
        left_join(PART2, by = c("ID" = "ID", "temp_id" = "temp_id")) %>%
        select(-temp_id) %>%
        relocate(Score, .after = Sign)

但是在我看来这样的解决方案并不是很漂亮,我认为这个问题可以用更好的方式解决。

因此,我将不胜感激您使用tidyverse.

回答

我们可以将名称从 'Sign', 'Sex' 更改为常用名称 'Sign' 并附加一个序列作为后缀以匹配 Person 的名称,然后使用 pivot_longer

library(dplyr)
library(tidyr)
DF %>% 
   rename_at(vars(c('Sign', 'Sex')), ~ c('Sign1', 'Sign2')) %>% 
   pivot_longer(cols = -c(ID, Score), names_to = c(".value", "grp"), 
        names_sep = "(?<=[a-z])(?=d)") %>%
   select(ID, Name = Person, Sign, Score)

-输出

# A tibble: 6 x 4
#     ID Name         Sign   Score
#  <dbl> <chr>        <chr>  <dbl>
#1     1 Devin Davey  artist    10
#2     1 Eli Greer    female    10
#3     2 Rui Butt     artist    20
#4     2 Alvin Simons male      20
#5     3 Keon Dotson  artist    30
#6     3 Leona Ford   female    30


回答

在基础 R 中,您可以使用函数reshape. 由于这给出了不同的排序,我们将重新排序以获得如上所示的确切数据。虽然没有必要

DF1<-reshape(DF, matrix(2:5, 2), dir="long")
DF1[order(DF1$ID),c("ID", "Person1","Sign", "Score")]

    ID      Person1   Sign Score
1.1  1  Devin Davey artist    10
1.2  1    Eli Greer female    10
2.1  2     Rui Butt artist    20
2.2  2 Alvin Simons   male    20
3.1  3  Keon Dotson artist    30
3.2  3   Leona Ford female    30


以上是在R中将四列特殊连接成新的两列的全部内容。
THE END
分享
二维码
< <上一篇
下一篇>>