组合：rowwise()、mutate()、cross()，用于多种功能

html5 • 2022年9月14日 pm1:29 • 问答

这在某种程度上与此相关的问题：原则上我试着去了解如何rowwise操作与mutate多个列采用更然后像（1个功能mean()，sum()，min()等）的工作。

我已经了解到可以across完成这项工作而不是c_across。我已经学会了该功能mean()是将不同的功能min()以如下方式mean()不起作用在dataframes，我们需要将其更改到可以不公开或as.matrix做载体- >从Ronak沙阿了解到这里了解横行（）和 c_across()

现在以我的实际情况为例：我能够完成这项任务，但我丢失了一个 column d。我怎样才能避免d这种设置中的柱子松动。

我的 df：

df <- structure(list(a = 1:5, b = 6:10, c = 11:15, d = c("a", "b", 
"c", "d", "e"), e = 1:5), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

不工作：

df %>% 
  rowwise() %>% 
  mutate(across(a:e), 
         avg = mean(unlist(cur_data()), na.rm = TRUE),
         min = min(unlist(cur_data()), na.rm = TRUE), 
         max = max(unlist(cur_data()), na.rm = TRUE)
  )

# Output:
      a     b     c d         e   avg min   max  
  <int> <int> <int> <chr> <int> <dbl> <chr> <chr>
1     1     6    11 a         1    NA 1     a    
2     2     7    12 b         2    NA 12    b    
3     3     8    13 c         3    NA 13    c    
4     4     9    14 d         4    NA 14    d    
5     5    10    15 e         5    NA 10    e

作品，但我松散列d：

df %>% 
  select(-d) %>% 
  rowwise() %>% 
  mutate(across(a:e), 
         avg = mean(unlist(cur_data()), na.rm = TRUE),
         min = min(unlist(cur_data()), na.rm = TRUE), 
         max = max(unlist(cur_data()), na.rm = TRUE)
  )

      a     b     c     e   avg   min   max
  <int> <int> <int> <int> <dbl> <dbl> <dbl>
1     1     6    11     1  4.75     1    11
2     2     7    12     2  5.75     2    12
3     3     8    13     3  6.75     3    13
4     4     9    14     4  7.75     4    14
5     5    10    15     5  8.75     5    15

回答

使用pmap()frompurrr可能更可取，因为您只需要选择一次数据，并且可以使用 select 助手：

df %>% 
 mutate(pmap_dfr(across(where(is.numeric)),
                 ~ data.frame(max = max(c(...)),
                              min = min(c(...)),
                              avg = mean(c(...)))))

      a     b     c d         e   max   min   avg
  <int> <int> <int> <chr> <int> <int> <int> <dbl>
1     1     6    11 a         1    11     1  4.75
2     2     7    12 b         2    12     2  5.75
3     3     8    13 c         3    13     3  6.75
4     4     9    14 d         4    14     4  7.75
5     5    10    15 e         5    15     5  8.75

或者加上tidyr：

df %>% 
 mutate(res = pmap(across(where(is.numeric)),
                   ~ list(max = max(c(...)),
                          min = min(c(...)),
                          avg = mean(c(...))))) %>%
 unnest_wider(res)

回答

编辑：

最好的出路

df %>%
  rowwise() %>% 
  mutate(min = min(c_across(a:e & where(is.numeric)), na.rm = TRUE),
         max = max(c_across(a:e & where(is.numeric)), na.rm = TRUE), 
         avg = mean(c_across(a:e & where(is.numeric)), na.rm = TRUE)
  )

# A tibble: 5 x 8
# Rowwise: 
      a     b     c d         e   min   max   avg
  <int> <int> <int> <chr> <int> <int> <int> <dbl>
1     1     6    11 a         1     1    11  4.75
2     2     7    12 b         2     2    12  5.75
3     3     8    13 c         3     3    13  6.75
4     4     9    14 d         4     4    14  7.75
5     5    10    15 e         5     5    15  8.75

较早的回答
您this will work甚至无法正常工作，如果您更改输出顺序，请参阅

df %>% 
  select(-d) %>% 
  rowwise() %>% 
  mutate(across(a:e), 
         min = min(unlist(cur_data()), na.rm = TRUE),
         max = max(unlist(cur_data()), na.rm = TRUE), 
         avg = mean(unlist(cur_data()), na.rm = TRUE)
  )

# A tibble: 5 x 7
# Rowwise: 
      a     b     c     e   min   max   avg
  <int> <int> <int> <int> <int> <int> <dbl>
1     1     6    11     1     1    11  5.17
2     2     7    12     2     2    12  6.17
3     3     8    13     3     3    13  7.17
4     4     9    14     4     4    14  8.17
5     5    10    15     5     5    15  9.17

因此，建议这样做 -

df %>% 
  select(-d) %>% 
  rowwise() %>% 
  mutate(min = min(c_across(a:e), na.rm = TRUE),
         max = max(c_across(a:e), na.rm = TRUE), 
         avg = mean(c_across(a:e), na.rm = TRUE)
  )

# A tibble: 5 x 7
# Rowwise: 
      a     b     c     e   min   max   avg
  <int> <int> <int> <int> <int> <int> <dbl>
1     1     6    11     1     1    11  4.75
2     2     7    12     2     2    12  5.75
3     3     8    13     3     3    13  6.75
4     4     9    14     4     4    14  7.75
5     5    10    15     5     5    15  8.75

另一种选择是

cols <- c('a', 'b', 'c', 'e')
df %>%
  rowwise() %>% 
  mutate(min = min(c_across(cols), na.rm = TRUE),
         max = max(c_across(cols), na.rm = TRUE), 
         avg = mean(c_across(cols), na.rm = TRUE)
  )

# A tibble: 5 x 8
# Rowwise: 
      a     b     c d         e   min   max   avg
  <int> <int> <int> <chr> <int> <int> <int> <dbl>
1     1     6    11 a         1     1    11  4.75
2     2     7    12 b         2     2    12  5.75
3     3     8    13 c         3     3    13  6.75
4     4     9    14 d         4     4    14  7.75
5     5    10    15 e         5     5    15  8.75

在这些情况下，即使 @Sinh 建议的 group_by 方法也无法正常工作。

以上是组合：rowwise()、mutate()、cross()，用于多种功能的全部内容。

THE END

二维码

将Pandas数据框中的值合并为字符串

< <上一篇

获得每组“摘要”输出的整洁方法？

下一篇>>

搜索内容

组合：rowwise()、mutate()、cross()，用于多种功能

回答

回答

目录

目录

推荐文章

最新文章