Skip to content

ptype parameter of unnest() does not work #1594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kilojarek opened this issue Apr 3, 2025 · 4 comments
Closed

ptype parameter of unnest() does not work #1594

kilojarek opened this issue Apr 3, 2025 · 4 comments

Comments

@kilojarek
Copy link

kilojarek commented Apr 3, 2025

Hello,

I've encountered a situation where I have to enforce type on specific columns during unnesting, and the parameter ptype does not seem to work:

library(tidyr)
df <- tibble(
  id = 1:3,
  test = list(1:2, "a", c(TRUE, FALSE)))
df %>%
  unnest(test, ptype = list(test = list()))
#> Error in `list_unchop()`:
#> ! Can't convert `x[[1]]` <integer> to <list>.

Created on 2025-04-03 with reprex v2.1.1

The full error message is:

#Error in `list_unchop()`:
#! Can't convert `x[[1]]` <integer> to <list>.
#Backtrace:
 #1. df %>% unnest(test, ptype = list())
 #3. tidyr:::unnest.data.frame(., test, ptype = list())
 #4. tidyr::unchop(...)
 #5. tidyr:::df_unchop(...)
 #6. vctrs::list_unchop(col, ptype = col_ptype)`

I believe this is a bug? The error persists with variants ptype = list(test = as.character()) or ptype = list(test = as.numeric()).

@DavisVaughan
Copy link
Member

DavisVaughan commented Apr 4, 2025

It looks like it is working to me. You're asking it to convert each element of the original test to <list> before unnesting, but there is no vctrs conversion from 1:2 (i.e. <integer>) to <list>.

What type are you trying to preserve? If you can make a tibble of your expected outcome that would be helpful


Were you going for something like this?

library(tidyr)

df <- tibble(
  id = 1:3,
  test = list(1:2, "a", c(TRUE, FALSE)))

df %>%
  dplyr::mutate(test = purrr::map(test, as.list)) %>%
  unchop(test)
#> # A tibble: 5 × 2
#>      id test     
#>   <int> <list>   
#> 1     1 <int [1]>
#> 2     1 <int [1]>
#> 3     2 <chr [1]>
#> 4     3 <lgl [1]>
#> 5     3 <lgl [1]>

Created on 2025-04-04 with reprex v2.1.1

@kilojarek
Copy link
Author

kilojarek commented Apr 4, 2025

Your solution is close to how I worked around this behaviour of ptype. My actual case is a bit more complicated, as I have tibbles within tibbles, but here what is happening with my data (data types now match my situation):

library(tidyr)
library(dplyr)

nested_df1 <- tibble(
  nested_id = 1:3,
  test = list(c("a", "b", "c"), "f", c("d", "e")))

nested_df2 <- tibble(
  nested_id = 4,
  test = "f")

main_df <- tibble(
    id = 1:2,
    nested_dfs = list(nested_df1, nested_df2))

main_df %>% pull(nested_dfs)
#> [[1]]
#> # A tibble: 3 × 2
#>   nested_id test     
#>       <int> <list>   
#> 1         1 <chr [3]>
#> 2         2 <chr [1]>
#> 3         3 <chr [2]>
#> 
#> [[2]]
#> # A tibble: 1 × 2
#>   nested_id test 
#>       <dbl> <chr>
#> 1         4 f

main_df %>% unnest(nested_dfs, ptype = list(test = list()))
#> Error in `list_unchop()`:
#> ! Can't combine `x[[1]]$test` <list> and `x[[2]]$test` <character>.

What I expected to happen is for unnest(), with ptype set to list, to turn column test in nested_df2 into a list-column.

Created on 2025-04-04 with reprex v2.1.1

@DavisVaughan
Copy link
Member

I imagine what is confusing you is the fact that ptype applies after you have combined all of the inner data frames together, so that means that the inner data frames themselves have to be compatible first.

For you, nested_df1$test and nested_df2$test are not compatible (for the same reasons given above, can't combine a list and a character vector together, there is no common type).

So you'll have to manually convert to compatible types first, then combine

library(tidyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

nested_df1 <- tibble(
  nested_id = 1:3,
  test = list(c("a", "b", "c"), "f", c("d", "e"))
)

nested_df2 <- tibble(
  nested_id = 4,
  test = "f"
)

main_df <- tibble(
  id = 1:2,
  nested_dfs = list(nested_df1, nested_df2)
)

main_df %>%
  mutate(
    nested_dfs = purrr::map(nested_dfs, function(df) {
      if (!is.list(df$test)) {
        df$test <- as.list(df$test)
      }
      df
    })
  ) %>%
  unnest(nested_dfs)
#> # A tibble: 4 × 3
#>      id nested_id test     
#>   <int>     <dbl> <list>   
#> 1     1         1 <chr [3]>
#> 2     1         2 <chr [1]>
#> 3     1         3 <chr [2]>
#> 4     2         4 <chr [1]>

Created on 2025-04-04 with reprex v2.1.1

@kilojarek
Copy link
Author

Ah, yes, that makes sense. If ptype only applies after the unnesting had been done then yes, it won't work. So it's not a bug then :-)

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants