JRomanowska_automagical

background-image: url("https://images.unsplash.com/photo-1534447677768-be436bb09401?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1494&q=80")
background-size: cover

<p style="font-weight: 900; font-family: Georgia; font-size: 3rem; color: white; text-shadow: 2px 2px black;">
Auto-magical tables in R
</p>

<p style="font-weight: 900; font-family: Georgia; font-size: 2.5rem; color: white; text-shadow: 2px 2px black;">
Julia Romanowska
</p>

<p style="font-weight: 900; font-family: Georgia; font-size: 1.5rem; color: white; text-shadow: 2px 2px black;">
January 28, 2023
</p>

<p style="font-size: 12pt; font-weight: bold; right: 10px; bottom: 20px; position: absolute; ">
Photo by <a href="https://unsplash.com/@jplenio?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Johannes Plenio</a> on <a href="https://unsplash.com/photos/DKix6Un55mw?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
</p>

???

---

## Overview

### Get the data

### Design the table

### Create a table

### Hands-on

### Extras

---

## Get the data

- from analyses in R:

```r
analysis_results <- glm(
  outcome ~ exposure + covariate,
  data = input_data_hospitals
)
```

- from analyses elsewhere:

```r
library(haven) # reading from SAS, SPSS, etc.
library(readr) # reading from text files (.csv, .txt, etc.)
```

---

## Magical tables

- [{finalfit}](https://finalfit.org/index.html) + [rmarkdown::kable()](https://bookdown.org/yihui/rmarkdown-cookbook/kable.html)

- [{parameters}](https://easystats.github.io/parameters/) + [rmarkdown::kable()](https://bookdown.org/yihui/rmarkdown-cookbook/kable.html)

???

There are some packages that create outputs automatically. Most of the time, you
can use this at least as a starting point for your publication. However, you
will often need to adjust the default output beyond what's available by the
function options.

---

## Magical tables - {finalfit}

```r
## library(finalfit)
## library(dplyr)

data(colon_s)
colon_s %>% glimpse()
```

```
## Rows: 929
## Columns: 32
## $ id              <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,…
## $ rx              <fct> Lev+5FU, Lev+5FU, Obs, Lev+5FU, Obs, Lev+5FU, Lev, Obs…
## $ sex             <dbl> 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, …
## $ age             <dbl> 43, 63, 71, 66, 69, 57, 77, 54, 46, 68, 47, 52, 64, 68…
## $ obstruct        <dbl> NA, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1,…
## $ perfor          <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ adhere          <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, …
## $ nodes           <dbl> 5, 1, 7, 6, 22, 9, 5, 1, 2, 1, 1, 2, 1, 3, 4, 1, 6, 1,…
## $ status          <dbl> 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, …
## $ differ          <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 2, 2, …
## $ extent          <dbl> 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
## $ surg            <dbl> 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, …
## $ node4           <dbl> 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, …
## $ time            <dbl> 1521, 3087, 963, 293, 659, 1767, 420, 3192, 3173, 3308…
## $ sex.factor      <fct> Male, Male, Female, Female, Male, Female, Male, Male, …
## $ rx.factor       <fct> Lev+5FU, Lev+5FU, Obs, Lev+5FU, Obs, Lev+5FU, Lev, Obs…
## $ obstruct.factor <fct> NA, No, No, Yes, No, No, No, No, No, No, No, No, No, Y…
## $ perfor.factor   <fct> No, No, No, No, No, No, No, No, No, No, No, No, No, No…
## $ adhere.factor   <fct> No, No, Yes, No, No, No, No, No, Yes, No, Yes, No, No,…
## $ differ.factor   <fct> Moderate, Moderate, Moderate, Moderate, Moderate, Mode…
## $ extent.factor   <fct> Serosa, Serosa, Muscle, Serosa, Serosa, Serosa, Serosa…
## $ surg.factor     <fct> Short, Short, Short, Long, Long, Short, Long, Short, S…
## $ node4.factor    <fct> Yes, No, Yes, Yes, Yes, Yes, Yes, No, No, No, No, No, …
## $ status.factor   <fct> Died, Alive, Died, Died, Died, Died, Died, Alive, Aliv…
## $ age.factor      <fct> 40-59 years, 60+ years, 60+ years, 60+ years, 60+ year…
## $ loccomp         <dbl> NA, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1,…
## $ loccomp.factor  <fct> NA, No, Yes, Yes, No, No, No, No, Yes, No, Yes, No, No…
## $ time.years      <dbl> 4.1671233, 8.4575342, 2.6383562, 0.8027397, 1.8054795,…
## $ mort_5yr        <fct> Died, Alive, Died, Died, Died, Died, Died, Alive, Aliv…
## $ age.10          <dbl> 4.3, 6.3, 7.1, 6.6, 6.9, 5.7, 7.7, 5.4, 4.6, 6.8, 4.7,…
## $ mort_5yr.num    <dbl> 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, …
## $ hospital        <fct> hospital_5, hospital_3, hospital_5, hospital_4, hospit…
```

???

These will be examples that we can find on the official documentation webpage.
The data here is on colon cancer suvival after surgery.

---

```r
*explanatory <- c(
* "age",
* "age.factor",
* "sex.factor",
* "obstruct.factor"
*) 
*dependent <- "perfor.factor"
```
]
 
.panel2-table1_finalfit_example-user[

]

---
count: false

```r
explanatory <- c(
  "age",
  "age.factor",
  "sex.factor",
  "obstruct.factor"
)
dependent <- "perfor.factor"
*table1 <- summary_factorlist(
* colon_s,
* dependent,
* explanatory,
* p = TRUE,
* add_dependent_label = TRUE
*) 
```
]
 
.panel2-table1_finalfit_example-user[

]

---
count: false

```r
explanatory <- c(
  "age",
  "age.factor",
  "sex.factor",
  "obstruct.factor"
)
dependent <- "perfor.factor"
table1 <- summary_factorlist(
  colon_s,
  dependent,
  explanatory,
  p = TRUE,
  add_dependent_label = TRUE
)
*knitr::kable(
* table1,
* row.names = FALSE,
* align = c("l", "l", "r", "r", "r")
*) 
```
]
 
.panel2-table1_finalfit_example-user[

|Dependent: Perforation |            |          No|         Yes|     p|
|:----------------------|:-----------|-----------:|-----------:|-----:|
|Age (years)            |Mean (SD)   | 59.8 (11.9)| 58.4 (13.3)| 0.542|
|Age                    |<40 years   |    68 (7.5)|     2 (7.4)| 1.000|
|                       |40-59 years |  334 (37.0)|   10 (37.0)|      |
|                       |60+ years   |  500 (55.4)|   15 (55.6)|      |
|Sex                    |Female      |  432 (47.9)|   13 (48.1)| 1.000|
|                       |Male        |  470 (52.1)|   14 (51.9)|      |
|Obstruction            |No          |  715 (81.2)|   17 (63.0)| 0.035|
|                       |Yes         |  166 (18.8)|   10 (37.0)|      |
]

---

```r
*explanatory <- c(
* "age.factor",
* "sex.factor",
* "obstruct.factor",
* "perfor.factor"
*) 
*dependent <- "mort_5yr"
```
]
 
.panel2-table2_finalfit_example-user[

]

---
count: false

```r
explanatory <- c(
  "age.factor",
  "sex.factor",
  "obstruct.factor",
  "perfor.factor"
)
dependent <- "mort_5yr"
*table_res_log_regression <- finalfit(
* colon_s,
* dependent,
* explanatory
*) 
```
]
 
.panel2-table2_finalfit_example-user[

]

---
count: false

```r
explanatory <- c(
  "age.factor",
  "sex.factor",
  "obstruct.factor",
  "perfor.factor"
)
dependent <- "mort_5yr"
table_res_log_regression <- finalfit(
  colon_s,
  dependent,
  explanatory
)
*knitr::kable(
* table_res_log_regression,
* row.names = FALSE,
* align = c("l", "l", "r", "r", "r", "r")
*) 
```
]
 
.panel2-table2_finalfit_example-user[

|Dependent: Mortality 5 year |            |      Alive|       Died|          OR (univariable)|        OR (multivariable)|
|:---------------------------|:-----------|----------:|----------:|-------------------------:|-------------------------:|
|Age                         |<40 years   |  31 (46.3)|  36 (53.7)|                         -|                         -|
|                            |40-59 years | 208 (61.4)| 131 (38.6)| 0.54 (0.32-0.92, p=0.023)| 0.57 (0.34-0.98, p=0.041)|
|                            |60+ years   | 272 (53.4)| 237 (46.6)| 0.75 (0.45-1.25, p=0.270)| 0.81 (0.48-1.36, p=0.426)|
|Sex                         |Female      | 243 (55.6)| 194 (44.4)|                         -|                         -|
|                            |Male        | 268 (56.1)| 210 (43.9)| 0.98 (0.76-1.27, p=0.889)| 0.98 (0.75-1.28, p=0.902)|
|Obstruction                 |No          | 408 (56.7)| 312 (43.3)|                         -|                         -|
|                            |Yes         |  89 (51.1)|  85 (48.9)| 1.25 (0.90-1.74, p=0.189)| 1.25 (0.90-1.76, p=0.186)|
|Perforation                 |No          | 497 (56.0)| 391 (44.0)|                         -|                         -|
|                            |Yes         |  14 (51.9)|  13 (48.1)| 1.18 (0.54-2.55, p=0.672)| 1.12 (0.51-2.44, p=0.770)|
]

---

## Magical tables - {parameters}

```r
## library(parameters)

data(iris)
as_tibble(iris)
```

```
## # A tibble: 150 × 5
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
##           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
##  1          5.1         3.5          1.4         0.2 setosa 
##  2          4.9         3            1.4         0.2 setosa 
##  3          4.7         3.2          1.3         0.2 setosa 
##  4          4.6         3.1          1.5         0.2 setosa 
##  5          5           3.6          1.4         0.2 setosa 
##  6          5.4         3.9          1.7         0.4 setosa 
##  7          4.6         3.4          1.4         0.3 setosa 
##  8          5           3.4          1.5         0.2 setosa 
##  9          4.4         2.9          1.4         0.2 setosa 
## 10          4.9         3.1          1.5         0.1 setosa 
## # … with 140 more rows
```

---

```r
*model <- lm(
* Sepal.Length ~ Species * Petal.Length,
* data = iris
*) 
```
]
 
.panel2-table2_parameters-user[

]

---
count: false

```r
model <- lm(
  Sepal.Length ~ Species * Petal.Length,
  data = iris
)
*model_parameters(model)
```
]
 
.panel2-table2_parameters-user[

```
## Parameter                           | Coefficient |   SE |         95% CI | t(144) |      p
## -------------------------------------------------------------------------------------------
## (Intercept)                         |        4.21 | 0.41 | [ 3.41,  5.02] |  10.34 | < .001
## Species [versicolor]                |       -1.81 | 0.60 | [-2.99, -0.62] |  -3.02 | 0.003 
## Species [virginica]                 |       -3.15 | 0.63 | [-4.41, -1.90] |  -4.97 | < .001
## Petal Length                        |        0.54 | 0.28 | [ 0.00,  1.09] |   1.96 | 0.052 
## Species [versicolor] × Petal Length |        0.29 | 0.30 | [-0.30,  0.87] |   0.97 | 0.334 
## Species [virginica] × Petal Length  |        0.45 | 0.29 | [-0.12,  1.03] |   1.56 | 0.120
```
]

---
count: false

```r
model <- lm(
  Sepal.Length ~ Species * Petal.Length,
  data = iris
)
model_parameters(model)

*mp <- model_parameters(model)
*print_md(mp)
```
]
 
.panel2-table2_parameters-user[

|Parameter                           | Coefficient |   SE |            95% CI | t(144) |      p |
|:-----------------------------------|:-----------:|:----:|:-----------------:|:------:|:------:|
|(Intercept)                         |        4.21 | 0.41 |      (3.41, 5.02) |  10.34 | < .001 |
|Species (versicolor)                |       -1.81 | 0.60 |    (-2.99, -0.62) |  -3.02 | 0.003  |
|Species (virginica)                 |       -3.15 | 0.63 |    (-4.41, -1.90) |  -4.97 | < .001 |
|Petal Length                        |        0.54 | 0.28 | (-4.76e-03, 1.09) |   1.96 | 0.052  |
|Species (versicolor) × Petal Length |        0.29 | 0.30 |     (-0.30, 0.87) |   0.97 | 0.334  |
|Species (virginica) × Petal Length  |        0.45 | 0.29 |     (-0.12, 1.03) |   1.56 | 0.120  |
]

---

## Design the table

- do I have all the data?

- get extra info from the data or the model object?

- what type of table?

- target readers?

???

Sometimes, the output of the analysis gives one table but we would also like
to include some extra information. Then, we need to first merge tables manually.

Or vice-versa: sometimes, the output includes too much details, while we want
to show only a small part of it, then we should filter first, to operate faster
and simpler on the final set of the data.

- Table 1 (characteristics of the study population) will look differently from
any table showing results.
- Do we need some special grouping?
- Do we need colors?
- Formatting of numbers?
- Maybe we want to merge some cells into one?
- ...

**All this might influence the choice of the tool!**

---

## Create a table

**Which package to choose?**

- [{gt}](https://gt.rstudio.com/)    
adapt the table for publication

- [{gtsummary}](https://www.danieldsjoberg.com/gtsummary/index.html)    
    easily create 'Table 1' or tables with statistical results!
    
    - [{gtExtras}](https://jthomasmock.github.io/gtExtras/index.html)    
    extra functionality for {gt}, e.g., plots inside tables!

---

## Create a table

**Which package to choose?**

- [{flextable}](https://davidgohel.github.io/flextable/index.html)    
similar to {gt}, but has some other functionality, e.g., merging cells automatically

- [{DT}](https://rstudio.github.io/DT/)    
interactive tables in .html!

- \* [{reactable}](https://glin.github.io/reactable/index.html)    
interactive table - more advanced

- \* [{forestplot}](https://cran.r-project.org/web/packages/forestplot/)    
narrow usage and not so flexible

- \* [{formattable}](https://renkun-ken.github.io/formattable/)    
very similar functionality to {gt} and {flextable}

- \* [{modelsummary}](https://vincentarelbundock.github.io/modelsummary/)    
summarising results of statistical analyses and datasets

???

There are also other, some mentioned here, I will not talk about.

---

## Hands-on examples

<br>

### Descriptive table with {gtsummary} and {gt}

### Table with results, using {flextable}

### Interactive table with {DT}