The following are useful functions and some examples of their capabilities for manipulating data.
select()
Selecting can be by column name
beds_data |>select(org_code, org_name)
# A tibble: 4,558 × 2
org_code org_name
<chr> <chr>
1 R1A Worcestershire Health And Care
2 R1C Solent
3 R1E Staffordshire And Stoke On Trent Partnership
4 R1F Isle Of Wight
5 R1H Barts Health
6 R1J Gloucestershire Care Services
7 RA2 Royal Surrey County Hospital
8 RA3 Weston Area Health
9 RA4 Yeovil District Hospital
10 RA7 University Hospitals Bristol
# ℹ 4,548 more rows
Or position (including a range from:to)
beds_data |>select(3:5)
# A tibble: 4,558 × 3
org_name beds_av occ_av
<chr> <dbl> <dbl>
1 Worcestershire Health And Care 129 117
2 Solent 105 82
3 Staffordshire And Stoke On Trent Partnership NA NA
4 Isle Of Wight 54 42
5 Barts Health NA NA
6 Gloucestershire Care Services NA NA
7 Royal Surrey County Hospital NA NA
8 Weston Area Health NA NA
9 Yeovil District Hospital NA NA
10 University Hospitals Bristol NA NA
# ℹ 4,548 more rows
Deselecting
beds_data |>select(-org_code)
# A tibble: 4,558 × 4
date org_name beds_av occ_av
<date> <chr> <dbl> <dbl>
1 2013-09-01 Worcestershire Health And Care 129 117
2 2013-09-01 Solent 105 82
3 2013-09-01 Staffordshire And Stoke On Trent Partnership NA NA
4 2013-09-01 Isle Of Wight 54 42
5 2013-09-01 Barts Health NA NA
6 2013-09-01 Gloucestershire Care Services NA NA
7 2013-09-01 Royal Surrey County Hospital NA NA
8 2013-09-01 Weston Area Health NA NA
9 2013-09-01 Yeovil District Hospital NA NA
10 2013-09-01 University Hospitals Bristol NA NA
# ℹ 4,548 more rows
Select everything()
Re-position a column and then refer to everything else
beds_data |>select(org_name,everything())
# A tibble: 4,558 × 5
org_name date org_code beds_av occ_av
<chr> <date> <chr> <dbl> <dbl>
1 Worcestershire Health And Care 2013-09-01 R1A 129 117
2 Solent 2013-09-01 R1C 105 82
3 Staffordshire And Stoke On Trent Partners… 2013-09-01 R1E NA NA
4 Isle Of Wight 2013-09-01 R1F 54 42
5 Barts Health 2013-09-01 R1H NA NA
6 Gloucestershire Care Services 2013-09-01 R1J NA NA
7 Royal Surrey County Hospital 2013-09-01 RA2 NA NA
8 Weston Area Health 2013-09-01 RA3 NA NA
9 Yeovil District Hospital 2013-09-01 RA4 NA NA
10 University Hospitals Bristol 2013-09-01 RA7 NA NA
# ℹ 4,548 more rows
Select starts_with()
Select columns which start with the same text
beds_data |>select(starts_with("org"))
# A tibble: 4,558 × 2
org_code org_name
<chr> <chr>
1 R1A Worcestershire Health And Care
2 R1C Solent
3 R1E Staffordshire And Stoke On Trent Partnership
4 R1F Isle Of Wight
5 R1H Barts Health
6 R1J Gloucestershire Care Services
7 RA2 Royal Surrey County Hospital
8 RA3 Weston Area Health
9 RA4 Yeovil District Hospital
10 RA7 University Hospitals Bristol
# ℹ 4,548 more rows
Also ends_with()
contains()
Searches for strings in the column names without the use of %wildcards%
beds_data |>select(contains("s_a"))
# A tibble: 4,558 × 1
beds_av
<dbl>
1 129
2 105
3 NA
4 54
5 NA
6 NA
7 NA
8 NA
9 NA
10 NA
# ℹ 4,548 more rows
Using n() and n_distinct()
beds_data |>summarise(number =n(), # distinct number of org_namedistinct_number =n_distinct(org_name),.by = org_code) |>filter(distinct_number >1) |>arrange(desc(distinct_number))