Vectorised Pattern Matching with Keyboard Shortcut

Convenient wrapper around grepl() to match a pattern: x %like% pattern. It always returns a logical vector and is always case-insensitive (use x %like_case% pattern for case-sensitive matching). Also, pattern can be as long as x to compare items of each index in both vectors, or they both can have the same length to iterate over all cases.

Usage

like(x, pattern, ignore.case = TRUE)

x %like% pattern

x %unlike% pattern

x %like_case% pattern

x %unlike_case% pattern

Source

Idea from the like function from the data.table package, although altered as explained in Details.

Arguments

x: a character vector where matches are sought, or an object which can be coerced by as.character() to a character vector.
pattern: a character vector containing regular expressions (or a character string for fixed = TRUE) to be matched in the given character vector. Coerced by as.character() to a character string if possible.
ignore.case: if FALSE, the pattern matching is case sensitive and if TRUE, case is ignored during matching.

Value

A logical vector

Details

These like() and %like%/%unlike% functions:

Are case-insensitive (use %like_case%/%unlike_case% for case-sensitive matching)
Support multiple patterns
Check if pattern is a valid regular expression and sets fixed = TRUE if not, to greatly improve speed (vectorised over pattern)
Always use compatibility with Perl unless fixed = TRUE, to greatly improve speed

Using RStudio? The %like%/%unlike% functions can also be directly inserted in your code from the Addins menu and can have its own keyboard shortcut like Shift+Ctrl+L or Shift+Cmd+L (see menu Tools > Modify Keyboard Shortcuts...). If you keep pressing your shortcut, the inserted text will be iterated over %like% -> %unlike% -> %like_case% -> %unlike_case%.

Examples

a <- "This is a test"
b <- "TEST"
a %like% b
#> [1] TRUE
b %like% a
#> [1] FALSE

# also supports multiple patterns
a <- c("Test case", "Something different", "Yet another thing")
b <- c(     "case",           "diff",      "yet")
a %like% b
#> [1] TRUE TRUE TRUE
a %unlike% b
#> [1] FALSE FALSE FALSE

a[1] %like% b
#> [1]  TRUE FALSE FALSE
a %like% b[1]
#> [1]  TRUE FALSE FALSE

# get isolates whose name start with 'Ent' or 'ent'
example_isolates[which(mo_name(example_isolates$mo) %like% "^ent"), ]
#> # A tibble: 106 × 49
#>    date       hospi…¹ ward_…² ward_…³ ward_…⁴   age gender patie…⁵ mo           
#>    <date>     <fct>   <lgl>   <lgl>   <lgl>   <dbl> <chr>  <chr>   <mo>         
#>  1 2002-02-21 C       FALSE   TRUE    FALSE      69 M      4FC193  B_ENTRC_FACM 
#>  2 2002-04-08 A       TRUE    TRUE    FALSE      78 M      130252  B_ENTRC_FCLS 
#>  3 2002-06-23 D       FALSE   TRUE    FALSE      82 M      798871  B_ENTRC_FCLS 
#>  4 2002-06-23 D       FALSE   TRUE    FALSE      82 M      798871  B_ENTRC_FCLS 
#>  5 2003-04-20 D       TRUE    TRUE    FALSE      62 M      6BC362  B_ENTRC      
#>  6 2003-04-21 D       TRUE    TRUE    FALSE      62 M      6BC362  B_ENTRC      
#>  7 2003-08-13 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRBC_CLOC
#>  8 2003-08-13 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRC_FCLS 
#>  9 2003-09-05 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRC      
#> 10 2003-09-05 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRBC_CLOC
#> # … with 96 more rows, 40 more variables: PEN <rsi>, OXA <rsi>, FLC <rsi>,
#> #   AMX <rsi>, AMC <rsi>, AMP <rsi>, TZP <rsi>, CZO <rsi>, FEP <rsi>,
#> #   CXM <rsi>, FOX <rsi>, CTX <rsi>, CAZ <rsi>, CRO <rsi>, GEN <rsi>,
#> #   TOB <rsi>, AMK <rsi>, KAN <rsi>, TMP <rsi>, SXT <rsi>, NIT <rsi>,
#> #   FOS <rsi>, LNZ <rsi>, CIP <rsi>, MFX <rsi>, VAN <rsi>, TEC <rsi>,
#> #   TCY <rsi>, TGC <rsi>, DOX <rsi>, ERY <rsi>, CLI <rsi>, AZM <rsi>,
#> #   IPM <rsi>, MEM <rsi>, MTR <rsi>, CHL <rsi>, COL <rsi>, MUP <rsi>, …
# \donttest{
# faster way, since mo_name() is context-aware:
example_isolates[which(mo_name() %like% "^ent"), ]
#> ℹ Using column 'mo' as input for `mo_name()`
#> # A tibble: 106 × 49
#>    date       hospi…¹ ward_…² ward_…³ ward_…⁴   age gender patie…⁵ mo           
#>    <date>     <fct>   <lgl>   <lgl>   <lgl>   <dbl> <chr>  <chr>   <mo>         
#>  1 2002-02-21 C       FALSE   TRUE    FALSE      69 M      4FC193  B_ENTRC_FACM 
#>  2 2002-04-08 A       TRUE    TRUE    FALSE      78 M      130252  B_ENTRC_FCLS 
#>  3 2002-06-23 D       FALSE   TRUE    FALSE      82 M      798871  B_ENTRC_FCLS 
#>  4 2002-06-23 D       FALSE   TRUE    FALSE      82 M      798871  B_ENTRC_FCLS 
#>  5 2003-04-20 D       TRUE    TRUE    FALSE      62 M      6BC362  B_ENTRC      
#>  6 2003-04-21 D       TRUE    TRUE    FALSE      62 M      6BC362  B_ENTRC      
#>  7 2003-08-13 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRBC_CLOC
#>  8 2003-08-13 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRC_FCLS 
#>  9 2003-09-05 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRC      
#> 10 2003-09-05 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRBC_CLOC
#> # … with 96 more rows, 40 more variables: PEN <rsi>, OXA <rsi>, FLC <rsi>,
#> #   AMX <rsi>, AMC <rsi>, AMP <rsi>, TZP <rsi>, CZO <rsi>, FEP <rsi>,
#> #   CXM <rsi>, FOX <rsi>, CTX <rsi>, CAZ <rsi>, CRO <rsi>, GEN <rsi>,
#> #   TOB <rsi>, AMK <rsi>, KAN <rsi>, TMP <rsi>, SXT <rsi>, NIT <rsi>,
#> #   FOS <rsi>, LNZ <rsi>, CIP <rsi>, MFX <rsi>, VAN <rsi>, TEC <rsi>,
#> #   TCY <rsi>, TGC <rsi>, DOX <rsi>, ERY <rsi>, CLI <rsi>, AZM <rsi>,
#> #   IPM <rsi>, MEM <rsi>, MTR <rsi>, CHL <rsi>, COL <rsi>, MUP <rsi>, …

if (require("dplyr")) {
  example_isolates %>%
    filter(mo_name() %like% "^ent")
}
#> ℹ Using column 'mo' as input for `mo_name()`
#> # A tibble: 106 × 49
#>    date       hospi…¹ ward_…² ward_…³ ward_…⁴   age gender patie…⁵ mo           
#>    <date>     <fct>   <lgl>   <lgl>   <lgl>   <dbl> <chr>  <chr>   <mo>         
#>  1 2002-02-21 C       FALSE   TRUE    FALSE      69 M      4FC193  B_ENTRC_FACM 
#>  2 2002-04-08 A       TRUE    TRUE    FALSE      78 M      130252  B_ENTRC_FCLS 
#>  3 2002-06-23 D       FALSE   TRUE    FALSE      82 M      798871  B_ENTRC_FCLS 
#>  4 2002-06-23 D       FALSE   TRUE    FALSE      82 M      798871  B_ENTRC_FCLS 
#>  5 2003-04-20 D       TRUE    TRUE    FALSE      62 M      6BC362  B_ENTRC      
#>  6 2003-04-21 D       TRUE    TRUE    FALSE      62 M      6BC362  B_ENTRC      
#>  7 2003-08-13 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRBC_CLOC
#>  8 2003-08-13 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRC_FCLS 
#>  9 2003-09-05 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRC      
#> 10 2003-09-05 B       TRUE    FALSE   FALSE      52 M      F35553  B_ENTRBC_CLOC
#> # … with 96 more rows, 40 more variables: PEN <rsi>, OXA <rsi>, FLC <rsi>,
#> #   AMX <rsi>, AMC <rsi>, AMP <rsi>, TZP <rsi>, CZO <rsi>, FEP <rsi>,
#> #   CXM <rsi>, FOX <rsi>, CTX <rsi>, CAZ <rsi>, CRO <rsi>, GEN <rsi>,
#> #   TOB <rsi>, AMK <rsi>, KAN <rsi>, TMP <rsi>, SXT <rsi>, NIT <rsi>,
#> #   FOS <rsi>, LNZ <rsi>, CIP <rsi>, MFX <rsi>, VAN <rsi>, TEC <rsi>,
#> #   TCY <rsi>, TGC <rsi>, DOX <rsi>, ERY <rsi>, CLI <rsi>, AZM <rsi>,
#> #   IPM <rsi>, MEM <rsi>, MTR <rsi>, CHL <rsi>, COL <rsi>, MUP <rsi>, …
# }