Creates a data frame from an exported 'WhatsApp' chat log containing one row per message and a column for DateTime when the message was sent, name of the sender and body of the message. Only works as an intermediary function called from within parse_chat

parse_android(
  chatlog,
  newline_indicator = "\n",
  media_omitted = "<media omitted>",
  media_indicator = "(file attached)",
  sent_location = paste0("location: (?=https:\\/\\/maps\\.google\\.com\\/",
    "\\?q=\\d\\d.\\d{6}\\,\\d\\.\\d{6})"),
  live_location = "^live location shared$",
  datetime_indicator = paste("(?!^)(?=((\\d{2}\\.\\d{2}\\.\\d{2})|(\\d{1,2}",
    "\\/\\d{1,2}\\/\\d{2})),\\s\\d{2}\\:\\d{2}((\\s\\-)|(\\s(?i:(am|pm))\\s\\-)))",
    sep = ""),
  newline_replace = " start_newline ",
  media_replace = " media_omitted ",
  foursquare_loc = "^.*: https://foursquare.com/v/.*$"
)

Arguments

chatlog

'WhatsApp' chat preprocessed by parse_chat

newline_indicator

character string defining character for newline indicators. Default is a Unicode newline.

media_omitted

character string inserted by 'WhatsApp' instead of file names when not exporting media.

media_indicator

character string for detecting media and file attachments.

sent_location

Regex for detecting auto generated messages for locations shared via chat.

live_location

Regex for detecting auto generated messages for live locations shared via chat.

datetime_indicator

Regex for detecting the DateTime indicator at the beginning of each message.

newline_replace

replacement string for a newline character in parsed message. Default is " start_newline ".

media_replace

replacement string for omitted media files. Default is " media_omitted ".

foursquare_loc

Regex for detecting sent Locations as FourSquare Links.

Value

A data frame containing the timestamp, name of the sender and message body

Examples

ParsedChat <- parse_android("29.01.18, 23:33 - Alice: Hi?\n 29.01.18, 23:45 - Bob: Hi\n")