I have a spreadsheet containing ticket data from an IT ticketing system. The field I need to extract data from is a non-indexed text field, so I am not able to search this field in the ticket system itself. I have exported the data into Excel, where the entire Work Notes field is in one cell for each ticket. The output looks something like this:
2021-09-21 17:15:41 John Smith [Work Notes] - Update inside ticket 2021-09-21 16:12:39 Jane Doe [Work Notes] - Another update inside ticket.
What I am trying to do is capture the individual work notes and separate them into their own column so that I can calculate the time between updates and count the number of updates each individual makes. Counting the updates is easy, I used a simple CountIF formula to get the total number of times each individual has made an update in the tickets I have exported. The difficult part is how to split each update into their own cell.
I have cleaned up the data so that the only time brackets( [ ] ) are used is for the [Work Notes] or [Additional Comments] (a customer visible work note entry). I then tried text to comments with the right bracket as the delimiter, but this works for splitting the first entry into it's own cell, but the next cell will contain the previous entry's comment and then the next entry. So it would look like this:
2021-09-21 17:15:41 John Smith [Work Notes
*next cell*
- Update inside ticket 2021-09-21 16:12:39 Jane Doe [Work Notes
and so forth. This does not really work since I still cannot perform the time calculation and starts to mix up the different updates within a ticket.
Does anyone have an suggestions on a better way to do this? I am open to using VBA if needed. Thanks for the help!
CodePudding user response:
This can be done with a find/replace and then text-to-columns.
2021-09-21 17:15:41 John Smith [Work Notes] - Update inside ticket 2021-09-21 16:12:39 Jane Doe [Work Notes] - Another update inside ticket.
Since the date is the beginning of the entry, you only need to add a character before the date. This can be anything you know will not appear in the rest of the messages.
Once you do this, you can use the text-to-columns to select the special character as your delimiter.
CodePudding user response:
Using a regex. Put strings in Column A on Sheet1.
Option Explicit
Sub extract()
Dim wb As Workbook, ws As Worksheet, cell As Range
Dim lastRow As Long, i As Integer, j As Integer
Dim c As Long, s As String
' build regex pattern
Dim regex As Object, m As Object
Set regex = CreateObject("vbscript.regexp")
With regex
.Global = True
.MultiLine = False
.IgnoreCase = True
.Pattern = "((20\d\d-\d\d-\d\d \d\d:\d\d:\d\d)(.*?)(\[[^]]*\]))" ' pattern
End With
' data
Set wb = ThisWorkbook
Set ws = wb.Sheets("Sheet1")
lastRow = ws.Cells(Rows.Count, "A").End(xlUp).Row
For Each cell In ws.Range("A1").Resize(lastRow)
s = cell.Value
c = 1
If regex.test(s) Then
Set m = regex.Execute(s) '
For i = 1 To m.Count
For j = 2 To m(i - 1).submatches.Count
cell.Offset(0, c) = m.Item(i - 1).submatches(j - 1) ' matched term
c = c 1
Next
Next
End If
Next
MsgBox lastRow & " rows parsed", vbInformation
End Sub
CodePudding user response:
With Microsoft365 formulae:
Formula in A3
:
=FILTERXML("<t><s>"&CONCAT(IF(ISNUMBER(SEARCH(" ????-??-?? ??:??:??",MID(A1,SEQUENCE(LEN(A1)),20))),"</s><s>",MID(A1,SEQUENCE(LEN(A1)),1)))&"</s></t>","//s")
Obviously nest this inside TRANSPOSE()
to spill this over to columns but for the sake of visibility in the screenshot I opted not to do so.
EDIT:
I can only guess your follow-up question but if you want to analyse the data a bit better e.g DateTime-Stamp;Person;Ticket you could try the following:
Formula in B3
:
=CHOOSE({1,2,3},--LEFT(A3#,19),MID(A3#,21,FIND(" [",A3#,20)-21),MID(A3#,FIND("] - ",A3#) 4,LEN(A3#)))
As you see it spilled all the different data in seperate columns.
Now if one has access to LET()
all this can be condensed into a single formula.
Hope it helps.