I have written a function in XQuery where you can pass a key as an argument and retrieve that key's value, this function can be applied on unformatted text.

I have defined separators (of pairs) as a regular expression pattern that would match all whitespace in addition to commas:

`[ \t\r\n,]+`

And then I applied another regular expression to extract the key and value parts out of each pair:

`([a-zA-Z0-9\.]+)=(\"?[a-zA-Z0-9\.\-\=]*\"?)`

This pattern matches any alphanumeric keys (plus dots) and alphanumeric values (plus dots and dashes) that can be optionally quoted.

The full function is as follows:

declare function xf:GetValue($arg as xs:string, $key as xs:string) as xs:string{ for $item at $pos in fn:distinct-values(fn:tokenize($arg, '[ \t\r\n,]+', 's')) let $regexp := '([a-zA-Z0-9\.]+)=(\"?[a-zA-Z0-9\.\-\=]*\"?)' where fn:matches($item, $regexp, 's') return if ($key = replace($item, $regexp, '$1')) then replace($item, $regexp, '$2') else ()

It can be changed to adapt to different scenarios, I believe the one above should cover a majority of cases unless for instance you are dealing with a format where pairs are glued to eachother, something of this nature: `key1=valueKey2=value`

then we need to find one or more patterns to distinguish separators from pairs.

Now my function isn't completely optimized for performance. Let's say we have *n* pairs that exist in the text we are searching in, and *k* pairs that we want to find, that means the complexity is `O(kn)`

or simply `O(n)`

, linear that is. What we can do instead is to run the function once and populate a binary search tree, and then each time we want to get a value we will look it up in the BST, the complexity in this case will be `O(n+k.log(n))`

.