Content Extraction
6 minute read
Content Extraction refers to the ability to parse and extract content of the response data. This feature comes in handy for example if you need dynamically generated content for subsequent requests such as access tokens.
To extract content from a response you have to set the extraction
option in the request options parameter.
You can use more than one content extraction per request.
- specify the desired type (
jsonpath
,xpath
,regexp
,header
) - specify the names for referencing
- specify the expressions
For example:
session.get("/tokens", {
extraction: {
jsonpath: {
"accessToken": "authorization.token",
"checksum": "authorization.checksum"
}
}
});
To use the extracted content in subsequent request you can do so with the getVar
function.
session.get("/ping?token=" + session.getVar("accessToken"));
Note
Since a launched client will execute only one session and they don’t share anything, you cannot reuse extracted data from responses across sessions or clients.JSONPath
JSONPath is used to find information in a JSON Object.
In your test case definition you can use JSONPath
as follows.
Given this JSON response:
{
"authorization": {
"token": "s3cret-access-token"
}
}
and this request options to extract the access token:
session.get("/tokens", {
extraction: {
jsonpath: {
"accessToken": "$.authorization.token",
}
}
});
will make accessToken
available as a variable to the current client within the same session:
session.get("/ping?token=" + session.getVar("accessToken"));
JSONPath Support
Note
We only support a subset of what JSONPath describes.-
$.
at the beginning to indicate the document root, is optional and implied. -
We allow the dot-notation (
attribute1.subattribute2.subattribute3
) and array brackets for more complex fields (attribute1["ProductName with $"]
) -
We allow array position access via brackets (
attribute1[0]
) -
We do not support script-expressions
-
You can use simple list filters in the form of
$.something[?@.attribute==value].attribute
attribute
may be one or more attributes (e.g.attr1
orattr1.attr2.attr3
)value
can be a simple number or a quoted string- see below for examples
-
If the result of your JSONPath expression is not
true
,false
, a string or number, we will reencode the result into a JSON encoded string. Note that this does not preserve whitespaces or field order of the original input though. -
If your expression does not match anything, you will either get an empty string (
""
) or an empty list ("[]"
) when using list filters. You can use conditionals to check for empty matches.Example condition to check for an empty match when using list filters:
session.if(session.getVar("listResult"), "=", "[]", function(context) { /* listResult did not match */ });
Example condition for an empty match for normal jsonpath expressions:
session.if(session.getVar("accessToken"), "=", "", function(context) { /* accessToken did not match */ });
Here is another example for selecting elements from an array. Given this JSON document
{
"products": [
{
"id": "8e7cd50a-ff44-4518-a02b-51d2dceaafb1",
"type": "42",
"name": "iPhone 6",
"properties": {
"color": "spacegray"
},
"available": true
},
{
"id": "5e41653c-de17-4d6a-9616-5d80257d4b7e",
"type": "23",
"name": "iPhone 5",
"properties": {
"color": "white"
},
"available": false
}
]
}
- Accessing items by index
$.products[0].id
will match8e7cd50a-ff44-4518-a02b-51d2dceaafb1
$.products[1].id
will match5e41653c-de17-4d6a-9616-5d80257d4b7e
$.products[3].id
will result in an empty string$.products[random(length(@))].id
will pick a random item fromproducts
and returns itsid
- Selecting by filter
$.products[?@.type==23].name
will matchiPhone 5
$.products[?@.type==42].properties.color
will matchspacegray
$.products[?@.name=="iPhone 6"].type
will match42
$.products[?@.name=="iAndroid 2099"].type
will match[]
, as nothing matched the filter$.products[?@.available].name
will returniPhone 6
$.products[?@.available==false].name
will returniPhone 5
$.products[?@.available!=true].name
will returniPhone 5
Note
JSONPath functions are currently in beta, so their usage is limited. Feel free to give feedback on more use-cases and ideas.Inside the filters you can use the following functions:
length(element)
-length()
returns the length of the passed element. Normally this is used with@
to access to length of the current element in a filter.random(int)
-random()
returns a random integer between zero (inclusive) and the provided parameter (exclusive). This can be used in combination withlength()
to a pick a random element from a list:$.products[random(length(@))].id
.
XPath
XPath (the XML Path language) is a language for finding information in an XML document.
In your test case definition you can use XPath
as follows.
Given this response:
<?xml version="1.0" encoding="UTF-8"?>
<users>
<user>
<firstname>Giada</firstname>
<lastname>De Laurentiis</lastname>
<email>giadalaurentiis@example.com</email>
</user>
</users>
and this request options:
session.get("/tokens", {
extraction: {
xpath: {
"email": "/users/user[1]/email"
}
}
});
Will make email
available as a dynamic data source within the same session:
session.put("/user?email=" + session.getVar("email"));
Regular Expression
Regular expressions are used to find a matching string.
In your test case definition you can use regexp
as follows.
Given this response:
<p>Welcome john.doe@example.com!</p>
<p>You can confirm your account email through the link below:</p>
<p>
<a href="http://test/users/confirmation?confirmation_token=noXuMgKe/i5pPP4wdv5Kq&locale=en">
Confirm my account
</a>
</p>
and this request options:
session.get("/data/test.html", {
extraction: {
regexp: {
"confirmationToken": "confirmation_token=([\\w_\\/-]*)",
}
},
});
will make confirmationToken
available as a dynamic data source within the same session:
session.get("/ping?token=" + session.getVar("confirmationToken"));
Note that you MUST have a match group within your regular expression - the first specified match group will be assigned to the variable.
Regular Expressions on Headers
Since the regexp
extraction only applies to the response body, you can use the regexpheader
extration to work with headers, e.g. to grab parts of a Link
header.
session.get("https://example.com/api/", {
extraction: {
"regexpheader": {
"docid": "Link: .*/id/(.*)",
},
},
});
session.assert("doc_present", session.getVar("docid"), "!=", "");
session.get("/doc/:docid", {
params: {
docid: session.getVar("docid"),
}
});
Assuming the first request returns a Link: https://api.example.com/doc/id/4711
header, the regexpheader
extraction would store 4711
into the variable docid
.
Each regexpheader
extraction is checked against each header in a random order and the first match is used.
HTTP Response Header
HTTP Response Header contain meta information of the response message.
In your test case definition you can extract HTTP response header field values using the request option header
as follows.
Given this header response:
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 211
Connection: keep-alive
Status: 200 OK
Date: Fri, 12 Feb 2016 08:43:01 GMT
X-Powered-By: Phusion Passenger 5.0.23
Server: nginx/1.8.0 + Phusion Passenger 5.0.23
and this request options:
session.get("/tokens", {
extraction: {
header: {
"serverHeader": "server"
}
},
});
Will make the HTTP response header Server
with the value nginx/1.8.0 + Phusion Passenger 5.0.23
available as a dynamic data source named serverHeader
:
session.get("/ping?server=" + session.getVar("serverHeader"));
Keep in mind that dynamic data sources are available in the same session only.
Cookie Extraction
Similarly cookie extraction allows to extract cookie values:
session.get("/login", {
extraction: {
cookie: {
"varAuthToken": "Api-Token"
}
}
})
This example copies the Api-Token
cookie value from the response into the dynamic data source varAuthToken
.
Body Extraction
Note
This feature is very costly and should be used sparingly.You can also store the whole body in a variable:
session.get("/profile/me.html", {
extraction: {
body: {
"varContentBody": true
}
}
})
This example will fill the variable varContentBody
with the response body of /profile/me.html
.