6-minute read

Quick summary: Proven successful approaches to generating random test data in JMeter as part of a more complex API testing framework

Random data prevents optimized libraries from caching requests and displaying artificially high performance. It is also useful to test the range of accepted values to ensure that the full range is supported. In addition, random data can help with building negative test cases, which are important both for exercising functionality and for determining whether negative test cases result in performance impacts (which are also a security risk). There are many ways to create random data, and this article provides examples of some approaches I have used with consistently positive results as part of a more complex API testing framework.

Apache functions

There are several functions documented on the Apache site. I find that I need to look beyond the documentation for good examples, and you will need to look beyond this short article for more than one, though this is one I use often: ${__RandomString([length],[characters], [variableName])}:

${__RandomString(${__Random(3,20)},ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,
              notes.saveSessionNotes.myOrderSessionApi)}

Using JSR223 Samplers

JSR223 Samplers are great for taking advantage of the efficiency of the Groovy language in JMeter. There are two types of random data that are well suited for JSR223 Sampler generation. One is generic data, such as numbers, dates, and alpha-numeric characters with an easily defined pattern. The other is a small sample of specific values that can be used randomly.

The static data stored in the JSR223 Sampler should be as realistic as possible. Extracting the data from a production database copy using a query such as

SELECT DISTINCT [FIELD_NAME] from [SCHEMA.TABLE]

is the best way to get realistic and valid values. In cases where an empty string as input will return null as output, consideration must be given to the impact of skipping the null inputs versus using assertions that need to be very complex in order to handle the nulls dynamically.

Random dates

The example below demonstrates an approach to extract the difference in days between the earliest and latest date, then randomly select a number in that range of days and deduct it from the latest date to create a random test value (in this case for a date of birth):

Date  todayDob  = new Date();
Date  bottomDate = Date.parse('yyyyMMdd', '19010101');
Int   diffDays  = todayDob.minus(bottomDate);
Int   rndDiff   = org.apache.commons.lang3.RandomUtils.nextInt(0, diffDays);
Date  randDob   = todayDob.minus(rndDiff);
vars.put('myApiPersist.dob',randDob.format( 'yyyy-MM-dd' ));

Random strings from list

An object array is easier to read and maintain for simple arrays.

String[] empStatuses  = ['Full time student', 'Unemployed', 'Retired',
                       'Employed','Part time student','Unspecified'];
int      empStatusIdx = org.apache.commons.lang3.RandomUtils.nextInt(0,
                        empStatuses.size());
vars.put(‘myApiPersist.employmentStatus', empStatuses[empStatusIdx]);

In the example above, note that org.apache.commons.lang3.RandomUtils is used instead of the Groovy Random class because of how Random is not actually random and can easily lead to clashes across threads. (There are other functions and approaches as well, and I tend to re-use previous code until a compelling reason to change occurs).

For sets of very small lists, using a random number and the modulus operator is more efficient if the random number is already being generated for another value, as in this example:

String[] patGender = ['M', 'F'];
vars.put(‘myApiPersist.gender', patGender[(int)todayDob.getTime() % 2]);

Two-dimensional arrays

2D arrays are necessary when values must be used in pairs to be valid. For those unfamiliar with working with multi-dimensional arrays, here is an example:

String[][] nearestLocation = [["Don't know","UNK"],["SoCal","42035005"],
                       ["None with 100 miles","OTH"],
                       ["Midwest","20430005"],
                          ["Virtual","ASKU"],
                          ["America South","38628009"]];
int      nearestLocationIdx = org.apache.commons.lang3.RandomUtils.nextInt(0,
                            nearestLocation.size());
vars.put(‘myApiPersist.sexualOrientation', nearestLocation[nearestLocationIdx][0]);
vars.put(‘myApiPersist.sexualOrientationCode', nearestLocation[nearestLocationIdx][1]);

Maps

Maps are an alternative to the 2D array. Here is an example from a file upload API test:

def fileSize = [1,5,10,20];
def extensionAndMimeType = [tif: 'image/tiff', pdf: 'application/pdf', jpg: 'image/jpeg', docx: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'];
def extensions = extensionAndMimeType.collect{entry -> entry.key};
String filePrefix = fileSize.get(java.util.concurrent.ThreadLocalRandom.current().nextInt(0, fileSize.size()));
String fileExtension = extensions.get(java.util.concurrent.ThreadLocalRandom.current().nextInt(0, extensions.size()));
vars.put('session_file.name', String.format('%sMB_File.%s', filePrefix, fileExtension));
vars.put('mime.type', extensionAndMimeType.get(fileExtension));

Random keys from maps

Sometimes you only need the key from a map at certain points in your test. Rather than maintain two samplers, you can access just the keys.

def fileTypes 		= 	[ 'FileAttachment': 1, 'FileAttachmentOpaqueId': 1, 'ClinicalNote': 3, 'ClinicalNoteTemplate': 4, 'ExternalDocument': 5, 'FaxDocument': 6, 'Multiple': 8, 'CCDABatch': 9 ];
def randomFileTypeKey = (fileTypes.keySet() as List).get(ThreadLocalRandom.current().nextInt(fileTypes.size())).toString();
OR:
def randomFileTypeValue = fileTypes.get((fileTypes.keySet() as List).get(ThreadLocalRandom.current().nextInt(fileTypes.size())).toString());
OR:
def randomFileTypeValue = fileTypes.get(randomFileTypeKey);

Lists

At a certain (arbitrary) level of complexity, a List implementation may be easier to maintain (or it may justify an exception to use a CSV file instead). Here is an example of a List of Lists being used to generate all possible combinations of three sets of health-related parameters:

List<List> testValues = [
  ["FEVER","COUGH","SORE_THROAT"], //v3SymptomsSet
  ["POSITIVE","NEGATIVE","PENDING"], //testResult
  ["YES","NO","INDICATED_BUT_UNAVAILABLE"]];//v2TestedStatus
List<List> patCOVID19ScreeningApiVars = [];//
List temp = [];
void generatePermutations(List<List> lists, List result, int depth, List current) {
    if (depth == lists.size()) {
       result.add(current);
        return;}
    for (int i = 0; i < lists.get(depth).size(); i++) {
        generatePermutations(lists, result, depth + 1, current + lists.get(depth).get(i));}
}
generatePermutations(testValues, patCOVID19ScreeningApiVars, 0, temp);
vars.putObject("patCOVID19ScreeningApiVars",patCOVID19ScreeningApiVars);
vars.put("patCOVID19ScreeningApiVars.Count", (patCOVID19ScreeningApiVars.size - 1).toString())
SampleResult.setIgnore();

Ctrl+C/Ctrl+V

There are some alternate ways to generate random values described in this Stack Overflow thread.

Some key considerations for options to generate a random number:

  1. java.util.Random does not generate true random values. A test in JMeter resulted in the same series of numbers every single run. This can increase the likelihood of clashes between threads.
  2. While org.apache.commons.lang3.RandomUtils is a wrapper around java.util.Random, it is (like most of the Apache Commons collection) a well-designed wrapper that results in values that are more random and tested to result in a different series of values on every test run.
  3. ThreadLocalRandom is recommended by some as more efficient. Given these efficiencies are measured in sub-milliseconds and that the Apache Commons is a more reliable collection than the java.util collection, the recommendation is to use RandomUtils.

Random conclusion

The failure of many API tests to catch defects prior to release is the limited inputs used to test them. Using random data in conjunction with iterative loops will help lower the amount of time spent patching production issues and leave more time for building business capabilities and value.

Like what you see?

Paul Lee

Senior Technical Architect Scott S. Nelson has over 20 years of experience consulting on robust and secure solutions in areas such as multi- and hybrid-cloud, system integrations, business process automation, human workflow management, and quality assurance innovation.

Author