According to USAID research standards, quality data must exhibit five key attributes, V-TRIP; i.e. Validity, Timeliness, Reliability, Integrity and Precision.
Valid data is data that shows a true representation of the measure of interest (indicator), and its changes can credibly be associated with the interventions in question. It should be free of sampling and non-sampling errors. The validity of data is achieved by developing proper data collection tools and their subsequent effective use during the data collection exercise. Soon after the exercise, the integrity of the data in question must be protected, and this is often linked to the capabilities of the management system in place to reduce the possibility of introducing bias either by transcription error or deliberate manipulation during data entry and cleaning.
In reality though, isn’t this just idealism? During a data collection exercise, there are usually many factors in play that may hinder the collection of valid data. Therefore, what follows data collection is often putting in hard work on preserving the integrity of invalid data.
The challenge does not start at the field but from the planning of a research exercise, and most of it has to do with time. There are key areas that are associated with the issue of time that must be dealt with precisely to avoid these “time-bottlenecks”. I conducted a mini-research for purposes of getting opinions and experiences (basically qualitative in nature) from fellow researchers and here are some of the concerns raised:-
1. Training period. Proper planning ultimately determines the level of success of any given project. Therefore, it is advisable to spend as much time as needed to prepare, so that on execution every possible angle of challenges and risks will have been mitigated or prepared for. Training of enumerators is part of planning. Spending only a day on training enumerators who are going to carry out a seven day survey only to end up getting 50% of the responses wrong doesn’t make sense. Wouldn’t it be wiser to spend two days on training and increase the precision and validity of collected data to 90%? Release enumerators to the field only when you are sure they will bring you, not good but excellent data.
2. Data collection period. A questionnaire that takes an hour during a mock survey in the training venue will not take the same amount of time in the field. It will take probably a half an hour more. Therefore it is not logical to expect an enumerator to bring back eight questionnaires at the end of the day. The plan must consider sampling method used and time of travel to access target participants. All these are about TIME. If you give unreasonable targets the enumerator will use unethical means (compromised integrity) to reach the target and the result will most likely affect the validity of the data.
3. Sample size versus daily target. Often, the aspect of bias and assumption among enumerators comes in when they have had to ask the same questions over and over and keep getting the same answers. By the time they are administering the one-hour-long, fifth questionnaire of the day they have basically switched to ‘auto-drive-mode’. What they do is to assume responses to some questions will be similar to what they got before and therefore they do not pose these questions to the respondent but fill in the assumed response. This is also reported to happen whenever respondents look unsettled, seem to be in a rush or when the enumerator is tired, feels like they are far from reaching set daily target or are running out of time as the day concludes.
Is it possible to deal with these “time bottlenecks” to beat the issue of validity and integrity at the level of data collection?
Several suggestions were put across but there was no single standing solution. The suggested approaches must be combined to move from 90% that can be achieved with proper planning, to 98%. First of all sufficient time must be allocated and used in the planning phase. Train, carry out mock-survey, re-train, pre-test with a sample of targeted respondents then re-train. Ask questions and engage trainees. It helps in gauging their level of understanding of the tool, their confidence on the tool and their level of preparation to undertake the exercise. Do not depend on getting phone calls to clarify issues for enumerators after deployment to field. Network reception may be terrible or something else may render it impossible to communicate, then enumerators will make-do with guesswork. Attain an excellent mark before deployment.
Secondly, allocate sufficient time for the survey. Don’t give unreasonable targets because enumerators will hit the target but will deliver invalid data. It was also suggested that sound recording would be a great tool for confirming validity.