Part 1 of the Science Critique 101 Series.
Science critique is absolutely necessary for interpreting all the information we can readily access these days, but before you decide to accept or reject new findings, it’s important to understand that research is never perfect, and that’s the whole point!
Science is about continuous improvement.
We live in an era of information saturation. In my own time, I have gone from Encyclopedia Britannica in the family library, specialty veterinary books on my friend’s shelf, university lectures and information nights, through to specialist CD ROMs and DVDs, then the internet and now podcasts and webinars.
If someone criticises you for feeding your horse garlic, you can just ask Siri to find articles about ‘the benefits of feeding horses garlic’. They can do the same with the search term ‘garlic poisonous for horses’.
Siri is always there to tell people what they want to hear (read my article ‘Bias Beware’ online here).
This is why evidence-based, scientific knowledge is more valuable than ever. It’s why we no longer ride horses on empty stomachs, why Pony Club Australia revised their 2019 Syllabus of Instruction around Equitation Science and probably why you have chosen to read Horses and People Magazine.
Science is about continuous improvement – getting closer to confidence and certainty by continually testing, revisiting, reviewing and sometimes replacing or outright rejecting that of which we were confident yesterday.
That is, science is not perfect. Science is as critical of itself as are its consumers. Researchers as well as the general public should be critical of the scientific evidence with which they are provided, especially regarding how knowledge is produced.
Most people know some form of what I call science critique 101, covering the following big four: sample size, research bias, peer-review and funding. They want to know ‘how many people or things were in the study?’, ‘who did the research?’, ‘who vouched for it?’ and ‘who funded it?’
However, those critiques are sometimes based on unhelpful assumptions like ‘the bigger the better’, ‘they always push that barrow’, ‘peer-review is a mate’s club’ and ‘they were paid to find that answer’ – especially when people dislike or disagree with the research findings (see my article ‘Make Hay, Not Straw Men!’ here).
Whilst these critiques are legitimate, knee-jerk deferrals to these kinds of criticisms for the purpose of dismissing research findings are usually short-sighted projections that reveal more about the anxieties and personal politics of the critic than the implications of the research.
In future articles, I will provide more context for these critiques so that science consumers can be more nuanced when interpreting and engaging with scientific research. To borrow from metaphor, I want to make sure we aren’t throwing out the baby with the bathwater or missing opportunities to gain something beneficial even from flawed research.
In the present article, I want to simply make the point that no piece of research is perfect. Research is a decision-making process.
When a researcher designs a study, they are faced with many decisions. Whether they are answering a research question or testing a hypothesis, researchers need to decide what approach they will use, what method they will use, who they will do the research with, who the researchers will be, if and what they will measure, what is included, what is excluded, when and where the research will be done, for how long and for how many repetitions.
The answers to all these questions should depend principally on what is going to be most useful to answer the research question or test the hypothesis. They may also depend on how well researched the area already is. For example, new areas of research tend to be more open-ended and exploratory whilst well developed areas of research can have more specific studies, often extending or testing the findings of previous research.
The best way to answer any research question is not the same as the perfect way. Every decision in research design is a trade-off.
I can survey hundreds or thousands of people quickly online, but it is very hard to consider their answers unless they are restricted to units of information that can be easily and quickly analyse. If I want that kind of large sample size, I defer to closed-ended questions (yes/no, how many?) or a Likert scale (always, sometimes, rarely, never, etc).
The trade-off is that I miss all the highly nuanced responses that might more accurately reflect how people actually make decisions.
You know when you do a closed-ended online survey and are faced with yes/no options or a Likert scale, but the first thing that comes to mind is ‘umm, it depends’?
Well, that’s the kind of real-life (‘ecologically valid’) thought process of humans that a simple survey of thousands just can’t capture. I might end up finding out how many people think horse-racing is cruel, and I might have collected enough demographics to consider the age and gender of people who do and don’t think horse-racing is cruel, but I will know nothing about where they draw the line between what is and what is not cruel, under what circumstances and dependent on what?
For example, if you asked me if I find rodeo riding more or less cruel than dressage, my answer would depend on many things. Rodeo horses are ridden for a few minutes every so often. Dressage horses might be ridden for an hour a day. In the same way that we know that the ‘goodness’ or ‘badness’ of your average dressage bit and bridle depends on the rider, are there also kind and unkind rodeo riders? Is the dressage horse kept in a stable or housed with company where he or she has room to move?
It all depends.
Faced with only two possible responses in the survey, I might end up ticking the box ‘more cruel’. Still, I would have an immediate snapshot of public perception and findings that could be easily communicated.
I could do a stated preferencing study where I can add these other variables about time spent being ridden and living conditions, but that would be a fairly artificial creation of a fixed number of options. Still, I could weight the variables and determine statistically what people attributed more or less value to and how that impacted their relative ranking of welfare outcomes.
A 30-60 minute face-to-face interview, however, might capture convey all of that ‘it depends’ complexity and more. But it would be very hard for the researcher to analyse more than 40 or 50 of those interviews in a way that preserves that complexity and makes it easy to present in a scientific paper.
And 20 or 30 interviews might be the ‘sweet spot’ before the person analysing the interviews stops looking for (i.e., just can’t handle) new insights. And how would I handle the diversity of responses and multiple combinations, even if some of them only occurred once during an interview. And how much does what someone say during an interview reflect how they thought before or after?
Then, what if the nineteenth person you interview brings up something incredibly relevant to the research question, but you never anticipated it, so you never asked participants 1 through 18? How will you handle that information, if at all? You might never have found that out if you did a close-ended online survey, but if you had done an online survey, you would have had more people to be able to identify trends that might be common in the general population.
This is the breadth versus depth trade-off that is often lent on to distinguish between quantitative and qualitative research. All of the aforementioned approaches have advantages and limitations. It just means that the decision to use a survey-based design or a face-to-face interview design needs to be made in direct relation to the aims of the research. (It makes little sense, for example, to answer a research question about ‘how many people think rodeo is cruel’ with a research design of 20 face-to-face interviews).
But how often does our imperfect world provide the conditions for the best research design to occur?
In deciding on the best way to answer their research question, researchers face many practical restraints, mostly related to the resources available. This includes the people available to do the research, the necessary equipment, ethical considerations and the time available to get the work done (at the right time of year, in the right time frame and at the right pace). Time may possibly be the easiest thing to plan but it is the hardest thing to manage.
In-depth, face-to-face interviewing takes time to conduct and analyse. Someone might answer ‘question six’ in their response to ‘question two’, or not at all. It takes time to organise the findings of that interview against the others. How well could this have been predicted and put into a timeline?
An online survey of 100 people might be free to host and take three hours to analyse the data which is delivered in a spreadsheet. Each interview with 20 people might take one hour to conduct, three hours to transcribe and three hours to analyse. That means 180 hours of work, at least.
Such practical limitations of time and money are rarely spoken about in the published methods sections of academic articles. I believe this is largely a reflection of the legacy of the monastic history of universities.
There was a magical time that academics dream of fondly, when they could produce knowledge for knowledge’s sake, in the time necessary to do it well, without justification of importance and with full support of their institution and the community.
During that period, there was no need to make it acceptable for a researcher to say something like ‘we studied six horses because the two others we had hoped to include were taken back by their owners’, or ‘we only analysed the effects of the medication for six months because our research assistant took a position elsewhere and we couldn’t find anyone else who wanted to work with us for prestige instead of pay’.
With the corporatisation of university research, academics have to either justify their research in relation to the official list of national research priorities, fund their day-to-day existence by doing research demanded by industry or partner with like-minded organisations.
Few, if any universities or industries, will cover the costs of research that cannot be translated into intellectual property that can be sold. That is, there will be more support for academics to research the treatment of laminitis, than research the reasons why horse owners and carers struggle to objectively score their horse’s body condition. And that’s just the system in which most researchers currently find themselves.
This is why other kinds of research are largely a labour of love, done by researchers at night, on weekends, during their holidays without any assistance or put up as a project for Honours, Masters or PhD students to work on.
Indeed, that is how most of my research on human-horse relations was completed, whilst my salary was directed towards more important/topical/fundable/profitable issues like domestic food waste, train driver fatigue or passenger crowding. They may not have been done perfectly – being subject to the same or even more practical limitations as my ‘bread and butter’ projects – but they were done.
This is not to say that some research should be subject to less scrutiny that other forms of research. Rather, the question becomes, how can we use popular forms of science critique to contextualise research findings and ensure that they are neither over nor under interpreted, nor over or under stated?
Over ten years of being a university researcher, teacher, supervisor, project manager, article reviewer and writer, I was continually reminded that no piece of research is perfect. But that’s the point. That’s exactly why research is necessary and that is exactly why we should question the findings of research.
However, whilst critique is necessary for progress and improvement, it is important to know that even basic concerns over the big 4 (sample size, research bias, peer-review and funding) are not free of moral values, politics, agendas and biases. But these can be used in much more useful ways than disregarding a study.
Over the coming issues of Horses and People, I will explore science critique 101. There are three reasons for this:
First, to make sure that research findings are not being dismissed unfairly.
Second, to make sure that critiques of science are themselves subject to critique and,
Third, to make sure that we don’t deny ourselves the opportunity to be better horse people, and we do better by and for our horses, because we rejected something for being imperfect.
In the next article, I tackle the issue of sample size and the popular belief that the bigger the sample, the better the research. This belief is based on the important need for a sample to adequately represent a population. However, the exact numbers behind ideas of ‘big’ and ‘small’ are relative to the research aims, the size of the total population, the type and quantity of data being considered for each research participant/subject, and the implications that are being drawn from the findings.