
Daniël Lakens is an Associate Professor in the Human-Technology interaction group at Eindhoven University of Technology (TU/e). His areas of expertise include meta-science, research methods and applied statistics. Daniël’s main lines of empirical research focus on conceptual thought, similarity, and meaning.
He also focuses on how to design and interpret studies, applied (meta)-statistics, and reward structures in science. A large part of his work deals with developing methods for critically reviewing and optimally structuring studies. Lakens has provided several contributions to the scientific community, including the free open courses on Coursera “Improving your statistical inferences” and “Improving Your Statistical Questions”. He runs a blog called “The 20% Statistician” and is also on twitter (@lakens). In this exclusive interview to netECR, Lakens provides very important insights on science, ethics, and academic career.
What are your main research interests?
My background is in Experimental Psychology, and I still do this, and this is really work on things like concepts, how people think about conceptual thoughts, especially more the overlap between the Social Psychology concepts such as valence or morality or power or those things – a little bit vague things – how do we manage to think about them. More recently I have been also working a little bit more on things like similarity – how people make similarity judgements between things – really the things that cognitive psychologists study, so circles and squares and colours and those kind of things. So that is a little bit of my background, doing empirical research in this area. Since the last six years I have slowly moved into something that I always considered to be sort of my side job. Every yearly talk that I had with the head of my department I used to say like “I’m just going to finish this one paper – I think is really important that someone writes about this – but then I’m going to do empirical research” up to the point that, like, two years ago, the head of the department said “Daniel, it’s time for you to start writing some grants”, which I never really did the years before, because I have some problems with the grant system. Then I said “I think it’s a waste of time, because, you know, successful rates are so low… Are you sure I should be spending so much time writing a grant when the odds are so low?”. He said “I think the odds are okay, if you write something about what you have been doing in the past couple of years, this meta-science thing, I think it’s necessary and I think you have a good résumé for it”, and now I have this grant, and a couple of people working on it. So now, from my hobby this meta-science thing is now my full-time job. So it’s interesting to see how you just completely shift fields. So now I’m looking at a lot of statistics, research methods, norms in science and rewards structures in science and how we can do things better at a real, applied, practical level.
Can you briefly describe your career path?
I did a Masters in Social Psychology but also a little bit more in Experimental Psychology so the anchoring effect, which is based on the work by Kahneman and Tversky, if I tell you: “would you want to buy my car for €3K?”, you look at the car and say “well, it’s definitely not worth €3K, it’s worth way less, but okay, how about I offer €2,500?”. So you are anchored by the amount of money. But if somebody says “I want to offer a thousand Euro” – a very low offer – anyway, people’s judgements are anchored towards the numbers that they have as references – this is sort of what I did for my Masters. Then I remember my supervisor saying to me when I got my Masters dissertation: “I think you could have a career in science”. By that moment I hadn’t even seriously thought it, and I remember thinking “Hmmm… Yeah, thanks. It’s a nice compliment but I don’t think so!”. And I had decided that I was going to work at an elderly care home for a year, doing social service, and I thought I would take a moment to think about it. But my supervisor sneakily sort of said “How about we write up your Masters thesis for this Dutch conference?” and I started to sort of like it and get into it, and then I thought “okay, I will apply for a PhD position somewhere”, not realising that if you spend a year not going directly from your Masters to your PhD people generally think “are you motivated enough? It’s like you’re wandering around”, and it was a risky hire, so it took like two years until someone hired me. I was on the brink of thinking “oh, never mind, I will just work somewhere else”. Somebody hired me and then I worked in Social Psychology in Amsterdam, at the University of Amsterdam. Then I moved to Utrecht where I got my PhD, and after the PhD I moved to Eindhoven which is a technical university. In The Netherlands we have a divide between technical universities and non-technical universities. So now we are a small group of social scientists within a much larger community that is mainly made of chemistry, physics, material science, architecture, computer science and those kinds of things. So, no social sciences except our small department. It is a very interesting place to work. Some people say I should go abroad and experience a different culture, but in this case I would say “just work with engineers, basically”, because they seriously say things like “I’m not even sure whether Psychology is a science”, and they are serious about that. So they really confront you and then “okay, what are we doing? Are we doing a good job? Are we actually doing science?” I like this sort of critical environment where you have to reflect upon what you are doing. So I have been there for eight years and now still an Assistant Professor, I’m sure I will become an Associate Professor at some point this year but I don’t worry too much about this.
Please tell us about your career highlights and lowlights.
I think lowlights for me… I had a difficult time doing my PhD, to be honest. I am not sure how good I was at what I was doing, to be honest. I mean, I think I tried but it was definitely difficult to, together with my supervisor, agree on, sometimes even on research questions that were interesting enough. Strangely enough my first papers are actually without my supervisor, because my supervisor thought: “I don’t think this is good enough science”. I think it was good enough science, I don’t know… It’s always a collaboration, people can have different opinions, so I’m not sure who was right or who was wrong, but it was difficult. It was difficult at the end of my PhD to collaborate; and you have to work the supervisor, right? So that was really… probably the most negative stress I’ve had.
Honestly, it sounds bad, but I think the highlights is getting this grant that I recently got and being able to hire a small team. So, we are three people now, two PhD students and a postdoc, and I think this time is really like a highlight. I mean, it’s a real pleasure to work walk in the office in the morning and just have really smart, motivated people working on something as a team together. It is just a pleasure, which is bad at the same time, because I know these grants are highly competitive to get, they should probably be spread out to more people; I think it’s unfair that I can hire three people, whereas other people who have great ideas would also benefit from hiring one person. So, I think it should be spread out a bit more, and I do work on this topic ironically, right? So, I do think about these things, but I also have the money and it is a pleasure; I have to be honest: it is really nice to have a large amount of money and be able to work with people on something.
What advice would you give to early career researchers working in suicide and self-harm research with quantitative analytical designs?
I would say it is probably good not to limit yourself only to quantitative analytical designs. It is good that you do it as well, but I think that in many fields, and this specific field of suicide research seems that you can combine both quantitative and qualitative designs in a very good way. I think that sometimes the division is a bit too much. But I think that, if you do things like quantitative research, I learned a lot after I completed my PhD, like two-three years after, because I finally sat down and a lot of things that I thought “okay, what do I need to do to do really good research?” And it is kind of weird that I only had this thought three years after my PhD, but before I was just doing stuff… I was just doing things like people were doing it and then after about two years, two and a half years, I had this moment where I thought “okay, I have to design a really good study” and this was actually for the reproducibility project that ended up being published in Science, 100 replication studies and I did one of them. So, very strangely I though “okay, this has to be a perfect study, because I’m replicating someone else’s work”, which I had never done before, so I thought that everything needed to be absolutely perfect, well thought through and excellent, which is weird because why wouldn’t I have done this before? But anyway, I felt “okay, if I mess it up, this person will be upset, so I have to do a really good job”. And then I thought about the basic stuff I don’t really know: “why am I doing this? How should I do this? Or what it actually is…” Because I tried to think a little bit one step beyond this is fine. I tried to think “is this really the best way? Can I justify this if I have to explain it to this person? Look, I did everything as good as I could because of x, y, and z”. And it turns out that if I asked myself why I was doing this, I didn’t really know exactly why. I was doing things, but I wasn’t feeling comfortable, strong enough in my understanding of these things. Then I took sort of a break, maybe close to a year, where I didn’t do a lot of new research. I was teaching at the time as well, but I didn’t do much research. However, I really learned a lot of stuff. Now, I’m benefitting so much from really investing some time to learn new things and I wish I had done it in the first year. I thought that maybe I should dedicate more time to learn new stuff rather than doing new research. I always had this feeling, sort of in the back of my mind, like “hmm, I think I kind of know what I am doing here” – this sort of uncomfortable feeling of slight uncertainty. And now, I’m trained in a way, basically. Now I pretty much know what I am clueless about as well, right – there is a lot of stuff I don’t know enough about to do – but I also know that it is dangerous to then just meddle around with it. Now I know that the stuff that today I understand, I look back and know that I used to meddle around in it and I see that I wasn’t doing a good job. So, it really helps me to know like “Okay, I know this stuff pretty well; I also don’t know other stuff at all”, and it’s really valuable to do that. You have to do that sooner or later. You will be confronted with this. And not having this insecurity anymore about certain things is a relief. I don’t know if you feel this, but I felt it. It is horrible to have the feeling of “I’m not sure about what I am doing here” – which I think is weird. As scientists, we should have the time to just really dive into something, especially if you use it a lot. You can’t be an expert in all quantitative analyses or other stuff, but the things that you use most you should feel comfortable and confident. It is also possible to reach that. You just need to commit a bit of time. I would really recommend doing that: figure something out that you use until you really feel confident. It’s a much nicer feeling than simply wondering around like “I guess I sort of know what I am doing”. If you have that, commit sometime time to read and study, you will feel a lot better.
I’ve got a follow-up question, if I may: When you say “knowing stuff”, what do you mean by that?
I did this replication study, and I had to do a power analysis and I had no clue about what it was and how it worked, and I had to calculate effect sizes. I had no real clue. Then ended up writing a paper in 2013 about effect sizes which is now cited a lot and I think a lot of people find it useful, because it was written for my past self on what would I have needed to understand this when I was working on it two years back in time. So I looked at the paper of the researcher I was trying to do the replication for, I had to calculate an effect size and I said “I’m not really sure. I think I need another number to be able to plug it in the power analysis”. That’s not really a very good level of understanding. Then the researcher said “Well, I also typed in some numbers in this online calculator and I get this number. Is that the number that you need?”. Then I thought: “Now, there are two people just, sort of, blindly doing random stuff without any understanding – this is not going to work for this replication. We need to do better than this.” That is the level of understanding I am talking about – it is really pretty clear, like I am not really sure how this works but I know that if I check these boxes then I get a number, but I didn’t know anything that was going on behind that. That level of understanding. There are different levels of understandings and they are like layers. There are layers, below layers, below layers. There is a level of understanding that is like “Okay, if I do this, I get this number, and that is what I am supposed to do”. There is another level of understanding that is like “Okay, I know why we do these kinds of things, what is the logic behind it, what is the theory behind this”. And then there is the math, and you can go deeper and deeper, like “Why these are the assumptions of this test, what are the foundations of probability”, and so on. What I am talking about here is a little bit below the surface level like “I know why we do this and I know why I wouldn’t do other things”. You don’t have to become a statistician or an expert, but if somebody asks “why do we do it like this?” You have to know. Yesterday on twitter there was a very nice question, somebody asked “Why do we use the standard deviation and not the standard error around the median?” I don’t know, I just assume that the standard deviation is a good thing to use. But then, some other people understand and can explain this. This is the level that I would have to look up. I don’t understand some things. You have to start from somewhere.
I have a second follow-up question, if that’s okay. How much percent of all these things you learned by yourself and how much of them you were taught?
Oh… well, in my case I am reading papers written by other people; sometimes they are educational papers, not lectures; I didn’t get lectures on these things. So I came to Glasgow to teach a course, I also have an online course, the one you told me you did, and that is exactly the stuff I wish I had known. So there was nobody really teaching me any of these things, and it’s weird. For example, we recently wrote a paper about equivalence testing where you can test the absence of an effect. For me, this seems like something I have been struggling with since I don’t know how long – “How do I interpret a nonsignificant result? What am I supposed to do with it?”. Now I realise: “Oh wait, you can just test not if it is exactly zero, but if it is too small to matter”. And that is the question that I sometimes have. It is exactly the same statistics, it’s t-test but just against the different value, so you need no other understanding of statistics; super useful. And I just don’t know why you are not taught this stuff. I really wonder why this is just not taught in an introductory course to science? Another thing that most people don’t know – and I didn’t know as well – is that p values are uniformly distributed if the null hypothesis is true. The first time somebody taught me this, I was like “no… that can’t be right; that doesn’t make any sense. That is supposed to be really high if it’s nonsignificant”, I basically had no clue. And it just takes 10 minutes at least to mention it; it’s so important. So yes, I don’t know why we are not teaching these things. I have the feeling that I have to learn a lot of things. Is not that the information was nowhere. But actually, papers about why p values are uniformly distributed are very mathy and stats orientated, they are not written for the general public, which is now you have these journals that are trying to do this such as the Advances in Methods and Practices in Psychological Science, a journal that aims to educate Masters and PhD students – it’s the level that you should write your paper about explaining something new. I think that, as a field, we are realising that we need more of this kind of education material because otherwise you are not going to learn as it takes too much time. So I had to learn a lot, a lot of stuff, and I am still learning a lot.
What advice do you give to yourself on difficult days/ Key tips for getting through difficult times in academia?
I get a lot of support from non-academics, to be honest, as these people makes things more relative. My wife and I are together for 14 years, we met just before I started my PhD, and she is not an academic and it’s lovely because she can make things more relative. We are in this academic world where we think “oh, this matters so much”, and certain things become really important. But then, I would come home and I would say “Oh, my papers were rejected, etc.” and my wife tells me “Yeah, it sucks. What are we having for dinner? Stuff happens, but it’s not the end of the world”. She is very good at putting things into perspective. And even now, very often I would get an email from somebody saying things like “Hey, you made this R package and I think there’s a bug in it”, and then I would go like “Oh no, no!”. Then my wife would go “Yeah, that’s how things work, people make mistakes. What do you expect?”. But it’s so weird that in the culture of academia, certain things feel like big things, big failures, big problems. I really benefit from her outside perspective. It’s not that she doesn’t know that it’s a bad thing; she understands it. She basically put things in perspective: there’s also the normal world, and that’s okay. For me that is very beneficial.
You have been considered as one of the main references of the Open Science movement. What are the main challenges in the implementation of Open Science practices within academia?
Science is really broad and I think this is an issue; it is now one of the challenges. We shouldn’t have a system where certain people are supposedly the main figures of something, that is not how it works. Everybody does their own stuff. Sure, some people spend more time to think about something and may educate others, or suggest ways to solve a problem. But in my experience that is all fine, but these things happen at the level of your research lab. So you are basically the person implementing this. So the challenge is: what can you do? Can you actually use these recommendations that people thought about? Or is your field different, more difficult? The translation into practice; we know that some things have to happen – that is sort of clear now – people wrote a lot about this. So the real challenge is this next step, you implementing stuff in your research: what can you implement? So I think it always comes down to it. You can have in theory, but things get complicated when you try to apply and implement them. When I give workshops I ask people what are their practical problems, what are their real-life limitations, because these don’t end up in a formula where it says like “for this power analysis you need seven thousand people”. The thing is: what you can really do given the situation you’re in – that is the challenge. If there is change, it’s because people manage to implement some of these things into practice. That is difficult.
In your open online course, you dedicate a large part of it talking about the importance of good theories as strong priors for hypothesis testing.
I appreciate that you noticed importance of good theories as strong priors for hypothesis testing. Sometimes people say that a lot of what we do is basically preventing people from p-hacking, you know, that everything is about the p-value. I think you are right: lot of it is about the constructions of your questions, like “what are you trying to do?”
In suicide research, a recent meta-analysis of 365 studies (analysing a total of 3,428 risk factor effect sizes for suicide) from the past 50 years concluded that prediction was only slightly better than chance for all outcomes, demonstrating an urgency to shift paradigms. What advice would you give to those who are interested in building a strong theory of suicidal behaviour?
That is a very good question and very difficult question. I think we still don’t know much about these things, that is why it is so interesting. These are interesting challenges. The methodological part is always a bit clearer, and I think that that is why a lot of times we spend in figuring out how we should do certain things – it is still not easy but it is mathematics and statistics. I’m a big fan of the work by Klaus Fiedler on theory formation. I think there’s too little work on how we should refine theory better. There is an important paper by Klaus Fiedler entitled “Tools, toys, tenure, truisms, and theories: Some thoughts on the creative cycle of theory formation” – I highly recommend reading it. In the paper, he makes important observations; one is you need times where you’re loosening your research line – anything goes, you try to get creative ideas flowing and anything is fine. And there is the period when you need to really test which of these things work and which don’t. So I think we have too many theories and we don’t drop them somewhere. People keep trying to save their theories even if they don’t work; there is too much commitment. And another point that Fiedler makes in this article, which I think it’s very strong, is why we are doing this: people are too individually committed to their theories. Then, what he says what we need is pluralistic endeavour where we all take theories of other people and really test them, and I will help you with this, I wouldn’t be like “oh are you trying to destroy my theory?”, I should be like “okay, let’s do it. Let’s test what is going on, how can we do it?”, in a collaborative way, but you might be more critical or you might have different viewpoints… really testing theories in a pluralistic endeavour is necessary. We detach things a little bit from the individuals, put them to the community to test, preferably collaborative. There are now very interesting things happening in this area, I think that papers are not out about this, but people are now working on getting groups of experts together, even with conflicting opinions, and not just doing replication studies, but also thinking “How can we test this theory? What can we do?”.
Klaus Fiedler has a very nice paper coming out of this where he had a theoretical idea and he basically gave it to researchers and the community and said “Try to tear it down. I think this is a strong theoretical idea, but you try to tear it down”, and it led to a very good discussion in his field. I think his theory was reasonably successful, but nevertheless I think it’s a really good approach. So, something like that to move if forward because there are so many fields where they call “the undead theories” or “zombie theories” – they just keep wondering around, they don’t die, we don’t move on. One would say that improvements happen one funeral at a time in science. Somebody has to die and then the theory dies, which is not productive, of course. I think there is nothing specific for suicide research; I think this happens across the board, but if you know that it is happening, I think it’s worth thinking as a community and asking this question: “can we get together and see what we can do about it this situation? What would be necessary to move forward?”
Just a follow-up question on this: some researchers have been saying that there is a neglect of qualitative research, and it’s being claimed that qualitative research would help to improve thinking about phenomenon and forming stronger priors through listening to people about their experiences. What is your view on this?
On the neglect part and underappreciated part, I only can share my own experience that I think this is true. Anecdotally for me who comes from a background which was strongly experimental, and I still do it, I still like this field a lot, and I’m not very good at qualitative research, but when I moved to this department where I work now there are many more people doing more qualitative research, many times they are more applied to specific fields, at least starting with sketching landscape like “what is going one here? Can we get the feel for themes that we might need to examine if we really want to make a change somewhere?”. So, my wife, apparently tells me that the first couple of months I would walk around and I would say things like “Urgh, these people do qualitative research”, I didn’t really like it – maybe she exaggerates a little bit, but apparently I was really brought up in a culture where qualitative research was not valued a lot. And now, being there for a while, I value it a lot. I am now suddenly like “hey wait, you are telling me that instead of asking my participants to press buttons on a keyboard and then trying to figure out what they are trying to do I can also just ask them afterwards, I can just combine this?”, such as “what were you thinking when you were doing the research task?”. I can just ask, it’s useful information. So, I don’t think that anything is a single solution, but if it’s not done enough, it makes sense to do it. I could also make the argument for some crazy machine learning approach and just through only data, without theoretical hypothesis, and ask “okay, are there patterns that pop up here? Is there something that we think we didn’t see but that pops out from the data that has better predictive power than this coin flipping that you mentioned?”, okay then go for it. And again, this is, I think, very important in this sort of more loosening stage that Fiedler also describes: we need all of this, where you we are going to get our ideas from, otherwise we are stuck in certain paradigms.
Qualitative data can really nicely pump you out of this sort of train track you’re on that is not going anywhere and be like “hey wait, wait, what am I seeing here?”. Machine learning could do the same or whatever other approach. So, yes, I think it’s important to do. I think successful careers in the future – if you combine both, if you’re good at both – I’m not, I’m only good at one – but people who are very good at both are rare, very rare, too rare, those who can see the connections between the two. If you like it, go in depth and learn about that. You can also learn about this sort of mixed approach to research, I see that it’s becoming popular. I wouldn’t know if you can do only one – If you only do qualitative research is fine I guess, but then someone else needs to, at a certain moment, we do need to predict something, and prediction has to happen in some sort of quantitative skills somewhere. So it’s fine if there is someone who does all the qualitative work but, again, and probably in a team you need to know how stuff will be used, whether it’s useful, knowing what you’re doing. If methodological fields don’t communicate with each other then there is a problem. Across disciplines, across anything, even across methods. If you have key people sitting on the fence between these two communities, these can communicate, it’s important for all sorts of things. You can be a physicist and a psychologist as random examples. So people will know something about two fields, that’s important.
The field of psychological science agrees about some behaviours that it universally condemns and punishes, such as fabrication of data. However, we seem to be relatively lenient on things like questionable research practices such as p-hacking. But both behaviours produce identical outcomes: junk empirical findings. Do you think intentional p-hacking should be treated more equivalently? Should fabrication of research *findings* be condemned regardless of the mechanism by which it occurred?
I do really think a lot about that, and discuss a lot with colleagues. So, you are basically talking about norms, scientific norms; what do we consider acceptable and not acceptable? And it will change, these things change. Fraud is easy, it’s always intentionally misbehaving, it’s pretty clear that fraud is a bad thing. The question is where do we move the line towards? You are talking about p-hacking, but one of my main concern is publication bias, so intentionally not publishing nonsignificant results that, if they were statistically significant, then you would have published them. It’s not that it is a bad idea or a bad performed study, those we can discuss whether we should publish or not or how we should share them. It is really something that you thought was good, but just because it doesn’t confirm your hypothesis you don’t publish it; and that happens. A lot. We know it happens, we see it around. And here, interestingly, if you ask the general population whether this is problematic, and they did it in the US, they asked: “General public, what do you think of the scientist who selectively reports studies that worked?”, which is just what we do; 50% of the public said “they should be fired”. They should be fired! Now one could say “oh, that’s pretty extreme”. But the question is there are two huge conflicting norms: the general public might be too naïve, maybe they are like “okay, they don’t really know how it works, that’s too extreme”. But I think we are very much failing in recognising how problematic is what we do. Maybe we shouldn’t be completely fired, but it is much bigger deal than we make of it. It’s crazy! We are doing stuff, investing a lot time and effort, and then we selectively report stuff? The literature is a mess, p-hacking is a mess, but publication bias can be a much bigger mess. Say, you don’t p-hack but you publish any fluke that comes out and nothing else? Then you get a literature that is full of flukes and it’s very difficult; people are completely misled. If 200 studies are done it will all of a sudden looks like “hey, but there 13 studies showing this”, but it’s nothing. So, it can be really really problematic, and I am not sure what will happen, but I just see this huge division between what the general publish thinks it’s okay and what we apparently think it’s okay; and somewhere someone needs to be wrong. Either we should be fired for this behaviour or it is acceptable and we have a good reason for that. Now, I don’t see a lot of people putting out there good reasons for why we do this publication bias. I think it’s extremely problematic.
The question is: should there be consequences now? And that’s a weird thing because it’s sort of going back in time. For example, thinking of men who thought that it was sort of okay that women couldn’t vote in 1830s. Do you think these people are wrong? Should they be punished or something? In that time women couldn’t vote. Now you would be like “it’s pretty crazy, right? We wouldn’t’ accept going back to that stage like that”, I would hope. But there are other things that take a long time. Gay marriage is another example. 30 years ago it was definitely not the majority of people who thought that this should happen. Now, not in every country but in many countries, you see majorities thinking “yes, this is okay, this is acceptable”. Again, I think, something that hopefully we don’t really go back on. So there are norms that change, I think you could even call them progress to a certain extent. This would be the situation where I hope to see scientific progress, where in certain moment we will say something like “imagine! There were people 30 years ago who were doing science and then they just hid all the stuff if it didn’t work”. Should we be punished now for it? Yes, society makes the laws; women can vote or they cannot and we update stuff. It’s the same in science. It is difficult now to say “you should be punished personally because you p-hack”. Is it problematic? Yes, of course, because otherwise we are not discussing now whether this is problematic or not. Of course, we already feel that there is a little bit of the thing, but we are sort of clinging on to “yeah, it’s not too bad”. I think it is too bad! Yes, I think some things are too bad! And I really hope that in 20 years we will look back and be like “well, I did this. I’m sorry, I did this”. I think many people are well intentioned. Okay, there is intentional p-hacking, there are always people who cheat the system. But if you change the norm, you want the majority to think “damn it, I did this and I now realise this is problematic and I shouldn’t do it again”. I have this about my own first paper, it is p-hacked, it is p-hacked. We admitted, there was a meta-analysis, and we said “we don’t believe this effect anymore, we selectively reported, we admit that we did it, we realise the problem that it causes now, we wouldn’t do it again, but at the moment, we were too clueless to do anything about it”.
So norms do change. My personal norms change, the norms in the field will change. So should we be harsh now? I think it’s also risky because you will start to judge people. I can image that if you are a young person entering in the field and if you look at what I did in my first paper you would be like “man, what kind of person are you?”, being very judgemental about what I did. I can understand it because if you enter the field naïve, young, you didn’t know what were the norms in the past. But I also think “I don’t know what I could have done”. Norms change, should people be punished? There has to be some transition, and the transition should be a little bit slow because if we start know to say things like “oh we should punish you for what you did then, you don’t get this anymore”. But changing norms take a bit of time. I think sometimes it’s not acceptable, we know where the standards have shifted towards. But I think you can also say “yes, you did this, it’s not optimal but it’s not that you should be punished for it now”. And the question is where are we now, many people are not taught about this still. So we publish a paper, it’s clearly p-hacked and what are we supposed to do? We should educate them. I think this is more a phase of education than punishment. And eventually, we will draw lines like “now we expect everyone respecting the norms”. I’m not sure if we are there yet. It’s a complex and long enterprise.
And where do you think these educational boundaries should come from? Universities, journals, publishers, funders?
This question I think it’s more controversial than the previous one, because the previous question is pretty clear and fair question. The controversial thing is how do we solve this, and it depends. I think this is a social dilemma to a large extent, this is a social dilemma. I think many understand the problem and would want to do better. However, the dilemma comes into place like “yeah, if I am the only one doing this, I will lose my job”. Then, researchers will just not do it, and they will keep a job or get tenure and, you know, it is already going much better for them. I think the main drivers of these social dilemmas are people who are outside of the system, independent enough not to be a player; not to be part of the social dilemma.
If you look at something like open data, for example, it takes time to have to upload your data, plus it is more likely that somebody will spot an error: all negative possible consequences. Hence, individually one could think: “do I really need to do it? Well, nobody is forcing me so maybe I won’t, because ‘how much do I get?’, ‘how much can it cost?’”; the risk is too high. It is a reasonable, individual evaluation. But for a funder, funder says we fund research, if data is available we basically get more research, we get more knowledge from the same money – so make your data open. Then people say “but it might have negative…”. Yep, it might have negative consequences. Life sucks. We have to do it. That’s what we do. So I think some of these changes have to come from high up because we are not going to fix it ourselves. We are going to take another 50 years before we – I don’t know what is needed to break through a social dilemma. I very often use the example of light bulbs in Europe, where you have energy efficient light bulbs – they are more expensive – people wouldn’t buy them, and then the EU just said “Okay, we are banning these energy inefficient lightbulbs. You can’t buy them anymore – you have to all move to this other thing. It is better. It is even cheaper for you in the long run, and it is better for the environment – you have to do it”. Yes? Right. So the social dilemma is sometimes solve by a rule. It is not crazy. I think sometimes a rule – so, journals can say you have to justify your sample size if you want to publish you have to justify your sample size; you can’t just say, “oh, we did something”. No, that is no longer allowed. Put it in there, tell us what you want. And we have to educate, of course, we have to be prepared to do it, but some rules help, I think. That’s my personal thing; I think it helps.
It is common for many PhD students to be frustrated because they did not find statistically significant results. Could you tell us how Equivalence Testing may help us in this context?
This happens at such an early, early time. Like nowadays you can see bachelors students being disappointed because their study didn’t work and it is only interesting and nice if it worked and not if it is a null effect. At the same time, if people are educated a little bit about reasonable expectations, null effects – first of all they happen all the time, because of course not all our ideas are good else it would be too easy. If you want to never find a null effect you study the Stroop effect again and again and again because we know that is true. So null effects are very common. The problem is we don’t see them. So would Equivalence Testing help? Yes, and I will briefly say how, but I think first of all we need to make it acceptable, not acceptable but make people realise that null effects are super common; they happen all the time. Okay, if we then understand that they are supposed to be very common then we want them to be informative, right? So the problem is if you find a null effect and you don’t really know what to conclude, then it feels like a waste of time. And because we don’t design a study to be informative regardless of whether there is an effect or null effect, we end up with these situations where we have null effects which are not exciting – because they are uninformative – and only the significant effects are considered informative, even though the null effects are a mix of false positives, true null effects, so there was nothing, but also a lot of false negatives – that you didn’t have enough power, that there was an effect but you missed it because your sample size was small, it happens all the time as well. Now an Equivalence Test helps because it forces you to think “what would I actually care about? When is it an informative absence of an effect? When can I conclude, well, I do suicide research, if this intervention is worth it – because it will cost money and stuff – if it is worth it we think we need to see some effect in the world that has a certain, noticeable size that is worth the cost”. That would be one way. There are many ways to think about what matters, but practical, significance is important for some fields, right? Does this matter in practice? Will we see an effect asking people to do this? So then you determine this thing, this boundary, like okay it should be noticeable to this extent. And if it is smaller than this, sorry we have learned that this is not a route that we can take to make things better. So, if you learn about an Equivalence Test you set these bounds, you are able to say “this is too small to matter”, you can design a study that will always yield an informative result. So you can go to a journal and say “hey, I’m going to collect a lot of data here for you and I will tell you this matters to a noticeable extent – there is practical significance – or, the effect is too small to matter, in practice. Are you interested in me collecting data and telling you which of the two it is?”. And then the journal should say “Yes, that is interesting”. The problem is, of course, without this intentional Equivalence Testing we are submitting studies and then with the null result we go to the journal and say “well, I have some data here and yeah, I’m not really sure – it might be that there is something but my sample size is too small or it might be that there is nothing” and then the journal says “yeah, and go back and collect some more data. Whatever. Do something else, right.” So, being able to design an informative study hopefully – hopefully – improves our ability to publish informative null results, reducing publication bias and making it less horrible – it happens; it should be interesting. Make null effects more interesting.
What do you look for from early career researchers/ when hiring early career researchers?
So, I’ve hired 5 people in my career, right. Two PhD students are done, two are almost done, and one postdoc has been working with us for three years. So, I have to say something about these people, because that is my experience, so I had better say something nice! They are all great and I think I did a good choice in choosing people. So what did I think about? So one of the questions I ask in a job interview, which I thought was pretty smart, I like the question, for example I was hiring on this grant proposal, I had sent it to them so they would know what they would be working on more or less, and then in the job interview I asked, “what do you think is the weakness of this grant proposal?” Like “what are the limitations?”, “what are the weaknesses of this grant proposal?”. Not just to randomly criticise it, but do be critical – to say, “hmmm, well to be honest I think that this, I think that you missed this, which is really important, and I am not really sure how this could ever lead to something insightful. I think you need to look for more about this”. That would be an answer, for example. And the grant proposal was funded, right, so it wasn’t crap but nevertheless being able to have some sort of good, constructive criticism – and also telling me about it – because I am a supervisor and if the people that work with me feel that they can’t criticise me that’s not a good thing because I will need criticism. I will always do stuff wrong. I think it is very important to have a team that feels very comfortable criticising the mistakes that I make, because otherwise it is problematic, right; they are going to do stuff that is a waste of time that they could have told me and they will do stuff that they don’t believe in. So, I ask this question and I think that’s really an important thing – having a decent level of, sort of, a critical attitude. Not trashing things down, right, because that is something that also happens, that people would be like overly critical, but you have to have some good balance of reasonable criticism, like “yep, this is a smart thing you just said. This is right, this is a limitation”, and feeling comfortable voicing it – I greatly value this because then I am slightly sure that we are not going to mess up stuff, that I get some push back sometimes. I want this, that people say, “yeah, I’m not sure that this is a good idea” and I feel that the team that I have does this pretty well, there is differences of opinions about things and that is useful. So I greatly value that; a good, critical – not overly critical but you know a smart sort of slight push back where it is appropriate – I think that is very nice because it also shows that you are capable of good, independent thought, you know enough about something to be able to voice some criticism, if it is good, if I have to take you seriously. If you say something and I have to take it seriously then I am impressed, because otherwise I could have done it myself – I don’t need you, but if you are able to do that, that is really nice – very good. So that for me is very valuable, that people voice, much more valuable than other stuff. So, I hire people, like the postdoc I hired, I read some of his papers just to see what he is working on, but I already knew, you know, I knew what his interests were, and I thought he was saying sensible things about it, and I’m not even sure if he has a very good resume or not in terms of publications, in truth. He’s already started, I shouldn’t say this, but I’m not really sure if he has – I think it is fine, it is probably good, but I don’t care about it. I just know that he is saying smart stuff so things like publications, I find it relatively uninteresting, especially at this young career stage – it depends so much on how many resources you had, did you work in a rich lab? Then good for you, you could do a lot of stuff, I understand. Did you work in a not rich lab? It had nothing to do with you. So, I don’t look at those issues, like “yeah, what is your output?”. I want to know, “can I see that you have good ideas?”. That I think is really important, and that is also up to you, right. You can’t determine how much resources you had to run a lot of participants or collect a lot of data or do stuff, but yeah you can think about stuff and say something that is sensible and smart, that benefits me in terms of that I learn something. So I think that is really important.
What’s essential to your well-being? Although our research is highly important, all work and no play is not good for the soul. Please share what you get up to in your free time.
Essential to my wellbeing is just my, we don’t have kids, but my family. We are married, we have a dog that is a little bit sort of a family member. But that is essential, like really, really important. Because I work a lot, my wife also works a lot. We really like what we do, but I think that is the most important thing. Like really hugely important, to be honest, hugely. Now I am away for a week and my wife is not around and I feel that I just, I don’t sleep enough, I keep working too late and doing stuff. I like work, it is not a problem, but I need somebody who really makes sure that I take care of myself – I do the same for my wife of course and we take care of each other. It is immensely important, I have to say, I don’t know how single people do it because I think it is tough. Hopefully then you have good friends or something who play this role. Social network, it is the obvious answer but I think it is important; impossible not to appreciate. I know people always say that when they become a professor at the end of the day, in their speech they always become emotional and say “I want to thank my family”, but it is so important.
Is there anything else that you would like to say?
I think it is nice what you mentioned about netECR – people coming together as young researchers. I mean, I don’t know if there is room to say that I really think that those things are very important. That is what I want to add. I think initiatives where young researchers come together and, early on, find ways to collaborate and exchange information, I think it is useful. If I think what science is going to look like in 20 or 30 years I hope larger teams of researchers within disciplines cooperating and fixing really big problems that are just going to be too big for single people to solve. The individual led approach is all nice but it is running against limitations, especially for certain research specialities – not for everything – but for certain research questions we will have to move to these collaborations and people who know each other from early on and then will rise through academia and have this network is going to be very important. Even just very straightforward for grant proposals in 10 years – you will be very happy that you can put 17 other names on there like “these are the people; we are all going to do this together”. I think in 10 years a grant proposal like that will actually make sure that you get the money, it is going to be more competitive, but we will see challenges are not solvable by single people, we have been trying this for a long time, like you just mentioned the theories – the predictive powers are not very high – so we have to a little bit rethink where we move to and then having a good network of cooperative researchers in a subfield is going to be very important – for the field as well, to get funding, I think. Other fields that are better organised are otherwise going to go with these things. It is a random prediction and in 10 years we might laugh at that statement, but it is now what I think, so I think it is very good to have a network like this.

This Interview was conducted by Tiago Zortea (@zortea_tiago). Tiago is a trainee clinical psychologist at the University of Oxford, and an honorary research fellow at the Suicidal Behaviour Research Lab at the University of Glasgow, UK.
Ackonwledgement: Many thanks to Emma Nielsen and Kirsten Russell for assisting with this interview’s transcription and revision.