Download Premium Content

First Name
Last Name
Country (Citizenship)
Opt-In to Future Emails
Thank you!
Error - something went wrong!

Migration Evidence to Support the Use of Patient-Reported Outcome Measures

July 22, 2019


Serge Bodart

That was a good summary of the guidance and the current literature. Maybe one thing that I wanted to stress is that what we do is faithful migration. And it’s not only for BYOD, it’s also for provisioned device. And we have done this for years actually. So faithful migration is the implementation of the instrument on different modes of data collection that do not bias responses, that’s exactly what this is about. We are making sure that when the instrument is projected on another device, the patient will still answer the same way.

I’m not going to repeat all this, Willie did a great job at summarizing this.

So for you, what should you do if you have a trial and you are using instruments, what evidence should you produce when you do migration? Usability testing, cognitive interviewing, equivalence testing? Maybe nothing? Full psychometric testing, expert screen reviews? Well, a good place to start of course, as Willie indicated, is to start with the best practices and make sure that you implement the instrument in accordance with those best practices from the ePRO Consortium.

We have four case studies for you. We want to make you work a little bit this morning. The four case studies are about migrating instruments, we are going to ask you what evidence do you need to produce to use those instruments in a clinical trial. And the first one is, let’s say you have the SF-36, you want to use the SF-36 in your trial. You see here, the current evidence that has been produced already that you can find in the literature, there are seven studies with Gwaltney, nine studies with Willie Muehlhausen, three studies in meta-synthesis. We learned about the equivalence of verbal rating scales. There is also a meta-analysis of 25 studies produced by White. And there are also unpublished cognitive interview and usability testing because, as you know, the SF-36 has been used quite often. So, what according to you, is there enough evidence to not require additional testing? And if so, what conditions would be required for this to be the case? And how would you package the evidence to support migration comparability?

So what I want you to do is maybe discuss that at your table a couple of minutes, and then I’ll ask some comments from the different tables.


Okay, maybe we can—somebody who volunteers to answer the question? Maybe some of you want to answer the question? There is no right or wrong answer here, just a decision about what do you think you should do to use the SF-36 in your trial.


We were just saying that it would be important first of all to just make sure that it was the same context of use, so the same kind of patient group and population that it was being used. And also, we were thinking that usability testing would still be required if it was being migrated to a different type of device. But also the expert review that was mentioned previously, that would also come into play. But yes, I think the key condition was to make sure that it’s actually the same patient group for both types that were being used in the studies. We didn’t quite get to the third one, but those were kind of the studies that we were thinking.


Okay. Yes, as I said there is no right or wrong answer here. This is what we would recommend here. We believe that there is enough evidence in the literature, it has been used many times. This instrument uses a verbal rating scale, so definitely that has been tested and proved equivalent. We believe that there is no additional testing required, provided that good practices have been applied, of course. You have to make sure that your vendor knows those best practices. And then the SF-36 is generally not used as a primary endpoint, it’s more exploratory endpoint. And keep in mind the PRO guidance applies to those that are submitted for labelling claims, so it’s not always necessary to provide additional evidence. In terms of package, we talked about expert screen review, definitely a good thing to do, that’s a good practice I would say, have an expert reviewing all those screens. And potentially an author review as well. Some of the authors are asking to check those screenshots, especially. I think SF-36 is part of them, but also the EuroQol  EQ-5D-5L for example, EuroQol requests to review the screens before you can start your trial.



I might just pick on Michelle White, because I know one of the articles that you refer to, that’s Michelle White’s meta-analysis. And obviously working with Optum, you might have a position on this, because we picked on your instrument, not deliberately but it’d be interesting to hear what you’ve got to say.


Well thank you, it’s an interesting question and really hard to think about in a small amount of time, because first we sort of skip over the whole what parts of the SF-36. Is it the domains, the summary measures, which parts are actually the endpoint that are important in the condition or context of use that you’re working on. So let’s just assume for this example that it is appropriate. And in that case, I would argue, based on all the meta-analyses, that this is correct as long as good practices have been applied in the migration, that you wouldn’t need to go back and do an entire study again, although pretty much anytime, if it’s a new implementation with new programming that hasn’t been done before, you would probably want to do usability testing. And there is a section in a dossier, if it is used to support a labelling claim, that would be on the mode, and you would have to speak about what you did to show that that would be equivalent. But yes, otherwise I think your advice here is really good.


Thanks Michelle.


Thank you. Case study number two. This is an instrument that is used in different populations. The current evidence that you have, you have equivalence proven in population one, and we’re going to use that instrument with another population. And so, can you provide examples of populations that would not require additional evidence, or examples of populations that would require additional evidence? So, same instrument used in two different types of populations. Again, no right or wrong answer here, just think about it a couple of minutes maybe.


And I think you might want to think about not just measurement equivalence, but also usability, those two aspects.



Okay, we’re going to try to save a little bit of time, so let’s try to find out what this would require. Examples of populations that would require additional evidence or not.


Certainly we agreed that for different populations and different cultural backgrounds, we absolutely need to do some sort of additional testing and cultural adaptation. And then we were discussing what evidence would not be required, and we agreed that there are certain similar TAs where we could not do more testing, but I think we all agree that that might not be the rule, that certain TAs while they have certain similar characteristics—like for example, a pediatric study and a geriatric study, while they’re both proxy roles, they’re very different patient populations—so there are very limited areas where you would not require it.


Yes, absolutely. So I think that’s a very good example, that patient characteristics—and we are talking here about patient characteristics such as cognitive impairment, dexterity, vision impairment, so when it comes to pediatric trials versus geriatric. Usability is not necessarily related to the disease. We are asking that question of therapeutic area, you can use the EQ-5D-5L for diabetes populations or respiratory trials. It doesn’t matter, that’s not the key aspects. It’s going to be more the patient characteristics that will drive this. Another example, if you use an EQ-5D-5L with a patient with Parkinson’s of course, there you have also issues in terms of handling the device.

Okay, thank you. Another case study. Maybe I will ask Bill to comment on that picture. It’s a new instrument with or without standard widgets. So what evidence would be needed to demonstrate migration equivalence, and how could this evidence be generated and reported to enable its reuse to support other studies? Can you first explain what this is?



Yes, it’s I guess a different way of illustrating a symptom diary. And this isn’t a CRF product, it’s another vendor product who I won’t mention. But basically they have ten symptoms, and the way that this motif works—if that gives you a clue—is that you drag your finger up and down the petal, so you get a score from one to five effectively by the position of how much you fill up a petal. And so it’s just a new way of thinking about how to do a numeric rating scale, I suppose. I guess the question would be, if we apply that to a standard instrument, which uses a 5- or 10-point scale, whatever, what would we need to do to feel comfortable that we’re measuring the same thing as perhaps the original paper version? Now, to be fair on the company that do this, they do this for de novo new symptom diaries that they develop, so it’s a little different situation. But let’s think about it in terms of perhaps applying a different sort of widget to an existing instrument.


So maybe we can propose or answer here to this. So we would definitely propose a cognitive interviewing, just to make sure that you understand the question and how to answer the question, and usability testing probably. If it is part of a validated instrument, probably in this case it was not, but if it is part of a validated instrument, make sure that this is moderate or severe change that you might need to produce equivalence testing evidence. And then, if you want to use it for other trials, other studies, testing reports of course and preferably publications about this, so you show the evidence. Any comment on this one?

Then the last one, relatively easy. It’s a respiratory trial. You are using the EXACT diary, you are using the SCRQ, the COPD Assessment Test. And what would you need to do if it was a completely new platform, you are using a new platform. What kind of evidence do you need to produce it, as SCRQ and COPD Assessment Test have been used a lot in those respiratory trials, there is a lot of evidence. So if it is a new platform that you are using, new vendor, new platform, what would you need to do?

Okay let’s take a couple of minutes about this.


All right, so let’s conclude this case study. Somebody who wants to volunteer?


We were thinking, as long as the vendor has been audited and they are a trustworthy vendor, 21 Part 11 compliant, that all we would need to do is UAT, was our original thought.


Yes, so definitely need to make sure that your vendor is capable of developing electronic patient-reported outcome or electronic clinical outcome assessments. If it is a new device, it might require usability testing. We’re thinking a little bit BYOD here, new device could also be in a BYOD mode. It’s not absolutely necessary, but what I would make sure is confirm that best practices have been followed and use an expert screen review, that’s always a good thing to do. And confirm that faithful migration has been performed correctly so that you have documentation that does involve the faithful migration, it’s to document that the migration has been done correctly and that the instrument is valid for this particular trial.

So, any questions before we go for lunch?

[END AT 13:58]

Previous Video
Optimizing Use of Mobile Technologies in Clinical Trials
Optimizing Use of Mobile Technologies in Clinical Trials

Jennifer Goldsack presents on the benefits of mobile technologies in clinical trials.

Next Video
Power of Integrated Patient Engagement
Power of Integrated Patient Engagement

Jeff Lee discusses patient engagement at the eCOA Forum.