This is going to be a two-person thing, so I’m going to talk about the OAS Aggression Scale as we developed this, a paper and pencil measure, and then Dan will come back and talk about the eCOA version of it. And that was being designed for a study that a drug company was doing to see if their drug actually improved impulsive aggressive behavior.
Aggression is something that a lot of people don’t fully understand. People talk about violence. Violence is more of a lay term; aggression is more of a scientific term. And it’s a behavior by one individual directed at another person or object, in which either verbal force or physical force is used to injure, coerce, or express anger. There are different forms of aggression. There are socially sanctioned forms of aggression, where for example if you’re in the military and you’re in a firefight, you’re supposed to be aggressive, that’s what you’re supposed to do. But there are also people who are impulsively aggressive and people who are proactively aggressive. And the proactive aggressive folks are not the kind of people that I’m talking about; those are the ones who are doing things to people to get something tangible for themselves and the like. But we’re really more interested in the impulsive aggressive folks, the folks who are blowing up and having temper tantrums.
Does anybody in this room know somebody who has frequent temper tantrums? Besides Donald Trump? Just a few of you? How many people in the United States, do you think, have this problem, where they have these kinds of attacks several times a year? What percentage? Do you think it’s 1%, less than 1%, 3%, 10%, what do you think? Less than 1%, no. It’s actually about 3.8%. And that’s the people who actually have intermittent explosive disorder. If you count everybody—this is from the National Comorbidity Survey—it’s closer to 7.7% who have in the course of their life at some point in one year at least three of these big blowouts.
And because the work that I did initially, which was in the biology of aggression, the whole idea was, we would see these relationships between brain serotonin activity and this kind of aggressive behavior, a very strong correlation. So we wanted to see, if we improve serotonin function by giving a drug, let’s say fluoxetine, could we make them less aggressive. We had to have a measure for that. And we started doing this work back in the ‘80s when fluoxetine had just come out on the market, and we had to figure out how to do this. And the tricky thing with measuring aggression is, mostly it’s measured as a trait characteristic, typically done via questionnaire. Questions will be like, in general, are you somebody who, when somebody pesters you enough they deserve a punch in the nose. Or do you sometimes bang your fist on the table when you’re angry, that kind of thing. And that’s a trait characteristic. If you take that kind of questionnaire and you give it to people day one and day twelve, or week twelve of the study, it won’t change much because it’s a trait. So clinical trials of aggression, as all other clinical trials in psychiatry, require assessment of the behavior itself, as opposed to the tendency to behave this way, which is what the trait characteristic is. So someone has a high score in a thing like the Buss-Perry Aggression Questionnaire, it just means that they tend to get aggressive more than somebody else. That doesn’t mean they are getting aggressive at that moment in time.
So the best methods of assessing aggressive behavior would be direct observation, such as video. Well you can’t really do that in most settings. I mean, if someone is in an inpatient facility or a jail or whatever, you can have video. But otherwise, you really can’t use that, and in fact if you were to somehow put video capability into someone’s home, it wouldn’t work either. Years ago I had a producer from ABC News, I think it was 20/20, and they had this person who was having temper tantrums, and they said, we put cameras in the house. And I said, I’m not sure that’s going to work. So she goes in four weeks later and she says, you know this guy is not getting aggressive. I said, well that’s’ because he knows he’s being videoed. And in fact, people can hold it together for a while. At some point they won’t be able to continue to do that. I mean, how long can you wait? She says, well this is already too long. So that’s a problem.
Now you can use secondary reports of behavior, could be from the person or from an informant. Probably the informant is not with the person all the time. How many people here are married? How much time of the day do you actually spend with your spouse or your significant other? I mean, you’re sleeping six to eight hours, you’re at work eight-plus hours. So if something happens at work, you may not know it. Then you get the person’s bias. Well would they really want to admit that they’re having problems with aggression. So you have those kinds of issues.
So direct observation, as I’m saying, it’s difficult to do in inpatient settings, you can do it. You can’t do it in outpatient settings. The best compromise for secondary reports is getting a frequency of the behaviors on established levels of behavioral severity. So when we look at aggression, we look at verbal aggression, which is screaming and shouting and those sort of things; indirect, which is breaking things, throwing things around; and direct forms of aggression, which would be pushing, shoving, hitting people, and things like that.
So, where to begin. In 1987, which was around the time that we were looking at our serotonin measures of aggression and starting to think about a clinical trial, Stuart Yudofsky came up with the Overt Aggression Scale. And it was really developed for inpatient use, and really they were using nurse raters, and they would assess each aggressive act and marking the most severe behavior in verbal, indirect, or direct aggression. And that’s what they did, it was a simple report, which we couldn’t use in an outpatient setting because it just doesn’t work.
So what we did was we modified the OAS for outpatient aggression, and we came up with this severity weighted counting of aggressive behaviors in verbal, indirect, and direct aggressive spheres.
Now, unbeknownst to me, somebody else was already doing it. That was a fellow by the name of Kay. And in 1988, he came out with this paper, and it was called the MOAS, the modified OAS. And he made really only a very few changes to this scale. And again, unbeknownst to me that he was doing this, I was working on my OAS-M. And the OAS-M actually doesn’t simply do the counting, but it also has a global assessment of aggression and irritability, which was adapted from some other measure. Initially we called it irritability but now it’s global aggression, global anger and aggression. And as it turns out, the scale’s got two parts to it; so it’s got this frequency weighted counting, and then this global assessment. And the nice thing is, you get this very granular assessment of aggressions. And those numbers can go to infinity, pretty much. And then you’ve got this more restricted thing. Basically the global is 0 to 10, so like a 0 to 10 of how aggressive and angry you are.
So just to go through what this measure looks like, it looks at verbal aggression, aggression against objects, aggression against others, and aggression against self. In practice, not a whole lot of people do things against themselves, for the most part. And then this is the global anger aggression measure. Now, because the aggressive behavior part of it, which is granular, has such a wide range of scores, actually as a signal it’s a little worse than the global anger aggression because it’s more compact. But we really want both of those kinds of pieces of information for lots of reasons.
So here’s an example of what the verbal assault scale says. So it’s assessment of verbal outbursts or threats made at spouse, boyfriend, whatever, close friends, strangers. And that’s what it is. And then there are the anchor points, and they go from 0 to 5. And what you do then is you score how many times they snapped or yelled at somebody, and then that’s given a weight of 1. So let’s say they did it three times, that would be 3x1 is 3. If they cursed or insulted somebody, that would be a 2. If they did that twice, it would be 2x2 so it’s 4. And on and on, and that—it’s the scoring of the OAS-M that makes the eCOA so much better. Because scoring this and tabulating it actually gets kind of complicated over time.
I don’t really need to go through all these slides, all the other ones are very similar. They define what it is, the anchor points are 0-5, and this is assault against others, and this is assault against self. The global subjective anger, you actually ask the patient, well how angry have you been this week, and then you rate 0-5. The global overt aggression basically is your sense—your assessment, the global assessment of the granular score. And you put the two of them together and get the 0-10 score. The eCOA is less needed for that, but most of the OAS-M interview is this frequency counting, and so it’s really important to have eCOA for that.
Now, does this thing even work? Well it turns out it does. The OAS-M aggression scores were studied in a variety of studies, and this is the study that we did in 100 subjects. And two out of three people got fluoxetine and one got placebo. And what you see here is everybody at baseline is pretty much where they are. And as you go through the trial you’re starting to see the effect of fluoxetine. And it really becomes statistically significant between Weeks 3 and 4. And then it pretty much stays significant. These are log transformed scores, so those differences are actually bigger than the look like on this graph.
This measure was also used in an Abbott study back in 2003. Same kind of finding, you get an anti-aggressive effect.
Now, as we were doing this research, we eventually had to come up with a conceptualization of what these people looked like as patients. You don’t want to treat people according to scale score, you want to treat people according to their diagnosis. And we don’t have enough time to get into it, but we basically took intermittent explosive disorder in the DSM-III-R and revised the criteria to be much more operationalized. And those criteria ultimately became the DSM-V criteria.
And so there were two kinds of outburst that we identified. One was the big severe outbursts that were infrequent, you know, three times a year or more. And then there’s the rumblings in between—getting into the arguments, snapping, that sort of thing. So the snapping and the very frequent but low intensity were called A1 outbursts, and A2 outbursts were the big outbursts that would happen less frequently. And so the OAS-M then changed to being a scale that really would also assess intermitted explosive disorder. So the way the interview would go is you would say, okay Joe, how many outbursts—you’d train them of course to understand what an outburst is—how many outbursts did you have this week? They would say, well I had this one with my wife, I had this one with my coworker, I had a couple with these other guys. And then you talked them out a little bit and you get a sense of whether it was an A1 or an A2 outburst. And then you go through the full OAS for every outburst. That’s actually when it clicked, I need a lot of data. And one of the problems with rating is the scoring later, which eCOA totally fixes, which is what’s great about it.
So the raters go through each outburst and record the different aggressive categories in each of the four areas. And then we have these rater guidelines, because as we did it we discovered, obviously, people don’t tell you things that you think you’re going to hear. You’ll only know when you actually do the studies. And so we came up with very detailed rater guidelines. And one thing was, well, how long is an outburst. And we would say, well—if you have an outburst they’re usually short-lived. So if they have two outbursts that are separate by 30 minutes, then it’s a separate outburst. That’s actually important because the number of outbursts becomes an outcome measure as well. But also, if you take two outbursts and conflate it into one, you’re likely to be missing information. Obviously one with how many outbursts; and two, the intensity of the outburst. So we need to train raters to do that. That’s not that hard to do. We’ve got a training program that is able to do that.
Here’s an example of those issues. We came to this thing called the One Event Rule. So somebody—here’s what happens when somebody has an aggressive outburst. Often what happens is, they’re interacting with somebody, and the spouse says something and the husband says, what do you want! That’s a snap. And they keep bugging, and he says, what the fuck do you want! Okay that becomes a curse. And then they start arguing, and that’s an argument. Now you really kind of need an argument for the IAD diagnosis; snapping and cursing are not quite enough. So what do you do with that?
Well, it’s the One Event Rule. If it’s What do you want! well that’s a snap, and that’s it. However, if it’s What the fuck do you want! well that’s a snap and a curse, so it rated as a curse because it’s higher. Get it? That’s kind of how we do it. There’s a whole other bunch of rules, which I don’t really need to get into, it’s not critical for this. But it’s a reasonably straightforward measurement, once you train people up on it. We’ve been successful in doing that with a number of different investigators.
When we were approached by this drug company, I was their consultant actually, they hired Bracket to work with this. And I guess the drug company wanted Bracket to do an eCOA is how that happened. Because you also train people just in general. And so I was brought in on that as well. But the actual development of the eCOA is something that Dan will actually speak about.
Let me see if there’s anything else I need to say about the OAS-M. I think I’m going to say a few other things that are kind of relevant.
Psychometrics. I said that the aggression score can go from 0 up to a billion—it really doesn’t, but it can go very high. How good is it? How good is the inter-rater reliability? It’s actually very good, it’s 0.95. Global anger assessment is 0.95. Type of IAD, the cap on that is 0.89. So when you train people properly—and again it’s a rigorous training but it’s not like days and days and days—you can get pretty good reliability. And we showed that in a variety of studies. The internal consistency for aggression is 0.78, that’s the log transformed aggression score—you have to transform it because they’re all over the place. The alpha for assault is 0.71, for objects a little less, for others. But if you combine the three non-verbal, it’s 0.65, which is getting close to acceptable. The alpha for the GAA is very very high at 0.90, so the psychometrics are very good.
It also reflects what you would expect it to reflect. So this would be like, convergent validity. So if you look at OAS-M aggression for healthy control, somebody who has no life history of the disorder, it’s 4.2, and again this over the last week. That’s not a lot. I mean that’s like four snaps. Psych control, who’s not aggressive, is 11.5, that’s not very high either. But IEDs will have 101 or very very high. For the global aggression scale, for controls it’s like 1. For PCs it’s like 2, but for IEDs it’s 5.9. I mean 5.9-6, which is getting into the moderate range. And if you look at other things that are looking at aggression and anger, you see the same kinds of patterns. Life history of aggression is low for healthy controls, a little higher for psych controls, very high for IEDs. Same thing for anger. That’s speaking to the psychometrics of it.
Concurrent validity is also very strong. There’s a 0.75 correlation with aggression and life history of aggression, with PAQ aggression is the tendency to get aggressive, state of anger is obviously anger. And impulsivity, and you get the same kind of correlations with global anger and aggression. So it holds together as an actual measure, which makes it good. And it also doesn’t correlate with things it shouldn’t correlate with. It doesn’t correlate with extroversion, which you wouldn’t expect aggression to correlate, I mean it literally doesn’t correlate, 0.02-0.01 is nothing.
Now this is important in terms of clinical trial work. So we’ve done a bunch of clinical trials with this measure, and what we discovered actually was, if you don’t have people with stable aggression scores walking into the trial, you won’t get much of a signal. And we’ve looked at this in a whole variety of subjects. And so the left-hand is the OAS-M aggression score. The right side is that global aggression and anger score. So people who are stable—meaning they don’t drop below a certain level—are the ones that actually do well. And the ones that aren’t stable, that are going to go down, you can see that from the very beginning.
And when you actually go back and look at the data, what you see is you get a signal in the people with stable baselines, at the end point. And you get nothing at the endpoint in people who are falling off. You’ve got some difference out here in the middle, but so what. I mean that’s not a result you want. You can’t market a drug that works for a few weeks and then goes away. So this is just noise, and that becomes an important thing. You then have to figure out, well how do you want to put people into the study so they are stable, and that gets into issues of how many subjects you need, and can you get them, and all that, that’s a whole other story.
This is looking at the global aggression measure. Here you can see big differences, big effect size, when you get a stable baseline. And the same sort of thing when you don’t, meaning a result you cannot use, that’s really not going to help you.
So that’s pretty much what the OAS-M is. It’s an example of a rating scale that would be used in psychiatric studies, it’s got good psychometric properties, it works, it can pick up a signal. But the scoring is tricky, you’ve got to multiply by those weights and those other weights and all that, and that’s where the eCOA, I think, really comes into play.
And with that I’ll turn it over to Dan Debonis.
[Q&A Section 19:25]
Thank you, Emil. Do you want to stay up, there’s some questions first.
AUDIENCE MEMBER 1
So thank you. That was really interesting. Just a question. I’m assuming I know the answer to this, actually, but it’s interesting to explore it. In this case, would it have been possible to develop a self-assessment version of this scale? Would a patient have been able to have a diary where they were recording this on a regular basis? I guess if they were prone to aggression they may break the diary quite frequently so that could be an issue. But what made you choose a clinician rating? Was it something to do with the complexity of really teasing out the categorization of these events?
Yes, that was the biggest reason for it. Also, those kinds of studies have never been done before, the study we did was the very first study on aggression and they became more IAD they actually were IAD, they weren’t called that when we started doing those studies. The tricky thing with self-report is, you have to train the patients. But also, you know, if you’re going to fill out a diary—the problem is if they’re spending a lot of time monitoring themselves, it’s going to drop the behavior. I mean they’ll drop it for a while, they won’t drop it forever but it’ll drop the behavior. And so that’s a problem. Now when the study that this pharmaceutical company wanted to do, the FDA really was pushing—I went to the FDA meeting and I was surprised by this because I’m not really in this field—and they’re really pushing the patient-centered outcome, and they really seemed to want some kind of diary. And so the company sort of pulled something together and I think they were starting to do some studies with it. Our concern, I mean the investigators especially me, were going, this is kind of a treatment. I mean people teaching me monitoring is a treatment, and you’re going to lose your signal. And so that’s a problem. And then again, the question, well how many questions do you ask. And then they start thinking about it, these are the questions we want, oh that’s six questions, that’s too many questions. And I think what could work would be an eCOA thing where they get a ping once a day, you know, did you have an outburst today. Something like that. Actually that’s probably the best thing, did you have an outburst, you know they fill it out. That’s not what they did, they asked a few more questions than that. And then you say yes or no, whatever. Then the idea would be, maybe when they came in for the rating, the rater would have this information, and so, it looks like you had da-da-da and let’s talk about those. If you don’t do that, then what you’re doing is, you’re meeting with the patient, saying, tell me about the last week, how many outbursts did you have? And then they’re sort of recalling right there. But the reality is they’re in a trial, they know what we’re asking, they come in sort of prepared. So I don’t have a problem with doing a self-assessment with an eCOA thing. My worry is, how extensive is it, because you don’t want to have them self-monitoring so much that they don’t tell you what’s really happening.
AUDIENCE MEMBER 1
Thank you. And I guess, when it comes to their recall, their ability to recall over the last week, for example, when they come to a clinician’s office, I’m assuming that because these are fairly significant emotional events, there isn’t a real significant problem with them not being able to remember how many they had, or the nature—
Yes, I mean they seem to remember. I mean the reality is, until we do a study where they’re doing some kind of handheld thing. Just saying they had one, we won’t really know. But the other thing about self-report is you have to really train them a lot about what exactly it is they’re looking for. And so I think that becomes really tricky. There is a thing about aggression which is interesting, which is there are people who have aggressive behavior, don’t feel—aren’t distressed by it, and don’t report impairment by it. Those are not the people that will come to a study, because if they don’t think they have a problem, they will not be identified, they will not self-identify and come in for a trial. But the people who we do study, do come in, don’t lie about stuff. I mean we actually looked at this. For example, we were doing a family study of IAD, and we looked at the direct interview of a relative versus the direct interview of a person. And we did it for like 70 people and in only one case did the relative say, yeah they got IAD and the patient said they didn’t. So the concordance was—they weren’t lying about that. So on that level, the people who come in for studies are motivated to tell the truth. And that’s one of the things I sometimes get on grant reviews. Are these people lying to you? Well they might be lying to you if they’re being sent for anger management or being sent by the divorce attorney or whatever. But if they’re coming in on their own to be treated, they’re not really lying, they already want to be treated. In fact, they recognized that there was an issue and they wanted to be treated. They’re pretty much like anybody else.
So those are the specific things to aggression that one needs to think about. The kind of population you have. Because as I said, 3.8% may also have this IAD, which also means that they’re bothered by it and they’re impaired by it. But it’s a whole other group that’s almost as big that doesn’t report distress and doesn’t report impairment, yet they have the same amount of aggressive outbursts and yada yada yada. And again this is from an epidemiological survey, which has its own issues, like you can’t actually talk to those people. Really, you’re not bothered by this at all, never? You know, it’s like they don’t recognize that they could be. It’s like they’re losing friendships but they’re not connecting that their relationships are falling apart because they’re blowing up. So there’s a lot of work to do in terms of all those kinds of things. But when you’re doing a clinical trial, you are looking to study treatment-seeking patients, so you’re probably okay in the area of aggression as long as you’re doing that. As in depression or anxiety.
AUDIENCE MEMBER 2
What about codependency or inter-relatedness where it’s actually not self-generated but prompted due to environmental cause?
No, we don’t actually screen them, and we never really found—again they’re not—well, you could screen them but—well that’s kind of what we did with that family study. So there’s validity there. But you won’t get the right numbers if you’re simply taking the numbers from the significant other, because they’re not with that person all the time. That’s not what you meant?
AUDIENCE MEMBER 2
No, I was referring to if significant other had the issue. It’s not self-prompted, that the person may identify themselves having an anger issue and it may be environmental based on—
Oh well sometimes what happens is, maybe the spouse says they have an anger issue and they come in for evaluation, and we discover, well it doesn’t sound like you really have an anger issue. Or it’s not as severe as your spouse thinks it is. That does happen. Obviously we don’t study those people because they’re not aggressive enough. But I should emphasize that when we study people we’re not studying people who are having any legal issues, we’re not picking people up from the criminal justice system, they’re not on parole. We don’t really want those people, for a variety of reasons, not the least of which is we don’t want to become their forensic psychiatrists, because we’re not. I’m certainly not a forensic psychiatrist, I don’t want to get into writing reports for these folks and stuff, that’s not what we’re trying to do. Any other questions?
Okay, so Dan will talk about turning that paper thing into an eCOA. Should I stay up?
If you like.
We met Dr. Coccaro here through a sponsor, contacted him and they were asked us to get in the study and they really wanted an all-eCOA study and obviously this scale had never been done before. I mean, you could see there’s a lot of nuances to it, but the point with bringing Dr. Coccaro here today, other than it’s interesting and it’s kind of neat, is to give you a flavor of the experience, that these are the kinds of things that you have to take into account when you’re doing a ClinRo. There’s a lot of nuances to the scales. They’re designed a certain way. And if you’re going to create eCOA versions of a well psychometrically validated outcome, you’ve got to maintain a lot of the properties that make it psychometrically valid. And you can’t impact the way that it’s administered. Again, back to what I said earlier, you can’t give a rater or clinician something that’s going to make their job harder. You have to give them something that’s hopefully going to make it easier, at least not make it any worse. And what we used to do at Concordant, and I encourage you folks in Bracket, is I’d have the developers come and watch a talk like that or go to a rater training so that they understood what they were dealing with here. Because again, it’s not—if you don’t understand the subject matter in something like this, it’s a little difficult. Now we brought Dr. Coccaro into the offices here, he met with some of the team, the developers, and walked through a lot of these issues. And that’s kind of a standard practice, to engage an expert with this. But anyway, you could see, we have a process for migration with eCOA, I think I spoke about that earlier. And we also tacked on a small feasibility study just to make sure that the paper and eCOA versions weren’t divergent in anyway in terms of the way that the raters were using it to assess the patients.
I think I already started, this was a fairly decent sized study for a tough-to-enroll population. And again, working with Dr. Coccaro, what can we do here, what are the biggest issues here. I think he pointed out that scoring was the big one here, can we automate that, sure. He had a very nice manual that he designed. But obviously can you imagine a clinician working with an 89-page manual, flipping back and forth to see how they’re going to score a different item. If they’ve been well trained they might remember it, but can we again take parts of that manual and incorporate it so that it’s kind of at the fingertips of a rater. That’s another consideration here. So yes, of course you can, you can do that. And then what you do is, within the context of a question or item, you have an information bar there that they can refer to as they’re looking at the item or doing this or giving the instructions. Again, just putting everything at their fingertips where they’re not having to flip through pages and look back if they’re not sure about something.
Again, very similar, I think he showed this one but— okay but this is the worksheet, so this is what the raters on paper are given. And they’re supposed to fill this out and then score it and then go through, asks about multiple outbursts, come back. And again, it’s—you train them, they’ll do it, but can we make it easier for them. Sure, we’ll define an outburst here for them so they don’t have to worry about it. And then as they’re going through, this may look familiar, they’re going to say yes, no, they can add in comments if they want. And then that will again autoscore in the background so that they don’t have to add anything up, they just move on to the next item, they can focus on interviewing the patient rather than refining their math skills.
Again, very similar here, but I think—you can have multiple outbursts. So essentially what you’re supposed to do is, if there’s another outburst that’s reported, go through that whole same process again. So again, if they were selected, if there was an additional one here, they would then be prompted to go in again and ask all those other questions. And note, I don’t know if you saw that, probably not in here, but we won’t let them go on until they ask all these questions. So that’s another thing that we’ve incorporated.
And then the scoring, which you saw here. This is what the rater sees when they’re done. This actually came from the actual study here. This is what they see when they’re done, they can review it, their comments are in there, they’re captured at source. And then the scoring. So this is the summary of scoring. And the rater didn’t really have to enter a score other than when they’re scoring items they just have to have an objective score. These are all calculated for them. So this is the number of outbursts and then it kind of does the math for them, the weighted scores, and gives them this output, they don’t have to worry about doing any of this.
So feasibility study, you see here, we just had four raters that Dr. Coccaro trained. And then we kind of did a counterbalanced order. One group used paper version, one group used the eCOA version evaluating videotaped interviews, 48 hours, and kind of flipped the script a little bit, did counterbalanced order, and then just it’s pretty simple face validity but was there any impact here to the way that—did using eCOA impact their ability to rate the subject here and did it correlate with what they’re doing on paper. And the answer was no, pretty straightforward, and this was again presented by Dr. Busner from Bracket and Dr. Coccaro as being the author, so again these are well trained, you would expect this, it’s controlled but it’s you know again these are the kinds of things that we try to think about and at least do so that we’re not introducing more variability, so we’re actually trying to control for it.
So that’s all we have. I don’t know if there’s any other questions or comments. If you ever need an electronic OAS-M, you know who to call. Okay, thanks everyone.
[END AT 34:35]