Podcast published date:
dashboards, data, cases, graph, charts, graphs, people, misinformation, state, arizona, day, zip code, kinds, pandemic, explanation, talking, number, map, tools, updated
SPEAKERS: Shawn Walker, Michael Simeone
Michael Simeone: This is Missinfo Weekly, a somewhat weekly program about misinformation in our time. Missinfo Weekly is made by the Unit for Data Science and A nalytics at Arizona State University Library. This week, we talked about the different visualization dashboards that states are using to talk about COVID-19.
Shawn Walker: This week, we're going to talk about dashboards, specifically COVID-19 dashboards. So Michael, what's a dashboard? What's the purpose?
Michael Simeone: So a dashboard is a collection of different charts and graphs all put together. Normally, it's designed to support some kind of decision making process.
Shawn Walker: So in the context of COVID, what kinds of information have we seen on dashboards?
Michael Simeone: Well, I think the information is a little bit more straightforward. We've seen case counts, hospitalizations, and the like. I think the decision that the dashboard is supposed to, you know, support, I feel like that's a little bit more ambiguous when we start looking at COVID dashboards around the world. Plenty of people up to now have seen some kind of COVID databoard, if you google COVID-19 cases in your state, if you google global COVID-19 cases, Google will make a dashboard for you and your browser. If you go to your State Department of Health, they'll have a browser universities have put up their own browsers and dashboards. There's a lot of them out there.
Shawn Walker: There's like there's a new dashboard every day put together by some organization that repurchases these numbers. And of course, they all agree, right?
Michael Simeone: Yeah, we're swimming in charts about COVID-19. To the point where I feel like if all of us close our eyes right now, we can see the spidery jagged lines of states climbing up or climbing down. It's been all over the New York Times Washington Post, every major publication, their data visualization team is making these kinds of things as well.
Shawn Walker: Are we talking about these because these are just playing misinformation or tools of misinformation? Or why would we talk about these in a misinformation podcast?
Michael Simeone: Talking about these kinds of charts and graphs, you know, we're not looking at States Department of Health being malicious actors or trying to spread misinformation. I think when we take a look Look at some of these dashboards, we can just see how complicated it can be to communicate this information to people and to do a good job. And I think anytime there's an inefficiency or a hiccup in the system, or when you're putting a chart together, then that's an opportunity for misinformation.
Shawn Walker: And I would add that these are also very difficult to interpret, because we're using a lot of technical terms in these dashboards and vernacular. Around, you know, what is a case count what is are not, and these get graphed. And these are not common statistics that we use in our everyday lives. So we're having to learn a lot of medical terminology to interpret many of these dashboards. So what does it mean for a positive case? What's the serology tests versus a PCR tests? And these dashboards kind of lay out these numbers, but we don't necessarily always understand the context or how long it takes for a number to appear in the graph or what kind of processing a number might take before we have a test and then how does that eventually appear on a dashboard?
Michael Simeone: Yeah, and like, back to the idea of how can a dashboard be misinforming? Well, all it takes is to not really explain what a case is. And all of a sudden, that can be misinforming, or it can be an opportunity for someone else to missinform. So even at the press conference yesterday that the governor put on, somebody asked, Are you sure you're not double counting cases? Because someone might test positive for COVID-19 on one week, and then the next week you test them again? Are you sure you're not double counting them and inflating the cases? And, you know, we've talked before about all the different narratives that are kind of swimming around to try to make COVID-19 seem like a hoax or a false flag or some other kind of nefarious thing. And one of the keys to doing that is inflate the case load. So poorly described charts can also contribute to that as well. And so I think these kind of sit at the intersection of a lot of different ways of misunderstanding,
Shawn Walker: And just to clarify, you're talking about the June 25 press conference that the Arizona state governor Doug Ducey had to discuss issues around COVID, correct?
Michael Simeone: Yeah exactly, it's June 26 today. Yesterday was was June 25. Helpful clarification. A consistent backdrop for the governor of our state during these briefings has been bar graphs. And so the idea that the governor of the state is going to be giving a briefing on a semi regular basis. And the backdrop is consistently bar graphs just kind of speaks to how important data and data visualization is. In our moment. I think it's important to address why data visualization can sometimes be a place where we stumble or become misinformed, because we're living right now, in a moment where these data visualizations, we're putting a lot of stock in them.
Shawn Walker: And they become tools to advocate for policy. And this is not nefarious, it might be that the policy you're advocating for is opening up a state. So you want to ensure that the visualizations support that and you want to use visualization as a tool to support that. If you believe that the state should lock down, then you're going to use visualizations as a tool in support of that policy.
Michael Simeone: Yeah, I mean, their communication devices. Yeah, you know, I think that nothing we're going to talk about today falls under the header of purposefully malicious misinformation campaign, we're going to go through a number of things that almost seem like a blooper reel of ways that we can become misinformed using charts and graphs. Which is ironic because normally we think about charts and graphs as something that's going to clarify something for us or help us. But there can be a number of ways with just I think that's not helpful. So let's get into some things that we've observed and maybe even talk about some specific charts and graphs. Obviously, we can't show these charts and graphs in a conversation, but I think we can do our very best to describe them and paint a picture in everyone's mind.
Shawn Walker: So I, we've worked on a bit of a framework here to discuss this as to why some of these dashboards might be confusing or easy to misinterpret. So one aspect of the dashboards is that they have a long reporting tail, and that you take a test, and that test is sent to a lab, you know, lab interprets that test, and then eventually all that takes this long winding road to end up on the dashboard. It'd be internal And so there's a delay in reporting. So whenever someone might die of COVID, or someone has a positive test, it doesn't immediately pop up on that dashboard. 30 seconds later, it might take a couple of days or even a week for that to appear on the dashboard. And why might that be kind of confusing?
Michael Simeone: So what's one of the charts that we've seen tons and tons of times, especially for case rate, we see two different kinds. One is the line graph, which is just going to show cases over time. And that's where I think a lot of times people are talking about the curve. But the other one we see is actually just accounts of cases over time as skinny little bars that are all racked up from March all the way to the present moment. And that looks a lot like I don't know, Fitbit, telling you about how many steps you took in a day or how many hours of sleep that you got. I think we run into a lot of trouble when some of these tools look very similar to some of the apps that we use every day. It looks like a dashboard that gives us instant up to date information. It's not unreasonable to expect that they look similar. How come This Coronavirus information isn't instant and up to date? It's not like everybody understands how long Coronavirus testing takes, we wouldn't assume that automatically. To me this is confusing because our expectation is that it's instantaneous. Most the time when we interact with things that look like this, the latency period between event and reporting is a lot faster.
Shawn Walker: Like with our Fitbit or our Apple watch or whatever device or you know, in my case, my my ancient Pebble, my phone display --
Michael Simeone: You have a pebble?
Shawn Walker: Yes, I have a Pebble
Michael Simeone: Does that even give you results? Or do you have to like, do you start a diesel generator and then that creates a dashboard for you is that?
Shawn Walker: My Pebble is still happily going strong and impressively so.
Michael Simeone: But that creates a dashboard for you on your activity as well?
Shawn Walker: So pebble is my watch...
Michael Simeone: For people tuning in who aren't dialed into nerd humor about wearables. Right? So pebble the app still runs on
Shawn Walker: My iPhone. Yes.
Michael Simeone: Okay.
Shawn Walker: But despite the age of my watch, the dashboard gets updated in real time. So every time I open the dashboard or I look at the health statistics on my, my phone immediately grabs the data from my watch that shows up in some graphs that look very similar to all these dashboard graphs that I'm seeing on Arizona or Georgia or Florida's website.
Michael Simeone: Yeah, I mean, I do think that dashboards so far, and we've seen them, we expect them to be connected to some kind of real time service Coronavirus testing is not a real time service.
Shawn Walker: Definitely not. And the confusing part too is Many states have like, for example, I think we'll spend a lot of time talking about our home dashboard for the state of Arizona. But many dashboards have similar problems. So we're not just picking on Arizona for the fun of it. But they have a summary page that says 3000 tests came back positive today. But then you go to the daily graph, this bar chart that you were discussing before. And then I see the number of cases today are not 3000. The number of cases today are 45. What the heck?
Michael Simeone: Yeah, it looks like we're doing great. It looks like there's a sudden drop in cases. But back to what we were saying. If the reporting is delayed is updating, we're not looking at a decrease in the curve. We're just looking at data that's pending.
Shawn Walker: So there's a big mismatch in this delay is that well, let's say for example, today, on the 26th, the 3428 cases came back positive for COVID. But if we go to the cases confirmed by day graph, we see that on June today, June 26, only 45 cases have been reported. So what that means is those other 3400 cases have been attributed to previous days. So it's not just 3000. Today, it gets attributed back to the day that the test took place, but it might take from a day to a week for those numbers to return. So there's this confusing mismatch. We keep reporting 3000 positive cases today, and that could have been a test from a week ago.
Michael Simeone: Right? And so all of these bars are up for revision, they could did taller as the data comes in. So I don't think we're used to that. I mean, can you think of another chart bar chart that someone is going to interact with in their everyday life where we could say Okay, you're looking at it now. But two days from now, these values might be completely different. That's a that is a very different world,
Shawn Walker: When especially something that we're having such an intense political, and you know, life changing discussion around a virus in previous seven days are basically always in constant flux.
Michael Simeone: Yeah, and even any kind of, you know, projected epidemiological curves that we see in some of these charts and graphs, it's also tough to get your head around the fact that the projection will change. And so it is possible for one of these models to say, Hey, here's where our case rate is predicted to be in a week, and then have that be revised once we learn something new. So the fact that all of this that both the kind of reported past and the predicted future are under revision based on new information that we get, again, that's just not something we deal with every time we open up our iPhone.
Shawn Walker: Part of this is that we have to focus on certain parts of the data, and we have to understand and interpret each of these graphs that these number of days, we are constantly in flux.
Michael Simeone: Yeah. And I mean, I just I think that it looks so much like the way that other kinds of applications visualize data. But the data underneath it is so very different. And so I think one of the ways that some of these dashboards can misinform not intentionally, but one of the ways that they can do it is just being visually ambiguous with other kinds of data driven services that don't behave the same way.
Shawn Walker: And there's no visual indicator in the graph of side note, these four days worth of bars, these may change, we're still getting data. And so don't interpret these days. This graph just shows up and it's like 45 cases today. This is great, we're good.
Michael Simeone: Yeah, it feels like that could be printed in 45 point font across the top of the webpage, because it might be the most important piece of information to offer up to anybody looking at the website.
Shawn Walker: And this is not explained by the media either. So the media, you know, today in the news, they reported these, you know, 3400 plus cases in Arizona, so But that's not an accurate representation of actually today. Those are just the tests, the tests that came back today, or the deaths that were processed today. That didn't mean those folks passed away today or that they were positive today that just is a really confusing kind of number. That's not necessarily as helpful as we think it might be.
Michael Simeone: Yeah, yeah, totally even, uh, you know, there have been state papers in Arizona who published guides to understanding the dashboard. And the dashboard itself has undergone several revisions. And so these dashboards are just kind of emerging as a response, and nobody has it all figured out yet.
Shawn Walker: So let's move on to one of the next issues. Another issue might be trying to understand the spread sort of statewide or versus local. So how are these dashboards confusing in the ways that we might try to figure out maybe what does COVID look like in our neighborhood versus what does COVID look like in the entire state?
Michael Simeone: Yeah, so I feel like this has been a really interesting debate among data visualization people throughout the COVID pandemic. We is what is the best kind of map to use to represent the outbreak and epidemic over space? So the New York Times had a really kind of popular and still continue to be popular visualization, which looks like the United States is... Shawn, I'm sure you've seen this one, right? It looks like a map of the United States with red bubbles of various size superimposed over every city's location. And so it basically looks like a whole bunch of red bubbles superimposed on a map. And, you know, at the early point in the pandemic, New York just had so many cases, the bubble from New York was so big, you couldn't even figure out what the case rate was in Delaware, Philadelphia. Are you familiar with that kind of map?
Shawn Walker: Basically, the circles that you're discussing are then sized They're sort...of the center of the circle is attached to a location and the exact size of the circle represents the number of cases in a specific area. So you know, in January, there would be very tiny, very small circles versus now, the whole country's covered in red circles.
Michael Simeone: Yeah, exactly. And so that that kind of graphic is called a graduated symbol plot. A lot of folks kind of individualization space, Elijah Meeks being one of them. Elijah Meeks has, uh, works for Apple, I think he just quit. He was a kind of visualization lead at Apple quit Apple has his own startup right now. But one of the founders of the Data Visualization Society came out really hard against the idea of graduated symbol plots, you know, the case being against graduated symbol plots as they obscure more data than they show. Because after these, these data get to a certain value, it just looks like a big mess. Nevertheless, we see plenty of these graduated symbol plots. And it makes it very difficult to actually compare what's going on across different areas, right? Anytime we visualize something, we want to be able to make a comparison and so anytime we have something that doesn't let us make a comparison or interferes with that we should really question right so That's an issue. But then these choropleth maps. And these are, you know, if we've seen a county by county map or a state by state map or a zip code by zip code map, and certain zones are shaded in darker than others, that's a choropleth map. And those have all kinds of problems with it as well.
Shawn Walker: But what is the shading represent?
Michael Simeone: So a lot of times, we'll put the data into buckets, and then scale the color according to where the data is. And what all that means is, we might have one to 10, 10 to 20, 20 to 30, 30 to 40, right, if our maximum value is 40, and then we'll just shade in the map according to where the kind of counts fall according to certain locations, right. So if we can even call up the Arizona Department of Health Services map right now, by zip code map, where we can see what the case rates are by zip code or case totals, I'm sorry, per zip code, you know, some of them are shaded in darker because there's more cases and some of them are shaded and lighter because of other cases.
Shawn Walker: So we might imagine sort of gradient from you know, zero cases being a white to, in this case over 250 cases, which is a very small number, it might be a very dark red. And so then you can understand that the area of dark red has more cases. But the problem that we run into is one how these boxes are created, which you're talking about. So we have buckets when your example of you know, zero to 10, 10 to 20. Those are even buckets. But what are the buckets in the Arizona dashboard here?
Michael Simeone: Yeah, I mean, choropleths necessarily draw distinctions between the grades or the buckets of data, right? So what's a pale color versus a slightly darker color versus the darkest color, right? We're making distinctions among those. And normally, it'd be great to have a statistical way to draw those distinctions or even a theoretical or at some justification for why these things exist, right? Any How about any justification? Any justification is better than none at all. What we have in the state of Arizona site is actually none at all. But the other limitation that we see with a choropleth is that it actually draws hard lines between things that are relatively arbitrary, like zip code and county and even state. When you're looking at a viral pandemic, that thing isn't going to live by the same boundaries. as other kinds of data that we might plot. If we're looking at, say schools or income, a lot of those things are associated with zip code. But how a virus spreads. That's a vast and sprawling network that's not going to be contained by the boundary boxes of zip codes or counties or anything like that. So if we turn our eyes to the Arizona COVID-19, choropleth map, we see a whole lot of complicated stuff that isn't really clarifying.
Shawn Walker: Yeah, I was looking at this map, I just see a lot of dark red. And so now I want to close the window, because that just that looks problematic. But we see that the buckets that they've put in, they start at zero, and they maxed out at 250, which is a problem because, for example, some of the like Maricopa County has thousands and thousands of cases, so above 250 Basically the large portion of Maricopa County, they're in the six hundreds, eight hundreds of basically, all these zip codes are above 250. So I can't really distinguish between these zip codes as to whether or not it's more prevalent or less prevalent. So a zip code that has 400 cases versus isn't going to has 800 cases, looks exactly the same as presented visually the same on this map, when that's not necessarily the same. If you look at the underlying numbers.
Michael Simeone: Yeah, I see where you're going with the buckets for the choropleth. Right. So the way we're shading we can read it out right, the lowest value is 0. 0 cases is one color one to five cases is level one, six to 10 cases is level 2, 11 to 100 cases is level three, 100 to 250 cases is level four, and then the greatest possible value is greater than 250. And what I think has happened here, this is conjecture, but I imagined that this map worked a whole lot better Even three weeks ago than it did now.
Shawn Walker: Why is that?
Michael Simeone: Because Because now, so many different zip codes have over 100 over 250 cases, that the way that they've separated out the data for shading by color, no longer can keep up, right? We can't compare anymore, because almost every single zip code is above 250 cases. And so back to what we were saying before, maps should allow us to be able to make comparisons across different places, and not make all places look the same. And what we have with this choropleth is making all these places look the same. So if I were to just glance at this map, and I didn't, you know, spend 25 minutes staring at it, then I would think that northern Scottsdale, which is a much more spread out area, a much more affluent area would have the exact same set number of cases or the same class of cases as someplace in central Phoenix, which looks very different, demographically, and income wise and in terms of whether it's going with the virus right. So if we have Something that's 1200 cases right in one a Phoenix zip code compared against a North Scottsdale zip code that's 278 cases, this map is representing them as the exact same color. And so we talked about, you know, how can we be misinformed by something? I don't think this is intentional. But this is a misleading visualization, right? This is causing a kind of misinformation and equivalency, where no equivalency exists.
Shawn Walker: You said more than likely, this graph was actually very helpful in the early days of the corona virus pandemic as it spread throughout Arizona, and now these buckets needs to be updated and take into account the scale change in the number of --
Michael Simeone: This is chilling to look at, honestly, to see that greater than 250 at some point, was a really good idea. Right? dated June 19. Right, there's actually a little note with an update the total case counts using the original class breaks had to be updated. I would I would bet dollars to donuts that they're going to update this again sometime soon. To reflect this right. We know that this isn't an intentionally misleading artifact. But again, here we are with something that's actually pretty misleading as it stands right now.
Shawn Walker: So most likely right after this podcast release that it we'll see this graph get updated.
Michael Simeone: Yes. And then we won't be allowed to podcast anymore because of an executive order from the governor.
Shawn Walker: Well, as that goes with some extra funding, maybe that's okay.
Michael Simeone: Yeah, sure. Yeah, we can go with that.
Shawn Walker: So let's talk about the different types of graphs that are used, and why some of these graphs might be confusing or not. So if we look, for example, at in the state of Arizona, on their dashboard, hospital and bed usage, availability, so this is a stacked bar chart, where the entire bar represents 100%. And then the bar is sliced into various colors representing different categories. So in the case of hospital and bed usage, we have a sort of red bar that says these are Intensive Care Unit beds in use, and then we have a gray bar above that. And that's basically open in capacity. So are stacked bar charts confusing at all?
Michael Simeone: I mean, this one I want to give a pass to because there's only two categories. So it seems okay. Although you mentioned, you know, some of the most common charts that we see, I feel like if we were to create a COVID-19 data dashboard Greatest Hits than a stacked bar chart would be one of them. And for people who aren't familiar with stacked bar charts, I'm sure you've seen something similar, where you have one type of bar, one color of bar, and then another kind of bar, which is a different color stacked on top of it. And so actually, the governor used a slightly different version of the chart that you're talking about Shawn in one of his briefings, I want to say last week, where you did have stacked bars representing breakdowns of hospital bed usage. And these are a big problem for me. This one's fine because we only have two different kinds of bars. But as soon as we get more than one kind of bar stacked on top of one another, it's really difficult to compare values again, because what's the easiest thing to do? Compare the height of two things that are starting from the same place. Once you start mixing that up across, you know, something like 90 different instances, it becomes tremendously difficult to figure out what's going on.
Shawn Walker: And this is a bar chart. I think I know which one you're thinking of. But it was actually the ICU bed bar chart, but they had one third color at the bottom that said, of the ICU in use, what percentage of that is due to COVID cases versus non COVID cases? So then we had sort of three bars, which I think was like a gray, a blue and a yellow.
Michael Simeone: Yeah, exactly. And again, very difficult to compare. And so once it becomes difficult to compare, then there's all kinds of opportunities to be misinformed. But there's plenty of opportunities there to think, oh, I don't think the hospital bed situation is as bad as maybe some people are saying, and honestly, I think some of this misinformation or the misinforming effect that we're talking about today is in dialogue with some of the stuff that we've talked about before that if someone is confused, they may look for other explanations or they made you know, seek out a Additional explanations or be vulnerable to additional explanations. And, as we know, right, there's so much misinformation out there right now around COVID-19 that any kind of gaps in the data or the explanation are just going to be opportunities for some of this misinformation to have some kind of explanatory value.
Shawn Walker: Misinformation loves a vacuum.
Michael Simeone: Yeah, exactly. So, you know, if I'm looking at this, and it's not explained to me what the hospital bed capacity is, situation really is like back to before we're talking about and people can't necessarily agree on what a bed should count as, when you're talking about hospital capacity, right? There's, that figure isn't really well documented right now. And so in the absence of a good explanation of what counts as an ICU bed, or what counts as an occupied bed, or what counts as a surge bed, then a piece of misinformation can do a really nice job of explaining that, which is they're trying to inflate the number of full beds to make this seem like it's worse than it is because this is just a false flag because Q told me yesterday that you know, so these explanations work out really nicely if there's in the absence of a much better, much simpler explanation.
Shawn Walker: Do we want to explain who Q is?
Michael Simeone: Q refers to the QAnon conspiracy theory. And so in kind of using that free indirect discourse to characterize that conspiracy theory, that's what I'm referring to, is that the QAnon conspiracy theory, which is probably worth its own, its own conversation, but there's frequent updates distributed online and through social media, supposedly from a team of military officials embedded somewhere in the United States government who kind of pass information on to followers but a lot of conspiracy theories about the Coronavirus are coming from the QAnon conspiracy theory circles, and so again, a vacuum in the explanation about the data or confusion created by unclear visualizations isn't in itself misinforming, but creates really nice opportunities for conspiracy theories like QAnon, or some of the stuff that we saw or that we discussed kind of glancingly about Plandemic propaganda video, right? They gain more explanatory value. When there's more ambiguity and confusion.
Shawn Walker: There's more space for them to hook into, so to speak. So if we go back to this graph of hospital bed and usage availability, then we see no definition of what it means for an available bed versus an unavailable bed versus inpatient bed, which some folks might say, well, it's really obvious. We all know what a hospital bed as we all know what a ICU bed is, what an ER bed is. Not really. So if a bed was in use for part of a day, does that count? If a bed was in use for an hour, does that count? I f someone's pending a COVID test does that count? So that's why these definitions are really important for it to be clear and data so we understand. And we all agree that this means the same thing.
Michael Simeone: Yeah, exactly. Because otherwise people are going to start asking those questions that you just asked. As Evidence for there being something wrong happening. And even what people are asking or how people are criticizing this kind of reporting. That's where they go first, the responses. They didn't clarify this, or they didn't specify this. So this seems less reliable to me. Because I'm not sure exactly what they're talking about.
Shawn Walker: They must be hiding something from me,
Michael Simeone: They must be hiding something. Right, and they're not hiding something is charting really well is can be difficult. But yeah, these charts that we're looking at here, just because we threw a bar graph up online, and it looks really nice, doesn't mean that we've achieved what we really want to do, which is try to achieve clarity and some kind of Common Ground about where we're at these kinds of follow up questions that you asked or the exact kind of stuff that can create the kind of confusion that can be destructive.
Shawn Walker: I imagine that AZDHS, the Department of Health Services has a very strict definition that hospitals usually report they're just not communicating to the public. But there are a lot of these dashboards do and this is not big on Arizona is there just basically give you a title and then they drop a graph in and then they walk away. And then that graph gets updated every day. And so now we look at this. And that means that you know two things: One, there's no real explanation and contextualization. So all data emerges from a context. So there's no explanation of the context that this data emerges. So now, you know, we're already creating a story about the context of this data emerges. And that may or may not be correct.
Michael Simeone: Yeah. And that's an invitation for all kinds of other explanations.
Shawn Walker: Yes, I can't download this data. I can, you know, mouse over the graph. And I, it'll give me the different numbers. But I can't download this data as a table to then explore further myself, unless I want to spend all the day hand coding this data, or find someone who graciously did this for me. So in some ways, we're being transparent with the graph, but in other ways, we're a little less transparent, because you can't download the data to then look at it in different ways.
Michael Simeone: Yeah, and I keep coming back to this that, what decision are we supporting? By creating a dashboard like this? A lot of times dashboards are instruments that are designed for very specific purposes, the idea of a generally informative dashboard is a thing. But generally we want to think about at least some of the possible decisions that someone might want to make. So if we're talking about a COVID-19 dashboard, there's a small number of decisions that a lot of people are weighing right now. How much should I go out? You know, should I go to X number of businesses or what kinds of businesses? Should I wear a mask? All of these decisions are relatively straightforward. If we're all on the same page about the CDC right now, where these dashboards can become more useful is when states start to have very different kinds of situations. There are states where the curve is kind of declining right now. So maybe people are making slightly different decisions, but on plenty of these dashboards in states that have a lot of Coronavirus and states that don't have a lot of Coronavirus right now, relatively speaking, there's still not an explicit connection to behavior, which I find very unusual in looking At a dashboard and not having it linked to any kind of decision making, that somebody is going to make, right, we're observing the data, but that data is kind of in a vacuum. And so again, I just think it kind of sets it up there for ridicule or criticism, rather than helping connect it to some kind of evidence based practice.
Shawn Walker: And if we think about sort of the genesis of many dashboards, we can think of these as tools to communicate to an executive team, for example, the status of a company, so they produce widgets, you have dashboards with various, you know, widget production information and sales information, so you can get this overview. So then you can make your next decision or with this dashboard. Does the number of hospital beds that are available versus not available, helped me understand whether I should go to the grocery store today? Or does this just feed my anxiety? I don't really understand what's going on. And I just see these red bars growing and growing, but I don't really know what that means.
Michael Simeone: Yeah, I mean, I think there should be some kind of subtext for charts at all times. And if not, there should be some kind of lengthy explanation about why we're all here, so to speak. So there should be a reason and a decision associated with this stuff. And you know, the other point you bring up is it's also interesting that not only are dashboards designed oftentimes to help support complex decisions with lots of interdependent factors, so that people who are making decisions can just, you know, more rapidly or in a more informed way make them they these are also right, we can't leave behind this idea about how visually ambiguous these things are with much more mundane things. If we didn't have Coronavirus labelled on any of these charts and graphs. It could just as easily look like sales figures from an online retail company. You know, we could just be looking at q2 sales in the state of Arizona by day, but it looks exactly the same as Coronavirus cases. That's not to say that every chart has to look exactly different right? Depending on what it's doing. That's That's impossible. But it does contribute to some of our expectations, right? It creates that vacuum so that without some kind of further explanation about why we're paying attention to this stuff, then we just default back to our normal expectations about looking at these things, which is Oh, just reporting it, and I want to know it, but we have to know what kind of decision we want to be making.
Shawn Walker: And if these were sales figures, we'd be having a rockin quarter.
Michael Simeone: We'd be having a great quarter. That's, that's really grim. But it's true. I'm also noticing as we go through newspapers, states departments of health from all over the place, that a lot of these charts and graphs are starting to look the same. Have you noticed that too?
Shawn Walker: Yes, I've noticed that similar, sometimes the colors are different. And the design of the graphs, the way information is presented, how the dashboards load, it just kind of looks like they've all purchased the same product. And that's what they're using to display their data.
Michael Simeone: Actually, if we go through and look at even just the domain names for some of these, we can see that they actually may have purchased the same product. I think something like 15 states or using an Esri product would visualize their geospatial trends, Esri is a geospatial software company. Tableau is another product it looks like some of these people are using. Tableau, again is a data dashboarding and visualization software product, Microsoft Power BI, which is a business analytics software tool. So I think to some extent that people did buy the same product for this purpose that might account for some of this sameness that we're observing.
Shawn Walker: I mean, also, these tools have a default set of graphs that they come with that you got to click your data, click the graph type, press a button, and then presto, you have a dashboard. So creating a new type of graph that's not in the tool that's not sales quarter three, graph is very difficult, time consuming and expensive.
Michael Simeone: Yeah, totally. And, you know, here we are, in a situation where science and information communication are now being marshaled as a form of disaster relief, statistics, modeling, medicine, clinical care. These are all now bundled together. We normally don't think About scientists and stat and biostatisticians, as people who are kind of responding directly to an urgent kind of emergency, it's normally not that situation, right? But here we are with people who are reporting out data and statistics. It's an emergency, getting the information out is important. But it's not like any of these state institutions are so generously funded that they just have a crack team of data visualization, statistics folks working around the clock to make sure we have the best and most nuanced products available. That's just not happening.
Shawn Walker: And we also have to understand that these tools were spun up and many ways was sort of almost overnight sometime.
Michael Simeone: And being revised, right?
Shawn Walker: Right, and constantly being revised. But also, there are hundreds of thousands of visitors to each of these dashboards, if not more every single day, the infrastructure so the servers and internet connections and tools that sit behind that these aren't trivial. And you know, this is not an area of expertise of, you know, the Arizona Department of Health or any Department of Health, they don't run servers, that's not their job. So hooking into these tools, in many ways is ingenious, but then leads to a lot of well, these tools were designed to forecast sales or production, these weren't designed to show epidemiological graphs or the spread a virus.
Michael Simeone: Right, So the more custom it is, the more it's going to cost you. Or we could use an off the shelf product and put it on Amazon Web Services. But if we do something that isn't bespoke, then we get all kinds of problems that we're talking about right now, which is the charts, raise a lot of questions. And then also occlude some of the data raising more questions. So we walk away with that vacuum you talked about, and that's an invitation to explain what we see, or to criticize what we see. And right now, the conspiracy theory thinking the misinformation that's kicking around, this is the stuff that's oftentimes flying in to serve as an explanation. For what's going on, or as really fuel for the criticism of what we see. I mean, we've talked about this about these dashboards. And I think there have been some, some kind of interesting points raised about what goes into making them and how reliable they are and what the consequences are for ambiguity, and occlusion of data. But how many people do you think actually use these dashboards? And take them seriously?
Shawn Walker: I have some anecdotal evidence, I guess I would say so my non academic friends and family are either direct users of these dashboards because they have conversations with me about these every day, or indirect users have these dashboards because journalists are using those report information, or they're gathering data from this and then re visualizing it, like in the case of the New York Times. So I think these dashboards are actually heavily used because we're all trying to figure out what the heck is going on with this pandemic.
Michael Simeone: Yeah, you mentioned the site traffic looks like it's pretty heavy. I think I did a bad job of raising the point of maybe some people don't come All these dashboards at all, because the trust level is so low, or they feel like it's all a hoax anyways,
Shawn Walker: I imagine some of those folks are actually doing the opposite we see in elections and other places where these graphs are produced. And then the sort of holes in the graphs are then used as entry points to say, this actually isn't a problem. Or see, we can't trust this data. Let me highlight this one issue in this graph that may have been corrected. So we go back to for example, is a great example with the Georgia Department of Public Health where they I'll say accidentally produced a graph.
Michael Simeone: Oh, yeah. early May, Georgia Department of Health Services. Yeah.
Shawn Walker: So traditionally, we would do like a time series where we have June the first then June the second, then June the third, and then we display that data in.
Michael Simeone: June the fourth also known as the day that comes after June, the third,
Shawn Walker: Correct. And what happened is that Georgia Department of Public Health produced a graph that was instead of ordered by day, it was ordered by number of cases. So it looked like this. These were out of order to make it look like the graph was going down at a downward slope. So look like cases were decreasing. And journalist picked up on this. There was a firestorm. The Georgia Department of Public Health said, We're sorry for the mistake. We removed this from our website from our social media. We fixed that graph. We apologize. But cases like that are used by all sides of the political spectrum to sort of make their point whether I would argue their points valid or invalid. That's a different conversation,
Michael Simeone: But it can appear nefarious, but just clicking sort by value. There's a very kind of innocuous explanation there. I will get my facts. Right. Right. I kept calling it Department of Health Services for Georgia, but that's Arizona. Not it's the D, DPH? Okay, yes. Yeah. Just making sure I don't protect any misinformation.
Shawn Walker: Not intentionally, right?
Michael Simeone: Not intentionally. Anyway.
Shawn Walker: Long story short here is that even those that don't believe that that pandemic is an issue, they're still either primary or secondary consumers of this information because they use this as part of their argument.
Michael Simeone: I see what you're saying. So the Best Worst case scenario is that these charts are confusing. Cover up data, encourage equivalencies, where they don't exist. But the worst worst case scenario is that these points of weakness or ambiguity can be leveraged by people who are expressly interested or invested in misinforming information.
Shawn Walker: So the site might say, well, we're still processing data. So there's a backlog. And then someone says, Well, of course, it takes a long time for you to cook the data. So that's why you can't put it up immediately.
Michael Simeone: Right. The Deep State is very busy right now. And so it needs a little bit of time to get the figures up on the website. Alright, so thinking about wrapping up what use is the data dashboard on COVID-19 right now? what good is it if we've identified a number of different potential tricky parts What good Can we see in these things?
Shawn Walker: Well, I see them primarily as a transparency tool. So this is a way for public health services and the government to say, now you're working with a lot of the data that we're working with. And you can see this data, we can explain our policies to you in the context of this data. We might disagree, but at least we're working from the same data.
Michael Simeone: Yeah, this feels like almost like the accountability side of things is just as important as the communication side of things. That it's kind of confusing about what exactly or why we're communicating this information in terms of people's individual decisions. If just yesterday were mandating wearing masks, what other decisions were people supposed to be making. But the other part of this seems to be holding people accountable and making sure there's some modicum of transparency.
Shawn Walker: But this also then brings all the complications with it. This is actually highly technical data, you might have one number of cases per day, but that one number has a really complex context from which it emerges a whole testing environment, the whole set of test criteria, reporting criteria, a delay criteria. So there's a latency to that one number wraps up all of that inside of it. So it's not just this is transparent. It's also this data is actually really complex. And so these dashboards don't do a lot to honor that complexity, which leads to a lot of confusion.
Michael Simeone: So if I'm just gonna look at this now, I feel like maybe I just look at this as an educated guess. Is that a better way of thinking about it? Do you think, you know, because like, what's the way forward with this? I'm not going to stop checking this dashboard. But does this mean that I should check it less? I should take it with only so many grains of salt or that if I'm going to check it, I should make sure that I've got some extra coffee because I'm going to be spending 20 minutes with it just making sure I've got my my ducks in a row.
Shawn Walker: Well, I think extra coffee is always a good solution to most problems.
Michael Simeone: Yeah, I forgot I forgot about that.
Shawn Walker: So I would say a little bit extra coffee and then also finding some individuals, especially researchers, because there are a lot of researchers that are doing daily work to help the public interpret the data. So combining your interpretation of these dashboards with, potentially a researcher that you feel comfortable with, a lot of them are on Twitter, and on Facebook that can help you interpret some of that data.
Michael Simeone: Okay, so it seems to me that in looking at these dashboards, maybe one way to put it is we're only looking at half the story. We're looking at the output. But we're not looking at the processor institutions that kind of created all this stuff as much. And so if we want to be responsible users of dashboards about public health information like this, then we have to have homework that leads up to it, which is to just better understand where the data came from, what the techniques are, or what current logic is going into sorting and presenting the data.
Shawn Walker: Yeah, so I think what we almost need is another button or tab on each of these dashboards where they work with some science communication specialists to visualize. Well, how does this number end up on a dashboard? And to help us contextualize that a little more.
Michael Simeone: And until that tab appears, it seems like everybody's got to come to these dashboards with a running start with some some information and understanding because it's just not here on the dashboards right now, or at least in many cases, it's certainly not.
Shawn Walker: Yeah, we have to go find that fine print, and then read it and interpret it, which is a lot of work.
Michael Simeone: Yeah. And I mean, that's why I value that publications like New York Times and Washington Post have kept the same graphic up since the very beginning of the pandemic, because it at least means that when you come back to it, and you understand it, there's value in repeating the same thing, or just updating the same chart over and over again, this is back to we start getting a dividend for a dashboard, right? The whole reason we put up the same charts and the same configuration in the same way and show it over and over and over again, is not just so we can support a decision and you know, make these comparisons. But also we keep some kind of consistency so that we don't have to bootstrap every single time we want to look at a chart. And so it feels like that homework is gonna pay a dividend. So it's not like every time you check your web browser for this kinds of information, you're going to have to do a ton of work. But it does feel like you know, we should be thinking about the dashboard having some kind of payoff where you have to invest a little bit more time in, I think I've been using these dashboards all wrong, I think is what I've come into a realization on.
Shawn Walker: So to wrap up, I would go back to I think a lot of what we're talking about here is the context. So we have to do a little more work to understand the context of the final number, so we can really better understand what it means.
Michael Simeone: So thanks for joining us for this conversation. We'll see you in the next one. For questions or comments, use the email address data email@example.com. And to check out more about what we're doing, try library.asu.edu/data