Rachel Rush-Marlowe Transcript: April 7, 2025
Rachel Jones/NPF (00:00:00):
Session one of the April, 2025 widening the pipeline virtual training. We’ll focus on the March 21st executive order signed by President Donald Trump that was intended to dismantle the United States Department of Education. In the weeks since then, there’s been a great deal of focus on how that federal policy decision would affect local communities. From the pause and the collection of education data to the cessation of grants for specialized education initiatives, many observers are concerned about how public schools and communities across the country will operate moving forward. So during today’s widening training, our guest speakers will provide critically important context about how the elimination of key DOE offices, like the National Center for Education Statistics will impact student learning. In fact, our first speaker, Rachel Rush Marlowe, worked for both the Department of Education and for the NCES before launching research ed in 2020. That organization provides research, advocacy and data services to post-secondary institutions and affiliated organizations across the country. Rachel has also worked for organizations such as the Association of Community College Trustees, the National Student Clearinghouse Research Center, new America, and add Quality information partners on contracts for NCES. Today we are especially privileged to have, excuse me, today we are especially privileged to have Rachel provide us with the insight into the role that data and research play in shaping the education of American children. Rachel Rush Marlowe, thank you so much for joining us today.
Rachel Rush-Marlowe/ResearchEd (00:02:03):
Thank you, Rachel, so much for having me. It’s a pleasure to be with you all and I think at a time when education data is really at risk, it’s important that journalists are empowered to know how to use it, where to get it, and what we can do with it when we can no longer rely on the government to be sharing this information and providing the public with the information that they need.
Rachel Jones/NPF (00:02:24):
Now, as I mentioned during our first conversation, prepping for this session, you have a unique perspective on this issue in that I believe you started at DOE at the beginning of the first Trump presidency. You’ve had a sort of span of perception of what’s going on there because you also analyze the impact of covid on education and how the DOE operated. So before you get to the presentation that you’ve prepared for us, could you give us a little bit of insight into that context?
Rachel Rush-Marlowe/ResearchEd (00:03:00):
Absolutely. Yeah. So when I first started my career in higher education, I finished graduate school in 2024 and started working on contracts for the Department of Education, the National Center for Education Statistics at the time. So I started in September and then the administration transitioned in January of the following year. So got to see it from both perspectives and things changed pretty drastically in January. At that time, I think the administration was notably different than it is today, and that I think were not anticipating the win. They were not prepared to take office and to take leadership, which to be quite frank, given how things are going now, was perhaps a blessing at that time. But what it meant to then was that they didn’t have staff in place. And so our work became incredibly challenging and in fact impossible to do as a contractor sitting outside of the department, we had certain staff members that we would liaise with and they had all either been removed because they were political appointees and they had not been replaced by Trump appointees.
(00:04:08):
And many career staff chose to leave the administration as well. And so we essentially had no points of contact within the department, and those roles went unfilled for at least a year. And after about a year, I decided to move on because it was incredibly challenging to do the work, and it felt like I wasn’t able to support students or families through my role because it was incredibly difficult to get anything done. I was and remain incredibly concerned about the impacts that that administration had on the Department of Education. I think that it’s difficult to report on, it’s not a top level news story, bureaucratic inefficiency, although we’re talking about it in a very different context today is not usually kind of sexy reporting to talk about understaffing and things like that. But I’m concerned that prior to this administration that we were already feeling and will continue to feel the impacts of the vast understaffing that occurred during the first Trump administration. And so I’m much more concerned, of course, with the dramatic cuts, rifs and dismantling of our systems that are occurring now.
Rachel Jones/NPF (00:05:19):
And of course, as we discussed, we often think of DOE in terms of K through 12 education, but you have some other insight into the higher ed piece of this. What happened with your father? Can you share that with us?
Rachel Rush-Marlowe/ResearchEd (00:05:36):
Absolutely. Yeah. So the Department of Education also does a ton of work in the higher education space. That’s where most of my portfolio is. My dad is also a professor and dean at the University of South Carolina Bufort, where they recently brought in a $5 million teacher quality partnership grant. Teacher quality partnership grants facilitate relationships between K 12 districts and higher education institutions. This does a number of different things, right? It reduces teacher pipeline shortages. So colleges are paired with K 12 districts so that they can do their teacher inservice training at those schools, which sets them up for roles in those schools right afterwards. It also is a great opportunity for education in place, especially in rural communities where we do a lot of work so that kids in the K 12 district also see a pipeline and a pathway to what it could look like to remain in their community and get connected with that higher education institution.
(00:06:32):
They do kind of reciprocal programming so they can see themselves as college students as well. Unfortunately, the teacher Quality partnership grants, I believe 65 million worth of grants across the country were canceled a few weeks ago. There have been a number of things in the courts back and forth, legal battles eventually ending in Most recently last week, a number of the grants were restored across the country due to ongoing litigation via a large educational association, which fought on behalf of their participating institutions. Some, but not all grants were restored. However, over the weekend, I’m not sure of the exact ruling date. I believe perhaps last Friday some cases went all the way to the Supreme Court and a divided court ruled in favor of the Trump administration once again, rescinding some of those teacher quality partnership funds, which will have just devastating impacts on both higher education and K 12 districts across the country.
Rachel Jones/NPF (00:07:30):
So why don’t, at this point, I’ll let you take it away with your presentation and we can take it from there.
Rachel Rush-Marlowe/ResearchEd (00:07:37):
Sure. Thank you so much Rachel, and thank you for setting the stage kind of with those questions and conversation. I’m going to share my screen, so I have a presentation to share with you all today, but I would like to make this fairly interactive and have a conversation as we go. I’m sure you all know each other quite well by now, but I don’t know any of you and you don’t know me yet, so please feel free to jump in or put questions in the chat and I’ll pause as I go. I have a lot to cover today. So I wanted to focus today’s conversation both on a better understanding of federal education, the way that the current kind of dismantling of the Department of Education is impacting education data and how that connects to how we can support families and students, how we can support the public, and understanding how the Department of Ed supports families and students and what the state of education is.
(00:08:26):
But also wanted to give some information on how to create meaningful and impactful data visualizations with this data and give you all the tools to do that as part of your reporting. So as we get started, I think I’ve already been sufficiently introduced. I have just the mission and vision of my organization here, but we’ll keep going. So I want to get started in talking about how you might be interested in using education data in your reporting if you use it already or if it’s something that you’re thinking about. Would love to see and hear a little bit more from you all in the chat and talk a little bit about how you think about education data, what education data sources you might already be using. We’d just love to get a sense of where everyone’s at. And as you are popping that in the chat, I’m also going to bring up a quick poll. So if you go to mentee.com on your phone or laptop and you put in the code 7 9 7 4 7 2 3 8, drop that in the chat as well. And just to get a sense of how comfortable you are, I said data generally, but it could be data, it could be education data, federal data sets. We’d love to get a sense just kind of level set of where everyone is at and if you have anything you want to drop in the chat as well.
(00:10:09):
Awesome. So it looks like we’ve got almost everyone participating and looks like the, well, let me not say so. It looks like the majority of you said that you’re somewhat comfortable with data, right? Right there in the middle. Some people said uncomfortable, some people said very comfortable. I think that’s great. I bring this up just to kind of level set and get a sense of everyone’s at. But also just to say that if you are comfortable reading this visualization, that’s all the baseline that you need in understanding data to do what we’re going to talk about today. So we’re really going to focus on what you can get out of federal data sets, where you can find federal education data if it’s no longer available on federal websites, how you can use it to make some really compelling clear visualizations for your readers to help them understand education data as well. I see someone in the chat said they use state gathered metrics and federal ed U and census data. It’s dense usually and overwhelming. Absolutely. So it sounds like you already have a ton of experience with this Lionel, but we’re going to look at some tools and resources today to help make some of that data a little bit less overwhelming. Thank you all so much for sharing.
(00:11:20):
Sorry, bear with me while I continuously shift between different apps and have my whole screen shared. So just wanted to again, bring that up to say that I want to make this as accessible as possible. I think this is really easy stuff once you get your hands on it and can really improve how you access federal data sets and use them. So just a quick overview today we’re going to talk about some different federal education data sets where you can get those from. I’m going to walk you through something called Power Stats, which is a data lab tool. It’s again, through NCES and helps make some of that data a little bit less overwhelming. We’re going to talk about data cleaning, storytelling with data, how to make data visualizations, looking at some best and worst practice, and then planning to make your own based on what we’ve learned today.
(00:12:08):
So I don’t know if there’s any Harry Potter fans, but federal data sets and where to find them. So I just wanted to give a brief overview of what the National Center for Education Statistics is. It’s housed within the Department of Education. They are a congressionally mandated organization that is required by Congress to provide reliable, trustworthy statistics about the condition of education in the us. So they were created by an act. Congress in theory should need an act of Congress to dismantle them, but we’ll talk a little bit more about what’s happening with that. So the NCES data includes hundreds and hundreds of data sets with really in-depth information on really anything you could want, I think at the federal level. And many of the resources are collected at the state and district level as well. And so you can get really granular and look at your local communities through this.
(00:12:59):
So there’s kind of four buckets, I would say, of the unit of measurement of data that you can get with NCES. There’s student level data, so things on enrollment demographics. So if you want to look at K 12 or higher education, looking at student groups broken out by race, ethnicity by gender, disability status, there’s a ton of rich information there. You can look at performance metrics like test scores on standardized tests such as the na. You can look at graduation rates both in high school and college financial aid and tuition. So I’m not going to read through all of these, but just wanted to give kind of a sense of the types of data we’re talking about, the level of data we’re talking about, and to touch again on the current context before we get into looking at this in more depth. Unfortunately, the Institute for Education Sciences, which is where NCES and all of this data is housed some more cuts than any other department within education department.
(00:13:53):
So the acronym here, ED, is what we typically use in the field. A lot of people you’ll see sometimes we’ll use DOE usually in Washington DC DOE refers to the Department of Energy and ED is Department of Education. So out of 221 employees, about 47% were laid off from IES. And we’re hearing from the field that impacts of this are already being felt. So again, I work primarily in higher education outside of research. Some of the other work that we do supports colleges and universities directly with their data collections. There was a mandated reporting cycle that’s happening right now for higher education institutions. They have to submit their what’s called IEDs data to the Department of Education within the next couple of days. And we’re hearing that submissions are failing, the portal is down, all kinds of challenges are being experienced that typically colleges are telling us they only have ever seen this happen prior during government shutdowns.
(00:14:51):
So the site is typically very stable, easy to access, and right now that’s not the case. On Friday, a lawsuit was opened against the Department of Education. The lawsuit was opened by an organization called the Institute for Higher Education Policy, as well as the Association for Education Finance and Policy. They’re being represented by the Public Citizen Litigation Group to stop the unlawful dismantling of the Institute for Education Statistics. And the lawsuit is seeking to restore all of i’s staff contracts and resources. So very much in development, I was working on finalizing these slides Friday afternoon when I saw that news. So things are happening very quickly and it’s hard to anticipate what this will look like. But all of this to say that the future of education data collection in this country is very much at risk, which is why I think it’s so important that we’re having this conversation that you all are empowered to use this data report on it and share it with the public for as long as we have it.
(00:15:50):
And again, we’ll be talking about some ways to access this data should the federal sites go down entirely. So the next thing I want to walk you all through is the Power Stats Data Lab. This is a fabulous user-friendly tool that allows you to look at any of these a hundred NCS data sets that we talked a little bit about. It’s an interactive website. It allows you to look at higher level data without having to use any statistical packages or I think Lion have put in the chat. Sometimes it feels really overwhelming. So power Stats is a great way to get started looking at some data. Doesn’t require anything fancy. Like I said, if you have the tools to analyze this TER pole we looked at, you have the tools you need to look at Power Stats. It does require an account to log in.
(00:16:34):
I am already logged in today, but all you need is an email address to set that up and you’ll get to this page. So you can see all of these different data sets available, things on school surveys on crime and safety staffing, national teacher and principal survey. So today, just to give kind of an example, I’m going to walk us through looking at the baccalaureate and beyond. And so if we click here, you can also get some quick information about what’s included in the study, the type of data, how regularly it’s collected, things like that. So if you’re working on a story and you need a specific data point, this is a great place to come. Or if you have kind of a theory maybe that there’s something you want to look into, this can be a great place to find some data to back up your work. So I’m going to get launch here. So as you’re working through power stats, the one thing I’ll say to remember is always just look for the kind of green buttons, green or orange buttons, and those will guide you if you are looking at this data, the green is always kind of there to walk you through.
(00:17:42):
Oh wow, I just ran this morning to prep for you all. And it’s saying that there’s no variables in this data set. Oh, I’m sorry. I’m sorry. Always follow the green. I’m always doing this and I’m like, is it gone? The data’s already gone. This just happened. Sorry. So the first thing you need to do, as it says in green, to select an analysis type to begin. So we are going to look at percentage distributions today. Again, just pick something kind of simple to get us started, just still saying there are no variables in the data set.
(00:18:18):
This is crazy. It is very possible that this data has disappeared today. Wow. Okay, well that’s great. This is crazy. Okay, we’re going to talk about where you can find data if this happens. I really don’t believe that this is gone though. I Alright, well things are dynamic and changing every moment. Hopefully power stats still exist by this afternoon. And I will take a look at this once we’re offline and get back to you all. But I’m so sorry about that. This is so strange. So if power Stats or other sites no longer exist, a few different options. There is something called the Wayback Machine. If you’re not familiar, this is an archive website. So you can go in type in a URL of a site that maybe used to exist, and it’s a nonprofit organization that it’s archiving the internet. One thing that is not actually available in power stats is NA scores.
(00:19:45):
So these are the standardized test scores that children across the country take, and I believe third through 12th grade in some states every year and some states on a less frequent cadence. Zelma AI is a great tool. We’ll look quickly at together. And then I’ll also show you the PNPI data explorer since we did not have the chance to actually look in power stats. Zelma AI is a great resource. It’s hosted by again, a third party, so no risk of it going down. There is of course a risk that the data won’t continue to be updated as we move forward, right? But SMA AI use a very similar format to chat GPT. You can type in a question and it will query for you the NA data. The PNPI Data Explorer is formatted in kind of tableau. The PNPI data explorer is focused on higher education and can go down to the district level, which can be really helpful again for kind of that local reporting. So we can take a quick look at both of these tools in zelma.ai. So you can click on explore assessment data, type in a question so they have some questions kind of ready for you. So if we just click on math scores in Alaska by grade over time, it won’t come up with a new question just to save us a minute.
(00:21:04):
And right here you can see it generates the visuals for you right away. So you can see for grades three through eight, those standardized test scores in Alaska from 2016 17 academic year all the way through to 2324 with of course some missing data during covid. So you can see the way that these proficiency rates changed over time. I will pull up also the PMPI data explorer. I wasn’t planning to demo this, but since it seems like our federal data sets are disappearing in real time, we can see here. So this is a really great tool for post-secondary education data. You can select the state congressional district or territory you’re interested in. So look at Alabama Congressional District three to give it a second to generate, and it’ll give you some top line numbers, information on enrollment and access, college costs, attainment, student debt. And so everything really pretty nice and easy to look at here. And you can also pull these images down directly, choose a new format to download them, export it as a PDF. And so these can be included as you kind of do some exploring research and reporting. So just another great way, and this is all based on different data sets that are from NCES, but they’ve been pulled into this format for easier analysis.
(00:22:36):
The last thing that I will mention is the data rescue project. So I have it linked here. I’ll be sharing these slides at the end of the day. The data rescue project is an incredible massive data archivist effort. It was actually just covered this morning. I think an article came out in, I believe the New York Times covering the Data Rescue project. So they are a group of archivists, researchers, volunteers from all across the country, all different types of expertise that are working to preserve as much federal data as possible. It’s also a great kind of social network of folks. And so if there’s something that you’re worried about losing or you’re looking for, there’s ways to kind of get connected with people there and see if someone can pull that for you or if someone has perhaps already collected it and they’re working on finding servers and secure ways to host all of this information.
(00:23:34):
Alright, so once you’ve decided where you want to grab your data from, the next thing that we always have to do is data cleaning. So you want to pull out typos and inconsistencies. So if you’re pulling a raw data set, like one of those ones from Census or somewhere else, you might see some inconsistencies in the data, it might say NY and some records and New York and another. And if you’re thinking about trying to do analysis or build a visualization, whatever tool you’re using is not going to know that those two are the same, right? Even as AI improves, AI won’t necessarily always recognize those as the same variable. And so you need to clean those up. You might want to remove duplicates, think about how to handle missing data, ensure that your data types are correct, right? So for anyone who’s used Excel before, no matter what thing you put in there, it wants it to be a date even if it’s not a date.
(00:24:26):
So you need to make sure that’s right and just getting things structured and formatted properly. So my hope for today was to kind of walk you all through a very concrete example and pull some data out of Power Stats. Not sure what’s going on with power stats, but the data that we were going to walk through together and pull out was looking at students that reported that they were either able or unable to maintain their basic needs and expenses during the pandemic. And so what I had pulled out previously, what a time we were living, I was crushing, yeah, sorry, I’m just looking at the chat. I am absolutely blown away. I mean, I tested this out like an hour before we got on the call together. So I don’t know if it’s user error because I’m not doing things sitting by myself calmly or if it’s really gone already.
(00:25:17):
But I will let you all know once I’ve had a chance to look into it. But I pulled this data out this morning ahead of the call just to save a little bit of time. So we would’ve pulled this together from power Stats. And what this is looking at is the number of students who reported that they were unable to meet their expenses, essential expenses over the last months as a result of the pandemic. And what I found kind of looking into this, trying to find something for us to explore today was that almost double the number of students who reported having a disability had challenges meeting their essential expenses. So I thought that was something kind of interesting that we might want to look into when we pull the data from power stats. It’s pretty high level, it’s pretty clean. But again, if you want to create a data visualization, do something with this in your reporting.
(00:26:06):
This is still pretty messy. You have these estimates, you have standard errors, you have all of this at the top that none of this is going to go into a visual. And so just to save a little bit of time for the example today, I cleaned the data up outside of the call. And so this is what we would be using to create our visual. So we have a column for disability status, disabled and not disabled. And then those students that answered, yes, COVID-19 was a reason for me failing to meet my essential expenses. So we can see about 9% of students without a disability reported that that was the case compared to 16% of disabled students. So it would’ve been really fun to discover that together in power stats. But here we are. So this is what the cleaned up data looks like and depending on the type of data you’re working with, how many variables you have, you can do it kind of manually as I did here in Excel, just pulling out the pieces that I needed. If you have a larger data set, there are some tools and programming that you can do for that. So just want to kind of highlight those options based on what you’re working with.
(00:27:15):
So connecting the data to your story, none of this matters if you don’t have something else to say to go with it. So I wanted to spend a little bit of time talking about what story your data is telling, how you want to incorporate that into your reporting today. For example, I was hoping to look into power stats and just kind of see what interesting things we could find. We didn’t have a particular story that we’re working on together. We’re just exploring if there’s something that needs to be told. And when I was preparing for today, and I found this thing that said where the data showed that disabled students were much more likely to have had difficulty meeting expenses during covid, I thought that was really interesting. Perhaps that’s not interesting to the others who are more familiar with the disability space that might just be consistent with what we know to be true about disabled students and their ability to meet their financial needs.
(00:28:04):
But maybe there’s a story here. So I would want some more information about disabled students and non-disabled students as I build together what might be a story around this data and around a visual I might want to create. So sometimes you might be working on a story, you’re looking for data to back it up. Maybe you’re just exploring to see what’s out there, see if there’s a story that needs to be told. I think both can be valid, but we have to be really careful of cherry picking data and thinking about our own biases and how our biases can impact the stories that we tell and what we see in the data. And so in order to be effective creators of our own compelling stories and data visualizations, we have to be critical consumers of others. And we have to be critical about our own biases, what we bring to the table in this work.
(00:28:48):
And so to start out that conversation, I have some examples of data visualization and what not to do. I know that’s not usually how you start things, but I think it’s a fun way to get thinking about data, thinking about how we might want to use education data in our stories and visuals by looking at some examples of what works and what doesn’t work. And so silly little cartoon here, right of cherry picking, or I guess maybe apple or number picking, he’s asking her to pull out an 84. It would look really good in their report.
(00:29:24):
So the first one here is a pie chart. There’s nothing wrong with pie charts inherently. It’s a very clear, concise way to show something very accessible to most readers, but there’s usually a better way to show data, I think, right? I don’t usually recommend pie charts, it’s just not very interesting to a reader. If you’re going to make a pie chart, it can just as easily be a sentence in text. I think except for this pie chart, this is obviously perfect. Donut charts are a type of pie chart. They just have the center missing, just like a donut, like this visual here. Again, I’m not a big fan of donut charts. They’re not any different really visually from pie charts, and they’re not necessarily the best way to display all types of information. So in this visual, we’re looking at how much of each generation spending goes to restaurants.
(00:30:16):
So we see these four different generation categories and percentage of their income that they spend. I think there’s a couple of different things here besides me personally just thinking that donut charts aren’t a great way to display information. I think there’s a particular issue here with this visual so that I want to tease out a little bit with you all. So I’m wondering if you want to come off mute or just put in the chat, what’s something that you notice? What’s the first thing that stands out to you about this image? What are you thinking about when you’re looking at this?
Rachel Jones/NPF (00:30:46):
I’m thinking that the color pattern or whatever is not compelling.
Rachel Rush-Marlowe/ResearchEd (00:30:57):
Yeah, color choice is really important and visuals. And so hot pink, it’s cutesy, it’s a donut, but what is it really telling us? All I can see are donuts. Yeah, sprinkles are distracting. Yeah, it’s not an order ti. That’s a great point. And it’s kind of strange also that they have them decorated as donuts because the visual has nothing to do with donuts. The way it’s displayed feels counterintuitive. Yeah, yeah, exactly. So not being in order, the display being little counterintuitive, there’s very little difference between the percentage amount. So they all look similar. Yeah, exactly. So that you all kind of have touched on what is one of my biggest issues with this visual is that they’re using it for comparative. Yeah. All the percentages look like they could be about the same size at a glance. Exactly right. So when you actually look at the numbers that they’re telling you tradition was 13% versus millennials 24, that’s a little kind of close to half half, right?
(00:31:56):
13 to 26. So that would be the thing that’s most interesting to me is to say that millennials spend almost twice as much of their income on restaurants as traditionalists. But when I look at this image, that’s not what pops out at all. Right? Too playful. Yeah. Your eyes don’t go where you want them to so much. Great feedback. So I think that this is a pretty poor chart choice type for this type of information. It’s the purpose of this, which I imagine it might be, is to compare. We’re comparing different generations. The data that you’re looking at doesn’t feel comparative at all. There’s too much color, there’s too many weird things going on. Your eye is not immediately drawn to those distinctions, which is where it should be.
(00:32:39):
Oops, there we go. Just kind of a silly he thing here, but you really never, ever, ever need a 3D visual, especially one that’s also a pie chart. 3D visuals just rarely add anything useful to the conversation, especially this one has no numbers on it. But even if it did, you imagine that the numbers would likely be related to the size of each slice of pie. And so it also being 3D is just confusing. I’ve never seen a 3D visual that where it’s added anything beneficial to the conversation. Alright, so we’re going to talk about some other types of data issues. This is also just kind of a silly one, but the question asks is truncating the yxi dishonest? Yes or no? And you’ll notice almost immediately here, I think, right, that this is a pretty dramatically truncated access. So we see that the numbers start, we see 92 is the first marker. We can assume it starts at 90. And so this is the problem, right? It looks like 50% of people think yes, 50% think no. But really the numbers are starting at 90 here. I don’t say never. I’ll say you should almost never truncate anxi. But there are exceptions to that which we’ll look at in just a minute. I want to look at a couple different examples of truncated axes and talk about them a little bit more.
(00:33:59):
We have another visual here. We’re looking at the effectiveness of allergy medicines. I thought it’s not education related, but this felt timely. I don’t know where y’all are located, but I’m in DC and my car is yellow with pollen. I can’t walk outside with just turning into a sneezing fit. So this is looking at the percentage of people who reported to fewer symptoms with two different companies, half the joy and poll away, and what do we notice here year round? Yeah, the pollen is horrible this year. So what all you all notice in this visual, we have a truncated axis, right? Does this seem like a reasonable way to display this information?
Speaker 3 (00:34:48):
I’d say no. Because the trunk, it is making it a tiny difference appear way larger than it actually is.
Rachel Rush-Marlowe/ResearchEd (00:34:56):
Yeah, yeah, exactly. So a lot of people making the same comment in the chat because of the truncation. We’re talking about the difference between what is this, 30.4% and 30.6% totally marginal. Probably within margin of error. Both drugs work about the same, but this is making it look like a huge difference. Alright, how about here? So we have average global temperature data from 1900 to the year 2000. So these two visuals, the one on the left and the right are both showing the same information. Technically you could say the one on the left is truncated because it starts at 10, not zero. Or I guess maybe it’s not marked, but you can assume the zero line is down here. The one on the right hand side is truncated, starting from 55.5 to 59.5. What do we think about this tru on the right? Yeah. So Chloe’s saying, yeah, it’s okay because of the context, Lionel, the right is much better because the differences are slight over a long period of time. Yeah. So tell me more about the context. What makes the context make this okay?
Chloe K. Li | Al Jazeera English (00:36:26):
Just because a difference in average, global temperature scientifically is much more significant than a slight percentage in symptom reporting.
Rachel Rush-Marlowe/ResearchEd (00:36:40):
Yeah, exactly right. So even though on both visuals, you’re seeing that the temperature just goes from about 57 degrees to 59 degrees, two degrees, and global temperature is huge, whereas the difference between 30.2 and 30.6% in symptom reporting is not big. So in this case, I had to caveat that most of the chime truncation is not good, but in context where small differences are representative, something much larger it can be. Okay, so we looked at a couple examples of some bad visualizations. What are some other things either from what we looked at or things that we haven’t seen yet that make a data visualization bad? Yell it out, pop it in the chat. Yeah.
Lionel Ramos/KOSU (00:37:34):
One thing that I tried to avoid personally, can you guys hear me by the way?
Rachel Rush-Marlowe/ResearchEd (00:37:37):
Yep.
Lionel Ramos/KOSU (00:37:38):
Okay. Is decimals and things like that, if they’re not necessarily, I guess vital to the context, for example, in temperatures over a long period of time because of the significance of the small differences, I think that would be something more appropriate for that context rather than something else that can be rounded up or rounded down to keep it neat, avoiding of course, misrepresentations of it by doing so if you don’t want to round up too much, obviously or too far down because it might skew it in the way that we’re talking about.
Rachel Rush-Marlowe/ResearchEd (00:38:13):
Right? Right. Absolutely. I see other ones in the chat, misleading, unclear, hard to read. We saw some of that already, right? The donut chart was a little bit unclear, hard to read. There can be misleading data, all great responses. So when we talk about what makes a visualization bad, I would say it typically falls into three categories. Incorrect, illegible, and for lack of a better word, bullshit. Some visualizations have data that’s incredibly off or misleading, whether or not that’s intentional, right? They’re just wrong. They’re not representing what they claim to. And so all of these things below kind of fall into those three categories. So the scale might be inaccurate. We haven’t really looked at that yet. Access truncation, we saw a lot. We saw some poor use of coloring, right? With the donuts. As Rachel mentioned, data can be overwhelming. You can have the wrong chart type.
(00:39:05):
Again, I think the donuts, excuse me, were the wrong chart type for that kind of comparison. The context in storytelling might not match the visual. And then poor data handling, right can happen. Whether or not it’s intentional, you might click on a column and it changes something in a way that you don’t anticipate. You don’t notice it when it gets into the visual. Or it can be intentional. It can be people promoting misinformation through poor data handling, purposely trying to obscure what something’s actually showing. So we’re going to look at just a couple more examples of this and then get into making our own visual. All right. This is one of my favorites. So goofy. What’s wrong with this picture? This one’s a lot of fun. Yeah, scale truncation. Exactly right. So women in India being around five foot are not one fourth of the height of Latvian women who are five four. I think truncation might be appropriate here. Not a ton of variation in height lower than five foot or higher than five seven. I mean, I might widen it a little bit as someone who’s towards the lower end of the scale itself, but the scale is completely off.
Speaker 6 (00:40:22):
Oops, oops, oops.
Rachel Rush-Marlowe/ResearchEd (00:40:24):
All right. This one is a little bit harder to read. I apologize. It’s blurry at the bottom. That’s because this is a screenshot that I actually took from my cell phone back in 2020. And when I was preparing for today, I somehow found someone else on the internet had also come across this one, which was exciting because it didn’t make the news at the time, nor has it since. But I found another data nerd. So this was actually some reporting that was done by the Georgia Health Department early on in the pandemic. They were trying to make an argument that Covid cases were on a steady decline from late April to early May, and to show that they provided some data from their five counties with the highest covid rates to show the decline. But there’s something pretty off about this visual, and again, it might be a little hard to see. I know the access is a little blurry, but let me know in the chat if you see what’s a little off here.
(00:41:28):
Yeah, exactly. Richard, the dates are totally out of order. Oh, that’s your coverage area. Very cool. So you can see that on the left hand side here, right? We’re looking at April 28th, then April 27th, 29th. Then April 30th is over here. Then April 25th, April 26th, it’s all out of order. They did this several times. Wow. So you can see that they just ordered this by height rather than date, which could be accidental, but seems perhaps unlikely. And they also changed the county order. Yeah, yeah, exactly. So the corrected version that someone I found online created now has this by the correct date. So we can see this much more clearly. And you can see that it tells a very different picture. And I believe that this reporting came out right around May 8th or ninth. And so you do see the numbers are much, much lower there probably a lag in reporting timeline.
(00:42:39):
The other thing that I would note is that while I think this is a much better visual, if I were to put this together, I may make one additional change, which would be to someone else’s point, right? I might put the counties in the same order on each date. It really depends on what you’re trying to visualize. And so I think this is important and ties back to storytelling. So if the story here is about which county was the most impacted on each date, then this might be the way that you want to visualize it. The eye is very quickly drawn to what is the highest number on each date. And you see that that changes for me. I think it might be more interesting to know how the counties are each respectively doing over time. And if that’s what you wanted to emphasize, then you might want to put the counties on the same order on each date so that for example, if we’re looking in yellow at Fulton County, I would want Fulton to be the first bar on each date. Then it’s easier for your eyes visually to look across at each county and how they’re changing over time.
(00:43:41):
Alright, so I’ve picked quite a few things today that are maybe get us feeling strongly about a lot of these issues. And I know they’re not all education related, but some of the best examples that I could find of these things were from a number of different fields. And so I also wanted to pick things that are intentionally a little bit polarizing. So it’s not necessarily about how we feel about the issues, but how we can analyze what’s happening in the data and be good arbiters of that. A good point on the last one, a line chart would probably be easier to follow as well. Absolutely. And so when we’re looking at this line chart, what do you notice? What story is this telling? What do you see right away with some of the things we talked about in data visualization and do’s and don’ts?
(00:44:29):
So we’re looking here at Planned Parenthood Federation of America. This is a chart. It went quite viral a few years ago. I had it in the back of my mind when I was getting ready for this. This was shared in a committee meeting in 2015 by Jason Chaz, who’s a former congressman from California who was making an argument about funding for Planned Parenthood and his kind of top line. You can see that abortions are up and life-saving procedures are down was the argument that he was making from 2006 to 2013. So what do you notice about the numbers here about the axes?
Lionel Ramos/KOSU (00:45:11):
Well, for one, I’m not sure how related cancer screenings are to abortions directly. That kind of stands out to me as odd.
Rachel Rush-Marlowe/ResearchEd (00:45:24):
Yeah, definitely. Yeah. So I think, yeah,
Lionel Ramos/KOSU (00:45:26):
I just don’t understand the comparison at all, just looking at it. So I’m not sure.
Rachel Rush-Marlowe/ResearchEd (00:45:30):
Yeah, so I think the argument that he was trying to make is that Planned Parenthood funding should be primarily focused on life saving procedures. And that here, that’s not the case. A lot of things popping up in the chat about the Y axis. So this is, even though there’s no labeling for either of the Y axis, it is kind of a dual axis graph. So the Y axis is not the same for both lines, which is very misleading. Yeah, 930 5K is lower than 2 89, 2 87 is higher than 9 35. Also, it’s just randomly drawn lines. The data is probably not perfectly linear. Yeah, all great points. So pretty messy here. Not really showing anything that you can make sense of with where the numbers are in the lines. So if we go to our next slide here. So on the left hand side you can see the data normalized with one axis that has the lines shown with the same standard for both.
(00:46:39):
Could have probably been separated. Better to tell the story. Yeah, exactly. And so here we see the same numbers exactly, but just on a single access graph. So we see abortions now look somewhat consistent over time. We see that cancer screenings and preventive services actually have gone down. Part of the reason for that, just for some interesting context perhaps, is that from 2006 to 2013, the recommendations for women’s cancer and preventive screenings changed pretty dramatically. So while it used to be once a year, I think the recommendations now three to five years, depending on women’s age and health, as long as you’re getting normal readings back in your screenings. The other thing was that the previous visual mentioned lifesaving procedures very specifically. And so someone went as far as to combine all of the services that Planned Parenthood provides outside of cancer screenings and combine those in all of the abortion and non abortion services. And you can see here that it looks even more kind of consistent when put into that context. So this data was a little bit cherrypicked with the cancer screenings because those did go down in fact. But when combined with all of the other services that Planned Parenthood provides numbers for non abortion and abortion services remain pretty constant over time, even with that dip in cancer screenings.
(00:47:58):
So I would say a few different things. The vertical access labeling, dual access, the scale was wrong. And it’s what I would describe as a bullshit visual, whether intentional or not, it’s not correctly displaying information that it’s stating that it does. Alright, so we’re going to talk about planning visuals of our own. So when you’re doing that, the first thing that I would encourage you all to think about is what your goal is. What kind of storytelling are you trying to do? Do you want to convey a single data point or are you trying to show comparisons, distributions, trends? So these are kind of a few of the options when we think about what it is that we’re looking to demonstrate with our data.
(00:48:44):
As you do that after you’ve kind of figured out your goal and what it is you’d want to do with your data, it’s important to think about what kind of data you’re looking at. Yeah, just looking at the chat, the problem is these bad visuals get used wisely. People don’t understand and how to fact check them. Yeah, a lot with climate denial. Absolutely. Tax cut charts really interesting. Awful. So the first thing in understanding your data you want to think about is what kind of data you have. Is it continuous or categorical? So for folks who are less on the comfort end of data, nothing to panic about with this continuous numbers one through four, one through a hundred, whatever it is, categorical or categories. So in the data that we looked at, we have disabled students and non-disabled students. So I would identify that as categorical even though we are looking at quantitative data.
(00:49:35):
So there are numbers involved, we’re looking at categories. You also want to think about how many variables you want to visualize. In our case, we kind of have two, right? We’re thinking about disabled students and non-disabled students and their financial stability. How big is your data? Do you have three or four, two data points like we do? Do you have thousands, right? That might impact how you want to build your visual. Is there a time or geographic component? So as somebody already mentioned, kind of with time, you might want to use a line graph, geographic component that can be helpful to have mapping data. Is there race ethnicity data or gender data involved? If so, we don’t have a ton of time to get into it today, but there are amazing best practice guides out there. I think just kind of basic stuff, right? If you’re doing gender data, everything, female shouldn’t be pink. But there are really extensive guides online that can help you think through this. And then today we’re talking about federal data sets, all pretty aggregated, so no privacy concerns, but that might be the case in other projects that you work on.
(00:50:38):
And so once you’ve thought about your goal and you’ve understood the data you’re working with, it’s time to pick a chart type. So again, depending on what you’ve thought about in terms of your goal in the data that you have, there might be different chart types that are a better fit than others. We aren’t going to talk a ton today about colors. Again, there are some great resources and color and design online, but just a few things to think about, right? You want visually appealing, clear, topically relevant. And you also want to keep in mind on the accessibility issues. So many of you might have other team members at your papers that deal with that, but as you’re working on data visualizations, there’s also some great tools online to check for 5 0 4 compliance. And if you need help thinking about which chart type you want to use, why can’t I?
(00:51:31):
There we go. One resource I wanted to share with you all is called a chart chooser. The one I’m using today is from high chart. So this is a free tool to help pick the chart type you might want to use based on your data and goals. So you can pick here what data type you have, whether it’s categorical or continuous. So I’ll pick categorical for ours and then what your objective is. So do you want to make a comparison? Are you looking at composition distribution? We’re going to look at comparison data, right? Between disabled and non-disabled students. And so here are some options they give. I take issue with the donut chart, but that’s just me. So you can see columns, tree maps, bar charts, and how they match with our objectives and data type.
(00:52:27):
So a couple of tools I want to share, and I’ll send links and things to all of this and the follow up email that you’ll get after today, but a few different options as you’re making your own data visualizations. Canva is a wonderful resource. They have some very basic structures for visualizations. Tableau is great if you want to get your hands a little bit more in the weeds with the data, it’s still point and click. No need to code. You can download Tableau desktop to your machine for free with an EDU email address for a year. Otherwise, you can still use Tableau for free in perpetuity. The only distinction is that without a paid account, any visuals that you want to publish get published to a public site in your name. So if you’re working with any sensitive data or anything that you’re worried about sharing ahead of press time, not a great option. High charts and flourish are both really great online platform options that have free versions. High charts is what we just used for the data chooser, and you can build your chart in high charts. The one that I’ll quickly show you all today is called raw graphs. We’ll work through our power stats data with that. And then of course Excel Write, simple basic. You can make lots of different types of charts in Excel.
(00:53:41):
This is never going to let me escape. There we go. So if we go to raw graphs, which is one of the options, you can just paste your data in, you can upload it if it’s bigger. Since we have this really small little power stats, clean file I made, we’re going to pop that in here. So we see our data has come through now that it’s been successfully imported, we can choose a chart type. So again, not a ton of data we’re working with. I want to do a comparison. So I think a bar chart might be a good option. We’ll select that here and scroll down. And then the third step is mapping. So we have to figure out how we want our data to be visualized. So I’m going to put disability status as the bar, and then we need to determine the size of the bar.
(00:54:27):
And the size of the bar will be driven by whether or not COVID-19 was a reason for failing to meet essential expenses. So we’ll put that in size and you can see that it generated this little chart for us, right? Nothing revolutionary, not beautiful or groundbreaking, but just wanted to give a really simple example of how you could bring some data through and use this free tool. And then you can play with the margins, play with the colors, the labels and everything to make this look a little fancier, but pretty quick and easy to grab the data that you want and create a visualization.
Rachel Jones/NPF (00:55:00):
Rachel, I’m going to jump in here because first of all, I think we need to steal a little bit of time for our break to keep you here just a little bit longer if you can stay.
Speaker 6 (00:55:11):
Sure.
Rachel Jones/NPF (00:55:15):
One question that I had, and then I’m going to open it up to the journalists, is that when we’re communicating this issue to the public or trying to get them to understand what disappearing data means, would it make sense for the fellows to go back to their newsrooms and maybe take a look at the data in their communities, what’s available? Is that a story, a valid story?
Speaker 6 (00:55:51):
Absolutely.
Rachel Jones/NPF (00:55:52):
Yeah. Help us flesh that out a little bit.
Rachel Rush-Marlowe/ResearchEd (00:55:54):
Absolutely. Yeah, I think thank you for bringing that up. And I think that’s a great point. You can’t make any analysis or decisions about things that you can’t measure. And the federal government removing all of this, and even I think that trickling down. So the data sets that I described today are federally collected, but they’re federally collected about local communities, about local districts. And if we don’t have that information, that’s hugely impactful, particularly for our marginalized students and marginalized communities, right? Communities of color, rural students, low income communities, and students. The federal government provides through the Department of Education provides resources. I think there’s a huge misconception that the Federal Department of Education sets scripted curriculum or mandates a lot of things. That’s really not the case, but more than anything else, they are an accountability measure to make sure that, for example, the Department of Education’s Office of Civil Rights, they take care of civil rights complaints coming from schools or from parents.
(00:57:00):
If a student with disabilities is not receiving the services to which they’re entitled, the Department of Education will send investigators to look into that. If we don’t know how many civil rights cases are being opened against different districts, how do we know which of our students are at risk? I think this is absolutely a story, and one of the resources that I have on the page here, open Campus, they just created, again, it’s higher ed specific, not K 12, but it is focused really open campus, always on local communities, and they have a new higher ed under pressure Resource center for journalists in particular working on these topics. So they have issue primers source recommendations. They can help you interpret how national moves might impact your local data, your local communities, and they also have a Slack channel, virtual briefing calls, lots of different things. So I think that this is absolutely a story, Rachel, and this disappearing data is going to be really detrimental to our communities and to our understanding of even how detrimental it is because we can’t collect information on how poorly people are being served or how well people are being served, or what resources students and families have access to.
Rachel Jones/NPF (00:58:07):
If we have a couple of pressing questions, can I see some zoom hands if somebody, or put it in the chat. Let’s see. Here’s another sort of a devil’s advocate question that I want to put to you while we wait to see if the journalists have, one of the story links that I sent to the journalist to prepare them for this conversation was President Trump using the test scores in Baltimore or achievement record in Baltimore and saying, we’ve been spending all this money collecting all this data, and in many schools around the country, the scores are still low and the performance is still bad. So what would you advise a journalist who is putting together a story like this, how to anticipate that or how to communicate that to the public?
Rachel Rush-Marlowe/ResearchEd (00:59:10):
Yeah, yeah, absolutely. I think test scores, despite my work in higher ed primarily, you’ve touched on something I care very deeply about. In my early career, I worked much more in K 12, and I think that standardized test scores and the use of standardized test scores to make decisions about district funding is really detrimental to our communities. The usage of these standardized test scores, data and underperforming schools in this kind of narrative that those schools are doing a disservice to students or should be punished, I think really takes information out of context. So I think a lot of the conversation that we had today is about context and storytelling and how do you put the data together with a narrative. And I think that my response to that and how I hope others would respond in investigating these topics is to look at who are the students being served in that community, what resources does that school have to make a comparison In the higher education space, we work with a lot of community colleges that have graduation rates that are in the twenties, 20%.
(01:00:16):
We work with a college that brought their graduation rate from 10% to 20% in the last year, and they won a national award. And that’s because I think when we’re talking about our students and their needs, we need to put it into context. We need to think about who we’re serving, what their needs are. Community colleges traditionally serve students that couldn’t afford to or maybe didn’t have the academics to make it somewhere else. And so the other option might’ve been that they didn’t go to college. And so when you put it into that context, you’re talking about students. And I think the same is true in local communities. And at the K 12 level, you’re talking about communities that have huge issues with food scarcity, talking about communities that maybe are living in places where their students don’t feel safe walking to school or walking home from school.
(01:01:01):
You’re talking about places where parents are working 1, 2, 5 jobs to make ends meet. And when you’re talking about places like that, schools need more support, not less. We know that teachers, teacher quality is the number one in-school factor that changes a student’s trajectory, right? It is the most important in-school factor impacting student success in school. Factors are nowhere near, in comparison to the importance that out of school factors play in a student success, the neighborhood you live in, your access to food, your access to books at home. And so there’s only, education can be the great equalizer, but it’s not the only thing. And schools are expected now more than ever. I think that we’ve finally begun to acknowledge that not all students come from the same households. Schools are being expected to fill that gap, and when they can’t, we see that they’re penalized.
(01:01:54):
And the way that our education systems are structured in part from policies that originated all the way back to the Bush administration and No Child Left Behind, test scores are what are being used to determine a school’s value and a school’s success. And I think that’s taking data really out of context and not telling the full story of are there bad teachers? Are there bad schools? Sure, of course. But I don’t think that that’s the story that we should be telling here. And I don’t think that’s the true story. In most places, most teachers are working really, really hard for their communities and their students. Most schools are trying really, really hard to make sure that students are successful, but there’s only so much that they can do, and kids are coming to school hungry and many kids across the country are coming to school hungry.
(01:02:37):
And that’s not reflected in test scores, except in that they might not be performing as highly. Right? I think that we also have to discuss and acknowledge the racist underpinnings of these standardized tests and of the way that they were created and of the fact that they don’t acknowledge the variety of backgrounds that students come from. The variety of ways in which students learn. And I think that more broadly, and this is more of maybe my own personal soapbox, but I think standardized tests overall are a really poor way to assess what students are learning. Students are being asked a number of standardized questions that have no relationship to the joy and inquisition and creativity that is happening in our schools.
Rachel Jones/NPF (01:03:23):
Last call from the Fellows, are there any of you who have questions or should I just take it away here? Adam, I
Lionel Ramos/KOSU (01:03:32):
Had a quick one. Hopefully one thing that I, oh, Lionel Ramos with KUSU working. I cover state government for a public radio station in Oklahoma City. One thing that I’ve found reporting on education, we have an education report at the station that I work a lot with, especially with our political context here, is that Oklahoma’s structure for how education is funded and what money they get and what they do with it is kind of its own. And other states have their own systems, and that makes for really, really messy data and really messy comparisons I’ve found. And I wonder just how you navigate that complexity when you’re trying to compare one state to the other. And the metrics they have are standardized test scores, and the metrics they don’t have are the things that pertain to poverty and race and such and such, or they conflate, one thing that happens often in Oklahoma is they conflate white and Hispanic and they don’t offer the distinction of the Hispanics by themselves. And so there’s an arbitrary finding there now. So just kind of wondering how you navigate that stuff.
Rachel Rush-Marlowe/ResearchEd (01:04:41):
Yeah, that’s a great question. I think that’s why journalism and storytelling and how you craft these narratives is so important, right? Sometimes the data is insufficient, sometimes the data itself at its source is misleading. And so I think providing that additional context, that qualitative data or pulling in information about demographics and about state policy to say that Oklahoma might be in a story where it’s being compared to New York, but to bring in that context to say, look, these two states, these two districts, these two systems are not funded the same. And here’s how things look different. Another way that we sometimes do this in some of our research and reporting, and again, primarily in higher education, but we just wrapped up a big report looking at some state finance laws in California that impact the community college system there. So California collects their data a certain way.
(01:05:35):
California is funded a certain way, but we picked a couple of states across the country that have more similar funding metrics. So we looked at California, Washington State, and North Carolina I believe, and pulled some federal data sets. So the data there is a little bit standardized, even if it doesn’t capture different state policies. And so we added that qualitative component talking about the ways that California is different from other states, but also pulled in data from what’s called IEDs, the integrated post-secondary education dataset to look at financial metrics that are calculated at all three states, show how those work, and then add that kind of contextual additional information. But I think it’s a hard question, doesn’t have any concrete answer, but I think to the extent possible, pulling in those standardized numbers where they exist and providing that context as to why, what aggregations you’re not seeing, right? What’s in the data and what’s missing from the data. And calling that out really vocally to say, this dataset combined white and Hispanic students, despite the fact that we know, insert some information you have about why you think they should be separated out.
Lionel Ramos/KOSU (01:06:41):
Okay. That is really helpful. Thank you. And it is comforting because essentially what I’ve defaulted to is learning each system individually and then being like, okay, now I’m looking at this and what’s different and what isn’t, and how do these match, what metrics do they have that are either similar enough or representing the same thing that may be called something different? And being able to characterize it, I guess, with my own language in that way, and having those to stand on.
Rachel Jones/NPF (01:07:08):
Absolutely.
Lionel Ramos/KOSU (01:07:09):
Thank you,
Rachel Jones/NPF (01:07:10):
Rachel. We are clearly going to have to invite you back to talk with the journalists, and I know that they’ll want to keep in touch with you. We have your email address and whatever, but I just want to take this opportunity to say you have broadened my perspective on how to cover this issue and how to communicate it to readers and listeners and viewers. So Rachel Rush Marlowe of research. Ed, thank you so much for joining Winding the Pipeline today.
Rachel Rush-Marlowe/ResearchEd (01:07:43):
Thank you all so much for having me, and I hope you all stay in touch and I will be sharing the slides and also looking into if power stats is officially dead.
Rachel Jones/NPF (01:07:52):
Yes, please do let us know about that. That is chilling. But
Rachel Rush-Marlowe/ResearchEd (01:07:58):
Thank you all so much.
Rachel Jones/NPF (01:08:00):
Take care. I’ll be in touch.
Rachel Rush-Marlowe/ResearchEd (01:08:02):
Alright, bye everyone. Thank you.
###
