Davar Ardalan and Lucretia Williams Transcript — Sept. 8, 2025
Rachel Jones, National Press Foundation (00:00:00):
Session three of the September, 2025, widening the Pipeline Virtual training. We’re joined by two women who will outline a vision of using artificial intelligence to provide a voice clarity and relevance for communities as a way to rebuild long abandoned trust in media. First, Davar Ardalan is a technologist and storyteller whose career spans NPR News, National Geographic, and the White House Presidential Innovation Fellowship Program. She focuses on embedding cultural context into emerging technologies and ensuring that community stories, languages and values will be a part of our digital future. As a lead author of AI for the community, her drawing on her background in journalism and innovation to explore how AI can preserve heritage and nurture human connection, Davar is joined by Lucretia Williams, a senior research scientist at Howard University. Her work centers on the relationship between artificial intelligence and Black communities in AI for community. She explores how technology can more accurately reflect cultural identity through community involvement, inclusive language, data sets, and ethical digital preservation. Her work includes projects that document African-American Vernacular English and efforts to digitize cultural archives at Howard University. Davar and Lucretia, thank you both so much for joining us and Davar, I’m going to let you take it away.
Davar Ardalan (00:01:47):
Thank you so much, Rachel. It’s a great honor to be joining you. And thank you so much for the National Press Foundation and the opportunity to be joining you to speak to your fellows. It’s a true honor. So I will go ahead and share my screen and we can get started. OK, good. So can we all see the screen?
Rachel Jones, National Press Foundation (00:02:19):
Yes. Great.
Davar Ardalan (00:02:19):
Thank you so much. Great. So super briefly. Rachel gave the introduction. My name is Davar Alan and I do focus on ensuring that the future of artificial intelligence also includes and preserves our heritage. And we’re going to be talking about that in the context of some of the work that we’ve done and the work that we’ve also presented in AI for community, which it came out in June, published by Taylor and Francis. And you can look that up on Amazon and I’ll turn it over to Lucretia.
Lucretia Williams, Howard University (00:03:05):
Yes, so glad to be here. Nice to meet everyone. Well, great introduction by Rachel, but I am Dr. Lucretia Williams, a senior research scientist at Howard University. And that’s all I do is research day in and day out. I mainly focus on human computer interaction, human-centered AI. And I also am a community-based researcher, so I do community work with designing and evaluating technology nationally and also internationally as well. So I’m looking forward to just sharing about some of the few projects that we also discussed in the book and what we have going on at Howard as well.
Davar Ardalan (00:03:44):
Amazing. Yeah. So what we’re going to be doing is talking about how AI can preserve memory, introducing sort of the concept of AI for community, exploring two case studies, and then talking about ethics and trustworthy AI and ending with ideas about how you could create purposeful custom GPTs. Obviously, we’re going to leave 15 minutes at the end for questions because it’s not just questions. We would love to hear your ideas, ways that you are using these tools in ways that we might not even know. And so it’s just the beginning of a conversation and hopefully in the coming weeks and years we can learn from all of you in terms of what you’re going to be doing out in the field. So to begin with, we’re going to take a look at AI for community. What is the intent? What problem are we solving? Why does it matter to community to have this perspective? How do we preserve voices, lived experiences and memory? And how do we ensure that AI reflects the lived realities, the languages and contexts of different communities? We’re going to begin with the intention, what problem we’re solving. I know that this is a heavy slide, but I thought it would just be helpful to have this on the screen. I’m not going to read it, but just to be able to discuss it and then CIA and I are going to go back and forth on this screen. So cia, why don’t you talk about what problem you were trying to solve with your case study, even though we’re going to go more in depth in it later.
Lucretia Williams, Howard University (00:05:33):
Yeah. So the main problem we try to solve with Elevate Black Voices is to have automatic speech recognition, be able to understand African-American English. Currently, our voices technology sometimes won’t understand certain speakers who don’t have standard American English speech patterns and the data proves it. So we just wanted to make sure that we were — how do we develop an inclusive data set that these LLMs can be trained on to improve automatic speech recognition technology.
Davar Ardalan (00:06:06):
Brilliant.
Lucretia Williams, Howard University (00:06:07):
And I can go to the relevance as well because the problem always ties hand in hand to relevance. So why does this matter to the community? Historically, Black communities have been historically excluded ever since the inception of this country, and that trickles down into the technology that we now own and use every day. So we wanted to make sure that people who spoke African-American English were included within the devices that they use every day.
Davar Ardalan (00:06:40):
Yes. And so what is the approach? What is the human-centered design aspect? How do we preserve voices, lived experiences and memory? An amazing example that I want you guys to research is the country of India. They are now creating small purpose-built data sets, which essentially are going around the country and recording the voices of farmers, recording the voices of people in rural areas because India is very big on digital tools. But what’s the point if you’re building an AI tool for farmers to be able to get the latest weather report, but if their AI doesn’t recognize their voice or understand what they’re saying, what’s the point? So they are the number one country in the world that is going around and doing what Recia just explained that they did with Google. They are doing it at a large scale because they understand that they have to invest in this right now India has a lot of different ethnic languages and dialects.
(00:07:50):
And so it’s the perfect example for you to look up to see how it’s human-centered because it has people at the center, not just the model. And then in terms of how do we ensure AI reflects live realities? Another incredible example is Nvidia currently in the last three years have been working with the Maori community in New Zealand to protect their language and show how AI can help revitalize cultural heritage. So the idea here that AI should adapt to the communities and not flatten them. So why is it that memory matters in journalism? As Rachel mentioned, I was at NPR News for many years, 22 years actually, and Rachel and I go back to working together at NPR and later at National Geographic because community memory and action is important. As you build the future and start using these AI tools, you can look to see what kind of data sets are out there so that you can build custom AI models that can inform your reporting much more proactively.
(00:09:06):
We’ll get to that a little bit later, but for example, and I don’t want to get too much into CIO’s example, but Howard University owns the dataset that they created and currently it’s available for HBCU universities to use as they wish. So imagine if you’re a reporter and you want to build a dataset around African-American vernacular speech because it’s going to really inform some of the reporting that you do. So we’ll get to that in a second, but it can also amplify, forgotten or overlook narratives. This is really important because when you look at large language models, for the most part, the dataset behind them are Western. And so I come from an Iranian American community and the example I’m going to show you is a custom GBT I’ve created called my name is Iran, which allows me to access nine different open source data sets going back to the stones from Persepolis and the inscriptions from the Persepolis.
(00:10:11):
These are all data sets that are open, and I am able to just find remarkable resources to help my children or my community learn about what it was that the message of the ancestors was back then and what do we learn from ’em. And then editorial responsibility, you have to keep your stories accurate in the next five years. Large language models and AI tools are going to play a big role in how you do your work. So you have to make sure that you’re looking for data sets other than the ones that the large language model has been trained on, because it’s your responsibility to make sure that you do the extra work so that your work is encompassing more cultural context. And Lucretia, did you want to round this out?
Lucretia Williams, Howard University (00:11:02):
No, you said it perfectly.
Davar Ardalan (00:11:05):
OK, so again, the extra responsibility for journalists AI for community matters because communities can archive their stories and their heritage in new ways using AI. For journalists. Remember, these are very, very early days, but AI inputs, as I mentioned, are incomplete. They don’t capture the rich full story of communities, of farmers, of faith-based groups. These are often voices that are overlooked. So you have to be careful that your grounding, your reporting in the context of history and lived realities. But you guys, this is also an incredible opportunity. When I got into this space in 2018, one of the very well-known journalists, I think from the Guardian, he said something like A machine will win the Pulitzer Prize one day because a machine is going to find some story inside of a dataset that nobody could have ever imagined, and you’re going to be the one who’s going to find that. So it’s a very exciting time.
(00:12:20):
And also AI can adapt to community voices, languages and traditions. We’re going to get to learn this from LA pressure in a minute because cultural AI grounds us in archives, oral histories and diverse data sets of communities that exist currently. But it is also your responsibility as journalists to find these. You should have some kind of Excel file where you add data sets to it. A lot of them are out there, but why shouldn’t you have in your toolkit open data sets that exist? For example, if India ever made their data sets open, you should be able to access those because in the future it’s going to give you some really interesting insights. And then the reason why cultural AI and contextual AI are hand in hand because they interpret meaning, but they bring social and historical context into place and together, this is what becomes a very nuanced role that you will have as a journalist because you don’t just take the large language model for what it is, you have to be able to add to it with data sets to be able to then be really powerful. So I’m going to now throw it to cio.
Lucretia Williams, Howard University (00:13:53):
Yes. So I am going to talk about two projects here at Howard. So first project, elevate Black Voices. So that was my team’s effort of building a large scale dataset of African-American English to improve automatic speech recognition. So I’m just going to lay out the process of what we did. So this project started around fall 2023, and we knew we had a goal of collecting 600 hours of African English, African-American English from across the United States. So we wanted to use a community-based approach where we hosted about eight community activations across various cities in the us, so New York, dc, Atlanta, new Orleans, north Cal, south Cal, Chicago. And we really started out with having just community round tables and community discussions to talk about AI and technology. This is when generative AI is a buzzword, and it’s also a lot of fear mongering on social media and the news outlets, and especially with our older community, what this generative AI means or are people going to call me.
(00:15:04):
So we wanted to also address those thorny issues and those challenges that people did have in the community. And also we just wanted to have the communities members just see how do you feel that Black Voices will be able to contribute to research or the next generation of AI innovation? So particularly we had an older group in New York City, it was an older church group and they were kind of excited. So sometimes often community-based research in this kind of way when it comes to marginalized communities, people think that they won’t participate because of historical harms, but there is hesitancy. So it’s not trying to say they won’t participate, but you have to make sure that you have someone in the room who relates to them to bring this opportunity out and also discuss the real challenges As a researcher, oftentimes higher education and institutions go to communities and extract, and then they never hear from them again.
(00:16:09):
But with me being from this community, and also Howard University really takes research like this. Seriously, we wanted to make sure we set the context and we set the groundwork with the communities first before we introduce our study of why we wanted to collect your voices. So at the end of each community activation, we just laid out the project really who owns our data? Where would it be going? Where would it be stored? I’m actually on campus where we have the server where all of the data is stored. So Howard University does own the data and I see it and I work with it. But with this project, we really hit our goal with collecting 600 hours of African-American English. The study consisted of three weeks verse 10 survey questions, a day survey, we will automatically send them 10 questions a day through email and they can answer random light-hearted questions like, oh, if you won the lottery, what would you do with your money?
(00:17:12):
Oh, what dishes are you going to bring to Thanksgiving or the family cookout? And what are some things your mom said growing up to you? So things that were very lighthearted that didn’t want to evoke any emotion triggers or trauma. And the main reason why we wanted to do this, because there are research studies out there that shows the error rates of African-American English when they use voice assisted technology such as Siri, Alexa, Google Home, and we want to also make it known you shouldn’t have to code switch to even use your voice assistant devices. And you also have to be realistic and know that not everyone has the ability to code switch, whether or not you have different speech patterns or you’re an older adult, or you come from the rural software that’s ingrained in you and you don’t know what a code switch looks like.
(00:18:06):
Everyone has that right to be able to talk naturally in their natural speech pattern in order to use their assisted devices. And also just to go back to the older community, sometimes people use these devices for caregiving and for help in the home as well. So if there is an emergency where you need to call for help or call a loved one and then your voice assistant technology stopped working, then that’s a life or death matter right there. But I also want to bring it back to challenges that journalists might have when it comes to automatic speech recognition technologies. And this also came up a lot when I was doing interviews for this project, and there were different press journalists that were interviewing me virtually over the phone. So transcription services like Zoom and recordings, sometimes they don’t pick up accurately on speech patterns. And I know I have an accent, I speak fast, I have different speech patterns.
(00:19:11):
I’m from New York City, so my vowels is all different in certain words. And even in something that came out in one article like the ED was left off, but it won’t catch it in the recording. And that’s AI. That’s automatic speech recognition technologies. So as a journalist, when you are fact checking and you going through your transcripts, you may want to go through fine, a fine tooth comb to see, OK, where were the errors in this? And if you not familiar with African-American English, it may not come in natural to you. So this is why this project is important. And also just having this in your toolkit to know this is something that we are going to have to keep forefront of mind. And then lastly, I want to move on to preserving back culture through archives. So Howard University has the Moreland Spring Garden Center, and it is one of the largest Black archives globally outside of the Schaumburg Center in New York City and Harlem where I’m from.
(00:20:16):
So Howard here on this campus, actually in this little tower here, there’s rows and rows and rows of Black press archives dated from early 19 hundreds. I’ve seen some of them, but they’re in films black and white. It’s not digitized yet. So a lot of VCs and a lot of companies are walking to Howard to try to pay to have that digitized. But similar to the Elevate Black Voices project, we negotiated to own that data. Now, what is challenging now is that Howard needs to make sure we still own the digitized versions in all AI products that come out of the Black archives because that’s generational wealth for the university. And we are HBCU and we are private. But this also tends to understanding how the media’s role shaping the public perception and accuracy of African-American culture with the current times we have here in America.
(00:21:18):
Black history is not being taught in certain states, and a lot of these students that come to Howard are from these states. So this is a good way to have an educational tool be used through these data sets of actually learning Black history that has never been seen before in those archives. Some of those things that I saw, I’d like, wow, this is interesting. I wish my grandma was here to see this because this, she will remember some of these things. But a thorny issue that comes about when preserving Black culture specifically is the argument or the question of who should have access to and who should this historical information be shared with? A lot of people want to gate keep this information off the fears of it being weaponized or used further for historical harm. So just as journalists and as you all are doing your work, just keep in mind, how am I telling this story?
(00:22:16):
How am I being responsible? Or with whatever dataset that I see or getting this information from, am I giving credit to it in the best way possible or the most appropriate or accurate way possible? So those are the two main projects we have here at Howard. We’re still working on it. We hope to have a 2.0 version of Elevate Black Voices with other Black dialects from the Gullah Geechee, Caribbean and certain African countries. And we are still trying to figure out how are we going to find funds to further digitize the Black archives. So thank you.
Davar Ardalan (00:22:57):
Yeah, absolutely. Extraordinary. And maybe Lucretia, we have just a couple minutes before we move on. Maybe you could also share a little bit about how you worked with the communities in the sense that the folks who recorded audio were also paid. And we don’t have to go too much into this, but I think it’s really important for the journalists to know that your framework is very pioneering.
Lucretia Williams, Howard University (00:23:32):
So for research studies, we have to pay people, but it’s best to pay people compensation. But with this specific one, we wanted to make sure we paid people the maximum amount we can. So each participant who finished the three weeks completion of the voice recording, they got $599. So that’s the highest amount before it can be taxed. And also we wanted to let them know that we value their time, we value their contribution because if this is going to live on and HBCUs are going to continue to do research and elevate offer of the voices, we want to make sure that they’re able to be compensated fairly. So I don’t know if you all have any inspiration to do data sets or work with technology or work with the community, but compensation is a very important thing that people underestimate because people will participate when they feel that their time is valued. And also when it comes to data sets, we are trying to think about funding structure when it comes to the community owning something. I know Howard owns this specifically, but we want to make sure that the funding structure is a big part of research components when it comes to data sets or even just technology research.
Davar Ardalan (00:25:04):
Great. And in the book, which I hope you guys will get, it’s more detailed because Lucretia was also very honest about some of the challenges they had because even at the university level, sometimes it’s hard to get those checks to the right person. And so there were delays and things like this, but that’s why it’s important. What do we learn from this incredible pioneering approach and how future communities can do this and learn lessons from what Howard did? Google is able to improve their technology. They don’t own the data, but they’re able to improve the technology. OK, so we’re going to move on to the next case study, which is my name is Iran Interactive. So in 2004, when I was at NPR News, I did a series called, my name is Iran, and it was basically about my own family. My great grandfather was the Minister of Justice in Iran in 1927, and he completely revamped the Iranian justice system and modernized it.
(00:26:13):
So in 2003, the Nobel Peace Award went to someone named Sina Body, and she was an Iranian human rights lawyer. And somebody said, oh, somebody needs to write a book from a body to Davar, who was my great grandfather? Iran’s 80 year struggle for a lawful society, looking at how through the decades, Iranians have always tried to have a lawful society, but there’ve been obviously different political reasons, revolutions that have prevented them from doing that. So I wrote this book, I did this NPR series, it became a book, and I wanted to do the next generation of that. I was like, OK, when you’re now 2025, what’s the next generation? The next generation is I researched all of the open data sets that are available on Persian history, and there’s amazing numbers. So the Persis tablets, which are the stone tablets written in CUNY form 2,500 years ago that carry incredible messages about human rights are now an open dataset.
(00:27:24):
There’s an Iranian American oral history project at Harvard that has interviews with over 170 leaders, nation builders in Iran before the revolution. And their voices have been recorded in interview form, but they’re sitting there in audio format. They’re digitized, but they haven’t been AI ready. And then there’s a lot of data sets on Persian poetry. So the reason why I wanted to do this is because I can bring all of these data sets together and create a more reliable AI resource than the general AI. Because if you ask Chad GPD about Persepolis, it’ll tell you it’ll everything about pers seiss. But when you have a more narrow created custom AI that is trained and has access to more open dataset, it’s going to be more reliable and it’s going to give you deeper stories. The other thing is, think about this. A lot of books go out of print, a lot of archives disappear.
(00:28:29):
I don’t need to tell you this and I know we’re being recorded. How many stories have you heard about data sets that are disappearing? Raise your hand. Data sets are disappearing. These are data sets that are invaluable and they are being deleted. So I thought to myself, my God, these periplus tablets, they’ll be deleted. Somebody will say, this university doesn’t have funding to keep this data set up. We’re just going to delete it. GitHub moves on. They’re like, oh, we don’t need Iranian American oral history at Harvard. Let’s take down these 170 oral interviews. So think about that. Some groups of people in the last 20 years have gone through so much effort to create these open set data sets, but they could disappear, A link could break. So I wanted to make sure that I have access to these data sets and then this is heritage technology, this is my heritage, and now I feel that I’m preserving and elevating the voices in new ways.
(00:29:41):
So this screen looks like A GPT. It is. It’s a custom GPT. And did you ever wonder what Darius, the great etched in stone or what Hafe whispered in verse, this GPT reads Old Persian unifor and Persian poetry, thanks to open data from royal inscriptions to centuries of verse memory is able to speak. OK, so the questions it answers, what did the stones of Persis say? What did qajar women write in their diaries? So this is also an open data set of qajar women, which is women from 200 years ago, what they were writing in their diaries, and then even how did ancient Persian engineers move water through streams? I mean, there was a engineering feet called the K system, and this was from 2000 years ago. There’s a lot that we can learn now around sustainability when we build our cities. If we go back to these data sets.
(00:30:47):
So this screen just sort of like, OK, what did Kja women write in their diaries? And this gives you an answer. Again, I’m not saying that the regular GPT won’t be able to do this, but I’m suggesting that it is also our responsibility to curate having this data and creating this very comprehensive portals so that if these data sets are deleted or we have scholars or our children who want to dig further, they actually can. And the information comes directly from data. So we’re at 2 33 and we’re going to spend the next 10 minutes or so before we go to your questions on ethics and trustworthy AI. And then we’re going to briefly talk about the tools that you can make. And Lucia, feel free to join me on these slides. So what are some key principles? I think that Recia has talked about the idea of safeguarding your sources and making sure that privacy and protection is there.
(00:32:03):
So she talked about as these data sets, you access them or if in the future you want to make your own, make sure that it’s human-centered and at the heart of ’em is the community that you’re going to be reporting on. Make sure you fact check everything because hallucination is a thing even in the general chat GPT, if you ask it questions about Persian history, you don’t even know if all of it is right. I should have brought that up. That’s another reason I did. My name is Iran, because what you ask my JGBT that’s custom is going to be right because it’s going to the source. Fairness, accurate representation. Large language models do not have accurate representation. But let me just say something very clearly. AI will be biased until it’s not. AI will always be biased because the second it knows, I’m just making this up, African-American vernacular, it doesn’t know Persian hanani accent. So you’re never going to win the battle to make sure that AI is completely not biased. It’s always going to be biased. Your responsibility is to make sure when you’re using it for your reporting that you know which community you’re reporting on and that you check for accurate representation. Recia, did you want to maybe add on that?
Lucretia Williams, Howard University (00:33:28):
Yeah, I would say I would refrain from using AI as a search tool too much because we love AI, but it’s not there yet. So I would say if you use it for a search, if to get any information about whatever story or something, I would say not to use it as that, but if there is some specific curated, like the VAR is saying her custom GGBT, if there’s curated GPTs and models, those are more beneficial than the larger open source models like Claw Gemini and chat GPT, because those are super general where it’s going to be pulling things the way technically like how generated AI, it’s off of zeros and ones and numbers. They’re just predicting the next output. So you want to make sure the more custom in the mall, the more smaller the GPT is, the more accurate because it’s able to handle the things that you created it to be. So I would say I would reframe it from using it from a search tool if it’s not custom.
Davar Ardalan (00:34:45):
Yeah, exactly. And then also transparency and ownership. If you did do a search on ai, which you shouldn’t, but if you did and you ask it to give you all the sources, you have to go back and check the original sources because it also hallucinates and makes up a lot of bullshit. It’ll reference a report and you’ll be like, oh my God, there’s a report on that. But if you don’t click the link and see that it is completely bogus, it doesn’t exist, you will think that you’re actually referring and referencing a source and you’re not. And then sustainability. So this is important because you are going to use AI tools in the future more and more. Remember how much energy and resources AI takes. So there’s many stories about the water resources, the ways that energy are going up, because AI systems use so much energy.
(00:35:41):
I read somewhere, and I don’t want you to quote me that a 20 minute conversation with Chachi PT is like two cups of drinking water. And then just imagine how many people are having much more than 20 minute conversations with Chad, bt, there’s just a lot of energy there. And then cultural responsiveness. This isn’t something anybody teaches you in journalism school or even in AI schools. They will never tell you to be culturally responsive when you are thinking about AI and how you use it. But recia and I would like you to please be responsive. What does that mean? That means that you have to make sure you understand the context and that you’re not flattening a community just by using an AI. I are there, Rachel, questions in chat. I can’t see those while I’m talking. I think you’re on mute maybe.
Rachel Jones, National Press Foundation (00:36:47):
No, I don’t see anything right now, so you can continue.
Davar Ardalan (00:36:52):
OK, great. So here’s some useful ideas for how you could create your own GPT. Let’s say that your beat is education or housing or healthcare. So depending, let’s just say you’re covering the state of Maryland and you’re covering education in the state of Maryland. There’s so many data sets and PDFs and reports on education that come out regularly. So you could think about are you covering K through 12? Are you covering higher education? What are you covering? Then you’re going to go look for the state of Maryland and you’re going to look for data set on education. Again, you download the PDFs and you now suddenly have, let’s say, and by the way, 10 to 15 PDF docs is a sweet spot. Don’t try to do anything more because even with GPT, it has a problem with too many data points it’ll hallucinate. So I try to stick to about 10 maximum 15 PDFs that are maybe 20 pages each, not more.
(00:38:15):
Next, again, you’re thinking about your beat education, housing, health, and you’re going to the state of Maryland. You’re trying to see what kind of reports exist, and then you’re downloading them and you then have to do your own search. You want to make sure you do some research on news articles that have come out so that as you put these PDFs into your own custom GPT, you’re able to look at reporting that’s already out there around your topic. So let me explain how I use this in the last few months. So my husband’s a civil engineer and we spent some time in Florida and he was working a lot with the Fort Myers municipality on civil engineering stormwater stuff. Well, you can imagine because of hurricanes, Florida has unbelievable data sets that are publicly available around hurricane or storm water relief and stuff like that.
(00:39:25):
So I downloaded any data that I could get from that particular county, and he was able to, as part of his proposal for Fort Myers, give absolutely the latest information on stormwater prevention because I had downloaded this data. You have to use your imagination. What is your beat search, what kind of open data sets are out there? And then just have fun with it. You don’t have to write a report or a story right away. Spend two or three weeks just talking to this data because what did I say earlier? You can talk to memory, you can also talk to data. You can ask it for information. So you have a 10 page PDF on the state of Maryland and how kindergarten students are doing, ask it a question, what is the most surprising statistic in your data? The AI is going to answer you.
(00:40:25):
It’s going to say to me, the most surprising is that in the month of November, there were more six year olds who dropped out of kindergarten than the year before. I don’t know it’s going to tell you that. And you’re going to suddenly be like, oh my God, I have a scoop dude editor. I want to do a story about how many sixth graders have dropped out. Now you have to go and have your own additional information and you have to do interviews. So you want to make sure that the machine isn’t giving you wrong information. But my point is that this is how you can use it as a resource. So, oh, I wanted to go forward.
(00:41:14):
So again, state education reports, state budget, education report, testing results, enrollment data, what’s the graduation rate and my district compared to last year? Quick comparison of district spending and outcomes and always fact check. So key takeaways from what Lucretia and I have talked about, augmentation, trust, authenticity, and alignment. Make sure that AI is enhanced with reporting and you’re not replacing your own journalism judgment that you’ve been trained on. Make sure that the AI that you’re using, you understand it, make sure that you are transparent about using it. You make sure that you’re fair in terms of how you’re approaching it and make sure that you have an eco-friendly approach to how much AI you use because there’s a lot of energy, authenticity. Make sure that what community are you writing about and make sure that you’re culturally responsive, that you’re not just taking what AI is giving you or any research you do on Google that necessarily doesn’t have the context of the community. And then alignment as you yourselves, all of you, I can guarantee you, I think 15 people are on this call. Every single one of you in the next five to 10 years, we’ll build an AI tool that you’ll use in your trade because that’s the direction you’re either going to buy it or you’re going to build it. So make sure that you align and build purposeful tools that are going to help you journalistically, but also make sure that the communities that you are writing about are represented accurately.
(00:43:11):
Yeah. So Rachel, I’m going to stop sharing my screen. I have a couple questions here, which is like what opportunities do you see for AI in your reporting or maybe you already are using ’em, and then what risks or concerns stand out to you? Lucretia, why don’t you also give a final thought before I turn it over to Rachel.
Lucretia Williams, Howard University (00:43:33):
I guess my final thought is how do you see AI impacting your work currently and in the future?
Rachel Jones, National Press Foundation (00:43:44):
I have a question based on our recent webinar, but before I get to that, I want to talk to Chloe. Chloe Lee, one of our fellows. She said that she has been asked to use generative AI and she knows that some of her other team members do, but she hasn’t used it yet. So Chloe, can I ask you to tell us what some of your other team members are doing?
Chloe K. Li, Al Jazeera English (00:44:15):
Yes, sorry, I was muted. Yeah, a lot of it has been – horrifically – some of it is using it for cleaning up scripts a lot of the time, which I don’t like because we do a lot of international journalism. So countries that AI that is built isn’t familiar with, they get names wrong. We had someone’s name be Mistranscribed as a slur instead of their name, stuff like that. I know sometimes there are people who might use it for prep for interviews and for trying to quickly find information. We do have a pretty tight turnaround every day. We do a 20 minute video show on a daily basis, and we only get one day to work on it. So I understand there’s a bit of a thing, but some generative AI has been used for prep, and then obviously we still look at it, but I have yet to use it because of, I just feel weird. It feels like it kind of goes against everything journalistically we’ve been trained to do. But yeah, so those are some of the experiences that I’ve been –
Rachel Jones, National Press Foundation (00:45:28):
You should stand in my shoes. As someone who started 40 years ago, I think AI for me stands for archaic intellect. Some days I really feel I’m behind the curve, but before I ask my question, I see that Lionel and Kirti have their hands raised. So Lionel, why don’t you go ahead and ask your question and we’ll move on.
Lionel Ramos, KOSU (00:45:49):
Yeah, thank you, Rachel. I cover state government in Oklahoma City or state government for Oklahoma for KOSU in Oklahoma City. And my question is, I guess kind of practically related to going through transcripts and things that are transcribed from people who have accents. I cover a lot of the Latino community here in Oklahoma City. I’m a Latino myself, and so I use a transcription service sonics that will pick up the audio that I pick up for the radio and spit out a transcript of audio that can be in Spanish and English and sometimes both at the same time and spit it out pretty accurately. I always go and comb through. But one thing that I run into is when people use colloquial terms in an accent and it’s broken in the transcript, and now I have to correct for grammar. And oftentimes I don’t end up using that tape if I don’t have to, because usually I translate that also to the digital story.
(00:47:01):
So those quotes are also in text in our publication. So I try to stay away from situations like that, but sometimes that is the most relevant thing that someone says. So just parsing out the fixing for what you hear and know as someone that is a member of a community versus standard American English and AP style, which tells us that we shouldn’t fix for grammar and that we shouldn’t write how we hear people say things so that everybody can understand it. And so that’s one thing that I’ve run into just I guess practically and also philosophically.
Lucretia Williams, Howard University (00:47:38):
Yeah. Oh, go ahead. It’s so interesting because it’s similar to being a researcher, because with qualitative research, if you do interviews and focus groups, you have to record and transcribe things as well. And sometimes people, I err on the side, but I know your journalism is a different field than research, but because I do community-based research and then also depending on the country it’s in, I want to, as they said it, I’m going to report it like that within the research publication because I think that is with human computer interaction and the human-centered design work that I do, it’s their lived experience, it’s their perceptions. We are trying to capture those nuances and how we can improve technology and capture their feelings and what they say. So I would say, this is just my opinion. I would say maybe you can be an advocate for how you see yourself reporting in your specific work. Maybe that’s a lane you want to carve or maybe some talks that you want to have. But I do think when it comes to certain contexts, it is better to report what they say, broken grammatic error English or whatever. I think it feeds the story and it feeds the accurate narrative.
Lionel Ramos, KOSU (00:49:10):
I do want to point out, so I write fiction just on my own, and when I do write my fiction, I write it as I hear it, and it’s actually a lot, it feels a lot more accurate and it feels a lot more contextual than anything that has ever been spit out to me by AI. But in the context of news, obviously there is that sense of accuracy. I’m not making stories up or I’m not inventing dialogue perhaps like I would in a fictional story, and so I can do it well when the limitations of the news aren’t there. And so I just wonder, I guess, and maybe this is a question for Davar, the trajectory of the industry and industries in producing content that is actually representative of the people that it is about, whether it’s AI or not, I guess,
Davar Ardalan (00:50:04):
Yeah. So I would say that you should start making a list of these phrases and words, and I’m not saying that they’re always the same, but imagine that you end up in the next year with 15 to 20 phrases or grammar combinations that become very, very regular, and you actually can create a custom GPT where in the instructions you tell it that the majority of my interviews are from the Hispanic communities. Whenever you see these phrases make sure and add, I’m making this up an asterisk in my transcripts. So then what you do is at the bottom of the transcript, you come up with your own original editorial note. You say that this is Lionel Ramos. I have trained my own custom GBT for copy editing and for AI fluency. And my editorial note to you is that this transcript has been edited and modified for clarity, but that these terms that were, I don’t know if you want to say code switched or whatever, are part of the Latino English vernacular.
Lionel Ramos, KOSU (00:51:31):
And I think to that difficulty of calling it what is called, there’s so much variability in in the different expressions and what people say and what is picked up and what isn’t. And so I think that’s a really good note. I think I am going to start making a list. It really stands out to me when I’m in the middle of my workflow once times.
Rachel Jones, National Press Foundation (00:51:49):
Great. Yeah, go ahead. We’re running near the end of our session, so let me make sure to let Keerti ask her question.
Keerti Gopal, Inside Climate News (00:51:59):
Hi, thanks for being here. My name’s – hi, I’m a reporter with Inside Climate News, and yeah, you were mentioning some of the environmental concerns. My colleagues have been writing a lot about the data centers and the energy use of AI and the water use of AI and also the energy bills and the costs and all this stuff. And so for us, whenever we’ve talked about AI and journalism, that feels like a pretty big conflict for us is to be reporting on all the harms of this technology. And then so we don’t really use it very much, and I think that’s something that my newsroom has been pretty resistant towards. And I’m curious, you mentioned thinking about using it in a sustainable way, but what does that really look like and yeah, can you say a little bit more about how you reconcile those things?
Davar Ardalan (00:52:49):
OK, so I have to make sure that you all hear me before we end. I use AI all the time for my writing, for my research, for everything. So I’m not trying to be contradictory. I use it all the time because I am 61 years old. I have been trained for 30 years as a journalist, and I know exactly how to use it to for my purpose and then to be eco-friendly. So I’ll give you an example. Let’s say I have a transcript of an interview that I’ve done. So I have created my own custom GPT where I’ve given it instructions. So number one, I put it in my custom GPT, I say summarize the important points, right? Then I look at it, I already have the summary. Then I say, this person comes from, this is their background. I met them in 1995 once, but then I met them again yesterday.
(00:53:52):
They are formidable in their space. Write a paragraph including this quote, it writes it. What I’m trying to say is that you can use AI because it gave me 30% of my time back if I had to do that on my own. And that’s not unethical. That’s very ethical because I’m using AI to, number one, summarize a transcript. Number two, when I prompted, I’m giving it all of my own context. And I’m saying, now write the paragraph with the quote. So the way to be eco-friendly is to also make sure that you take a piece of your day as a journalist or a week, what are the things that you do routinely? What are the things that you do all the time? OK, you transcribe and then you review your transcripts and you summarize them, you pick the best quotes. Can you automate that? If you automate that, how much more time are you saving?
(00:54:55):
And then when you ask the AI to you give it context, more context, and you ask it to write part of your story because you’re part of prompting it, how much more time again are you saving? So the worst thing is when you just feed a whole bunch of stuff to AI and you say, OK, write this story number one, that is not journalistic. And number two, you’re wasting a lot of resources because you’re just going to constantly add it. No, I don’t like that tone. The tone doesn’t work. This doesn’t work. I have a custom trained AI that knows how I write, have given it my writing since 2016 related to AI. I know exactly which ways my current workflow is inefficient or how it can be more efficient. And that’s how I use it as a tool. So I hope that helps.
Keerti Gopal, Inside Climate News (00:55:51):
Sorry, just to clarify. So you’re saying that basically the eco-friendly way to use it is to be sparing with how much you’re using it so that you’re not using it a lot,
Davar Ardalan (00:56:01):
Be sparing. But to your workflow, you don’t want to be sparing because you’re like, oh, I can’t ask AI one more time because it’s another cup of water. No, in a given week, I spend this much time on this task or in a given month or in given for every story, I spend at least 20% of my time on picking the best soundbites. Let me create a custom GPT that knows my writing style. Give it 10 pieces of your writing. OK, then say to it, OK, this is the context of my story, which soundbites do you think would be good? So yes, it’s sparing, but it’s very strategic, so use it very strategically. Lucretia, any thoughts from you?
Lucretia Williams, Howard University (00:56:50):
Yeah, I would say it’s the overconsumption of using these generative AI tools because also I want to remind us, AI has always been a thing. I think this Generat AI has with all these chat GPTs and things like that, it’s made more over consumption of running things, which is why they want all the data. There’s more data centers to meet the mass consumption. So I would say some people overly use it. And I think what VAR is saying is if you use it purposefully and strategically, then it’s not your specifically aren’t really overusing it. But I will say that there are ways just in general, even outside of your work, Google automatically has the ai. When you do search, you can turn that off in your Google browser. And if your work uses teams, if there’s copilot, you can toggle some things on off and on. So when you are going about your day, oh, just Googling where is the best food tacos in the area where Google isn’t using generative AI for your every little search move, but is like everyone in the world is doing that, then that’s the problem. But AI has always been a thing. We always had data centers. It’s the overconsumption now that is the problem. So strategically and purposeful, if you’re using it,
Rachel Jones, National Press Foundation (00:58:24):
That is very helpful for me to sort of consider the coverage of AI. Now, the sort of objective of many organizations to herald the warning or sound the warning about it. But to sort of flip the script a bit, we are near the end of this session. I want to let the two of you know that I would love to have a conversation with you at some point about using AI to cover Medicaid and what’s happening in that policy. I think that would be a very instructive way for me to think about it. I’ll quickly just say and tell me if I’m on the right track, would you put all of the data about the Medicaid program, say in your state all the interviews, or go and do a lot of interviews with people who are Medicaid users? Would you just put all of that into your custom chat GPT and then see what it says and perhaps come up with a story idea? Is that the way to think about using it in your reporting?
Davar Ardalan (00:59:31):
Yeah, absolutely. But make sure that the prompt is very specific. And so in other words, your question to the AI is going to say, for example, these are the interviews I’ve done. This is the November 20, 25 Medicaid report from the state of Maryland. Do you see any discrepancies? So it’s going to tell you I see a discrepancy because Mrs. Jones told you that they used Medicaid in this way, but the data shows that the majority are using it in this way. What you ask AI to tell you if there are any discrepancies, and then later you go and check that data and you’re like, holy shit, that’s good. So I would do that, but maybe Rachel, just very quickly, I wanted to just put two dates out there on October 1st, Howard University is launching AI for community on campus. Lucia, can you just briefly say about that? And Rachel, we can send this to you if you want to share with the students. Absolutely.
Lucretia Williams, Howard University (01:00:45):
Yes. We will be having a book panel discussion at the Howard University bookstore at 5:30 PM So me and Davar will be there along with two of the other co-authors. And then my two students are going to moderate for us, and it’s going to be refreshments in there, but we are really going to dive deeper into the book and just also just the current topics of AI in general. And on the next day, October 2nd, from one 30 to six 30, I’m one of the senior research scientists in the human center AI Institute at Howard. So we’re going to have a symposium, but it’s really going to be a fruitful discussion where we are going to have breakout discussions. You’re really going to meet other people and network and talk about, there’s three tracks, future work of AI Health and the education one or something. But this is where you can really meet other people in this space and in the community and learn anything you want to learn about how we center the human within developing and designing AI technology products.
Davar Ardalan (01:01:51):
Will these sessions be recorded?
Lucretia Williams, Howard University (01:01:55):
I’m going to try to record the book panel discussion. It’s not going to be live stream, but I’m going to try to record it. And I don’t know if we’ll be able to record the whole day of the symposium, but we’re going to try to record at least the panel discussion at the symposium.
Davar Ardalan (01:02:14):
And then this Sunday, September 14th, there’s an AI salon, AI for Community Salon in Virginia at Parse Place. I’ll send it to Rachel. And there’s only room for 25 people, and I think so far 16 people have signed up. So if you want to continue this conversation in a smaller group, please please RSVP and join us. I will send that information to Rachel.
Rachel Jones, National Press Foundation (01:02:39):
So these are incredible opportunities that I hope I can take advantage of. Many of the fellows are located around the country, which is why I was asking about recording. So please do send me any links to any of these activities if they are being recorded, and we will share them with the fellows. But Davar Lon and Recia Williams, this has been a very important conversation about artificial intelligence and how we ensure that communities benefit from it. And so I’m grateful to both of you for joining us. Thank you so much for being here.
Davar Ardalan (01:03:19):
Thank you. Thank you for having us. Yeah, good luck everyone. Thank you.
Rachel Jones, National Press Foundation (01:03:25):
Take care. I’ll be in touch with both of you.
Davar Ardalan (01:03:27):
Thanks. Bye.
###
