What Could ChatGPT Do for News Production?

Published in

Generative AI in the Newsroom

10 min readFeb 15, 2023

In early 2022 the Associated Press (AP) published a report on the overall level of AI understanding and readiness in US newsrooms. Fielding an extensive survey, more than a hundred participants responded with, among other things, their “wish list” of tasks where they really wanted some help from automation. The report aggregates and compiles those responses into a helpful set of tasks where there’s clear demand.

Here I walk through these tasks with an eye towards whether generative AI models like ChatGPT could feasibly help. In virtually all cases additional work is needed to evaluate the quality of output for the tasks. In general, I would not recommend using ChatGPT for these tasks without a strict and close editorial process for fact-checking outputs before anything is published.

Based on my analysis of the tasks there are some patterns in terms of the types of tasks where such models can be helpful. These include tasks involving classification into potentially user-defined categories, rating of documents along dimensions of interest like newsworthiness, summarization of text in journalistically aware ways, and personalization of content based on end-user characteristics. Classification and rating are more analytic and can generate structured data that can be used in downstream filtering or ranking, whereas summarization and personalization are premised on outputting text that is either more efficient, or more relevant to consume.

For each task below, I’ll quote the description from the AP report and then add my commentary for how ChatGPT could be useful, whether it has clear limits, and if there are other considerations at play. For implementation examples of some of these tasks see my Colab Notebook, and if you’re excited to try things yourself, check out the Generative AI in the Newsroom Challenge. These tasks are of course just a starting point — lots more creative applications are still waiting to be explored.

1. Content Discovery

“Flagging and gathering social media content like trends, quotes from newsmakers, flag and gather content on government websites, e.g., COVID-19 data, court records, law enforcement records”

The AP distinguishes between content discovery from structured data, like excel sheets or databases, and content discovery from unstructured data, like written documents. Traditional analytics methods are probably better for discovery from structured data, whereas ChatGPT seems well-suited to discovery from unstructured data.

Discovery from unstructured data basically boils down to scanning documents and classifying them according to some category of interest, like newsworthy. Every content discovery task is a little bit different though, and there are a lot of different factors that play into whether a document is actually newsworthy. A GPT-based classifier of news value will need to be carefully evaluated for each specific content discovery task. If you want it to flag interesting court records in comparison to, say, law enforcement records you’ll likely need to write different, specifically tailored prompts. And you’ll need to evaluate the accuracy of each specific classifier to assess things like false positives and false negatives.

2. Document Analysis

“Processing large sets of public records like campaign finance records, state legislation, civil complaints, municipal budgets; in combination with text summarization to help reporters”

Much like the content discovery task, one aspect of document analysis comes down to categorizing and flagging interesting documents for further reporter attention. Another aspect is in summarizing documents so they can be quickly scanned by reporters for relevance. These aspects can get blurry though. For instance, in some of our ongoing research on this topic, we use GPT-3 to summarize newsworthy angles from scientific abstracts, embedding a news judgment into the summarization process. Any document analysis task will also need a careful evaluation of accuracy, both of any news value applied as well as of the text of the summary itself.

When reporters work with documents, those documents sometimes need to be digitized. Text might first need to be indexed through sometimes Optical Character Recognition (OCR) processes that can introduce errors. Whether ChatGPT is robust to these kinds of errors in data is still an open question.

3. Translation

“Translating published stories into multiple languages and processing raw data in various languages”

If you want to publish translations of stories you’re probably better off using a tailored solution (e.g. Google Translate) rather than a general purpose language model like ChatGPT. Even better would be a translation done by a person who can make accurate linguistic and cultural interpretations. But, if you’re thinking more about a use case like multilingual content discovery or document analysis, then ChatGPT might be appropriate for doing a rough translation that can be scanned and evaluated for relevance or interest. This could be interesting, for instance, in multilingual engaged journalism. But ChatGPT is best used only for getting an initial gist of a translation, and is not for publishing. If you want to rely on a document published in another language as a piece of evidence in your reporting, you’ll want to check with a fluent colleague, or maybe even get a certified translation.

4. Tips Processing

“Moderating story idea submissions and questions; verifying tips.”

One aspect of tips processing is a sort of special case of content discovery from unstructured documents that come in via email or other information channels. In some of our work on engaged journalism we’ve looked at survey callouts and online comments as avenues for learning about community interests for follow up reporting. For these types of use cases ChatGPT can be used in all the same ways for detecting newsworthiness and relevance of a document, according to whatever categories of newsworthiness you want to set up. Again, you’ll need to do tailored evaluations of this for the specific kinds of “tips” you’re scanning.

In terms of verifying tips, ChatGPT is not able to help at all. Remember that ChatGPT is limited in terms of its training data. As search engines begin to integrate ChatGPT they’ll build in ways to allow the model pull from more updated search results, but this, again, will not be of any use in corroborating a tip: if there were data or documents out there to corroborate the tip that the model already had access to then it wouldn’t really be a very good tip, would it?

5. Social Media Content Creation

“Generating content (e.g., text, video, photo, audio) and scheduling optimized posts to Twitter, Facebook, Instagram; cropping photos and videos for different formats.”

It’s a complex communication task to find the right tone for social posts and frame them so they resonate with particular audiences. And so I wouldn’t necessarily recommend trying to automate this task using generative AI. But producing excerpts and summaries of textual content for social channels is something that ChatGPT can definitely help with as long as there’s a human in the loop to dynamically prompt the machine and evaluate the results before they’re published. Perhaps there are patterns of prompts that could accelerate engagement editors’ work.

ChatGPT won’t be able to help with the other aspects of this task category: scheduling posts and cropping visual media. Scheduling is an analytic function that’s better addressed with other types of machine learning based on audience characteristics and data on past publishing. No doubt there will be other types of generative AI models that can help with visual tasks of cropping and editing video, but those will need careful evaluation before they’re used for publishing.

6. Automated Writing (Structured Data)

“High school sports, college sports, weather, natural events (e.g., tides, fires), restaurant report cards, police logs, elections, agriculture grain bids, business licenses, real estate and community calendars.”

Automated writing based on structured data is a familiar use-case for news media going back a decade or more. But most of the industrial-scale automated writing in use today is based on more straightforward template-based approaches. ChatGPT can also render fluently written text based on structured data inputs, but it comes with more caveats because of the statistical sampling involved in its text production. They also generally can’t do math, and so if there are specific mathematical operations needed, it is better to do those with traditional analytic methods first and then feed those pre-computed data files into ChatGPT for language generation. The bottom line is that the accuracy of texts generated from data need to be evaluated for accuracy before those texts are published.

7. Automated Writing (Unstructured Data)

“Obituaries, press release briefs, event previews, etc”

Automated writing from unstructured data is more challenging than the structured data case. But ChatGPT can also work to extract structured data from unstructured data, and then use that to generate text. Perhaps there are press releases your newsroom receives about community events and the model can extract things like dates, times, locations, and short descriptions of those events. This structured data can then be fed back to the model to output a written description of the event. Much like for the structured case though, you’ll want to check the output for accuracy before publication.

The AP description also suggests that “obituaries” are an example where there is a need for automation. Ethically it feels wrong to delegate the task of selecting the most important and meaningful life events for a person to a machine. However, if an editor selected life events and wrote up a set of bullets, you could imagine having ChatGPT render that into a draft text. Then again, if you’re going to do the work of researching, selecting, editing, and writing the life points, I doubt ChatGPT can save much time.

8. Newsletters

“Personalize newsletters and optimize newsletter delivery times.”

There are better ways to do newsletter curation using more traditional recommender system approaches, and ChatGPT won’t be any help with optimizing delivery times either. But there’s a lot of potential for newsletter personalization in terms of tailoring the curated content to appeal to an individual’s interests. ChatGPT could be used to frame or rewrite headlines or summaries of the curated content to be more personally appealing, based on a user model. For instance, if I know that you’re the type of person who often reads articles about inflation, I could highlight that in the headline or blurb of an article in a newsletter. This use case is more about emphasizing the frames from an article which are particularly appealing to an individual. Much as for other automated writing use cases these personalized excerpts would need to be carefully evaluated though. To implement this you would probably define a limited set of audience segments for personalization, and then assign an editor to manually review how each audience segment would see the newsletter before publication.

9. Text Summarization

“Generate summaries (e.g., briefs, from government meetings and cut downs of broadcast packages)”

There’s a lot of information coming at journalists everyday, and summarization could help make scanning that information more efficient. But, the key issue here is whether the summarization is journalistically aware. Is the summarization able to pick up on journalistic relevance in selecting and prioritizing what subset of information to include in the summary? You could certainly apply ChatGPT to summarize a government meeting, but it would be a complete failure if it left out a key controversial exchange. The main question is: in your specific content, what is your tolerance for missing something? There are possibilities here, but still much research to do on how to prompt ChatGPT to be sensitive to the right factors in summarization and to evaluate the system’s abilities for this task.

10. Comment Moderation

“Filtering by language, duplicate commenter accounts and compiling comments for marketing”

Comment moderation is a use case of great interest, not only for news organizations but also for social media platforms in general. Insofar as comment moderation can be reduced to classification (e.g. “hate speech”, “top-quality comment”) then ChatGPT can be used here, though with all of the disclaimers described above. For any particular category of comment that should be detected there should be an extensive evaluation of the accuracy, false positives, and false negatives, with an eye towards the cost to the community of any of those moderation errors. The AP report also mentions “compiling comments for marketing” and this starts to get into more of a summarization use-case. If the marketing is low-stakes it should also be possible to use ChatGPT here.

11. Content Transformation and Reuse

“Format articles as structured data to enable reuse in different platforms, format broadcast scripts for the web.”

This will depend on the exact transformations and data extractions needed to reuse content on different platforms, but in general ChatGPT should be able to help with these kinds of tasks. One example would be in taking an article and generating a headline and keywords for posting it to social media. If the metadata generated by the model is not visible to end users, such as with keywords, you can probably make do with lower accuracy. But still, any outputs of the model should be evaluated to ensure they’re high-enough quality and not totally off base.

12. Search Engine Optimization

“Integration with A/B headline testing, offering recommendations and integration with archives to recommend evergreen content”

Non-GPT AI technologies for content recommendation from archives and for A/B headline testing are already well-developed. The area where I see models like ChatGPT helping here is in suggesting variations of headlines for A/B testing. In this case I expect fine-tuned models of large language models will have a better chance of recreating any particular publication’s headline writing style. Otherwise, you would want to come up with unique and tailored prompts to set apart the headlines generated. As with other use cases though you’ll always want to have a human in-the-loop. Much better to think of this as a headline brainstorming tool to help you come up with diverse headlines to test with your audience. More broadly, metadata useful for SEO purposes, such as keywords or a summary text, could also be generated. The way search engines integrate generated text from an SEO perspective will be an area of active scrutiny and development as companies like Google and Microsoft also figure out what it means to have the technology more firmly embedded in their search engines.

13. Push-Alert Personalization

“Extending story recommendation abilities to personalize mobile push alerts”

While you would want to use non-GPT AI technologies for the actual push-alert recommendation engine, ChatGPT could be useful in the same ways as discussed above for use cases in headline or newsletter generation. For instance, the writing of push alerts could be personalized to different audience segments. As with virtually all the other use cases you’ll want a person checking the text generations before publishing them.