Keywords

1 Introduction

Artificial intelligence (AI) has advanced rapidly since its inception in the mid-1940s [1, 2]. In the public sector context, there are several examples of successful AI deployments. For instance, AI helps tax authorities in classifying individuals and corporate taxpayers to customize services and prevent fraud, predicting the arrival, departure and delays in public transport for commuters, pinpointing locations for police department in which they should concentrate their efforts in to deter crime, and immigration authorities to identify potential red flags and recommend immigration applications that should be approved [3,4,5,6]. There has also been some criticism of the use of AI applications in public services. For instance, concerns regarding prejudice and discrimination were raised when the Austrian labor administration established a system to classify job searchers according to their chance of landing a position [7]. AI is envisaged to change society in both favorable and unfavorable ways, whether it be through bettering healthcare outcomes or streamlining supply chains [1, 8]. Globally, public service organizations are competing to be the first to capitalize on the AI hype and, in the process, are attempting to establish themselves as trustworthy, transparent and open AI stakeholders [1, 9,10,11].

Transparency is one of the important cultural values, particularly for Nordic countries [12, 13]. Transparency in AI caters to the right and ability to obtain information about the use of AI in organizations [1]. Transparency can take many forms, but among its many elements is the availability of data regarding internal operations and performance while improving the ability of outside entities to monitor decisions or actions occurring within that organization [14]. AI transparency in public service organizations is deemed highly important by numerous actors engaged with elucidating AI processes [1, 15]. Lack of transparency in the public services is often linked with dysfunctional government, poor decision-making, lack of accountability of public officials, social media induced polarization and spread of misinformation [10, 16,17,18]. It is generally believed that transparency is vital for maintaining a balance of power between the public service authorities and citizens thereby increasing the likelihood of exposing wrongdoings or abuse of power [19].

A common mechanism to promote transparency is providing access to data through Application Programming Interfaces (APIs), dedicated apps and portals [20]. Strictly regulated operations on such data are enabled through an interface for analyzing, visualizing, and exploring by using data-driven algorithms and big data technologies [10]. However, achieving full AI transparency in public services and its practical realization is challenging due to several reasons [19]. First, creating transparency from public services data is hampered by numerous socio-technical constraints, so simply opening the data is insufficient [21, 22]. Second, public services data portals may reinforce the opinions of their designers through the presentation of selected and aggregated data, thereby stifling the range of perspectives of a diverse society [10, 23]. Given the challenges, research advancements can be helpful for establishing transparency in rendering public services. To this end, we seek to address the following research question:

What are the effective design strategies for AI transparency in public services?

In doing so, we consider the context of a Norwegian public services organization that plays a key role in Norwegian government's public administration in managing different types of citizens’ benefits. This study specifically focusses on one of the organization’s several AI initiatives which is to develop a model to predict the length of sick leaves. Such a model can serve as an additional resource for case handlers assisting them in concentrating their efforts towards the goal of providing effective services intended for “user-adapted follow-up” [3]. Such a context is especially befitting to our study because Nordic countries are known for their innovative approaches to ensuring the transparent functioning of government, and they are highly regarded within the European Union for their high standards of transparency [1, 11]. This study contributes to the evolving discourse on human AI interaction and citizens’ needs for AI transparency in the context of public services [8, 24, 25].

This article is structured as follows. Section 2 covers the related literature and background of this study while elaborating more on the role and importance of AI transparency in rendering public services. The next section discusses the methodology employed to conduct this research thereby explaining the details pertaining to action design research and data collection. Section 4 presents an overview of the design of AI prototype used for this study. Subsequently, we discuss the findings of this study. A discussion of implications and limitations of this study is carried out in Sect. 6. Finally, Sect. 7 concludes this study.

2 Background

Transparency in public services is holistically defined as “the availability of information about an organization or actor that allows external actors to monitor the internal workings or performance of that organization” [14]. As such, ensuring transparency in rendering public services through AI has the potential to improve quality, enable personalization, ensure informed decision-making and more efficient use of resources in public service delivery [9, 26, 27]. Although, the awareness about ethical importance of algorithms and AI has been growing, the general public is unaware of even the most fundamental aspects of AI [28]. There are initiatives in place to make AI more understandable, such as a national online AI course of Finland which inspired a global course [1]. However, a lot needs to be done to overcome the misconceptions and reservations about the operation of AI. In the past, citizens’ concerns have led to termination of several public service AI initiatives after their inception [29]. One such example pertains to System Risk Indication (SyRI) which was a Dutch initiative for profiling citizens to detect fraud with social services [30]. The Court found that the SyRI Act is in violation of the European Convention on Human Rights (ECHR) thus calling for ceasing the further use of SyRI [31]. The plaintiffs contended that because of a lack of transparency, the citizens are unaware of the risk model or algorithm that was employed and therefore unable to defend themselves against decisions made using SyRI [30]. Another such example relates to a lawsuit filed by a number of Brazilian legal organizations demanding a halt on the use of face recognition technology for public safetyFootnote 1. They claimed that the technology violated citizens’ fundamental rights to free speech, assembly, and association, and would worsen issues such as inequality and structural racism. Therefore, it is crucial for citizens to know that AI is used in a transparent and accountable manner upholding all established rules for public service delivery [9, 32]. National strategic policies for AI strengthen notions of openness and trust while fostering sociotechnical understanding and resulting in a deeper comprehension of AI transparency [1].

2.1 Role of National Culture in AI Transparency

Though several cultures share the same values of transparency, trust, openness, accountability, and fairness, they do not give those values the same priority [33]. These variations may cause AI policies from various nations to enshrine the same values in different ways [1]. Moreover, with inherent economic, moral, and socio-cultural differences between countries and regions, perceptions on how human-AI interactions should operate do not follow a common mindset [34]. For example, preference disparities in interface design were found when older adult East Asians and Caucasians were compared to two distinct interface designs [35]. As such, various governments (primarily in Nordic countries) have started to create national strategic policies for AI. It is essential to understand the socio-cultural shifts and how they should be maintained in a nation’s strategic AI policy [36]. However, researchers and practitioners who can inform respective governments about the national outlook on AI are themselves grappling with its understanding due to scarcity of appropriate research studies in this field.

The concept of transparency has been frequently debated in the field of AI and Human Computer Interaction (HCI) [14, 37]. Proponents of transparency stress that transparency fosters understanding, a tighter bond between citizens, and a better understanding of public services [12]. Thus, a number of authors have contended that the public's lack of faith in public services is mainly because they are not provided with real information regarding the operations and efficacy of public services [38]. On the flip side, critics of transparency argue that showing the outcomes of public services may not actually boost citizens’ trust [39]. They contend that transparency could possibly “delegitimize” the administration [14]. They assert that, since transparency requires users who can process it to function and most of the material released by public service portals is too difficult for even professionals to understand, transparency in AI just increases ambiguity and lack of trust [40, 41].

Despite contrasting viewpoints about AI transparency in public services, governments across the world are jockeying to make their public services more transparent. However, they come across different types of barriers in the design of open data applications and portals [42]. These barriers can be classified as data quality barriers, technical barriers, economic barriers, organizational barriers, ethical barriers, human barriers, political barriers, and usage barriers [10].

2.2 Norwegian Guidelines on AI Transparency

Transparency is highlighted as a key value in AI development in the Norwegian policy guide, which describes the country's national strategies for AI [34]. This guide also makes serious efforts to promote and practice transparency in its discussions of AI [43]. As compared to other national policy documents which provide truncated technical overviews, the first few pages of Norwegian policy guide explain AI technologies in detail thus providing any stakeholder fundamental AI literacy [1]. Furthermore, the Norwegian guide recognizes that “lack of transparency” is an issue when citizens attempt to access AI-based public services [1]. The policy states that in order to minimize burdens on citizens, improve access to services, and eventually enhance perceptions of transparency, “the Government has established a ‘once only’ principle to ensure that citizens and businesses do not have to provide the same information to multiple public bodies” [44].

This study responds to recent calls for research citizens’ preferences and AI adoption in public services [26, 45,46,47].

3 Methodology

3.1 Research Context

This study is conducted in collaboration with a large Norwegian public welfare service organization responsible for different kinds of social benefits. The organization has established a specialized team to explore the potential of AI and machine learning technology in improving the efficiency, robustness, and quality of public services. One direction for the use of AI systems the team explores is to reduce administrative burdens for citizens and employees by improving the use of time and resources. The organization has developed a predictive model for estimating the probable duration of people being absent from their work due to sickness. The model is envisioned to be used as a decision support system for case handlers to decide whether a legally mandated meeting to discuss additional support for the person on sick leave will be necessary. These meetings can be resource-heavy and involve many actors, but Norwegian law allows for them to be omitted if deemed not to be necessary.

3.2 Action Design Research

The study employs the Action Design Research (ADR) methodology, which is a collaborative research approach that allows researchers to work closely with organizations to solve real-world problems [48]. The ADR methodology involves multiple iterative “build-intervene-evaluate” (BIE) cycles, where both researchers and stakeholders jointly shape the research artefact. We held bi-weekly collaboration sessions with the organizational partner to discuss design choices, provide feedback, and plan for the next iterations of the prototype. The emerging artefact then served as a tool to explore and refine the practical application of AI in public service, ensuring that the resulting systems are not only technologically advanced but also ethically aligned and human-centric [49]. Consistent with ADR, we reflected on the problem understanding, theories, and emerging designs, evaluating potential knowledge contributions to the field.

3.3 Data Collection and Analysis

We recruited 28 individuals from the age range of 18–65 to mirror the demographic distribution of Norwegians on sick leave, as reported by the central statistics office in Norway (2022). Because the study focused on understanding specific user perspectives in a given context through rich qualitative data, the insights of 28 participants were found to be sufficient in providing diverse and in-depth empirical data [50,51,52]. The data collection included three consecutive parts. First, participants provided general demographic information and self-assessed their prior AI knowledge and technology use frequency. Next, a moderated task-based user study was conducted, where participants interacted with the interactive prototype and made decisions based on a scenario. The scenario led to a decision point at which they were asked whether they would consent to the use of the AI-supported sick leave duration prediction system. Participants were asked to think aloud while interacting with the prototype, which helped to capture thoughts and feelings better and provided insights into reasoning [53] Finally, participants rated their agreement with a set of statements related to the scenario and answered open-ended questions about their decision-making processes. The study followed ethical standards and legal regulations and was approved by the Norwegian Center for Research Data (NSD).

The collected data was then categorized into three types: scale ratings, verbal feedback, and interactions with the interactive prototype. Initially, each data type was examined independently before being integrated with the other two. In the first step, we analyzed participants’ responses to predefined questions. We compiled the scale ratings into a spreadsheet, which provided a comprehensive view of the data, and revealed diversity, trends, and allowed us to track any shifts in opinions throughout the study. We then undertook an exhaustive analysis of the qualitative data, including verbal feedback from the think-aloud recordings and responses to open-ended questions to different explanation types. This involved multiple reviews of the study recordings and a detailed examination of interview transcripts. During this process, we highlighted notable incidents, quotes, and expressions from participants, and engaged in in-depth discussions within our research team to collaboratively interpret these insights. Finally, we analyzed how participants interacted with the prototype, drawing from session recordings and notes. We focused on identifying which information elements were accessed, how participants engaged with these elements and the time spent on them. By integrating this analysis with the insights from the earlier stages, we aimed to enhance our overall examination and identify emerging concepts and recurrent themes, cross-referencing observations and findings across the different data categories.

4 Prototype Development

To evaluate the different types of explanations and information in a naturalistic context, we developed an ensemble artefact, embodied in an interactive prototype [54]. The prototype mimics a public service portal, depicting a specific interaction sequence of users being asked whether they would consent to the optional use of an AI-based prediction system. The development of our prototype was informed by the Google PAIR framework, which is a comprehensive collection of guidelines formalizing Google’s insights from industry and academiaFootnote 2. Google’s PAIR framework is the most comprehensive and balanced when comparing different Human-Centered AI frameworks [55]. Auernhammer, 2020 emphasizes that it is important to embed technologies within a user-centric interface that aligns with users’ workflows and cognitive models, beyond just crafting a robust algorithm or machine-learning model [56].

4.1 Information Elements

The prototype replicates a public service portal, presenting a scenario where users are asked about their willingness to consent to the use of an AI-based prediction system. The scenario starts with a standard portal notification that directs users to the introduction page of the AI prediction tool. Consistent with the PAIR framework's first chapter (User Needs + Defining Success), it's crucial to identify areas where AI's capabilities align with user requirements. Ehsan & Riedl, 2020 emphasize that for AI systems to be considered trustworthy and ethically sound, explanations must be clear enough to enable users to make informed judgments based on these understandings [57]. In the specific public service context, it's also vital to consider and convey the rights and responsibilities of all parties involved [58]. For our case, this includes explaining the government's mandate on mandatory dialogue meetings, the possibility of waiving these meetings, and the citizen's right to dissent to the use of the AI system. This information is provided in a textual format (Fig. 1). Furthermore, the information includes the collective benefit derived from consenting to use AI in the particular context. This entails addressing considerations of social ethics rather than imposing moral obligations on citizens.

Fig. 1.
figure 1

Introductory interface of the AI prediction tool

To enhance transparency regarding the process, data usage, and the model's functionality, participants were allowed to explore different information elements and descriptions of the AI prediction tool. One such element is a process diagram, designed to illustrate the AI's role within the overall process and emphasize the case handler's critical involvement (Fig. 2). As per Chapter 3 of the PAIR guidelines (Mental Models), such visual explanations are vital in setting realistic expectations about the role of an AI system, its capabilities, limitations, and the value it offers (Google PAIR, 2022). The diagram shows the AI system within the context of the full process flow and delineates areas where AI supplements human decision-making. It also depicts the procedure in the absence of the AI-enhanced prediction tool, in case a citizen would not consent to its use. The chart aims to provide a comprehensive view of AI’s role in the process, thereby fostering transparency and a deeper understanding, and enabling citizens to also better understand their rights and role in the process.

Fig. 2.
figure 2

Process visualization

The prototype also includes a data table to provide information about the data used by the AI model and the rationale (Fig. 3). This element is based on the PAIR recommendations in Chapter 2, which emphasizes the importance of explaining the usage and benefits of specific data to users. The data table consists of an introductory paragraph clarifying that no new data will be collected, but existing data reused. The table itself consists of two columns: one for the type of data and the other for why it's necessary. This transparent approach to data aims to enhance participants’ understanding of the system and provide insights into the personal data used.

Fig. 3.
figure 3

Information about the data used by the AI prediction tool

Lastly, we integrated an interactive chart showing Shapley Values for feature importance visualization (Fig. 4) [59]. This chart highlights the importance of various variables across different age groups, providing a visual representation of the model's logic. This enhances exploratory transparency and helps citizens adjust their trust in the system [58]. This integration aligns with Chapter 4 of the PAIR framework (Explainability + Trust), which states that while providing detailed explanations of the model's logic may not always be possible, the reasoning behind predictions can sometimes be made accessible. The interactive feature importance chart aims to fill this gap by presenting users with a visually engaging way to comprehend the model's inner workings.

Fig. 4.
figure 4

Feature importance chart

At the end of the interaction sequence, citizens reach a consent decision point where they must decide whether to agree to the use of AI or to opt-out (Fig. 5). This emphasizes the voluntary nature of their agreement and the option to not provide consent. Consistent with Chapter 5 (Feedback + Control) of the PAIR framework, it's essential to convey citizens’ control over their data usage and the decisions they make.

Fig. 5.
figure 5

Consent decision point

The chapter also stresses the importance of feedback mechanisms, which can help refine the AI model’s performance and user experience. While opting out can be considered implicit feedback, this stage also presents an opportunity to request more detailed feedback from citizens. Within the prototype, we encouraged users to express their opinions and objections (Fig. 6). This not only facilitates the direct participation of citizens but also upholds their right to object to personal data usage without negative consequences or justification requirements. This approach ensures transparency and upholds the principle that the right to object is an inherent right of citizens within the process.

Fig. 6.
figure 6

Feedback form in case of opting out

5 Findings

The participants of the study were generally positive about the use of AI in public service, as indicated by their responses when asked about their level of comfort with the use of AI. Before interacting with the prototype, around 53% of the participants reported feeling comfortable with the general use of AI in public services, while 30% were unsure or declined to say, and 17% felt uneasy about the idea. These initial responses suggest a mix of cautiousness and openness towards AI in the public sector, with some participants expressing concerns about the potential risks and uncertainties associated with AI use. However, after interacting with the prototype, the participants’ outlook became more positive overall. 65% stated feeling at ease, 23% remained uncertain or preferred not to answer, and only 12% maintained reservations. This slight shift in the level of comfort suggests that the interactive experience provided by the prototype helped address some of the concerns and uncertainties that participants had about AI in public services. To further unpack how these interactions might have influenced participants’ stances on the use of AI in public service, we also analyzed the verbalized comments from the think-aloud protocol as well as the actual interactions with the prototype. This analysis revealed three key themes: a) Articulating Information in Written Form, b) Representing Information in Graphical Form, and c) Establishing the Appropriate Level of Information Detail.

5.1 Articulating Information in Written Form

The first theme emphasizes the importance of copywriting in AI interfaces. Participants carefully evaluated the textual explanation used in the system. While the feedback on the provided explanations was generally positive, several participants expressed comments suggesting that the explanations as well as other interface text were still unclear. There were multiple remarks about challenges in comprehending the textual content, with some participants suggesting that the language used was too technical or complex. Participants stated: “There were some words I didn't understand” and “I think that here it needs to be made much easier. Sit down really. It's not designed for a simple user”. Some participants connected their comments about the language to the organization. One participant expressed: “It's their terminology that I think is… is a bit wrong… or I think a lot of people will stop here and really need help.” Another participant remarked: “That's kind of a typical [name of organization] description. […] They can write it a little easier, but I understand what's there.” This underscores the critical role of clear and understandable communication in AI-driven public services and highlights the nuanced role of language in shaping user experiences and trust in AI systems.

5.2 Representing Information in Graphical Form

The second theme identified in our data analysis is centered around the visual elements used in the prototype, specifically focusing on the process visualization (Fig. 2) and the feature importance charts (Fig. 4). Our analysis delved into which graphical representations captured the participants’ attention and their subsequent feedback on these visual elements. Upon examining the engagement with this visualization, we found that different graphical representations resonated differently with users, influenced significantly by their backgrounds. For instance, the process visualization was particularly well-received by participants who had familiarity with such representations in their professional contexts. These participants stated: “I really love this” and “I love to see how the process is with and without the tool”. At the same time, this type of visualization was difficult to comprehend and puzzled users who had no prior familiarity with such representations. Some users expressed their general favor for visual explanations but still had difficulties understanding the visualization: “I found it exciting that it was explained with pictures or icons, but I also found it a bit confusing”. Others stated that the provided visualization did not match their personal Mental Model of the process: “I don't quite feel my understanding of the situation is similar to that on the process diagram then.” However, some participants stated that they did not understand the process visualization at all: “Hard to figure out the chart” and “I don’t understand anything about this”.

The second graphical information in the prototype was an interactive feature importance chart (Fig. 4). Despite aiming to provide more transparency into the inner workings of the sick-leave prediction model, the feature importance chart was found to be too difficult to comprehend by participants, describing it as “too academic”, or it was unclear what was the underlying reason for including it: “Yes, I understand what it shows. But I don't understand why it shows it really.”. The findings hint towards the necessity of considering the diversity of potential prior experience with certain forms of visualizations, as well as incorporating diverse graphical representations to accommodate different citizen groups’ needs. The right combination of visual approaches tailored to diverse users is key to building transparent, intuitive interfaces that help citizens of all backgrounds interact successfully with AI systems.

5.3 Establishing the Appropriate Level of Information Detail

The third theme identified in our analysis explores the optimal level of detail for the information presented to citizens, particularly concerning the data table (Fig. 3) as well as textual information. The data table provided a comprehensive overview of the data utilized by the sick leave prediction model, including explanations for the necessity of each data type. Several participants perceived the level of detail as overly extensive, describing it as “too much to read” or “Way too much text”. However, a few participants appreciated the availability of detailed information, delving into it to thoroughly grasp the scope and reason for the data repurposing. One participant noted, “I think it is useful that it says what the purpose of collecting information is, and what kind of information is collected.” However, despite not fully reading the details, many participants acknowledged the value of having this information available. As one participant expressed: “I thought it was really nice. Even though I haven't read it very thoroughly right now, I thought it was really good to know exactly why they're using it and that information and what it's being used for. […] And then I get some confidence in understanding that that's why and that they're not just gathering any information and kind of that it's to give rise to something.”

The level of detail in the provided information was also brought up in other places. One other instance was the explanation of the governmental mandate for dialogue meetings. One participant remarked that: “I would rather have said that it is required by law, not necessarily the National Insurance Act. In other words, for someone who has nothing to do with the National Insurance Act, it is not so important which law it is.” But the same participant also appreciated the availability of more information if desired, stating that: “[…] if you don't know what it entails then it's nice that you can read a little more about what it entails.” The overarching insight from this theme is that citizens appreciate the presence of detailed information, even if they do not engage with it extensively. The availability of thorough explanations contributed to a sense of ease among participants, illustrating that the mere provision of detailed explanations can foster trust in AI systems. Leveraging progressive disclosure, which presents essential information first and then elaborates on more detailed aspects, can help maintain focus on the primary objectives. This approach is about empowering citizens and fostering confidence through a commitment to openness and accountability.

6 Discussion and Contributions

As evident from our findings, in order to enhance transparency and drive citizens’ engagement with AI enabled public services, there is a need to restructure information elements and realign content according to preferences of citizens. To generate trust, citizens must play active role in using public services and provide actionable feedback about their perceptions of the use of AI in these services. AI capabilities in public services can advance the creation of public value in the form of trust, flexibility, adaptability, and responsiveness [26]. One of the most significant obstacles that public services must overcome is driving engagement with their stakeholders amidst growing speculations about responsible use of AI [10]. In order to ensure that AI policy initiatives are accurate and pertinent, it is imperative that public offices increase the level of awareness and involvement with citizens. Thus, evaluating the preferences of the public can promote continued cooperation between all stakeholders, accelerating the flow of information and the ensuing chances for further development of public AI based interfaces.

6.1 Contribution to Research

The findings of this study underscore the link between the government and the citizens thereby broadening the scope of current research on adoption of AI in the public sector. The three themes identified in this study emphasize the critical role of clear and comprehensible communication in AI-enabled public services. Educating the public about the features of AI systems and the role that humans play in decision-making significantly improves public perceptions of AI, which is crucial for its adoption. Our research goes beyond earlier conceptual and review-based studies [1, 10, 15, 19, 60] by being able to assess individual preferences and perceptions about AI transparency. The information citizens receive strengthens their faith in the public service portals, and having a human in the loop gives them confidence that they can clarify their unique situation in an open, accountable and trustworthy manner.

6.2 Contribution to Practice

This study sheds light on how the interactive engagement with the AI prototype enhances its positive perception, suggesting that structured information and hands-on engagement with AI can positively shape public opinion about transparency and trust in public services. The right combination of comprehensible communication, visual approaches, and requisite information availability is the key to building transparent, intuitive interfaces that help citizens of all backgrounds interact successfully with AI enabled public services. Public entities may find it easier to provide their services efficiently if citizen’s trust is enhanced and they derive a value from usage of AI in public services. Through the identification of characteristics that contribute to a positive attitude toward AI, we offer practical guidance to practitioners about the transparent design of AI systems within a public service setting.

6.3 Limitations and Future Research

Citizens’ attitudes toward AI are influenced by their socio-cultural identity and, in particular, their general level of trust in the government, which is notably high in Norway (OECD 2022). Consequently, it is critical to carry out more targeted research with larger samples as well as studies in other sociocultural contexts. Secondly, in this study, we focus on citizens’ preferences of transparency in AI based public services. Further research may delve into the interplay of key characteristics such as transparency, trust and openness of AI based public services. Trust is systemic in nature, embedded in the broader system of public and private entities connected to AI [1, 61]. As such, there is a fascinating opportunity for research partners from around the world for investigating and creating a framework for human-centered in public services.

7 Conclusion

This study provided valuable insights into how citizens perceive and interact with AI systems introduced in public services. The findings demonstrate that directly engaging users helps improve generally positive views by addressing concerns through informed experience. The three key themes that emerged from our analysis showed citizens have high standards for transparent communication. The implications of this research can help guide development of future citizen-facing AI to maximize benefits and allay worries through responsible, user-centered practices. Hence, it aims to contribute with practical insights in the extant body of research on normative guidelines for the responsible design and development of AI with the goal of human benefit [9, 26, 55, 62, 63]. Our research contributes to the broader understanding of public sentiment towards AI in public services and offers practical implications for the design of AI based systems in the public sector.