Data Science & Machine Learning for the Public Sector

Public Sector Network 21 May 2021

Data Science & Machine Learning for the Public Sector

Accelerating Citizen Outcomes by Transforming Data into Value with Data Science

Government Keynote:
Leveraging machine learning to glean detailed insights from qualitative as well as quantitative data

public sector network man speakers image

Stefan Kuegler
Associate Director,
Data Science & Visualisation
NSW Department of Premier and Cabinet

Developing a data science mind-set and team

Most organisations, especially ones that have or produce large amounts of data, are constantly looking to ease the analysis process, and take some of the manual and repetitive steps out of it. In the past, this was difficult to do because the technology wasn’t up to it, but now, as technology continues to develop and evolve, there are greater options for automation.

Stefan Kuegler , the Associate Director of Data Science and Visualisation at the NSW Department of Premier and Cabinet (DPC), says that recently, they have been able to “leverage machine learning to glean detailed insights into qualitative as well as quantitative data.” This has been a long and complex process but has produced positive results.

“The DPC had long had a data team, but “we were not really a data science place. We were very much just an analytical team.” A department like DPC has a constant flow of data coming in, and analysing it efficiently and appropriately is the role of the data team. They were good at their job, “but we found that there was more that we could be doing with the data to really exploit that data in a better way. We also needed to do it quicker.” There was a lot of talk around the team about how this could be achieved, and the creation of a data strategy was proposed, “though what we actually needed was an overarching data science strategy.” It became clear that “data science was evolving and really taking off.” They were a “little unit” doing their own thing, but decided that is where they needed to be.”

Out of their little team, they decided to build a data science unit. This started with the establishment of appropriate “reporting lines and a structure.” That allowed them “to take other people along on the journey as well,” and ensured they had buy-in from the senior executive. Once the foundations were laid, it was all about “showcasing little wins along the way.” For data science to really be effective, it has to work with live, current data, and has to prove that it can be efficient. Analysing data from months ago, or projects in the pipeline are great, but especially when setting up, something current shows all involved that what is being initiated is worthwhile. Therefore “education was a huge part of this – not only of our own people, but also education for the people who are interested in what we do as well.”

Since the original team was small and quite niche, “our recruitment strategy also changed from hiring just analysts, to hiring data science specialists or experts. It was really important for us to put the right people in the right place at the right time.” Then once the foundations were set and the experts were in place, “ we allowed the people an ability to play. We gave them the data that we needed and gave them the chance to service it in their own time.” At least initially, it wasn’t about timeframes or “an agenda for individual projects. We hired people for their skills, and we wanted to see what they could do.”

Data challenges and misconceptions

Although there was buy-in from the whole department, that doesn’t mean that they were constantly supportive. Before education around data science commenced, many people – including colleagues from other teams – “questioned what we were doing.” Some people accused them in a derogatory way of “just playing with data.” It is true that they were ‘playing’ with it, but for a very specific purpose.

The other misconception was that once machine learning was applied, everything would change and speed up instantly. Even the experts within the team needed to do “a lot of learning, education and expertise training.”

Moreover, there was a perception that machine learning was about setting and forgetting, and that it would do everything, almost without even the need for human involvement. “But a lot of the time we still need to be sitting there in the background and helping guide it to get the information we are after, something that’s usable.” The machines can perform a lot of tasks quickly, but they still need to be taught what to do and what results are being sought.

Benefits of machine learning

Though there were barriers initially, eventually the machines were set up and ready to produce results. The machines were set up in the first place because DPC collects a lot of data, particularly qualitative data with a significant amount of free text. Before the machines, “much of the qualitative data was turned into quantitative data for better analysis,” but generally that meant that things like “subjectivity, context, and the interpretation of what people were really saying was lost.”

We looked at machine learning to see if we can speed up some of the processes that we do in order to get richer and deeper insights from the data, and also to use more of the data. It was about combining the machine with our innate human ability to understand, and using the machine learning to speed up some of our processes, especially the repetitive ones.

Stefan Kuegler

Associate Director, Data Science & Visualisation, NSW Dept. of Premier and Cabinet

Humans were always intended to be part of the process. It was never about replacing humans with machines, but it was about “accentuating and accelerating our ability to get information out. And it was also about replicating and repeating the processes.” Some of the perceptions was that it would do everything, “but we had to remind people, that as the name suggests, it is still learning.”

For the DPC, machine learning was generally for text analysis in three specific areas:

Sentiment analysis – “Looking at that positive or negative context of what people say and trying to understand how that fits across the overarching feedback that people are giving us.”
N-grams – “Looking at key phrases and seeing where those phases are used the most.”
LDA (Latent Dirichlet Allocation) – “Looking at words and clustering them together.”

Each of these processes required algorithms and time to set them up, and would have taken “many weeks” to analyse manually. The machines gave results speedily. Some data is still being processed in new ways so not all the results are in, but the sentiment analysis shows “not only the overarching sentiment, which we can build up, but we can also look at individuals and see how they compare.”

In terms of the LDA analysis, “we’re looking at those clusters and trying to see what are the main themes that are coming out. These clustering methods allow us to go very quickly into what is the theme that people are trying to convey.” It allows the analysts to look at things holistically or to drill down as far as they want to, based on the algorithms that they build. In terms of sentiment analysis, since that is related to particular words or phrases, it is “built around various dictionaries,” but some of them are quite limiting and specific to a region, “so we might be building our own.” Each tweak takes time to develop and produces different results, so having expertise is critical.

Having the right people in place has been crucial because part of the analysis has been about understanding the background to the data in the first place, and then creating the machine processes for that.

This process of machine learning for text analysis is not only much faster, “but it allows us really to go back into the data and to provide us a better understanding of the information, with better insights.” It allows real stories to emerge and ensures that no data is wasted. This process has been successful because “we got the right people for the right data,” with ample time to set it up and play with it. “We were also lucky that we had a ready-made topic so that we could start with data straight away.” Anyone who doubted the team soon saw results, and soon saw the benefits. In fact, it has been so successful that “in the future it will probably allow us to look at different unstructured data, such as social media and even reviews.” Using data for “real analytics and for good” has always been the goal.

Regions

Australia

Published by

Public Sector Network , Public Sector Network

Communities

Events

Academy

Data Science & Machine Learning for the Public Sector

Accelerating Citizen Outcomes by Transforming Data into Value with Data Science

Government Keynote: Leveraging machine learning to glean detailed insights from qualitative as well as quantitative data

Stefan Kuegler Associate Director, Data Science & Visualisation NSW Department of Premier and Cabinet

Developing a data science mind-set and team

Data challenges and misconceptions

Benefits of machine learning

Regions

Published by

Recommendations

Australian Public Sector AI Adoption Ramping Up, Yet Progress Hampered by Fragmented Systems and Data

Powering the Global Public Sector Community

Australian data hosting is now available for Figma customers on Enterprise

Building Trusted Foundations for Responsible AI in Government: Strengthening Assurance, Compliance and Content Workflows

Secure AI starts with data: how a Modern Data Platform enables trusted AI use cases

Most Popular Insights

No Wrong Door: Building a Unified Experience for Citizens

Ensuring Accessibility and Bridging the Digital Divide: Colorado’s Approach to Inclusive Technology and Digital Equity

Let’s Chat: Driving Inclusive Digital Transformation in Government

Modernizing Technology for the City of Lewisville

2025 Research Digest: A catalogue of public sector transformation insights

Future Trends in Digital Government

Progress on the Digital Government Strategic Plan: Advancing Wave 3 Goals

Government services enter the AI era

Reflections, and Predictions

Digital NSW 2025: Practical next steps to delivering on 2026 priorities with Laura Christie

The Public Sector Podcast: Bridging the Digital Skills Gap in Government

Global Top 50: Government Innovators 2024

AWS vs Microsoft Azure vs Google Cloud vs Oracle Cloud Infrastructure: A Comprehensive Comparison

Case Studies in Public Sector Impact: Delivering Meaningful Outcomes with Limited Resources

The Role of Database Management in Business Growth

Most Popular Partner Content

Beyond legacy: The human and structural risks slowing modernisation in government and regulated industries

Tips from Figma for making your site more accessible

Innovate Federal 2025 Key Takeaways: Data, Analytics and AI

2025 Ransomware Holiday Risk Report

EntraGoat is a deliberately vulnerable lab that simulates real-world identity misconfigurations in Microsoft Entra ID.

What’s Shaping Investigations and Integrity in Government (Australia, 2025)

Private LLM Services for secure, compliant AI. Built for when Public AI won’t do.

Are AI Apps safe to use? How to securely adopt AI in your agency.

Unlock the Strategic Power of AI for Information Governance in 2025

AWS vs Microsoft Azure vs Google Cloud vs Oracle Cloud Infrastructure: A Comprehensive Comparison

Using the Viable System Model to Improve Government Operations

Leveraging the Strength of Centaur Teams: Combining Human Intelligence with AI's Abilities

Embracing Value Stream Thinking in Public Sector Project Management

Everything You Need to Know About Oracle 23c

The Role of Database Management in Business Growth

Government Keynote:
Leveraging machine learning to glean detailed insights from qualitative as well as quantitative data

Stefan Kuegler
Associate Director,
Data Science & Visualisation
NSW Department of Premier and Cabinet