Perth Children’s Hospital’s Paediatric Anaesthesia Research Team is looking for the next generation of AI scientists.
It’s planning to use advanced data science and AI in some groundbreaking projects to streamline and improve administrative tasks and outcomes for clinicians and patients alike, offering an invaluable glimpse into how AI can improve health outcomes in the real world, and they’re actively recruiting students from WA educational institutions to be part of it.
Alligator
The first project, Alligator, aims to develop text analytics for clinical notes in WA Health, starting with perioperative risk stratification from clinical notes.
The Paediatric Anaesthesia Research Team has access to a significant amount of unstructured medical notes for conversion into structured data for consumption by other research tools such as risk prediction tools, retrospective audits, qualitative assessments of generated texts and other novel studies.
They’ll be integrating it with in-house text analytics tools and datasets to create novel analyses and tools, combing the power of large language models and natural language processing.
The aim is to develop a system to automatically transcribe and analyse patient interviews. The team works with clinical psychologists to prepare kids before and after surgery to help understand and reduce the psychological trauma inherent to surgery and medications.
Building on a recent study they conducted, they’d like to standardise the transcription and analysis of these interviews using AI.
The project will require speech recognition using existing tools such as Whisper, analysing the resultant text with techniques such as embedding vectors, retrieval-augmented generation and sentiment analysis and abstracting this into a reproducible pipeline.
Alligator will need people interested in using natural language processing in a user-focused tool in studies involving clinical interviews.
The team is looking at several different clinical imaging datasets, starting with a dataset of skin infection grading over time and images of throats with streptococcus infection which are important health issues for Indigenous patients.
The aim is to apply AI modalities to these datasets, including semantic segmentation of lesions of interest, visual question answering with transformer models and image depth prediction for the throat images to build a model for clinical studies.
Students interested in applying should have knowledge of Python and a basic understanding of AI model inference and computer vision.
They also have a histopathology imaging dataset of labelled megakaryocytes in bone marrow images in normal and disease states, and want to apply and compare several semantic segmentation models such as Segment Anything Model and GroundingDINO to this dataset to help in diagnosing megakaryocyte diseases.
It’s a capability that can be extended to image/text pairs using public histopathology datasets, and as above, those interested need a knowledge of Python and a basic understanding of AI model inference and computer vision
Description of the AI tools
The team is planning to run AI models across a variety of clinical studies.
This includes live creation of language model datasets using reinforcement learning with human feedback (RLHF). To make this easier, they intend to use open-source tools such as langfuse to create a language model tracker for different studies and a benchmarking process.
In one example, the team might want to deploy and subsequently refine a generative discharge summary model over time depending on clinician feedback. Skills necessary to be involved in the project are a knowledge of container orchestration tools such as Docker, established web development skills and an interest in creating AI tools for clinical research.
They also plan to deploy multiple language models for different research studies, and would like to develop a tool to orchestrate the inference and finetuning processes for these tools within our HPC environment.
For example, one aim might be to inference a smaller visual language AI model using CPU only or allow burstable inference for large language models using one or more GPUs. This project involves integrating traditional HPC scheduling software like SLURM with containers using open-source tools such as Soperator or Slinky, and it requires an established knowledge of DevOps and container software such as Docker or Kubernetes.
The Paediatric Anaesthesia Research Team runs anywhere from 10-20 studies concurrently at any given time, which involves patient recruitment, consenting, follow-up and recording of various study statistics, typically in a tool called REDCap.
Studies are tracked in several places including OneDrive, whiteboards, REDCap and Excel spreadsheets, and they’d like a unified, centralised tracker of these studies to assist in data analysis and reduce data entry errors.
This involves the creation of a simple dashboard webapp which collates data from REDCap and requires established knowledge of webapp creation or dashboard creation tools such as Grafana, as well as knowledge of REST to export data from REDCap’s API.
About The Perth Children’s Hospital’s Paediatric Anaesthesia Research Team
The Paediatric Anaesthesia Research Team is a multidisciplinary team of doctors, data analysts and software engineers led by Professor Britta von Ungern-Sternberg that runs many concurrent studies in paediatric anaesthesia.
They’ve recently built MERLIN, a high-performance computing cluster for clinical AI research with a special focus on the application of large language models in paediatric medicine. It mainly consists of four H100s with access to historical and prospective patient data from WA Health.
The team also has access to the Pawsey Supercomputer for pretraining on public data and running simulations.
The team are looking for motivated, capable students who are interested in building WA Health’s AI capability on real-world clinical problems. This ranges from backend infrastructure and tooling support to applied AI in a clinical setting.