Fabian D’Agnone, Data Scientist at Synergy
What is your academic background?
I hold a Bachelor of Science (Hons) in Mathematics and Statistics from the University of Western Australia.
Have you completed any other training in data science?
I have not completed training courses per se; day to day job requirements have meant I have digested a lot of online videos, tutorials and blogs. I attended the Australian AI conference in 2019 and am currently completing Azure training at Synergy. I recommend becoming a member of the Statistical Society of Australia and joining the ANZSTAT mailing list.
What sparked your interest in working with data?
A big moment for me was when a spatial expert at the Water Corporation helped me link the size of tree canopies in aerial images to nearby assets (like sewer pipes). This linkage was the basis of my entire thesis and prior to that I never thought it would come from photos. I realised that working on real problems with different (dirty) data allows you to be incredibly innovative. This innovation is what solves past, current and future problems and has motivated me to work in the data science space.
How did you come to work in your current role?
While I studied, I worked at the Australian Bureau of Statistics, then during my final year had an industry project at the Water Corporation (I can’t recommend industry projects enough!). After completing my degree, I started working for a statistical consulting firm and it was at this firm that a recruiter reached out about a role at Synergy. I wanted to return to a large organisation, and I knew Synergy has an endless supply of data science problems.
What sort of projects have you been working on?
My team works on data science problems in the retail space. Short term forecasting of the entire grid consumption has been my life the last few months. Most of 2019 was modelling customer classification and trying to understand and improve customer profitability. The aim for 2020 is more customer focused modelling like tracking customers journeys to help them achieve their objective quicker and smoother and predicting churn.
What tools/platforms do you use in your work?
I bounce between R and Python. My statistics background has meant I favour R, the tidyvserse is a great way for anyone to use R. I also use Python as it has some very useful libraries, for example NLP. A more recent development in the Windows operating system is the “Windows Subsytem for Linux (WSL)”. Which means you can use Linux on a corporate Windows machine. WSL is one of the most powerful tools at the moment, even if it is just speeding up run time.
What has been a highlight of your data science career so far?
I developed a predictive model which needed to be communicated to customers. The model used random forest algorithms to predict consumption, but customers need to know the risk associated with the consumption. An additional model was created to convert consumption to risk bands (high, med, low). The risk bands are being surfaced to business customers daily. It has been a highlight knowing a model I developed is surfaced to business customers, influencing business decisions.
What challenges have you faced as a data scientist?
The biggest challenge is very common; the acceptance of analytics. It is required that you sell the analytics (without going completely technical) not just the idea. If the analytics aren’t trusted the idea is forgotten. If you help people understand a high-level view of the technical logic, they will buy in.
The second is a bit unexpected, computer permissions. Don’t underestimate the need for data scientist to have admin rights, not all organisations allow this. I believe it is a strong sign of trust and perceived value an organisation has for its analytical staff.
What are some of the big areas of opportunity/questions you want to tackle in this space?
In my current role the biggest unknown is how much rooftop solar is generated as panels are installed behind the meter. As more solar is installed and the grid potentially becomes more unstable, an accurate model would be invaluable for all involved. Personally, I would like to be able to predict AFL scores more accurately. The goal/behind nature of AFL creates randomness that makes predictions harder than expected.
What excites you most about recent developments in Data Science?
The increased number of data science students. When I started Uni (2012) data science was a rare term, but by the time I graduated UWA had undergrad and postgrad degrees being taught. The rapid increase in computing power, students and roles suggests Data Science is about to boom.
What does the future of data science look like?
The future is promising, more universities are teaching, more organisations are understanding the value and more students are graduating. There will be a positive impact on development of software/hardware. Personally, I feel the biggest impact data science will make is with advertising/marketing, consumers will be targeted with incredibly personalised content.
For people considering a career in data science, what is one piece of advice you would give?
Be pro-active and get industry experience, this helps understand the complete data science pipeline, starting with raw dirty data to a prediction/insight. If you don’t have work experience or an internship find a problem that interests you and implement a solution i.e. sport predictions, web-scrape data, fit a model and surface results (add code to github). Employers don’t care if you can fit a model from a clean iris dataset but being able to talk through the complete modelling process will set you in good stead. You will also quickly learn if data science work is for you as 75% is data preparation and wrangling.