“Stop overthinking, be curious about data and enjoy your data scientist journey!”
What is your academic background? What and where did you study?
I have a bachelor’s degree in computer science (2007) from the State University of Western Parana/Brazil. I have M.Sc and Ph.D in electrical and computer engineering (2009 and 2013, respectively) from São Carlos School of Engineering/USP (Brazil). During this period, I worked with graph theory, dynamical systems, time series analysis, machine learning, multi-objective optimisation and high-performance computing to solve problems related to electric power systems. During my Ph.D, I spent about 3 months as a Visiting Researcher at the Cornell Business and Technology Park (USA) working with meta-optimisation techniques for electric power systems.
I was a Research Fellow for about 2 years at the Institute of Mathematical and Computer Sciences/USP (Brazil). In this period, I worked with data visualisation, social networks, time series analysis and natural language processing. I also spent about 7 months at the University of Illinois at Urbana-Champaign (USA) as a Research Fellow working with data visualisation, natural language processing, high-performance computing and complex network analysis. Currently, I am a Senior Adjunct Lecturer at the University of Western Australia/Department of Mathematics and Statistics.
Have you completed any other training in data science? (Up-skilling, MOOCS, short courses etc)
Most of my training comes from my industry and academic experience and working with R&D projects. I also attended several conferences and short courses in machine learning, statistics and high-performance computing.
For my continuous improvement, I did two nanodegrees at Udacity (Artificial Intelligence and Deep Learning) and a specialization at Coursera (Deep Learning). Currently, I am spending my time studying for Azure and AWS certifications in data science/data engineering.
If you pivoted into a data science role from another area, how did you go about this and what advice would you give others looking to do the same?
Since I started my master’s studies, I aimed to pursue a career where I could apply my research in industry. In my vision, the bridge between academia and industry can be reduced with practical applications of fundamental research in industry. This was the key aspect for me to become a data scientist.
To achieve this goal, I sought to understand the industry needs and how I could practically solve their problems by using proper data science techniques (mathematics and statistics skills). This goal also required that I learned different skills, for instance, how to write re-usable source codes, deploy solutions (development skills) and communicate results to non-technical users. In other words, I spent most of my time gaining academic, technical, business and interpersonal skills to achieve this goal.
My opinion to those who are seeking a data scientist career is: learn how to frame a problem, make questions that can lead you to solve the problem practically and efficiently, and be able to communicate to non-technical people the results. If you don’t have the skills to communicate your results, you won’t be able to convince people about the importance of your solutions.
What sparked your interest in working with data?
I always loved to solve problems, especially when data is involved. For me, it is like Sherlock Holmes novels, you have limited information to solve a mystery, such as the crime scene, the information given by the police and witnesses, a question to guide you (that might change over the investigation) and you have to communicate to the community your result.
Data science day-to-day is similar to a detective job and that is why I am so passionate about data science. For me, the problem is the mystery and every mystery have its own challenges. To be able to solve the mystery you need to be able to dive deep into the data, have the skills to frame the problem, understand/interpret the data and tackle the right questions. During this journey, you need to have a lot of fun, enjoy every discovery you found during the investigations and be able to communicate your findings.
How did you come to work in your current role?
I saw the job description on LinkedIn and found it very interesting. I resonated with Alcoa’s values, and liked what the employees said about working there and all the possible projects to solve there.
What sort of projects have you been working on?
I have been working on projects related to anomaly detection (computer vision, unsupervised learning and deep learning), predictive modelling (multivariate time series analysis) and data visualisation (developing new visual metaphors to support decision making in business).
I also have research projects, such as the use of deep learning techniques for anomaly detection and clustering techniques for data fusion and dimensionality reduction. My last papers are related to some of the topics I mentioned:
What tools/platforms do you use in your work?
Mainly but not limited to Azure, Spark, Python and R.
What has been a highlight of your data science career so far?
I have gained the ability to solve problems in many different fields (electric power systems, surveillance, social media, biology, mining, refining, etc.). My favourite highlight, however, is when I see users using the solutions I deployed, and when these solutions help them to have a less stressful job and have more time to enjoy their lives.
What challenges have you faced as a data scientist?
As the technology related to data science is quite dynamic and changes very fast, there is not an adequate infra-structure yet for data scientists to be efficient. Besides that, several projects in data science have a high risk of failure because the data associated with them might be irrelevant to the problem, or the question wasn’t clearly stated. To identify these problems to mitigate their risk is quite challenging. Hence, agile methodologies might come into play to help identify the risks in the projects and reduce the time wasted in trying to solve the problem blindly.
What are some of the big areas of opportunity/questions you want to tackle in this space?
I would really love to see more works that enable the use of multivariate time series analysis in the industry. Most of the data is from sensors (aka. IoT), which is time-series data. By taking into account the time aspect of the data in the modelling, better and more accurate machine learning models can be developed, improving decision making and bringing more value to the business.
Besides that, information fusion is quite a common topic in academia, but not in the industry. Using data from different sources efficiently can improve model accuracy. In other words, there is a lot of data, but few techniques use them properly.
What excites you most about recent developments in Data Science?
This field is so dynamic that it is hard for me to say what excites me most about it. I would say, everything (?!). Every day, there is something new and exciting in data science, for example, companies trying to automate data science tasks, or research projects investigating machine learning algorithms which are less black boxes and more interpretable solutions.
What does the future of data science look like?
I believe that several tasks in data science are going to be automated, which will allow data scientists to spend their time in more complex projects demanded by industries (which can’t be easily automated). In the future, data scientists will have a critical role in advanced analytics of complex projects and in democratizing data science (aka citizen data scientist) in the business.
For people considering a career in data science, what is one piece of advice you would give?
Stop overthinking, be curious about data and enjoy your data scientist journey! 🙂 It is also important to gain technical/business skills. So, start now your studies in software engineering, statistics, and how to communicate your achievements. They are not boring; they are cool and cornerstones of data science.
Note: Any responses or recommendations expressed in this material are those of the interviewed and do not necessarily reflect the view of Alcoa.