Data Scientist Robert Rodger About His Work
Robert Rodger, Data Scientist at GoDataDriven, is willing to share his knowledge as part of 'A day in the life of a Data Scientist' at GoDataDriven.
Writing Intellectually Challenging Code
Robert Rodger is a data scientist, but one of his greatest strengths is his ability to explain technical ideas to a lay audience. He has a professional background in risk management and likes consulting because it accelerates his soft skills development.
Giving a successful presentation, talking to a satisfied student after conducting a course, or finally resolving an issue I’ve been stuck on- these are the things that make me happy in a working day.
For instance, today I was stuck on a small difference between the ways two versions of Python handle text - particularly how they encode it. I was banging my head against the wall for a few hours. But just before this interview, I figured it out and the code ran correctly. So now this won’t be on my mind all weekend.
Machine Learning Meetup
A few years ago I attended a Meetup about machine learning co-organized by GoDataDriven. I was impressed by the company’s maturity, and we stayed in touch. Now, I’m happy to say I’ve been working here for almost three years.
GoDataDriven is a good fit for me. My colleagues are all highly talented individuals and social at the same time. Everyone is willing to share their knowledge, so it’s also a great place for technical growth if you are less experienced in some areas. I also like how working as a consultant accelerates your soft skills development.
Perhaps because of my background in risk management, I have worked for several of GoDataDriven’s financial clients. I also worked on a project for a publishing company. They were looking for tooling that could automatically take their English content, translate it to Dutch and then make summaries of that content in an automated fashion. They wanted to reduce the degree of human intervention in this process as much as possible, and they wanted whatever we were going to build hosted in the cloud.
Gensim, Elasticsearch and Flask
The assignment started with a proof of concept. My approach, as always, was to build a prototype tool quickly. Once I had shown that what the client wanted was possible, then I iterated upon this PoC to improve its output.
To that end, I took advantage of some pre-existing technologies. I used Gensim, which can perform automatic text summarization and the Google Cloud Translation API. I also used Elasticsearch to provide a prototype recommender system with which users of the platform are supposed to find relevant articles. And I used Flask to create a front-end for demonstration purposes.
After trying out some code in Jupyter notebooks, I rewrote everything as Python scripts and set them up to run periodically from a Google Cloud Compute Engine using some ops experience I picked up from a previous assignment. I was pleased that all the pieces came together so well, and the client was impressed by the result.
My audience is not always very technical, and I need to think carefully about the order and presentation of ideas.
We develop and deliver our training on the Python data science ecosystem and use machine learning and natural language processing.
All of these provide ongoing opportunities for me to refresh my knowledge base and work on my communication skills.
In addition to my consulting and training work, I enjoy interviewing and assessing potential colleagues. I appreciate that management places trust in us as consultants to make hiring decisions. Providing positive and negative feedback is an invaluable soft skill to have.
In addition to my consulting and training work, I enjoy interviewing and assessing potential colleagues.