Mike Walker is the author of the Python Data Cleaning Cookbook, we got the chance to sit down with him and find out more about his experience of writing with Packt.
Q: What is/are your specialist tech area(s)?
Mike: data science, statistical methods, data cleaning and preparation, machine learning, application development, research methods, full stack development
Q: How did you become an author for Packt? Tell us about your journey. What was your motivation for writing this book?
Mike: My focus over the last couple of decades has been on the use of data for decision-making in education and social service agencies. I strongly believe that a greater understanding and use of machine learning tools can significantly improve the quality of services offered to students and clients. I proposed a book on that topic, with a large number of applications in those sectors. However, it was decided that a book on data cleaning with Python would be much more marketable at this time. I agreed to write that instead.
Q: What kind of research did you do, and how long did you spend researching before beginning the book?
Mike: Since the topic of my book is something I have to think about every day in my work I had already read the vast majority of books written on the topic over the last few years. I also keep up with blogs and websites on Python and pandas. I needed to think through what my book would have to offer that was not already out there. I decided before writing that both the tone and approach would be very much like that of a mentor to someone who had some knowledge of the field but was still fairly junior. This meant that instead of just demonstrating the relevant techniques, I tried to help the reader understand the why, as well as the benefits and disadvantages of different strategies.
Q: Did you face any challenges during the writing process? How did you overcome them?
Mike: The main challenge was the very quick turnaround, so I needed to be efficient and sure-footed. There was no time for significant changes in direction.
Q: What’s your take on the technologies discussed in the book? Where do you see these technologies heading in the future?
Mike: This is a bit specific, but I love that pandas is now a decade old and that there are so many good libraries. I probably incorporate another library into my regular work each month. At a more conceptual level I anticipate even more coming together of what we once saw as operational and analytical databases. Data scientists will have to oversee the development of interactive tools for drawing insights from live data. To me, this means additional blurring of careers in statistical methodology and computer programming.
Q. Why should readers choose this book over others already on the market? How would you differentiate your book from its competition?
Mike: As I discuss in response to an earlier question, I focus on mentoring the reader through real world applications. I try to emphasize professional judgment and statistical insights as much as a programming technique.
Q. What are the key takeaways you want readers to come away from the book with?
Mike: The most important thing for me is that folks just starting out in the field feel confident about their ability to do this work after reading this book, and that they find data issues rather than allowing those issues to find them. For folks who are more experienced, I want to recognize a fellow traveler in the pages of the book, one with whom they can be in regular conversation about familiar problems
Q. What advice would you give to readers learning tech? Do you have any top tips?
Mike: My number one tip is to be excited about new challenges. Also, if possible find work where your professional development is as important to the organization as how quickly you accomplish a particular task.
Q. Do you have a blog that readers can follow?
Mike: Yes, https://www.datasciencecentral.com/profile/MichaelBWalker
Q. How would you describe your author journey with Packt? Would you recommend Packt to aspiring authors?
Mike: The editors and marketing folks are all incredibly pleasant and thoughtful. I very much enjoyed working with them.
Q. How did you organize, plan, and prioritize your work and write the book?
Mike: The book is organized according to the steps an analysis takes in the data cleaning process, from importing data to producing summary statistics. It was helpful to me to write the book in that order.
Q. What is that one writing tip that you found most crucial and would like to share with aspiring authors?
Mike: I wish I had thought to pad the time from the production of the first draft of the final chapter to the publication of the book. We really could have used two more weeks.
You can find Mike’s book on Amazon by following this link: Please click here