Savaş Yıldırım is the author of Mastering Transformers, we got the chance to sit down with him and find out more about his experience of writing with Packt.
Q: What are your specialist tech areas?
SAVAŞ: I am an academician. I teach and work on machine learning, deep learning, and natural language processing.
Q: How did you become an author for Packt? Tell us about your journey. What was your motivation for writing this book?
SAVAŞ: There are so many valuable online resource, blog posts, readings and code documentation for the Transformers, but there are not gathered under one roof, so I had been thinking of this as a gap. When Packt approached me to write this book, I took this opportunity to bring all these resources under an umbrella. I must admit that due to my busy schedule I had some doubts if I could focus on the writing process but on the other hand, I was knowing that writing a book is also a learning process that makes me indeed excited and very much motivated.
Q: What kind of research did you do, and how long did you spend researching before beginning the book?
SAVAŞ: I have been doing research on machine learning for 20 years. For the last 15 years, I have been working especially on NLP. With the emergence of Transformer architectures, I focused on the subject and developed applications. For the book, I can say that we worked with my co-author friend Meysam for 1 year to prepare it.
Q: Did you face any challenges during the writing process? How did you overcome them?
SAVAŞ: This subject is part of a rapidly developing research area. That’s why new approaches and new libraries are constantly emerging. We tried not to miss any important development or approach as much as possible. While we were writing some chapters, we had to change the book outline proposal that we wrote at the beginning. Because new successful models and new directions have emerged.
Q: What’s your take on the technologies discussed in the book? Where do you see these technologies heading in the future?
SAVAŞ: The architectures based on machine learning technologies in the past were able to solve only one specific problem such as text classification or summarization. The new paradigm in deep learning is that a single architecture can solve many problems since it can both understand what to do and understand how to do it. And of course, this is done by transfer learning. We train a model with a huge amount of data and many parameters once, and then we do no need to fine-tune anymore for a specific task thanks to transfer learning, which we call is zero-shot learning. On the other hand, we want to make it dependent on language. One can train a single architecture so that it works for any task and any language. Here we see some research trends towards artificial general intelligence (AGI).
Q. Why should readers choose this book over others already on the market? How would you differentiate your book from its competition?
SAVAŞ: As I mentioned before, there are a lot of good online resources about Transformers. HuggingFace is a great resource in itself. However, at the moment, no resource covers all topics from training tokenizers to training efficient transformers, from mono-lingual models to multi-lingual models, in a comprehensive sense. There is also a well-prepared path in the book for those who want to learn the transformers step by step. Another important point is that we covered each subject in the book with hands-on coding.
Q. What are the key takeaways you want readers to come away with from the book?
SAVAŞ: In general, deep learning architectures and, of course, Transformer architectures are a bit difficult to train and adapt. We, researchers or practitioners, can overcome this difficulty with powerful libraries such as Keras, Tensorflow, PyTorch, Transformers, and so on. Through this book, the readers will be able to learn how to apply these complex architectures, Transformers in this case, to real-life problems. They will be able to train their own NLP models by following the steps in this book. Briefly saying, they will be able to solve many NLP problems in any language with Transformers architectures. Most importantly, they will discover which transformer model will best meet their needs.
Q. What advice would you give to readers learning tech? Do you have any top tips?
SAVAŞ: We have to move forward both in theory and in practice. So it is very important to follow the developments closely, therefore curiosity and genuine commitment to continuous learning are the keys, so I strongly advise them to attend meetups, workshops, conferences, tech events; follow valuable blogs, etc would feed them and enable them to move forward along with the industry.
It is necessary to understand the path from Artificial Intelligence to Machine Learning, from Machine Learning to Deep Learning. It is absolutely necessary to read the books that explain the historical and theoretical framework on this subject. In parallel with this, it is also necessary to master important tools and to constantly increase coding skills. PyTorch, TensorFlow, Transformers, Spacy, NLTK, Gensim, and many NLP libraries must be known and experienced. It is important to participate in competitions on Kaggle, to support Open Source GitHub codes, to train/share models to HuggingFace, or to try to develop models that will force the top rank in the GLUE-like benchmarks.
Q. Do you have a blog that readers can follow? top tips?
SAVAŞ: The most suitable blog for this book content could be HuggingFace as follows: https://huggingface.co/blog
Q. Can you share any blogs, websites, and forums to help readers gain a holistic view of the tech they are learning?
SAVAŞ: The most suitable blog for this book content could be HuggingFace: https://huggingface.co/blog
Q. How would you describe your author journey with Packt? Would you recommend Packt to aspiring authors?
SAVAŞ: We proceeded with a very planned program. We created a schedule for the Book Chapters delivery and there was only a 1-2 week delay at the end. The Packt team supported us for many issues such as styling, reviewing the chapters, code-testing. In this way, we were able to focus on our main job, which is book writing. Do I recommend authoring with Packt? Yes, it is not easy to write a book, but the Packt team both encouraged and supported us a lot.
Q. Do you belong to any tech community groups?
SAVAŞ: Not really. But I support and contribute HuggingFace community. I am an associate professor at Istanbul Bilgi University. I am also a reviewer for the journals of IEEE, ACL, and for the R&D projects of The Scientific and Technological Research Council of Turkey.
Q. What are your favorite tech journals? How do you keep yourself up to date on tech?
SAVAŞ: I mostly follow academic conferences and journals such as ACL, NAACL, EACL, ICLR, CICLING, COLING, IEEE, and AAAI Conferences.
Q. How did you organize, plan, and prioritize your work and write the book?
SAVAŞ: Since we were two writers, we had an average of 1 month for each chapter. Due to my other duties and academic studies at the university, I spent 2-3 days a week on the chapters. While writing the chapters, we only researched for the first week. This was important to build the general skeleton and not to skip any subject or approach. Then we wrote the chapters as comprehensively as possible. In some code-heavy chapters, we first implemented the codes and then wrote the sections.
Q. What is the one writing tip that you found most crucial and would like to share with aspiring authors?
SAVAŞ: They should know that there is no end to research. They should get rid of the feeling of telling everything and they have to stop somewhere. Deficiencies will remain in every book.
You can find SAVAŞ’s book on Amazon by following this link: Please click here