ABOUT ME

I am an assistant professor in the Faculty of Electrical Engineering, Mathematics, and Computer Science (EEMCS) at Delft University of Technology (TU Delft), the Netherlands. At TU Delft, I direct the AI-enabled Software Engineering (AISE) research lab, where I supervise various PhD, MSc, and BSc students. I am also the lab manager for TU Delft's tenth ICAI lab with industry, AI4SE, a collaboration with Jetbrains consisted of 10 PhDs (and various MSc/BSc and interns). Lastly, I am also part of the Software Engineering Research Group (SERG) at TU Delft.

My research is at the intersection of Natural Language Processing #NLP, deep learning, and source code analysis. I study, design, and develop learning-based models, specifically Large Language Models (#LLM) for source code, to automate software engineering and developer-related tasks such as understanding, generating, and documenting source code. My work has been published by several top venues in the software engineering community such as IEEE/ACM ICSE, EMSE, MSR, ICSME, SANER, and JSS.

I hold a PhD in Software Engineering from Sharif University of Technology (SUT). My PhD research focused on mining information from version control systems to facilitate software production-related tasks by automatically generating reports (such as release notes) for developers and software teams. I have a master's degree in IT Engineering from SUT. My Msc research focused was on improving evaluation methods of recommender systems which resulted in various publications.

If you are a researcher working on the intersection of NLP/ML and software engineering and you are interested in my work, please feel free to reach out to propose/discuss possible collaborations.

NOTE for prospective PhD students and postdocs: I have two fully-funded PhD vacancies as well as a postdoc position available on improving development tools and software engineering processes using #LLMs4code. Please apply through the portal for PhD vacancies (link) or reach out to me through my email for the postdoc position. Make sure to introduce yourself, describe your research interests, and enclose a copy of your CV.

NOTE for prospective BSc and MSc students: If you are a TU Delft's student who would like to work on the intersection of LLMs and SE in my lab, I have opprotunities for thesis, internships, and research projects. Feel free to reach out to me through my email to set up a meeting to disuss potetnial opportunities. Last but not least, I have limited capacity to supervise MSc/BSc students with scholarships from universities other than TU Delft.

Research Interests

  • AI-enabled Software engineering
  • Language models for source code
  • Applied machine learning and deep learning
  • Recommender systems
  • Natural language processing

Highlights of the Recent News

  • Oct 2023: Two full papers on LLMs for code accepted in #FORGE'24 conference! 🎉
  • Oct 2023: First paper out of our collaboration with Jetbrains Research got accepted at the #IDE'24 workshop! 🎉
  • Oct 2023: Our paper on a longitudinal and independent assessment of open-source LLMs4Code got accepted in the main track of #ICSE'24 conference! 🎉
  • Oct 2023: Our investigation on memorization in LLMs4Code got accepted in the main track of #ICSE'24 conference! 🎉
  • Aug 2023: Our study on the represenation of code tokens in multilingual LLMs4Code got accepted in #SCAM'23 conference! 🎉
  • Jun 2023: Our work on using dynamic assessment for improving automatic tutors got accepted in Education and Information Technologies Journal! 🎉
  • Jun 2023: I have started a new role as an assistant professor at TU Delft, the Netherlands.
  • Mar 2023: Our study on the impact of contextual data on the performance of LLM4Code got accepted to the technical track of #MSR'23 conference! 🎉
  • Feb 2023: Our study on the (ab)use of open-source code for training LLM4Code got accepted to the #NLBSE'23 workshop, co-located with #ICSE'23! 🎉
  • Jan 2023: Our EMSE paper on semantic-based recommenders got accepted for presentation at #ICSE'23 (Journal First Track). 🎉
  • Dec 2022: Our work on extending LLMs to decompiled code got accepted in the #SANER'23 conference. 🎉
  • Nov 2022: Our study on missing topic recommendation got accepted in the #EMSE journal. 🎉
  • Feb 2022: Our EMSE paper on issue report management got accepted for presentation at #ICSE'22 (Journal First Track). 🎉
  • Dec 2021: Our study on automatic code completion based on GPT-2 got accepted in #ICSE'22 (Technical Track). 🎉
  • Nov 2021: Our paper on automatic issue management for GitHub repositories got accepted in the #EMSE journal. 🎉
  • Jun 2021: Our study on recovering links among software artifacts got accepted to #ICSME'21 (Technical Track). 🎉
  • Apr 2021: My PhD paper on an automatic topic recommendation for GitHub repositories got accepted in the #EMSE journal. 🎉

Latest Awards and Visits

  • 2024: I won an *Amazon Research Award* for my work on measuring and undersating memorization in #LLM4Code.
  • 2023: My grant proposals for two fully-funded PhD positions on #LLMs4code with *JetBrains* are approved.
  • 2023: Received one fully-funded PhD as part of my starting package for my new role as the assistant professor, TU Delft.
  • 2023: Received the distinguished service award for co-chairing the NLP-based SE competition on issue classification.
  • 2023: Our tool for classifying code comments won first place in the #NLBSE'23 tool competition, co-located with #ICSE'23 (Melbourne, Australia)
  • 2023: I was invited and attended the Dagstuhl seminar on Programming Language Processing organized by Pradel et al., Germany.
  • 2023: Our attack won first place in the LLM's data extraction challenge organized by Nicholas Carlini et al. at #SaTML'23 (North Carolina, USA)
  • 2022: My tool for issue management won first place in the #NLBSE'22 tool competition, co-located with #ICSE'22 (Pittsburgh, USA)
  • 2021: Graduated PhD in software engineering with distinction.
  • 2020: Travel and accommodation scholarship grant for a 6-month research visit at TU Delft, Netherlands.



Publications (listed chronologically)

  1. Katzy, J.; Poescu, R.; van Deursen, A.,; & Izadi, M., An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets, In the proceedings of the first International Conference on AI Foundation Models and Software Engineering (FORGE), 2024.
  2. van Dam, T.; van der Heijden, F.; de Bekker, P.; Nieuwschepen, B.; Marc Otten, M., & Izadi, M., Investigating the Performance of Language Models for Completing Code in Functional Programming Languages, In the proceedings of the first International Conference on AI Foundation Models and Software Engineering (FORGE), 2024.
  3. Sergeyuk, A.; Titov, S.; & Izadi, M., In-IDE Human-AI Experience in the Era of Large Language Models; A Literature Review, First workshop on IDEs co-located with ICSE 2024 (IDE), 2024.
  4. Izadi, M.; Katzy, J.; van Dam, T.; Otten, M.; Poescu, R.; & van Deursen, A., Language Models for Code Completion: A Practical Evaluation, In the proceedings of the 46th IEEE/ACM International Conference on Software Engineering (ICSE), 2024.
  5. Al-Kaswan, A.; Izadi, M.; & van Deursen, A., Traces of Memorisation in Large Language Models for Code, In the proceedings of the 46th IEEE/ACM International Conference on Software Engineering (ICSE), 2024.
  6. Katzy, J.; Izadi M.; & van Deursen, A., On the Impact of Language Selection for Training and Evaluating Programming Language Models, In the proceedings of the 23rd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM), 2023.
  7. Izadi, M.; Izadi, M.; & Heidari, F., The potential of an adaptive computerized dynamic assessment tutor in diagnosing and assessing learners’ listening comprehension, Education and Information Technologies, Springer, 1-25, 2023.
  8. van Dam, T.; Izadi, M.; & van Deursen, A., Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study, In the proceedings of the 20th IEEE Working Conference on Mining Software Repositories (MSR), Technical track, 2023.
  9. Al-Kaswan, A. & Izadi, M., The (Ab) use of Open Source Code to Train Large Language Models, In the proceedings of the 2nd International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2023.
  10. Al-Kaswan, A.; Izadi, M.; & van Deursen, A., STACC: Code Comment Classification using SentenceTransformers, Proceedings of the 2nd International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2023.
  11. Al-Kaswan, A.; Izadi, M.; & van Deursen, A., Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge, Data extraction competition co-located with (SaTML), North Carolina, 2023.
  12. Al-Kaswan, A.; Ahmed, T.; Izadi, M.; Sawant, A.; Devanbu, P.; & van Deursen, A., Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries, In the proceedings of the IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2023.
  13. Izadi, M.; Nejati, M.; & Heydarnoori A., Semantically-enhanced Topic Recommendation System for Software Projects, Empirical Software Engineering Journal (EMSE), 2022.
  14. Izadi, M.; Mazrae, P. R.; Mens, T.; & van Deursen, A., LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in both Project-based and Transfer Learning Settings, arXiv preprint, 2022.
  15. Izadi, M., CatIss: An Intelligent Tool for Categorizing Issues reports using Transformers, In the Proceedings of the 1st International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2022.
  16. Izadi, M. & Nili, M., On the Evaluation of NLP-based Models for Software Engineering, In the Proceedings of the 1st International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2022.
  17. Izadi, M.; Gismondi, R.; & Gousios G., CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Naming Sequences, In the Proceedings of the 44th IEEE/ACM International Conference on Software Engineering (ICSE), 2022.
  18. Izadi, M.; Heydarnoori, A.; & Gousios G., Tag Recommendation for Software Repositories using Multi-label Multi-class Classification, Empirical Software Engineering Journal (EMSE), Springer, 2022.
  19. Izadi, M.; Akbari, K.; & Heydarnoori, A., Predicting the Objective and Priority of Issue Reports for Software Repositories, Empirical Software Engineering Journal (EMSE), Springer, 2021.
  20. Rostami P.; Izadi, M.; & Heydarnoori A., Automated Recovery of Issue-Commit Links Leveraging Both Textual and Non-textual Data, 37th International Conference on Software Maintenance and Evolution (ICSME), Research Track, 2021.
  21. Aghamohammadi, A.*; Izadi, M.*; & Heydarnoori, A., Generating Summaries for Methods of Event-Driven Programs: an Android Case Study, Journal of Systems and Software (JSS), Elsevier, 2020, *co-first authors.
  22. Tavakoli M.; Izadi, M.; & Heydarnoori, A., Improving Quality of a Post's Set of Answers in Stack Overflow, Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2020.
  23. Jalili, M.; Ahmadian, S.; Izadi, M.; Moradi, P.; & Salehi, M., Evaluating Collaborative Filtering Recommender Algorithms: A Survey, IEEE Access, 6, 74003-74024. 2018.
  24. Izadi, M.; Izadi, M.; & Azarsa, B., The Intonation Patterns of English and Persian Sentences: A Contrastive Study, Research Journal of Education (RJE), 3(9), 97-101, 2017.
  25. Javari, A.; Izadi, M.; & Jalili, M., Recommender Systems for Social Networks Analysis and Mining: Precision versus Diversity, In Complex Systems and Networks (pp. 423-438). Springer, Berlin, Heidelberg, 2016.
  26. Izadi, M.; Javari, A.; & Jalili, M., Unifying Inconsistent Evaluation Metrics in Recommender Systems, In Proceedings of ACM RecSys Conference, REDD Workshop co-located with #RecSys'14, Silicon Valley, USA (pp. 1-7), 2014.