Maliheh Izadi - Research

ABOUT ME

I am an assistant professor in the Faculty of Electrical Engineering, Mathematics, and Computer Science (EEMCS) at Delft University of Technology (TU Delft), the Netherlands. At TU Delft, I direct the AI-enabled Software Engineering (AISE) research lab, where I supervise various PhD, MSc, and BSc students. I am also the academic manager for TU Delft's tenth ICAI lab, AI4SE. consisted of 10 PhDs and various MSc/BSc students. Lastly, I am also part of the Software Engineering Research Group (SERG) at TU Delft.

My research is at the intersection of Natural Language Processing #NLP, deep learning, and source code analysis. I study, design, and develop learning-based models, specifically Large Language Models (#LLM) for source code, to automate software engineering and developer-related tasks such as understanding, generating, and documenting source code. My work has been published by several top venues in the software engineering community such as IEEE/ACM ICSE, EMSE, TOSEM, MSR, ICSME, SANER, and JSS.

I hold a PhD in Software Engineering from Sharif University of Technology (SUT). My PhD research focused on mining information from version control systems to facilitate software production-related tasks by automatically generating reports (such as release notes) for developers and software teams. I have a master's degree in IT Engineering from SUT. My MSc research focused on improving evaluation methods of recommender systems which resulted in various publications.

If you are a researcher working on the intersection of NLP/ML and software engineering and you are interested in my work, feel free to reach out to propose/discuss possible collaborations.

I have a postdoc position available on improving development tools and software engineering processes using #LLMs4code. Please apply through the portal for PhD vacancies (link) or reach out to me through my email for the postdoc position. Make sure to introduce yourself, describe your research interests, and enclose a copy of your CV.

NOTE for prospective BSc and MSc students: If you are a *TU Delft* student who would like to work on the intersection of LLMs and SE in my lab, I have opportunities for thesis, internships, and research projects. Feel free to contact me through my email to schedule a meeting and discuss potential opportunities.

Research Interests

AI-enabled Software engineering
Language models for source code
Applied machine learning and deep learning
Recommender systems
Natural language processing

Highlights of the Recent News

April 2025: New benchmark got accepted at #FSE'25 (main track). 🎉
March 2025: "HyperSeq: A Hyper-Adaptive Representation for Predictive Sequencing of States" got accepted at #FSE'25 (New Ideas and Vision). 🎉
March 2025: New Agent for onboarding got accepted at #FSE'25 (Demo track). 🎉
Jan 2025: "The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models" got accepted at #FORGE'25. 🎉
Jan 2025: "How Much Do Code Language Models Remember? An Investigation on Data Extraction Attacks before and after Fine-tuning" got accepted at #MSR'25. 🎉
Nov 2024: "The Impact of Generative AI on Creativity in Software Development: A Research Agenda" got accepted in ACM #TOSEM. 🎉
July 2024: "Leveraging LLMs for Enhancing the Understandability of Generated Unit Tests" accepted at #ICSE'25 (main). 🎉
Apr 2024: We successfully held the 3rd edition of #NLBSE workshop, co-located with ICSE'24 in Lisbon.
Oct 2023: AI4SE, the 5-year collaboration project between TU Delft and JetBrains started. I'm the scientific manager and the track lead of two tracks (supervising 4 PhD students) in this collaboration.
Oct 2023: Our paper on a longitudinal and independent assessment of open-source LLMs4Code got accepted at #ICSE'24 (main). 🎉
Oct 2023: Our investigation on memorization in LLMs4Code got accepted at #ICSE'24 (main). 🎉
Jun 2023: I have started a new role as an assistant professor at TU Delft, the Netherlands.
Mar 2023: Our study on the impact of contextual data on the performance of LLM4Code got accepted at #MSR'23 (main track) 🎉
Nov 2022: Our study on missing topic recommendation got accepted in the #EMSE journal. 🎉
Dec 2021: Our study on automatic code completion based on GPT-2 got accepted in #ICSE'22 (Technical Track). 🎉
Nov 2021: Our paper on automatic issue management for GitHub repositories got accepted in the #EMSE journal. 🎉
Apr 2021: My PhD paper on an automatic topic recommendation for GitHub repositories got accepted in #EMSE 🎉

Latest Awards and Visits

2025: Gave an invited talk at the French National Centre for Scientific Research's (CNRS) open seminar on Software Development in the Era of Generative AI.
2024: Gave an invited talk at the Software Engineering and Technology cluster of the Eindhoven University of Technology (TU/e).
2024: Gave an invited talk at Bits&Chips 2024 (Exhibition and conference on challenges in complex software engineering, AI in high-tech and system architecture).
2024: Gave an invited talk at Google Research.
2024: We won an ACM Distinguished Paper Award for our study on smart invocation of autocompletion systems at AIWare'24.
2024: Gave an invited talk at the FM+SE Summit in Tokyo.
2024: Attended a Shonan meeting on Foundational Models for Software Engineering in Japan.
2024: I won an Amazon Research Award for my work on measuring and understanding memorization in #LLM4Code.
2023: My grant proposals for two fully-funded PhD positions on #LLMs4code are approved.
2023: Received one fully-funded PhD as part of my starting package for my new role as the assistant professor, TU Delft.
2023: Received the Distinguished Service Award for co-chairing the NLP-based SE competition on issue classification.
2023: Our tool for classifying code comments won first place in the #NLBSE'23 tool competition.
2023: I was invited and attended the Dagstuhl seminar on Programming Language Processing organized by Pradel et al., Germany.
2023: Our attack won first place in the LLM's data extraction challenge organized by Nicholas Carlini et al. at (Google) #SaTML'23.
2022: My tool for issue management won first place in the #NLBSE'22 tool competition.

Publications (listed chronologically)

Al-kaswan, A.; Deatc, S.; Koc, B.; van Deursen, A.; & Izadi, M., Code Red! On the Harmfulness of Applying Off-the-shelf Large Language Models to Programming Tasks, ACM International Conference on the Foundations of Software Engineering (FSE), 2025
Koohestani, R.; & Izadi, M., HyperSeq: A Hyper-Adaptive Representation for Predictive Sequencing of States, ACM International Conference on the Foundations of Software Engineering (FSE), 2025
Ionescu, A.; Sergey Titov; & Izadi, M., A Multi-agent Onboarding Assistant based on Large Language Models, Retrieval Augmented Generation, and Chain-of-Thought ACM International Conference on the Foundations of Software Engineering (FSE), 2025
Koohestani, R.; de Bekker, P.; & Izadi, M., Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol, Under review, 2025.
Sergeyuk, A.; Zakharov, I.; Koshchenko, E.; & Izadi, M., Human-AI Experience in Integrated Development Environments: A Systematic Literature Review, Under review, 2025.
Katzy, J.; ... van Deursen, A.; & Izadi, M., A Qualitative Investigation into LLM-Generated Multilingual Code Comments and Automatic Evaluation Metrics, The International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE), co-located with ACM FSE, 2025.
Katzy, J.; Poepscu, R.; van Deursen, A.; & Izadi, M., The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models , The 2nd ACM International Conference on AI Foundation Models and Software Engineering (FORGE), co-located with ACM/IEEE ICSE, 2025.
Salerno, F., Al-kaswan, A., Izadi, M., How Much Do Code Language Models Remember? An Investigation on Data Extraction Attacks before and after Fine-tuning, The 22nd International Conference on Mining Software Repositories (MSR), co-located with ACM/IEEE ICSE, 2025.
Cipollone, D., Wang, C., Scazzariello, M., Ferlin, S., Izadi, M., Kostic, D., Chiesa, M., Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues, The 2nd Workshop on LLM4Code, co-located with ACM/IEEE ICSE, 2025.
Koohestani R. and Izadi, M., Rethinking IDE Customization for Enhanced HAX; A Hyperdimensional Perspective, The 2nd Workshop on Integrated Development Environments (IDE), co-located with ACM/IEEE ICSE, 2025.
Jackson V., Vasilescu B., Russo D., Ralph P., Izadi M., Prikladnicki R., D'Angelo S., Inman S., Lisboa A., van der Hoek A., The Impact of Generative AI on Creativity in Software Development: A Research Agenda, ACM Transactions on Software Engineering and Methodology (TOSEM), 2025.
Bogomolov, E., Eliseeva, A., Galimzyanov, T., Glukhov, E., Shapkin, A., Tigina, M., ..., Izadi, M., & Bryksin, T., Long Code Arena: a Set of Benchmarks for Long-Context Code Models , 2024.
de Moor, A.; van Deursen, A.; & Izadi, M., A Transformer-Based Approach for Smart Invocation of Automatic Code Completion, In the proceedings of the first ACM International Conference on AI-powered Software (AIware), ACM Distinguished Paper Award 🏆🥇, 2024.
Deljouee, A.; Koohestani, R.; Izadi, M.; & Zaidman, A., Leveraging Large Language Models for Enhancing the Understandability of Generated Unit Tests, In the proceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE), 2025.
Keshani M.; Bot G.; Rungta P.; Izadi M.; van Deursen A; Proksch S., Maven Unzipped: Exploring the Impact of Library Packaging on the Ecosystem, International Conference on Software Maintenance and Evolution (ICSME), Research Track, 2024.
Jackson V., Vasilescu B., Russo D., Ralph P., Izadi M., Prikladnicki R., D'Angelo S., Inman S., Lisboa A., van der Hoek A., Creativity, Generative AI, and Software Development: A Research Agenda, 2030 Software Engineering workshop co-located with (FSE), 2024.
Russo D., Baltes S., van Berkel N., Avgeriou P., Russo, D., Baltes, S., ..., Izadi, M., ..., & Vasilescu, B., Generative AI in Software Engineering Must Be Human-Centered: The Copenhagen Manifesto, Journal of Systems and Software (JSS), 2024.
Katzy, J.; Popescu, R.; van Deursen, A.,; & Izadi, M., An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets, In the proceedings of the first International Conference on AI Foundation Models and Software Engineering (FORGE), 2024.
van Dam, T.; van der Heijden, F.; de Bekker, P.; Nieuwschepen, B.; Marc Otten, M., & Izadi, M., Investigating the Performance of Language Models for Completing Code in Functional Programming Languages, In the proceedings of the first International Conference on AI Foundation Models and Software Engineering (FORGE), 2024.
Sergeyuk, A.; Titov, S.; & Izadi, M., In-IDE Human-AI Experience in the Era of Large Language Models; A Literature Review, First workshop on IDEs co-located with ICSE 2024 (IDE), 2024.
Izadi, M.; Katzy, J.; van Dam, T.; Otten, M.; Popescu, R.; & van Deursen, A., Language Models for Code Completion: A Practical Evaluation, In the proceedings of the 46th IEEE/ACM International Conference on Software Engineering (ICSE), 2024.
Al-Kaswan, A.; Izadi, M.; & van Deursen, A., Traces of Memorisation in Large Language Models for Code, In the proceedings of the 46th IEEE/ACM International Conference on Software Engineering (ICSE), 2024.
Katzy, J.; Izadi M.; & van Deursen, A., On the Impact of Language Selection for Training and Evaluating Programming Language Models, In the proceedings of the 23rd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM), 2023.
Izadi, M.; Izadi, M.; & Heidari, F., The potential of an adaptive computerized dynamic assessment tutor in diagnosing and assessing learners’ listening comprehension, Education and Information Technologies, Springer, 1-25, 2023.
van Dam, T.; Izadi, M.; & van Deursen, A., Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study, In the proceedings of the 20th IEEE Working Conference on Mining Software Repositories (MSR), Technical track, 2023.
Al-Kaswan, A. & Izadi, M., The (Ab) use of Open Source Code to Train Large Language Models, In the proceedings of the 2nd International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2023.
Al-Kaswan, A.; Izadi, M.; & van Deursen, A., STACC: Code Comment Classification using SentenceTransformers, Proceedings of the 2nd International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2023.
Al-Kaswan, A.; Izadi, M.; & van Deursen, A., Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge, Data extraction competition co-located with (SaTML), North Carolina, 2023.
Al-Kaswan, A.; Ahmed, T.; Izadi, M.; Sawant, A.; Devanbu, P.; & van Deursen, A., Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries, In the proceedings of the IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2023.
Izadi, M.; Nejati, M.; & Heydarnoori A., Semantically-enhanced Topic Recommendation System for Software Projects, Empirical Software Engineering Journal (EMSE), 2022.
Izadi, M.; Mazrae, P. R.; Mens, T.; & van Deursen, A., LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in both Project-based and Transfer Learning Settings, arXiv preprint, 2022.
Izadi, M., CatIss: An Intelligent Tool for Categorizing Issues reports using Transformers, In the Proceedings of the 1st International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2022.
Izadi, M. & Nili, M., On the Evaluation of NLP-based Models for Software Engineering, In the Proceedings of the 1st International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2022.
Izadi, M.; Gismondi, R.; & Gousios G., CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Naming Sequences, In the Proceedings of the 44th IEEE/ACM International Conference on Software Engineering (ICSE), 2022.
Izadi, M.; Heydarnoori, A.; & Gousios G., Tag Recommendation for Software Repositories using Multi-label Multi-class Classification, Empirical Software Engineering Journal (EMSE), Springer, 2022.
Izadi, M.; Akbari, K.; & Heydarnoori, A., Predicting the Objective and Priority of Issue Reports for Software Repositories, Empirical Software Engineering Journal (EMSE), Springer, 2021.
Rostami P.; Izadi, M.; & Heydarnoori A., Automated Recovery of Issue-Commit Links Leveraging Both Textual and Non-textual Data, 37th International Conference on Software Maintenance and Evolution (ICSME), Research Track, 2021.
Aghamohammadi, A.*; Izadi, M.*; & Heydarnoori, A., Generating Summaries for Methods of Event-Driven Programs: an Android Case Study, Journal of Systems and Software (JSS), Elsevier, 2020, *co-first authors.
Tavakoli M.; Izadi, M.; & Heydarnoori, A., Improving Quality of a Post's Set of Answers in Stack Overflow, Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2020.
Jalili, M.; Ahmadian, S.; Izadi, M.; Moradi, P.; & Salehi, M., Evaluating Collaborative Filtering Recommender Algorithms: A Survey, IEEE Access, 6, 74003-74024. 2018.
Izadi, M.; Izadi, M.; & Azarsa, B., The Intonation Patterns of English and Persian Sentences: A Contrastive Study, Research Journal of Education (RJE), 3(9), 97-101, 2017.
Javari, A.; Izadi, M.; & Jalili, M., Recommender Systems for Social Networks Analysis and Mining: Precision versus Diversity, In Complex Systems and Networks (pp. 423-438). Springer, Berlin, Heidelberg, 2016.
Izadi, M.; Javari, A.; & Jalili, M., Unifying Inconsistent Evaluation Metrics in Recommender Systems, In Proceedings of ACM RecSys Conference, REDD Workshop co-located with #RecSys'14, Silicon Valley, USA (pp. 1-7), 2014.