AISE lab | Maliheh Izadi

Director: Dr. Maliheh Izadi

Assistant Professor

Computer Science

AISE research explores several topics at the intersection of AI and Software Engineering, including:

LLMs for Code Generation, Summarization, Refactoring, and Bug Fixing: Leverage LLMs to accelerate various development tasks.
Longitudinal Evaluation and Benchmarking of Code LLMs: Study the long-term performance of LLMs across languages, tools, and developer workflows. Develop comprehensive benchmarks to assess and compare LLM effectiveness in diverse software engineering tasks.
Autonomous Software Engineering Agents: Build intelligent, task-driven agents capable of independently executing and managing software engineering workflows.
Mitigating Memorization and Hallucination: Investigate strategies to reduce factual inaccuracies, hallucinated code, and overfitting in LLM outputs, ensuring reliability in practical applications.
Human-AI Collaboration in IDEs: Design intuitive IDE interfaces and workflows that foster seamless collaboration between developers and GenAI assistants, maximizing productivity and usability.
Explainability in Code LLMs: Improve the transparency of LLM-generated suggestions to enhance developer trust and facilitate understanding of model behavior.
Domain Adaptation and Personalization: Fine-tune models to specific domains or codebases to improve contextual relevance, precision, and performance.
Repository Management: Develop techniques to automatically associate commits with relevant issues, triage, assignment, and resolve issues or create human-readable documentation from codebases, commit history, and other project artifacts.

PhD Students

Ali Al-kaswan

PhD candidate (Sep'22)

Former Msc student

Privacy/Security in LLMs

Code Red! On the Harmfulness of Applying Off-the-shelf Large Language Models to Programming Tasks, ACM International Conference on the Foundations of Software Engineering (FSE), main track, 2025
How Much Do Code Language Models Remember? An Investigation on Data Extraction Attacks before and after Fine-tuning, IEEE/ACM 22th International Conference on Mining Software Repositories (MSR), 2025
Traces of Memorisation in Large Language Models for Code, IEEE/ACM 46th International Conference on Software Engineering (ICSE), 2024
Towards Safe, Secure, and Usable LLMs4Code, IEEE/ACM 46th International Conference on Software Engineering (ICSE), Doctoral Symposium, 2024
Extending Source Code Pre-trained Language Models to Summarise Decompiled Binaries, IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), 2023
Stacc: Code comment Classification using SentenceTransformers, IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE), 2023
Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge, The 1st IEEE Conference on Secure and Trustworthy Machine Learning, 2023
The (ab) Use of Open Source Code to Train Large Language Models, IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE), 2023

Jonathan Katzy

PhD candidate (Jan'23)

Multilinguality in LLMs

A Qualitative Investigation into LLM-Generated Multilingual Code Comments and Automatic Evaluation Metrics, The International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE), 2025
The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models, The 2nd ACM International Conference on AI Foundation Models and Software Engineering (FORGE), 2025
An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets, The 2nd ACM International Conference on AI Foundation Models and Software Engineering (FORGE), 2024
Language models for code completion: A practical evaluation, IEEE/ACM 46th International Conference on Software Engineering (ICSE), 2024
Programming Language Models in Multilingual Settings, IEEE/ACM 46th International Conference on Software Engineering (ICSE), Doctoral Symposium, 2024
On the Impact of Language Selection for Training and Evaluating Programming Language Models, IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM), 2023

Egor Bogomolov

PhD candidate (Mar'24)

Evaluation in LLMs

Long Code Arena: a Set of Benchmarks for Long-context Code Models, under review, 2025

Agnia Sergeyuk

PhD candidate (Apr'24)

Human-AI Interaction in IDE

Human-AI Experience in Integrated Development Environments: A Systematic Literature Review, under review, 2025
The Design Space of in-IDE Human-AI Experience, under review, 2024
In-ide Human-AI Experience in the Era of Large Language Models; a Literature Review, The 1st Workshop on Integrated Development Environments (IDE), 2024

Daniele Cipollone

PhD candidate (Sep'24)

LLM Integration in IDE

Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues, The 2nd International Workshop on LLM4Code, co-located with ICSE, 2025
Enhancing Large Language Model Integration in Integrated Development Environments, ACM International Conference on the Foundations of Software Engineering (FSE), Doctoral Symposium, 2025

Ziyou Li

PhD candidate (Dec'24)

AI/AI Interaction in IDE

Enhancing Human-IDE Interaction in the SDLC using LLM-based Mediator Agents, The 1st International Workshop on AI-Augmented SDLC, co-located with (FSE), 2025
Mediating between Human Programmers and Integrated Development Environments using LLM-based Agents, ACM International Conference on the Foundations of Software Engineering (FSE), Doctoral Symposium, 2025

Razvan Popescu

PhD candidate (Feb'25)

Former BSc/Msc student

Robust Datasets for LLM4Code

The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models, The 2nd ACM International Conference on AI Foundation Models and Software Engineering (FORGE), 2025
An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets, The 2nd ACM International Conference on AI Foundation Models and Software Engineering (FORGE), 2024
Language models for code completion: A practical evaluation, IEEE/ACM 46th International Conference on Software Engineering (ICSE), 2024

Research Assistants/Honour Students

Roham Koohestani

Scientific Dev (Jan'24)

BSc student

Guaranties in GenAI

Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol, under review, 2025
Leveraging large language models for enhancing the understandability of generated unit tests, IEEE/ACM 47th International Conference on Software Engineering (ICSE), main track, 2025
HyperSeq: A Hyper-Adaptive Representation for Predictive Sequencing of States, ACM International Conference on the Foundations of Software Engineering (FSE), 2025
Rethinking IDE Customization for Enhanced HAX: A Hyperdimensional Perspective, The 2nd Workshop on Integrated Development Environments (IDE), 2025

Yongcheng Huang

Research Assistant

BSc/MSc student

Refactoring via LLMs

A Qualitative Investigation into LLM-Generated Multilingual Code Comments and Automatic Evaluation Metrics, The International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE), 2025

MSc Students

Nadine Kuo

Yash Mudhra

Venelina Pocheva

Visitors

Fabio Salerno

Msc student from Italy

Memorization in LLM4sCode

Next: SE @ Stema

How Much Do Code Language Models Remember? An Investigation on Data Extraction Attacks before and after Fine-tuning, IEEE/ACM 22th International Conference on Mining Software Repositories (MSR), 2025

Alumni

Andrei Ionescu

Former BSc/MSc student

Onboarding Agents

Next: Intern @ Microsoft

A Multi-agent Onboarding Assistant based on Large Language Models, Retrieval Augmented Generation, and Chain-of-Thought, ACM International Conference on the Foundations of Software Engineering (FSE), 2025

Aral De Moor

Scientific Dev (Sep'23)

Former BSc student

Smart Trigger Models

Next: ML engineer @ JetBrains

A Transformer-based Approach for Smart Invocation of Automatic Code Completion, The 1st ACM International Conference on AI-powered Software (AIware), 2024

Tim van Dam

Former BSc/MSc student

AutoCompletion via LLMs

Next: Software Engineer @ Teifi

Investigating the performance of language models for completing code in functional programming languages: a haskell case study, The 1st ACM International Conference on AI Foundation Models and Software Engineering (FORGE), 2024
Language models for code completion: A practical evaluation, IEEE/ACM 46th International Conference on Software Engineering (ICSE), 2024
Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study, IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), 2023

Frank van der Heijden

Former BSc/MSc student

AutoCompletion via LLMs

Next: CTO @ Teifi Digital

Investigating the performance of language models for completing code in functional programming languages: a haskell case study, The 1st ACM International Conference on AI Foundation Models and Software Engineering (FORGE), 2024

Philippe de Bekker

Former MSc student

AI4SE Benchmarking

Next: SE @ Booking.com

Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol, under review, 2025
Investigating the performance of language models for completing code in functional programming languages: a haskell case study, The 1st ACM International Conference on AI Foundation Models and Software Engineering (FORGE), 2024

Remco Schrijver

Former MSc student

AI Assistants Impact

Next: SE @ Booking.com

Beyond Acceptance Rates: The Impact of JetBrains AI Assistant and FLCC, TU Delft - MSc Thesis, 2024

Thesis Supervision

I have (co)-supervised 41 Msc/Bsc, and many of my studnets have graduated cum laude (top 5% of class).

Msc Level

#	Degree	Year	University	Student	Title
11	MSc	2025	TUDelft	R. Popescu	Dataset Development for LLMs4Code: Licensing, Contamination, and Reproducibility Challenges
10	MSc	2024	TUDelft	A.C. Ionescu	Meet Your Onboarding Buddy: A Smart, Adaptive, and Conversational LLM Assistant
9	MSc	2024	TUDelft	R. Schrijver	Beyond Acceptance Rates: The Impact of JetBrains AI Assistant and FLCC
8	MSc	2024	TUDelft	T. van Dam	Black-box Context-Aware Code Completion
7	MSc	2024	TUDelft	P. de Bekker	AI for Software Engineering: Reviewing and Improving Benchmarking Practices
6	MSc	2024	TUDelft	F. van der Heijden	Interactive & Adaptive LLMs: Building and Evaluating an LLM-based Code Completion Plugin
5	MSc	2024	U. of Milano-Bicocca	F. Salerno	Extracting Training Data from Fine-tuned Large Language Models
4	MSc	2022	TUDelft	A. Al-Kaswan	Limits of Binary Code Summarization with Transformers
3	MSc	2021	Sharif	M. Nejati	Missing Software Tag Recommendation
2	MSc	2021	Sharif	P. rostami	Issue Commit Linking
1	MSc	2020	Sharif	K. Akbari	Isure Report Classificatio

BSc Level

Counter	Degree	Year	University	Student	Title
30	BSc	2024	TUDelft	B. Koc	Implications of LLMs4Code on Copyright Infringement
29	BSc	2024	TUDelft	P. Deatc	Red Teaming LLMs for Dangerous and Unfair Software Applications
28	BSc	2024	TUDelft	C. Ionescu	Red-Teaming Code LLMs for Malware Generation
27	BSc	2024	TUDelft	F. Ignijic	Evaluating Adaptive Activation Functions in Language Models
26	BSc	2024	TUDelft	Y. Wu	Sparse Transformers are (in)Efficient Learners
25	BSc	2024	TUDelft	R. Mota Borges	Tokenization Matters: Training Your Tokenizer Right
24	BSc	2024	TUDelft	P. Loizides	LLM of Babel: Evaluation of LLMs on Code (Greek Focus)
23	BSc	2024	TUDelft	G. Panchu	LLM of Babel: Java Code Summarization in Dutch
22	BSc	2024	TUDelft	M. Ziemlewski	LLM of Babel: Code Summarization in Polish
21	BSc	2024	TUDelft	S. Vermeulen	Evaluating CodeGemma-7B for Dutch Code Comment Generation
20	BSc	2024	TUDelft	Y. Huang	LLM of Babel: Broader Multilingual Evaluation
19	BSc	2024	TUDelft	I. Vasiliauskas	Detecting Weaknesses in LLM Generated Code
18	BSc	2024	TUDelft	I. Moruz	How Can LLMs Harm Privacy? Red-Teaming Exploration
17	BSc	2024	TUDelft	K. Gulamov	Speed/Quality Trade-offs in Attention Mechanisms
16	BSc	2023	TUDelft	D. Sochirca	Compressing Code Generation Language Models on CPUs
15	BSc	2023	TUDelft	M. Keeler	Cross-Lingual Evaluation of CodeGen in Code Completion
14	BSc	2023	TUDelft	E. Malmsten	Distil-CodeGPT: Distilling Code-Generation Models
13	BSc	2023	TUDelft	M. Storti	Efficient Transformer Quantization for CodeGPT
12	BSc	2023	TUDelft	H. Kuo	Cross-Lingual Performance of CodeGPT in Completion Tasks
11	BSc	2023	TUDelft	E. Malmsten	Distil-CodeGPT: Distilling Code-Generation Models
10	BSc	2023	TUDelft	A. de Moor	Compressing CodeGPT via Layer Reduction and Quantisation
9	BSc	2023	TUDelft	R. Popescu	Common Code Structures Impact on CodeParrot Completion
8	BSc	2022	TUDelft	T. van Dam	Performance Analysis of UniXcoder
7	BSc	2022	TUDelft	F. van der Heijden	Analysis of InCoder on Statement Prediction
6	BSc	2022	TUDelft	M. Turk	Improving Source Code Conversion for Code Completion
5	BSc	2022	TUDelft	J. de Weerdt	User Evaluation of UniXcoder with Statement Completion
4	BSc	2022	TUDelft	M. Otten	User Evaluation of InCoder with Statement Completion
3	BSc	2022	TUDelft	A.C. Ionescu	Repository Recommender System Using Tag Hierarchies
2	BSc	2022	TUDelft	C. Botocan	Duplicate Stack Overflow Detection Using Tags and Text
1	BSc	2022	TUDelft	A. van der Rande	Improving GitHub Tag Recommenders Using Tag Hierarchies