Kento Nishi 西健斗

San Jose, CA Cambridge, MA Chiba, Japan

mail Email description Resume GitHub Scholar LinkedIn

Hi! My name is Kento, and I'm a PhD student at MIT EECS/CSAIL advised by Phillip Isola. I graduated from Harvard College/SEAS in 2026, with Honors AB/SM degrees in Computer Science. Click here to read my full bio.

In the past, I had the pleasure of being advised by Hidenori Tanaka, Ekdeep Singh Lubana, and Hanspeter Pfister at Harvard, as well as Tobias Höllerer at UCSB. Through my undergraduate and ongoing graduate studies, the Ezoe Memorial Recruit Foundation Scholarship has graciously supported my academic pursuits.

My research interest is to understand the surprising quirks of deep learning. Why does training give birth to well-organized representations for certain concepts and tasks, but not others? What mechanistic motifs emerge across different models, and why? What properties of the underlying learning algorithm and data distribution lead to these phenomena, and how can we leverage this understanding to build safer and more capable systems? I want to answer these fundamental questions by building a scientific theory of artificial intelligence. Incidentally, I strongly support interdisciplinary collaboration, open access, and open source.

Aside from academics, I'm an avid long-distance runner (mainly half and full marathons). I also love F1, public transit, anime/vtubers, music production, local eats, and Rocket League. Feel free to reach out via email or Discord @kento24!

My Publications filter: first author | all papers

Mechanisms of Misgeneralization in Physical Sequence Modeling

Kento NishiRaphael TangKarun KumarCore Francisco ParkHidenori Tanaka

Preprint 2026, as first author.

arXiv/Website/Tweet Thread

We show that generative sequence models can produce individually plausible physical trajectories while shifting aggregate quantities such as distance or energy, and use a data deviation kernel to predict and reduce this drift.

Evolutionary Curriculum Learning for Biological Sequence Modeling

Richard Yuxuan ZhuKento Nishi

ICML 2026 SPIGM Workshop, as co-author.

OpenReview

We train biological sequence models with a curriculum that gradually expands from nearby evolutionary neighbors to more distant homologs, improving protein variant-effect prediction and RNA sequence generation.

When does Observational Data Teach Latent Dynamics? Understanding Control Misalignment with Synthetic Tasks

Kento NishiRaphael TangKarun KumarCore Francisco ParkHidenori Tanaka

Sci4DL 2026 Workshop, as first author.

OpenReview

We show that generated samples can fit the observed data distribution while violating the distribution of hidden controls such as speed, energy, or speaking rate.

Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing

Kento NishiRahul RameshMaya OkawaMikail KhonaHidenori TanakaEkdeep Singh Lubana

ICML 2025, as first author.

Poster/OpenReview/arXiv/Code/Tweet Thread

We show that knowledge editing distorts entity representations beyond targeted facts, fracturing geometries that support factual recall and reasoning.

In-Context Learning of Representations

Core Francisco ParkAndrew LeeEkdeep Singh LubanaYongyi YangMaya OkawaKento NishiMartin WattenbergHidenori Tanaka

ICLR 2025, as co-author.

Poster/OpenReview/arXiv

We show that a concept representation can mediate in-context learning; when a graph is specified by enough examples, model representations shift toward that graph rather than only pretrained semantic associations.

Structured In-Context Task Representations

Core Francisco ParkAndrew LeeEkdeep Singh LubanaKento NishiMaya OkawaHidenori Tanaka

NeurIPS 2024 NeurReps Workshop, as co-author.

OpenReview

We find internal task representations during in-context learning on synthetic sequences from geometric graphs, including cases where model behavior follows context-specified structure over semantic priors.

Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model

Mikail KhonaMaya OkawaJan HulaRahul RameshKento NishiRobert DickEkdeep Singh LubanaHidenori Tanaka

ICML 2024, as co-author.

Poster/OpenReview/arXiv

We use a synthetic graph-navigation task to measure stepwise inference, covering the reasoning gap, sampling-temperature tradeoff, simplicity bias, compositional generalization, and in-context primacy.

Stepwise Inference in Transformers: Exploring a Synthetic Graph Navigation Task

Mikail KhonaMaya OkawaRahul RameshKento NishiRobert P. DickEkdeep Singh LubanaHidenori Tanaka

NeurIPS 2023 R0-FoMo Workshop, as co-author.

OpenReview

In directed acyclic graph navigation, we find better route generation when step-by-step paths expose hierarchical subpaths seen during pretraining; generated routes are further biased by in-context examples.

Joint-Task Regularization for Partially Labeled Multi-Task Learning

Kento NishiJunsik KimWanhua LiHanspeter Pfister

CVPR 2024, as first author.

Poster/OpenAccess/arXiv/Website/Code

We regularize dense prediction outputs in a shared task space, improving partially labeled multi-task learning while scaling linearly with task count.

Augmentation Strategies for Learning with Noisy Labels

Kento NishiYi DingAlex RichTobias Höllerer

CVPR 2021, as first author.

Video/OpenAccess/arXiv/Website/Code

We separate augmentation strategies for loss modeling and model learning, improving robustness to noisy labels, including a large gain on CIFAR-10 with 90% symmetric noise.

Improving Label Noise Robustness with Data Augmentation and Semi-Supervised Learning

Kento NishiYi DingAlex RichTobias Höllerer

AAAI 2021 Student Abstract Track, as first author.

AAAI/DOI

We combine data augmentation with semi-supervised learning to improve noisy-label robustness on CIFAR-10 and CIFAR-100 variants.

My Apps & Websites

LiveTL

Get live translations for YouTube streams, crowdsourced from multilingual viewers.

GitHub/Website/Chrome Web Store/Mozilla Add-ons

HyperChat customizable look and feel screenshot

HyperChat by LiveTL

HyperChat enhances your YouTube chat with a smoother, more feature-packed experience!

GitHub/Website/Chrome Web Store/Mozilla Add-ons

YtcFilter by LiveTL

Capture YouTube chat messages based on filter rules alongside the standard chat, with persistent logs and import/export.

GitHub/Website/Chrome Web Store/Mozilla Add-ons

holoEN Christmas Advent Calendar

Officially commissioned by Cover Corp. and used as a yearly hololive English community event platform.

Website/Cover Corp. Website

My Developer Tools

slsh: ssh without keyboard lag

A drop-in SSH wrapper with local latency compensation for interactive terminal sessions.

Project/GitHub/Release

Torch Pitch Shift

The first Python library for GPU pitch shifting at the time; later added to PyTorch upstream.

PyPI/GitHub

iframe Translator

Translate text in the browser by triggering in-browser translations programmatically.

npm/GitHub

exio UI Library

Framework-agnostic interactive components that can be mounted via content scripts.

Website/Docs/npm/GitHub

My Music

Date	Status		Title
Mar. 2026	Draft	2:11	Bliss

Aug. 2023	Demo	1:59	Celesta

Jan. 2023	Done	3:12	Forfeit

Sept. 2022	Done	3:24	Voyage

Sept. 2022	Done	3:53	New Beginning

Jul. 2022	Done	1:58	HyperChat Trailer Theme

Mar. 2020	Demo	1:17	Dubstep Demo

My Course Projects

Course		Title
COMPSCI 2760	PDF	Do AI Conferences' Ethics Reviews Steer Research Practices? (No!)
COMPSCI 1050	PDF	Does CS1050's Chatham House Policy Protect Against LLM-Powered Stylometric Linkage De‑Identification? (Yes!)
COMPSCI 271	PDF	Can Temporal Distance Maps Communicate Variability? (Yes!) A User Study with Maps of Transit Travel Times
COMPSCI 175	PDF	Implementing Portals in Unity
HISTSCI 1990	PDF	Browser Extension Standards: How Google Monopolized and Exploited the Web Browser Industry

Kento Nishi にし 西 けん 健 と 斗

My Publications filter: first author | all papers

Mechanisms of Misgeneralization in Physical Sequence Modeling

Evolutionary Curriculum Learning for Biological Sequence Modeling

When does Observational Data Teach Latent Dynamics? Understanding Control Misalignment with Synthetic Tasks

Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing

In-Context Learning of Representations

Structured In-Context Task Representations

Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model

Stepwise Inference in Transformers: Exploring a Synthetic Graph Navigation Task

Joint-Task Regularization for Partially Labeled Multi-Task Learning

Augmentation Strategies for Learning with Noisy Labels

Improving Label Noise Robustness with Data Augmentation and Semi-Supervised Learning

My Apps & Websites

LiveTL

HyperChat by LiveTL

YtcFilter by LiveTL

holoEN Christmas Advent Calendar

My Developer Tools

slsh: ssh without keyboard lag

Torch Pitch Shift

iframe Translator

exio UI Library

My Music

My Course Projects

Kento Nishi 西健斗