コンテンツにスキップ

セミナー
Seminars

リサーチセミナー Research Seminars

リサーチグループメンバーや学内外の研究者・学生による研究成果や進捗の発表会
Research seminars invite group members and affiliated researchers and students to present their work

2026年

リサーチセミナー Research Seminar #7 2026.06.17

  • 日付 Date:2026.06.17
  • 発表者 Speaker:Sarmistha Das (Indian Institute of Technology Patna)
  • 題目 Title:Reasoning-Aware Multimodal NLP for Human-Centered FinTech and Cross-Cultural AI
  • 要旨 Abstract:Financial technology services increasingly depend on intelligent systems that can understand user needs, interpret noisy real-world evidence, and provide reliable decision support. However, practical FinTech scenarios are rarely limited to clean textual inputs. Users often express complaints, financial concerns, advisory needs, or questions through multilingual text, multi-turn conversations, images, videos, audio signals, numerical tables, and culturally grounded expressions. This creates important challenges for current NLP and multimodal AI systems, including weak numerical reasoning, limited explainability, poor multilingual robustness, and insufficient grounding in visual or conversational context. In this talk, I will present my research journey toward building reasoning-aware, multilingual, and multimodal AI systems for financial and human-centered applications. The first part of the talk focuses on financial complaint understanding, covering text-based complaint identification, conversational complaint mining, multimodal customer grievance analysis, and video-based financial complaint detection. These works move beyond simple classification by incorporating aspect-level reasoning, severity modeling, cause extraction, multimodal fusion, and explainability. The second part discusses financial advisory and reasoning systems, including commonsense-aware conversational agents, multimodal financial advisory summarization, and multilingual financial question answering. These studies explore how large language models and vision-language models can be adapted through task-specific frameworks, retrieval, preference optimization, constrained decoding, and supervised fine-tuning to improve reliability in high-stakes financial settings. Finally, I will discuss my broader interest in culturally grounded multimodal AI, including multilingual idiom understanding across Hindi, Bengali, and Thai. This direction extends the same core research question beyond finance: how can AI systems understand not only what users say, but also the intent, cultural meaning, visual evidence, and reasoning process behind it? The talk concludes with possible directions for India–Japan collaboration on trustworthy, multilingual, multimodal, and culturally adaptive AI systems.
  • プロフィール Bio:Sarmistha Das is a Ph.D. student in the Department of Computer Science and Engineering at the Indian Institute of Technology Patna, India. Her research focuses on financial NLP, multimodal learning, multilingual reasoning, explainable AI, and human-centered large language models. She works on building AI systems for financial complaint understanding, conversational financial advisory, multimodal financial summarization, multilingual financial question answering, and culturally grounded multimodal reasoning. Her recent research investigates how language models and vision-language models can be adapted for real-world financial and social-good applications through multitask learning, mixture-of-experts architectures, retrieval-augmented generation, preference optimization, constrained decoding, and reasoning-aware evaluation. She has contributed to datasets and systems for financial complaint mining, video-based complaint generation, financial advisory summarization, multilingual financial QA, and multimodal idiom understanding. Her work has appeared or is under review in venues including IEEE Transactions on Computational Social Systems, WACV, ACM Multimedia, CIKM, AAAI, and related NLP/multimodal AI venues. Her long-term research goal is to develop trustworthy, explainable, multilingual, and multimodal AI systems that can support financial literacy, user assistance, and cross-cultural understanding.

リサーチセミナー Research Seminar #6 2026.05.25

  • 日付 Date:2026.05.25
  • 発表者 Speaker:Charles Clarke (University of Waterloo)
  • 題目 Title:A Sequential Decision Framework for LLM-Guided Search
  • 要旨 Abstract:Recent systems applying "retrieval-augmented generation" (RAG) and "deep research" iteratively generate queries, retrieve documents, and examine their contents, using the resulting information to guide subsequent search steps. When this process is carried out autonomously rather than by a human searcher, the system must make explicit decisions about which search trajectories to explore and what objective the search is intended to optimize. We introduce a simple framework for reasoning about such search policies. The framework models the search environment as a set of ranked lists and represents behavior using two primitive actions: select, which advances judgment through an existing ranking, and spawn, which generates a new ranking through query generation or reformulation. Each action produces a document to inspect, yields information that updates the search state, and contributes a reward for a task-specific objective. Within this abstraction, a variety of existing retrieval strategies can be expressed as policies operating over a common environment, while common retrieval objectives can be represented through reward and weighting functions over the resulting search trajectory, and the resulting formulation admits a natural reinforcement learning interpretation. We illustrate the approach using high-recall information retrieval (HRIR), where the goal is to identify as many relevant documents as possible with limited judging effort, showing how the framework provides a unified way to describe and reason about strategies for both classical retrieval tasks and emerging LLM-guided search systems. Through experiments, we demonstrate that both the underlying retrieval method and the deep research policy employed can materially affect performance.
  • プロフィール Bio:Charles Clarke is a Professor of Computer Science at the University of Waterloo, Canada, and a Visiting Professor at the National Institute of Informatics, Japan. His research focuses on data intensive tasks and efficiency, including search, ranking, question answering, and other problems involving human language data. He has supervised to completion over 30 graduate students and published over 200 refereed contributions on a wide range of topics, including search, metrics, user interfaces, filesystem search, natural language processing, machine learning, and databases. He has worked on search engine technology for both Microsoft Bing and Facebook Search. Clarke is an ACM Distinguished Scientist and leading member of the search and information retrieval community, serving as the Chair of the Executive Committee for the ACM Special Interest Group from 2013 to 2016 and as the Co-Editor-in-Chief of the Information Retrieval Journal from 2010 to 2018.

リサーチセミナー Research Seminar #5 2026.05.20

  • 日付 Date:2026.05.20
  • 発表者 Speaker:Benjamin Clavié (Mixedbread / National Institute of Informatics)
  • 題目 Title:ColBERT and Late Interaction Retrieval: Why, How, and What Next?
  • 要旨 Abstract:Within Neural IR, three major paradigms have emerged: dense single-vector retrieval (e.g. DPR), the most dominant one, contextualized sparse retrieval (e.g. SPLADE), and dense multi-vector retrieval (ColBERT), also called "late interaction". In this talk, we will briefly discuss the strengths and weaknesses of each of those paradigms. We will then take a deeper dive into late interaction models and cover both their theoretical and practical advantages. We will then assess how much we understand about the mechanisms of late interaction models, covering aspects which are well-understood as well as others which remain empirical or subject to strong practical constraints. Building on this, we will finally outline potential next steps for late interaction methods.
  • プロフィール Bio:Benjamin Clavié is a senior researcher at MixedBread AI and a PhD student at the National Institute of Informatics, based in Tokyo, Japan. He specializes in natural language processing and information retrieval, with a focus on late-interaction (“ColBERT”) models and novel scoring paradigms in general. Recently, Ben co-led the ModernBERT project, contributing significantly to renewed interest in streamlined, encoder-only models. He earned his MSc in Artificial Intelligence from the University of Edinburgh in Scotland.

リサーチセミナー Research Seminar #4 2026.04.15

  • 日付 Date:2026.04.15
  • 発表者 Speaker:ファム・フーロン(筑波大学)Huu-Long Pham (University of Tsukuba)
  • 題目 Title:機械学習モデル検索 Machine Learning Model Retrieval
  • 要旨 Abstract:現在、AI技術は急速に発展し、自然言語処理や画像処理など多くの分野で活用されている。機械学習モデルは、学習データの種類や構造によって異なる性質を持ち、法律や医療など特定分野に特化したものも存在する。ユーザは自身の解きたい課題に適したものを無数のモデルから選ぶ必要があるが、最適なモデルの発見には多大なコストがかかる。そこで本研究では、Web上に提供されている多様な機械学習モデルの中から、ユーザの特定の課題に最適なモデルを効率的に検索する手法の確立を目指している。Currently, AI technology is rapidly developing and being applied in many fields, such as natural language processing and image processing. Machine learning models possess different characteristics depending on their structure and training data, with some specifically tailored for domains like law and medical field. While users must select the most appropriate model for their specific tasks from countless available options, discovering the optimal model incurs significant costs. Therefore, the goal of this research is to establish a method for efficiently retrieving the appropriate models for a user's specific task from the diverse array of machine learning models available on the Web.
  • プロフィール Bio:筑波大学 図書館情報メディア系 特任助教。博士(情報科学、兵庫県立大学)。専門分野は機械学習モデルの検索。Huu-Long Pham earned his PhD Degree at University of Hyogo. His research focuses in Machine Learning Model Retrieval

リサーチセミナー Research Seminar #3 2026.03.18(延期)

  • 日付 Date:2026.03.18
  • 発表者 Speaker:Benjamin Clavié (Mixedbread / National Institute of Informatics)
  • 題目 Title:ColBERT and Late Interaction Retrieval: Why, How, and What Next?
  • 要旨 Abstract:Within Neural IR, three major paradigms have emerged: dense single-vector retrieval (e.g. DPR), the most dominant one, contextualized sparse retrieval (e.g. SPLADE), and dense multi-vector retrieval (ColBERT), also called "late interaction". In this talk, we will briefly discuss the strengths and weaknesses of each of those paradigms. We will then take a deeper dive into late interaction models and cover both their theoretical and practical advantages. We will then assess how much we understand about the mechanisms of late interaction models, covering aspects which are well-understood as well as others which remain empirical or subject to strong practical constraints. Building on this, we will finally outline potential next steps for late interaction methods.
  • プロフィール Bio:Benjamin Clavié is a senior researcher at MixedBread AI and a PhD student at the National Institute of Informatics, based in Tokyo, Japan. He specializes in natural language processing and information retrieval, with a focus on late-interaction (“ColBERT”) models and novel scoring paradigms in general. Recently, Ben co-led the ModernBERT project, contributing significantly to renewed interest in streamlined, encoder-only models. He earned his MSc in Artificial Intelligence from the University of Edinburgh in Scotland.

リサーチセミナー Research Seminar #2 2026.02.18

  • 日付 Date:2026.02.18
  • 発表者 Speaker:Jiaman He (RMIT University)
  • 題目 Title:Characterizing Personality from Eye-Tracking: The Role of Gaze and Its Absence in Interactive Search Environments
  • 要旨 Abstract:Personality traits influence how individuals engage, behave, and make decisions during the information-seeking process. However, few studies have linked personality to observable search behaviors. This study aims to characterize personality traits through a multimodal time-series model that integrates eye-tracking data and gaze missingness-periods when the user's gaze is not captured. This approach is based on the idea that people often look away when they think, signaling disengagement or reflection. We conducted a user study with 25 participants, who used an interactive application on an iPad, allowing them to engage with digital artifacts from a museum. We rely on raw gaze data from an eye tracker, minimizing preprocessing so that behavioral patterns can be preserved without substantial data cleaning. From this perspective, we trained models to predict personality traits using gaze signals. Our results from a five-fold cross- validation study demonstrate strong predictive performance across all five dimensions: Neuroticism (Macro F1 = 77.69%), Conscientiousness (74.52%), Openness (77.52%), Agreeableness (73.09%), and Extraversion (76.69%). The ablation study examines whether the absence of gaze information affects the model performance, demonstrating that incorporating missingness improves multimodal time-series modeling. The full model, which integrates both time-series signals and missingness information, achieves 10–15% higher accuracy and macro F1 scores across all Big Five traits compared to the model without time-series signals and missingness. These findings provide evidence that personality can be inferred from search-related gaze behavior and demonstrate the value of incorporating missing gaze data into time-series multimodal modeling.
  • プロフィール Bio:Ms. Jiaman He is a PhD student at RMIT University, Australia. Her research focuses on information retrieval, human behavior, and physiological sensing. Ms. He develops novel methodologies that use eye-tracking data to understand human behavior and cognitive states during search activities. In addition, her work investigates the statistical differences between large language model (LLM) annotations and human decision-making in annotation tasks.

リサーチセミナー Research Seminar #1 2026.01.14

  • 日付 Date:2026.01.14
  • 発表者 Speaker:上保秀夫(筑波大学)Hideo Joho (University of Tsukuba)
  • 題目 Title:モデル検索行動 Model Search Behaviour
  • 要旨 Abstract:近年、Retrieval Augmented Generation(RAG)手法などを背景に、人間ユーザのリクエストを受けた大規模言語モデルが、ユーザの代わりに文書コーパスを検索する応用が拡大している。しかし、大規模言語モデルを含む生成AIモデルの出力機序は未解明な部分が多く、各種モデルの検索行動も明らかになっていない。そこで、本研究では、従来人間ユーザを対象に実施してきたユーザスタディ手法を生成AIモデルに適応する「モデル検索行動研究」を紹介する。また、モデル検索行動分析のために開発したgeniie-labライブラリも紹介することで、研究の具体的な手続きや得られる知見を提示する。In recent years, with the rise of methods such as Retrieval-Augmented Generation (RAG), applications have increasingly emerged in which large language models search document corpora on behalf of human users. However, the output mechanisms of generative AI models, including large language models, are essentially black boxes, and their search behaviors remain poorly understood. In this talk, I introduce "model search behaviour" research, which adapts traditional user‑study methodologies, originally designed for human users, to generative AI models. I also present the geniie-lab library, developed for analysing model search behaviour, and illustrate the concrete research procedures and insights that can be obtained.
  • プロフィール Bio:筑波大学図書館情報メディア系・教授。専門はインタラクティブ情報検索、人間情報インタラクション、モデル検索行動、など。 Full Professor at the Institute of Library, Information and Media Science, University of Tsukuba. His research interests include Interactive Information Retrieval, Human Information Interaction, and Model Search Behaviour.