陶斯遠

Siyuan TAO

Graduate School of Engineering, Department of Mechanical Engineering, The Univ. of Osaka.
MAIL: suen.tou@eom.mech.eng.osaka-u.ac.jp

I am a M2 student affiliated with the Ishikawa Laboratory, where I am conducting research on embodied learning algorithm and the quantization of Visual SLAM. My current research interests lie in Musculoskeletal Robotics and World Models. I have actively participated in various projects and competitions, where I gained hands-on experience in robot design and development. During my undergraduate years, I was a member of Robohan, contributing to a variety of robot-building initiatives. Outside of my academic and research pursuits, I enjoy playing tennis and watching anime.

Education

the University of Osaka

Bachelor of Mechanical Engineering

GPA: 3.61 / 4.00

April 2019 - May 2024

the University of Osaka

Master of Mechanical Engineering (M1)

GPA: 3.71 / 4.00

April 2024 - May 2026

Skills

Programming Languages & Tools

Other Skills

1. Robotic control using ROS2

2. Automation Workflow Creation using Make

3. Chinese, Japanese, English

Interests

Apart from my research, I love staying active outdoors, especially playing tennis with friends. When indoors, I enjoy a variety of genres, including sci-fi, biographies, and anime, with Haikyuu!! being my all-time favorite. Additionally, I stay engaged with the latest advancements in biography, embodied AI and autonomous drone technology. I also enjoy creating automation tools to enhance my work efficiency.

Achievements

  1. S. Tao, Y. Minami and M. Ishikawa: “Performance evaluation of ORB-SLAM3 with quantized images”, Artif. Life Robotics, Vol. 30, No. 2, 2025.

  1. S. Tao, Y. Minami, M. Ishikawa: “Assessing the Impact of Dynamic Image Quantization based on Error Diﬀusion on Visual SLAM”, Proc. of 29th Int’l Symp. on Artiﬁcial Life and Robotics, Japan, 2024.

  2. S. Tao, Y. Minami, M. Ishikawa: “Memory-Saving Factor Analysis of Visual-Inertial SLAM with Quantized Images”, SICE Festival 2024 with Annual Conference, Japan, 2024.

  3. S. Tao, Y. Minami, M. Ishikawa: Performance Enhancement in Autonomous Drone with CLAHE-based Fog Removal Algorithm, Robomech Conference, Japan, 2025.

  4. A. Michikawa, S. Tao, Y. Masuda, M. Gunji, A. Fukuhara, H. Nabae, Y. Harada, K. Suzumori. Deep Biomimetic Printing Enabling Integration and Composite Fabrication of Tendon-Ligament-Skeletal Structure, Robomech Conference, Japan, 2025.

  5. A. Michikawa, S. Tao, Y. Masuda, M. Gunji, A. Fukuhara, H. Nabae, Y. Harada, K. Suzumori: “Deep Biomimetic Printing using Fiber Embedding and Sponge Ossification”, International Symposium on Adaptive Motion of Animals and Machines (AMAM2025), Germany, 2025.

  1. Received an award for excellence in university-wide education(大阪大学全学教育優秀賞), 2020.

  2. Awarded the Special Prize at the NHK Robot Contest(Robocon), 2021.

  3. Won a bronze medal in a Kaggle competition: "Google AI4Code - Understand Code in Python Notebooks", 2022.

  4. Received the Hatakeyama Award(畠山賞) from the Japan Society of Mechanical Engineers, 2024.

```
  1. TOEFL iBT: 99 (2022.6)
```

  2. TOEIC Reading&Listening: 860 (2022.7), 855 (2025.3)

  3. Passed the JDLA Deep Learning for GENERAL(G検定) 2024#3 certification. (2024.3)

  4. Completed the Large Language Model Course 2024(大規模モデル講座2024) organized by Matsuo-Iwasawa Lab in the University of Tokyo, 2024.

  5. Completed the World Model Course 2024(世界モデル講座2024) organized by Matsuo-Iwasawa Lab in the University of Tokyo, 2025.

  6. Completed the AI Business Insight Course 2025(AI経営講座2025) organized by Matsuo-Iwasawa Lab in the University of Tokyo, 2025.

  7. Completed the Deep Generative Model Course 2025(深層生成モデル2025) organized by Matsuo-Iwasawa Lab in the University of Tokyo, 2025.

  1. Exempted from the graduate school entrance examination due to excellent undergraduate academic performance, 2023.

  2. Selected as a scholarship recipient by the Kajima Ikueikai Foundation(鹿島育英会), 2024.

  3. S. Tao, K. Konno, K. Takaoka: “Assessment of the Relationship Between Intelligence Scaling and Task Efficiency in Embodied Multi-Agent Systems with LLMs”, poster presentation for the World Model Course 2024 at Sanjo Conference Hall, The University of Tokyo, Japan, 2025.

  4. Employed as a Research Assistant (RA) under the JST Fusion Oriented Research for disruptive Science and Technology Program, 2025.

Research

1. [AROB2024・SICE FES2024・AROB Journal] Assessing the Impact of Dynamic Image Quantization based on Error Diffusion on Visual SLAM / Memory-Saving Factor Analysis of Visual-Inertial SLAM with Quantized Images / Performance Evaluation of ORB-SLAM3 with Quantized Images

　Visual simultaneous localization and mapping(SLAM) is a critical technology for robots to perform high-precision navigation, increasing the focus among researchers to improve its accuracy. However, improvements in SLAM accuracy always come at the cost of an increased memory footprint, which limits the long-term operation of devices that operate under constrained hardwareresources. Application of quantization methods is proposed as a promising solution to this problem. Since quantization can result in performance degradation, itis crucial to quantitatively evaluate the trade-off between potential degradation and memory savings to assessits practicality for visual SLAM. This paper introducesa mechanism to evaluate the influence of a quantization method on visual SLAM, and applies it to assessthe impact of three different quantization methods on ORB-SLAM3. Specifically, we examine two static quantizationmethods and a dynamic quantization method called error diffusion, which can pseudo-preserve image shading information. The paper contributes to the conclusion that error diffusion, with controlled weight parameters in the error diffusion filter, can suppress degradation and reduce the memory footprint, demonstrating its effectiveness in dynamic environments.

Furthermore, as an extension beyond the conference presentation, we have validated the experiment in a real-time environment using Autoware and AWSIM (unity-based autonomous driving environment). The experimental setup is as follows.
2. [World Model2024] LLMによる身体化されたマルチエージェントシステムにおける知能のスケーリングと多様性の分析

　近年，World Model，Large Language Models（LLMs），およびVision-Language Models（VLMs）の発展に伴い，身体化されたAI エージェント（Embodied AI）の課題解決能力と連携能力が大幅に向上し，ロボティクスやゲーム分野において研究が盛んに行われている．その中で，Voyager[Wang 2023]をはじめとするMinecraft を舞台とした研究が代表例として挙げられる．Voyager では，LLM を活用した単体のエージェントが未知の環境において，与えられた課題に対して自律的に目標を計画し，環境を探索しながら新しいスキルを獲得することで，継続的な学習を通じて困難な課題を解決できることが示された．また，従来のマルチエージェントシステムにおける群知能も，課題解決において効果的であることが確認されている[Gronauer 2022]．そのため，マルチエージェントシステムにLLM を活用することで，更なる課題解決の効率向上が期待される．VillagerAgent[Dong 2024] やS-Agent[Chen 2024]では，LLM に由来する知能を持つエージェントによるエージェントシステムが，協調作業を通じてタスク効率を向上させることを示した．これに伴い，エージェントの組織構造設計やマルチモーダル技術の利用を中心に，タスク効率の向上を目的とした研究が行われてきた[Li 2024]．しかしながら，これらは大きなモデルサイズの高性能のLLM に基づく同質なエージェントによる実験に，異なる形式の知能がどのように相互作用し，どのように統合されるかについては，依然として十分に理解されていない．また, そのような大きなモデルサイズのLLM の利用は高いコストを必要とし, エージェントシステムの実用上の問題が生じることが考えられる.そこで本研究では，エージェントの知能や多様性が，マルチエージェントシステム内でのタスク効率に与える影響を分析する．具体的には，Minecraft をベースとしたシミュレータMineLand[Yu 2024] を用いて，AI エージェントの集団に建設や資源収集といった集団内の連携を要するタスクを課し，異なるモデルサイズのLLM やエージェントの内部状態を与えてそのパフォーマンスを評価する．この際，エージェントに異なる個性を付与するにあたり，Theory of Mind[Li 2023] やBig Five[Huang 2024] をLLM によるエージェントシステムに適用した研究をもとにシステムプロンプトを設計し，タスクの遂行や創発性への影響を検証する．なお，本研究は「世界モデル2024」講座の最終課題として実施され，その成果は東京大学山上会館にてポスター発表を行った．なお，本研究に用いたプログラムはhttps://github.com/Kosuke-K-21/embodied-world-model.gitで公開している．
3. [Robomech2025] CLAHE ベース霧除去アルゴリズムによるドローン自律飛行の性能向上

　地震や洪水，山火事などの災害現場では，迅速な被害状況の把握や捜索救助活動が求められる．その中で，ドローンは広範囲を効率的に監視し，危険区域への直接の人員投入を避ける手段として重要な役割を果たしている．しかし，災害現場ではしばしば濃霧や煙，粉じんなどが発生するため，カメラを用いた視覚ベースのドローン自律走行システムの性能を大きく低下させる要因となる．これにともない，ドローンの安全性や信頼性が損なわれ，十分な支援活動を行なうことが困難となる．近年，霧や煙のある環境における画像復元技術は，主に事前知識に基づく手法と学習ベースの手法に分類される．事前知識に基づく手法には，大気散乱モデル（ASM: Atmospheric Scattering Model）などの物理的な事前知識を活用する手法や，CLAHE(Contrast Limited Adaptive Histogram Equalization) をはじめとしたヒストグラムを利用した手法が含まれる[1][2]．これらの画像復元手法は計算コストが低く，高速かつ安定した霧除去を実現できる．一方，学習ベースの手法では，深層学習モデルを用いて画像データから直接霧除去を行なう．とくに，最近の研究ではU-Net や拡散モデルを使用した手法が注目されており，これらによる高精度な霧除去が実現されている[3][4]．しかし，学習ベースの手法は計算コストが非常に高く，災害現場のようなリアルタイム処理が求められる環境では適用が困難である．同時に，計算資源を大量に消費するため，ドローンの稼働時間を大幅に減少するといった問題も存在する．さらに，上述の事前知識ベースの手法と学習ベースの手法はいずれも，SLAM(Simultaneous Localization and Mapping)システムとの統合において，定量的な評価が十分に行なわれていない．そのため，計算コストを抑えつつ高精度を維持するためのアプローチが必要が求められており，SLAM システムとの統合に向けた定量的な評価が必要となる．そこで本研究では，CLAHE を用いた低コストかつ高性能な画像復元技術を適用し，濃霧環境下でORB-SLAM3[5] を用いた自己位置推定性能に対する定量的な評価を行なう．評価において，現状公開されている霧ありデータセットには，画像間に時系列情報が欠如することから，SLAM の定量評価に適したデータセットが存在しない．そのため，本研究ではVisual SLAM 用のデータセットEuRoC Mav[6] におけるMH04 データセットをもとに，ASMを用いてさまざまな霧の強さを含むデータセットを生成する．そして，さまざまな濃霧環境下でORB-SLAM3 を用いた自己位置推定性能を定量的に検証することで，提案手法の有効性を明らかにする．
4. [自主研究] 世界モデルによる筋骨格系ロボットの意識と行動パターンの調査

　筋骨格系ロボットは，人間の運動機能を模倣し，高度な適応性と柔軟な動作を実現することができる．しかし，従来の制御手法では，環境変化への適応や長期的な行動計画に限界があり，より高度な認識・意思決定能力に乏しい．近年，世界モデルを活用したロボット制御が注目されており，ロボットが自身の環境を予測・理解し，それに基づいて最適な行動を選択するアプローチが研究されている．本研究では，世界モデルを活用した筋骨格系ロボットの意識と行動パターンの調査を目的とし，ロボットがどのように環境を認識し，適応的な運動戦略を形成するのかを調査する．特に，学習過程における自己意識の発現や，環境の不確実性に対する行動選択のメカニズムに着目し，シミュレーションおよび実機実験を通じて検証を行う．現在はMujocoとGenesisにおける強化学習に用いる筋骨格モデルを作成している段階にあり，ここでは作成したMujoco環境での，簡略化した馬の筋骨格モデルを表す．
5. [自主研究] Dreamer v2を用いたMsPacman環境での強化学習

　モデルベース強化学習アルゴリズムDreamerV2を用いて Ms. Pacman環境での学習を行い，その性能と学習過程を評価する．DreamerV2は，エージェントが環境のダイナミクスを内部モデルとして学習し，実際の環境と対話することなく仮想的な試行錯誤を行うことで，効率的な学習を実現する手法である．Ms. Pacmanは報酬がスパースであるため，敵の動きを予測しながら長期的な行動計画を立てる必要があり，従来の強化学習アルゴリズムにとって学習が困難な課題の一つである．本研究は学習目的として取り組んでおり，DreamerV2の大規模化に伴い，精度が向上することを確認した．今後の展望として，DreamerV3やSTORMなどより高精度かつ安定な世界モデルで実験を行う．

陶 斯遠

Siyuan TAO

Education

the University of Osaka

the University of Osaka

Skills

Interests

Achievements

Research

陶斯遠