Zico Xu · 许梓超
Zico Xu
许梓超
Algorithm Assistant Engineer  ·  CYDOO

I hold a B.Eng. in Computer Science from Beijing Institute of Technology, Zhuhai (BITZH). My previous research focused on computer audition and speech signal processing.

I will join CYDOO, a robotics company, in an algorithm role, while independently exploring AI agents and AI system design. My trajectory converges toward the intersection of AI algorithms and AI systems — understanding not just what works, but why, and under what conditions.

Zico Xu
Email GitHub CV

News

动态
2026.05 Thrilled to join CYDOO as an Algorithm Assistant Engineer!

Selected Work

代表工作
Research Project
System Reconstruction under Distribution Shift: A Case Study in Child Speech Emotion Recognition

Reproduced a published model reporting 86.82% accuracy. Discovered 63% speaker overlap between train and test sets — systemic data leakage inflating all reported results. After enforcing strict speaker independence, accuracy collapsed to ~35%, near chance. Root cause: mean-pooled handcrafted features (162-dim) destroyed temporal structure, making Conv1d slide on the feature-concatenation axis rather than time — the architecture was mathematically incapable of temporal modeling.

Rebuilt the full pipeline from data to evaluation. Switched to WavLM frame-level SSL features (768-dim × T, preserving the true time axis). Designed Prosody-guided Temporal Importance Pooling — using children's F0 and energy contours to weight frames by importance, replacing blind mean pooling. Conducted negative augmentation experiments, quantifying distribution shift via Fréchet Distance.

80.85% test accuracy under strict 6:2:2 speaker-independent protocol, +30pp over strict MFCC baseline (50.2%), +5pp over published C-BESD baseline (76%). Prosody Pooling is the dominant driver (+2.24pp); Adapter contributes only ~0.5pp. Augmentation confirmed FD ∝ 1/Accuracy — distribution shift quantitatively predicts performance degradation. Cross-language transfer (English↔Telugu) largely fails (19–28%), identifying language shift as a harder problem than acoustic shift.

35% → 80.85% UAR 6:2:2 speaker-independent Prosody Pooling +2.24pp
System Building
Enterprise Knowledge Base RAG + Agent System

Built an end-to-end retrieval-augmented generation system for enterprise document Q&A. Pipeline spans document chunking, embedding, vector retrieval, prompt assembly, and LLM generation. Extended with agent tool-calling capabilities — summarization, task extraction, and structured querying — demonstrating integration across embedding, retrieval, API orchestration, and LLM inference.

Publication IMCEC 2024 April 16, 2024
A Multi-Means Speech Encryption Algorithm Based on Improved Zigzag Transformation
..., Zichao Xu, ShanHuang
Software Reg. No. 2024SR1790888  ·  2024.11.14
IoT-Based Smart Stereoscopic Agriculture System [HuiNong] V1.0
Beijing Institute of Technology, Zhuhai; ...; Zichao Xu

Research Interests

研究方向
Current Exploration
  • AI Algorithms — representation learning, sequential decision-making, structured prediction
  • AI Systems — efficient inference, model deployment, system-algorithm co-design
  • Robot Motion Control — planning and control for robotic platforms
Previous Focus
  • Computer Audition — speech and audio representation, deep learning for acoustic tasks
  • Speech Security — encryption algorithms, information hiding
  • Affective Computing — speech emotion recognition, distribution shift, speaker generalization

Currently converging from single-domain applications toward generalizable methods. The core question: how do we build AI systems that are both capable and reliably understood?

Technical Skills

技术能力
AI Agent & LLM Tools
Claude Code, Codex, Cursor, ChatGPT, Gemini, DeepSeek — active across research and development workflows. Experienced in AI-augmented development: human-led decisions, AI-assisted execution.
Deep Learning & Infrastructure
Python, PyTorch, CUDA/GPU training, Linux, Git. Speech and temporal data modeling with Librosa. Model deployment via ONNX. Data augmentation and few-shot learning.
AI Application Development
RAG, Agent, and tool-calling architectures. Prompt engineering. Vibe Coding and Harness Engineering patterns. End-to-end AI system integration.
Reinforcement Learning & Embodied AI
PPO and policy optimization methods. Isaac Gym and Mujoco simulation for robot control policy training. RL fundamentals and practical implementation.

Experience

经历
2026.05 —
Algorithm Assistant Engineer
CYDOO (Robotics), ShenZhen
Joining the algorithm team at a robotics company. Looking forward to this new chapter.
2025.01 — 2026.01
Research Intern
Brain Health Engineering (BHE) Lab, BITZH
Contributed to the Guangdong Provincial Key Special Project (No. 2024ZDZX2097). Participated in research projects on speech and audio processing. Co-authored a conference paper (under submission, 2026).
2021.09 — 2025.09
B.Eng. Computer Science and Technology
Beijing Institute of Technology, Zhuhai (BITZH)
Research focus: computer audition, signal processing, and deep learning. Joined the Brain Health Engineering (BHE) Lab where Associate Prof. Zhou works.

Growth & Thinking

成长与思考
Pre-2021
Before university
Knowledge came from textbooks and teachers. Learning meant absorbing what was assigned. I was curious about technology but had no concept of independent inquiry — "research" was something that happened elsewhere.
2021-2022
First major: Automation
Entered BITZH as an Automation major. University felt like an extension of high school — attend lectures, complete assignments. The routine was comfortable but increasingly felt like someone else's path.
2022-2023
A bold move: Computer Science
Realized my real interest lay in coding, algorithms, and AI. Transferred to Computer Science and Technology in the first semester of freshman year. Course projects became my first exposure to problems requiring answers beyond the syllabus. Self-directed learning shifted from necessity to habit.
2023-2024
A fateful encounter
Took Associate Prof. Zhou's Speech Recognition course and served as course representative. Consistent engagement — in and after class — earned her trust. She invited me to join the AI & Embedded Lab, my first entry into a research environment.
2025-2026
Research internship — BHE Lab
Senior year: university restructuring led Associate Prof. Zhou to the Brain Health Engineering (BHE) Lab at BIT. She invited me for a research internship. Under her guidance, I contributed to a provincial-level key project and chose a focused direction for my graduation thesis. Transitioned from coursework-based learning to independent, in-depth research.
2026-Now
Research Assistant — BHE Lab
After graduation, continued at BHE Lab as a research assistant. Currently fixing the data leakage problem discovered in earlier experimental work (63% speaker overlap), re-running experiments under corrected protocols, organizing experimental data, and drafting manuscripts.
2026-Now
From school to society: first job at CYDOO
Navigating the shift from process-oriented research to result-driven engineering. Continuing to develop research skills alongside industry preparation. Learning when to apply research thinking and when to ship. Looking forward.