Zico's home

Zico Xu

许梓超

Algorithm Assistant Engineer · CYDOO

I hold a B.Eng. in Computer Science from Beijing Institute of Technology, Zhuhai (BITZH). My previous research focused on computer audition and speech signal processing.

I will join CYDOO, a robotics company, in an algorithm role, while independently exploring AI agents and AI system design. My trajectory converges toward the intersection of AI algorithms and AI systems — understanding not just what works, but why, and under what conditions.

Email GitHub CV

News

动态

2026.05 Thrilled to join CYDOO as an Algorithm Assistant Engineer!

Selected Work

代表工作

Research Project

System Reconstruction under Distribution Shift: A Case Study in Child Speech Emotion Recognition

Reproduced a published model reporting 86.82% accuracy. Discovered 63% speaker overlap between train and test sets — systemic data leakage inflating all reported results. After enforcing strict speaker independence, accuracy collapsed to ~35%, near chance. Root cause: mean-pooled handcrafted features (162-dim) destroyed temporal structure, making Conv1d slide on the feature-concatenation axis rather than time — the architecture was mathematically incapable of temporal modeling.

Rebuilt the full pipeline from data to evaluation. Switched to WavLM frame-level SSL features (768-dim × T, preserving the true time axis). Designed Prosody-guided Temporal Importance Pooling — using children's F0 and energy contours to weight frames by importance, replacing blind mean pooling. Conducted negative augmentation experiments, quantifying distribution shift via Fréchet Distance.

80.85% test accuracy under strict 6:2:2 speaker-independent protocol, +30pp over strict MFCC baseline (50.2%), +5pp over published C-BESD baseline (76%). Prosody Pooling is the dominant driver (+2.24pp); Adapter contributes only ~0.5pp. Augmentation confirmed FD ∝ 1/Accuracy — distribution shift quantitatively predicts performance degradation. Cross-language transfer (English↔Telugu) largely fails (19–28%), identifying language shift as a harder problem than acoustic shift.

35% → 80.85% UAR 6:2:2 speaker-independent Prosody Pooling +2.24pp

System Building

Enterprise Knowledge Base RAG + Agent System

Built an end-to-end retrieval-augmented generation system for enterprise document Q&A. Pipeline spans document chunking, embedding, vector retrieval, prompt assembly, and LLM generation. Extended with agent tool-calling capabilities — summarization, task extraction, and structured querying — demonstrating integration across embedding, retrieval, API orchestration, and LLM inference.

Publication IMCEC 2024 April 16, 2024

A Multi-Means Speech Encryption Algorithm Based on Improved Zigzag Transformation

..., Zichao Xu, ShanHuang

Software Reg. No. 2024SR1790888 · 2024.11.14

IoT-Based Smart Stereoscopic Agriculture System [HuiNong] V1.0

Beijing Institute of Technology, Zhuhai; ...; Zichao Xu

Research Interests

研究方向

Current Exploration

AI Algorithms — representation learning, sequential decision-making, structured prediction
AI Systems — efficient inference, model deployment, system-algorithm co-design
Robot Motion Control — planning and control for robotic platforms

Previous Focus

Computer Audition — speech and audio representation, deep learning for acoustic tasks
Speech Security — encryption algorithms, information hiding
Affective Computing — speech emotion recognition, distribution shift, speaker generalization

Currently converging from single-domain applications toward generalizable methods. The core question: how do we build AI systems that are both capable and reliably understood?

Technical Skills

技术能力

AI Agent & LLM Tools

Claude Code, Codex, Cursor, ChatGPT, Gemini, DeepSeek — active across research and development workflows. Experienced in AI-augmented development: human-led decisions, AI-assisted execution.

Deep Learning & Infrastructure

Python, PyTorch, CUDA/GPU training, Linux, Git. Speech and temporal data modeling with Librosa. Model deployment via ONNX. Data augmentation and few-shot learning.

AI Application Development

RAG, Agent, and tool-calling architectures. Prompt engineering. Vibe Coding and Harness Engineering patterns. End-to-end AI system integration.

Reinforcement Learning & Embodied AI

PPO and policy optimization methods. Isaac Gym and Mujoco simulation for robot control policy training. RL fundamentals and practical implementation.

Experience

经历

2026.05 —

Algorithm Assistant Engineer

CYDOO (Robotics), ShenZhen

Joining the algorithm team at a robotics company. Looking forward to this new chapter.

2025.01 — 2026.01

Research Intern

Brain Health Engineering (BHE) Lab, BITZH

Contributed to the Guangdong Provincial Key Special Project (No. 2024ZDZX2097). Participated in research projects on speech and audio processing. Co-authored a conference paper (under submission, 2026).

2021.09 — 2025.09

B.Eng. Computer Science and Technology

Beijing Institute of Technology, Zhuhai (BITZH)

Research focus: computer audition, signal processing, and deep learning. Joined the Brain Health Engineering (BHE) Lab where Associate Prof. Zhou works.

Growth & Thinking

成长与思考

Pre-2021

Before university

Knowledge came from textbooks and teachers. Learning meant absorbing what was assigned. I was curious about technology but had no concept of independent inquiry — "research" was something that happened elsewhere.

2021-2022

First major: Automation

Entered BITZH as an Automation major. University felt like an extension of high school — attend lectures, complete assignments. The routine was comfortable but increasingly felt like someone else's path.

2022-2023

A bold move: Computer Science

Realized my real interest lay in coding, algorithms, and AI. Transferred to Computer Science and Technology in the first semester of freshman year. Course projects became my first exposure to problems requiring answers beyond the syllabus. Self-directed learning shifted from necessity to habit.

2023-2024

A fateful encounter

Took Associate Prof. Zhou's Speech Recognition course and served as course representative. Consistent engagement — in and after class — earned her trust. She invited me to join the AI & Embedded Lab, my first entry into a research environment.

2025-2026

Research internship — BHE Lab

Senior year: university restructuring led Associate Prof. Zhou to the Brain Health Engineering (BHE) Lab at BIT. She invited me for a research internship. Under her guidance, I contributed to a provincial-level key project and chose a focused direction for my graduation thesis. Transitioned from coursework-based learning to independent, in-depth research.

2026-Now

Research Assistant — BHE Lab

After graduation, continued at BHE Lab as a research assistant. Currently fixing the data leakage problem discovered in earlier experimental work (63% speaker overlap), re-running experiments under corrected protocols, organizing experimental data, and drafting manuscripts.

2026-Now

From school to society: first job at CYDOO

Navigating the shift from process-oriented research to result-driven engineering. Continuing to develop research skills alongside industry preparation. Learning when to apply research thinking and when to ship. Looking forward.