A timestamp stamped on USB arrival reflects host reception, not the instant of acquisition. Consequences for sub-millisecond multimodal sync.
Many pipelines stamp a sample at the moment it reaches the host USB port. That timestamp lies: it describes the instant of software reception, after hardware buffering, driver traversal and OS scheduling, not the instant the converter actually sampled.
Invisible but structural jitter
Between hardware acquisition and host reception, variable latency and jitter accumulate. On a single stream this goes unnoticed. As soon as you want to align two modalities at the sub-millisecond level, say EEG and video or EEG and an inertial sensor, these independent offsets make the alignment wrong.
The host timestamp is not an acquisition clock, it is an arrival clock.
Sync layer or single clock
A first approach adds a synchronization layer: a common resampling base and a shared physical signal, a TTL or sync pulse, present in every stream to realign them afterward. The radical counterexample is the single hardware clock: one oscillator drives all channels, and the alignment problem disappears by construction.
The takeaway: never synchronize on the instant of arrival. Either you inject a shared physical marker, or you share a single clock.