Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My guess is that it’s an unfortunate combination of several problems:

- audio and video capture has to start going before call is actually established at signaling level, in order to minimize call establishment delay. Audio maybe going through Bluetooth, for example, and waking up Handsfree mode of BT may take 1-2 sec

- most of the group calling functionality was developed by a separate team, and group calling signaling may be loosely integrated at UI level, where, once UI triggers a switch to a group call - internally, the whole new library may kick in and get the current 1-1 call state transferred to it.

- when this “transfer” happens, the state of the first 1-1 call gets affected (at either local or remote side (due to signaling), which leads to either remote side think that the call was answered (a lack of protection in the call signaling state machine to ensure it was users UI action) or local side thinks it’s ok that remote users answers the call (in this case FT must have streamed audio even during 1-1 call establishment phase)

- lack of a check for your own phone number added to a call. This, due to having the same IDs/tokens twice in a group call, may lead to unexpected call signaling state machine switch

- lack of manual testing with focus on edge cases (like the described flow to repro the bug may not be the main flow for how users start group calls on FT)

I never worked at Apple, but I built VoIP stuff for the past 20 years.



They were trying to reduce a latency and accidentally made it negative.


"negative latency". That's a Jobs worthy dodge right there.


It's weird seeing my username come up in a comment.


Does it mean I got to read your msg before you clicked the post button! :-)


> They were trying to reduce a latency and accidentally made it negative.

Ha! Facetime is now a non-causal filter.


latency of local hardware is much lower then cellphone/internet latency... so even if you bring local hardware latency to 0, you still have major network lag... I know you are kidding but this is ridiculous

They supposedly have these chips that make your phone much more secure and they can't get stupid stuff like this right? LOL ... GO APPLE. My trust level was already close to zero before this...


Latency between answering the call and being able to actually talk to the person on the other end.


right... the network creates the latency, not the time it takes to turn on a microphone (which is instant when compared to network latency)


> leads to either remote side think that the call was answered

Maybe that's triggered by adding your own number? Since you're clearly on the call already, your own number is obviously going to answer immediately and that kicks the whole call into "active" (since you presumably want a call to become active when more than one person has answered) without considering that you've actually got A+A+B instead of A+B+C.


maybe - it's hard to tell, and I did bring up that other option as well. It's just that when a 1-1 call upgrades to a multiparty call, there should a be a lot of new stuff going on at signaling level to convert that 1-1 call to multiparty - and a chance is that it's a combination that leads to this bug - a call gets upgraded to a multiparty AND a number being added is already part of the call...


> audio and video capture has to start going before call is actually established at signaling level, in order to minimize call establishment delay. Audio maybe going through Bluetooth, for example, and waking up Handsfree mode of BT may take 1-2 sec

As a user, while I can accept capture starting before I answer, I cannot accept sending. I understand how it helps the speed of establishing calls.

But it means the only thing needed to spy on me from that is a software change ON THE OTHER SIDE. No way to know from my side if I'm good or not.


But your comment it missing the point. Parent commenter said it needs to start recording audio and video earlier, and a combination of other circumstances cause the device to send the data. Nobody argues this is acceptable.


I don't think that's what he meant. It can be sending empty audio, but the whole path needs to be up and running.


I am not saying whether sending happens at the establishment phase or not - it's just that when your AV capturing is started and even encoding goes on - it's a matter of dropping the actual AV packets or sending them at the network stack/jitter buffer level. By the way, if not for privacy/security, it's actually very useful to start sending AV stream over the network, to pre-heat the network & test its throughput. For example, cellular data bitrate ramps up only when you actually send data, and it takes some delay to ramp up at more power-hungry levels and for cellular to even test what's possible. Also, estimating network bandwidth for the application layer requires measuring the round-trip time for a few seconds...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: