So it builds a linear timeline?
No, just duplicate sources but with the external audio embedded and synchronized. A little like the transcode function that asks you to hide the original media leaving only the newly transcoded. 1- collect the files into one folder (video files with internal audio + external audio). Duplicate they. Calculate the length of each file by timecode. If the external audio is not unique, concatenate and insert blanks to maintain timecode linearity. [?] 2- Synchronize the files with the external audio track (your choice of timecode {or waveform}). [?; you consider the external audio track to be the "master," you synchronize the other media with reference to its timecode. Or you take advantage of CinGG's code for "Align Timecodes".] 3- Duplicate and trim the external audio track so that there are coincidental chunks with every files. [?; you assign each media a timecode based on that of the master and trim on copies of the master for each media; ffmpeg -ss, -t, map] 4- Associate the external audio chunks with the relative files. [ffmpeg map] 5- Delete the internal audio tracks leaving only the external one. [ffmpeg map]