-
Notifications
You must be signed in to change notification settings - Fork 168
Feature/trim #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/trim #50
Conversation
- New component TrimDataSource, wrapping DataSource to be trimmed. - MediaExtractorDataSource is an abstract class to limit visibility of MediaExtractor to package - Updates to Engine to replace selectAudio/transcode/selectVideo/transcode sequence by selectAudio/selectVideo/transcode/transcode
Using 2 editText fields, default value is zero.
Builder can add directly trim values for UriDataSource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
M concern is that this does not seem to work for audio-only files? I think the implementation could be simpler if you don't differentiate between video and audio, and work for both.
Thank you very much for working on this!
lib/src/main/java/com/otaliastudios/transcoder/engine/Engine.java
Outdated
Show resolved
Hide resolved
lib/src/main/java/com/otaliastudios/transcoder/source/MediaExtractorDataSource.java
Outdated
Show resolved
Hide resolved
lib/src/main/java/com/otaliastudios/transcoder/source/TrimDataSource.java
Outdated
Show resolved
Hide resolved
lib/src/main/java/com/otaliastudios/transcoder/source/TrimDataSource.java
Outdated
Show resolved
Hide resolved
lib/src/main/java/com/otaliastudios/transcoder/source/TrimDataSource.java
Outdated
Show resolved
Hide resolved
- fixed case where video track is absent - throw exceptions for invalid trim values
In the original sequence selectAudio / transcodeAudio / selectVideo / transcodeVideo the first step (selectAudio) is intercepted by selectVideo + seekVideo and the 3rd step (selectVideo) is skipped. So it becomes selectVideo / seekVideo / selectAudio / transcodeAudio / transcodeVideo - Also added throws IllegalArgumentException to TrimDataSource
@mudar I'll answer here:
In the end I guess what I am saying is that the solution should not rely on the presence of a video track, as a file might not have one at all. If this is fixed now, I'm fine!
Is this 100%-ish now? I couldn't test myself yet. |
If we're not 100% yet my suggestion would be to experiment more with selecting normally and seeking twice during I don't think MediaExtractor is bad at seeking audio, rather it's bad at seeking. Often what happens is that the extractor frames are out of order:
When you select both and call Another way is maybe not skip at all and keep calling |
The current solution does not rely on the presence of a video track, it does support audio-only files. My tests are 💯 when trimming a single file, it's the concatenation of multiple files that has some issues, mainly off-sync audio. This could be related to the wrong comparison with
I was not aware of that! Will look into this and come back with questions if I don't manage.
I'll try that too. My understanding is that seeking twice must be done before any reading. If that's not the case, then I think I'll need some more explanations! |
Cleaner simplified code :) The extractor needs a second call to seekTo() after reaching a video keyframe, to obtain better values for audio track. Otherwise, too many audio frames can be lost, causing visible off-sync.
I think things are starting to look good 😅 I moved There are 2 sequences, depending on which track started first:
In the second sequence, we have 3 calls to seekTo(). The hacky one is the third one Anyways... one last thing: I called the method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mudar ! I like it, looking elegant. I think we should fix a couple of things before merging though.
private boolean isAudioTrackReady; | ||
private boolean isVideoTrackReady; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it helps (not sure), you could use a TrackTypeMap here instead of these two. Just in case you haven't noticed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 👍
lib/src/main/java/com/otaliastudios/transcoder/source/TrimDataSource.java
Outdated
Show resolved
Hide resolved
@Override | ||
public boolean canReadTrack(@NonNull TrackType type) { | ||
if (source.canReadTrack(type)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this method you call source.canReadTrack(type)
and if true apply our logic. Can we do the opposite?
The upstream source might be on AUDIO
position, but if you seek later, the next position might VIDEO
instead. So it would be better to do our stuff first (which might seek) and then return source.canReadTrack(type)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 👍
private boolean hasTrack(@NonNull TrackType type) { | ||
return source.getTrackFormat(type) != null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is needed as is. We don't care if the track exist but rather if the track is selected, and this can only be checked in canRead/read, not during the constructor. (if it doesn't exist, it will never be selected, so you can simply check selection)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok... I think this the only one that remains open 😅 multiple part answer!
We don't care if the track exist but rather if the track is selected
I agree 💯 The error in my code is that it assumes that all existing tracks will be selected. I now see that's not the case!
I might need to keep a list of tracks selected in selectTrack()
to set non-selected tracks to ready at the beginning of canReadTrack()
and this can only be checked in canRead/read
That's the issue I was trying to resolve in Engine, now that I see my error, I think we need to do some changes in the following block
if (!audioCompleted) {
stepped |= getCurrentTrackTranscoder(TrackType.AUDIO, options).transcode(forceAudioEos);
}
if (!videoCompleted) {
stepped |= getCurrentTrackTranscoder(TrackType.VIDEO, options).transcode(forceVideoEos);
}
What this does is that getCurrentTrackTranscoder(AUDIO)
calls selectTrack(audio)
then transcode()
will call canRead(audio)
. Then we'll seek and read audio, before calling getCurrentTrackTranscoder(VIDEO)
. At this point, we can still select the video track, but it's too late to call seekTo()
because that will rewind the audio track.
I'm sorry if this is redundant, but I'm not sure if my explanation was clear the first time. Or maybe I'm just completely lost here 🤔
thanks again for your patience ✌️
extractor.seekTo(trimStartUs, MediaExtractor.SEEK_TO_CLOSEST_SYNC); | ||
updateTrimValues(extractor.getSampleTime()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion - instead of calling extractor.seekTo()
and then update trimStartUs
... You can just leave trimStartUs
as is (even make it final) and call extractor.seekTo(extractor.getSampleTime() + trimStartUs)
anytime you must seek.
This second option would be better because we could get rid of the MediaExtractorDataSource interface! The default data source could simply implement seekTo(long)
by doing what I suggested above (seeking to first sample time + desired value
). So that seekTo(0)
seeks to start. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Let me know if you still prefer to get rid of updateTrimValues()
. I have removed MediaExtractorDataSource
, related to comment below https://github.com/natario1/Transcoder/pull/50#discussion_r361033517
@Override | ||
public boolean isDrained() { | ||
return source.isDrained(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested trimEnd values? I think it should be implemented here, returning true
if getReadUs() >= getDurationUs()
. So that reading stops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 👍 TrimEnd was auto-magically ok since the beginning, I thought that was because of
public long getDurationUs() {
return trimDurationUs;
}
but adding the condition to isDrained()
does seem the right thing to do 🙃 I was also able to stop manually setting KEY_DURATION for the mediaFormat
/** | ||
* DataSource that allows access to its MediaExtractor. | ||
*/ | ||
abstract class MediaExtractorDataSource implements DataSource { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope we get rid of this (see comments below) but if we don't, this should be made public because it's in the TrimDataSource constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, 👍 I added
long seekTo(long timestampUs);
to the DataSource
interface. The method also returns the new timestamp. This solves the problem where TrimDataSource
could not call getSampleTime()
anymore since we're removing its access to the extactor.
Signature is a bit similar to https://github.com/bcgit/bc-java/blob/master/core/src/main/java/org/bouncycastle/crypto/SkippingCipher.java#L23
Replaced by adding seekTo() to DataSource interface
The rest of the lib already assumes that timestamps start from arbitrary values.
replacing two booleans
Stop reading when readUs is past duration. This removes the need to manually define KEY_DURATION in the mediaFormat
to avoid possible bug where seekTo lands on a different track. Ex: The upstream source might be on AUDIO position, but if you seek later, the next position might VIDEO instead.
@mudar I'm sorry, I completely forgot about this! Let's merge this soon. I understand now what you are saying about the fact the before the second track is selected, the first one has already been through a Now that we have fixed many details in the source implementation, I wonder if you original natario1@fb9a53d#diff-4bd91bf352680b43ed4066c95874c6b2L367-R373 Maybe it wasn't working due to other things that we have solved. In theory if we apply that change, the canRead() could be as simple as: @Override
public boolean canReadTrack(@NonNull TrackType type) {
if (!didSeekTracks) {
source.seekTo(trimStartUs);
didSeekTracks = true;
}
return source.canReadTrack(type);
} What do you think about this? Note that, since the first extractor timestamp is not always 0, there are 2 time scales here: one goes from To solve this we could simply change @Override
public void seekBy(long durationUs) {
ensureExtractor();
mExtractor.seekTo(mExtractor.getSampleTime() + durationUs, MediaExtractor.SEEK_TO_CLOSEST_SYNC);
} This method goes forward by |
- Updates to Engine to replace selectAudio/transcode/selectVideo/transcode sequence by selectAudio/selectVideo/transcode/transcode - remove unnecessary hasTrack() - seekTo() is applied in canReadTrack(), once per selected track. This now works because all track selection operations are done before the first call to canReadTrack(). - When 2 tracks are selected, seekTo() is called twice and this helps the extractor with Audio sampleTime issues.
Hi @natario1, |
- reverted unnecessary changes to Engine class. Previous changes cannot guarantee that all calls to selectTracks() are done before the first canRead(). Latest bug was with merging multiple trimmed files. - use seekBy() to better handle first extractor timestamp - DefaultDataSource makes sure all available tracks are selected by extractor (without adding to mSelectedTracks array). Then seekTo() is called mutlitple times, using the resulting sampletimeUs for the later calls.
Well, another commit, hopefully the last one!
Any suggestions? I was not at ease messing with the drain state.. |
Thanks for doing it! MediaExtractor is a pain, no wonder Google itself dropped it (thinking of ExoPlayer). If this works now I would merge it as is and maybe try to improve it later. If I do I'll ping you. Can you just sync with master though? @mudar |
done 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I will try to release a new version soon, but if you need this now please fetch the library from JitPack.
Hi Mattia,
Thanks for all the work you've done on this library 👍
After some struggle with the media APIs, I think I have a TrimDataSource solution that works fine, 80% of the time 😅 sometimes audio is offsync, but I am not sure if there's a bug the way I'm trying to handle the feature.
I've based the solution on your answer on issue #37. Please let me know if you would like any changes to fit better into the library.
thanks,
mudar